Compute Library
 21.02
NESpaceToDepthLayerKernel Class Reference

Interface for the space to depth kernel. More...

#include <NESpaceToDepthLayerKernel.h>

Collaboration diagram for NESpaceToDepthLayerKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NESpaceToDepthLayerKernel ()
 Default constructor. More...
 
 NESpaceToDepthLayerKernel (const NESpaceToDepthLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NESpaceToDepthLayerKerneloperator= (const NESpaceToDepthLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NESpaceToDepthLayerKernel (NESpaceToDepthLayerKernel &&)=default
 Allow instances of this class to be moved. More...
 
NESpaceToDepthLayerKerneloperator= (NESpaceToDepthLayerKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~NESpaceToDepthLayerKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input, ITensor *output, int32_t block_shape)
 Initialise the kernel's inputs and output. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *output, int32_t block_shape)
 Static function to check if given info will lead to a valid configuration of NESpaceToDepthLayerKernel. More...
 

Detailed Description

Interface for the space to depth kernel.

Definition at line 35 of file NESpaceToDepthLayerKernel.h.

Constructor & Destructor Documentation

◆ NESpaceToDepthLayerKernel() [1/3]

Default constructor.

Definition at line 72 of file NESpaceToDepthLayerKernel.cpp.

Referenced by NESpaceToDepthLayerKernel::name().

73  : _input(nullptr), _output(nullptr), _block_shape(), _data_layout(DataLayout::UNKNOWN)
74 {
75 }

◆ NESpaceToDepthLayerKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NESpaceToDepthLayerKernel() [3/3]

Allow instances of this class to be moved.

◆ ~NESpaceToDepthLayerKernel()

Default destructor.

Referenced by NESpaceToDepthLayerKernel::name().

Member Function Documentation

◆ configure()

void configure ( const ITensor input,
ITensor output,
int32_t  block_shape 
)

Initialise the kernel's inputs and output.

Parameters
[in]inputTensor input. Supported tensor rank: 4. Data types supported: All.
[out]outputTensor output. Data types supported: same as input
[in]block_shapeBlock shape value

Definition at line 77 of file NESpaceToDepthLayerKernel.cpp.

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::calculate_max_window(), arm_compute::misc::shape_calculator::compute_space_to_depth_shape(), ITensorInfo::data_layout(), ITensorInfo::data_type(), ITensor::info(), arm_compute::test::validation::input, arm_compute::test::validation::output_shape, and arm_compute::validate_arguments().

Referenced by NESpaceToDepthLayerKernel::name().

78 {
80 
82  auto_init_if_empty(*output->info(), output_shape, 1, input->info()->data_type());
83 
84  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input->info(), output->info(), block_shape));
85 
86  _input = input;
87  _block_shape = block_shape;
88  _output = output;
89  _data_layout = input->info()->data_layout();
90 
91  // Configure kernel window
92  Window win = calculate_max_window(*output->info(), Steps());
93  INEKernel::configure(win);
94 }
Window calculate_max_window(const ValidRegion &valid_region, const Steps &steps, bool skip_border, BorderSize border_size)
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
Status validate_arguments(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const GEMMLowpOutputStageInfo *output_stage)
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
TensorShape compute_space_to_depth_shape(const ITensorInfo *input, int32_t block_shape)
Calculate the space to batch output shape of a tensor.

◆ name()

◆ operator=() [1/2]

NESpaceToDepthLayerKernel& operator= ( const NESpaceToDepthLayerKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

Referenced by NESpaceToDepthLayerKernel::name().

◆ operator=() [2/2]

Allow instances of this class to be moved.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 102 of file NESpaceToDepthLayerKernel.cpp.

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, arm_compute::CHANNEL, ITensorInfo::dimension(), ITensorInfo::element_size(), arm_compute::execute_window_loop(), Window::first_slice_window_3D(), arm_compute::get_data_layout_dimension_index(), ITensor::info(), arm_compute::NCHW, Iterator::ptr(), ITensor::ptr_to_element(), Window::slide_window_slice_3D(), IKernel::window(), Window::x(), and Window::z().

Referenced by NESpaceToDepthLayerKernel::name().

103 {
107 
108  const int channel_idx = get_data_layout_dimension_index(_data_layout, DataLayoutDimension::CHANNEL);
109  const int element_size = _input->info()->element_size();
110 
111  const size_t channel_size = _input->info()->dimension(channel_idx);
112 
113  Window slice_out = window.first_slice_window_3D();
114 
115  int batch_id = 0;
116 
117  // Main loop for NCHW and NHWC
118  if(_data_layout == DataLayout::NCHW)
119  {
120  do
121  {
122  Iterator out(_output, slice_out);
123  execute_window_loop(slice_out, [&](const Coordinates & id)
124  {
125  const size_t channel_id = id.z();
126  const size_t in_x = id.x() * _block_shape + (channel_id / channel_size) % _block_shape;
127  const size_t in_y = id.y() * _block_shape + (channel_id / channel_size) / _block_shape;
128  const int z = channel_id % channel_size;
129  Coordinates input_coords{ in_x, in_y, z, batch_id };
130  memcpy(out.ptr(), _input->ptr_to_element(input_coords), element_size);
131  },
132  out);
133  ++batch_id;
134  }
135  while(window.slide_window_slice_3D(slice_out));
136  }
137  else
138  {
139  do
140  {
141  Iterator out(_output, slice_out);
142  execute_window_loop(slice_out, [&](const Coordinates & id)
143  {
144  const size_t channel_id = id.x();
145  const size_t in_x = id.y() * _block_shape + (channel_id / channel_size) % _block_shape;
146  const size_t in_y = id.z() * _block_shape + (channel_id / channel_size) / _block_shape;
147  const int z = channel_id % channel_size;
148  Coordinates input_coords{ z, in_x, in_y, batch_id };
149  memcpy(out.ptr(), _input->ptr_to_element(input_coords), element_size);
150  },
151  out);
152  ++batch_id;
153  }
154  while(window.slide_window_slice_3D(slice_out));
155  }
156 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
uint8_t * ptr_to_element(const Coordinates &id) const
Return a pointer to the element at the passed coordinates.
Definition: ITensor.h:63
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
virtual size_t element_size() const =0
Element size in bytes calculated as data_size() * num_channels()
bool slide_window_slice_3D(Window &slice) const
Slide the passed 3D window slice.
Definition: Window.h:335
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:941
Num samples, channels, height, width.
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:77
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
Window first_slice_window_3D() const
First 3D slice of the window.
Definition: Window.h:291
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo output,
int32_t  block_shape 
)
static

Static function to check if given info will lead to a valid configuration of NESpaceToDepthLayerKernel.

Parameters
[in]inputTensor input info. Supported tensor rank: 4. Data types supported: All.
[in]outputTensor output info. Data types supported: same as input
[in]block_shapeBlock shape value
Returns
a status

Definition at line 96 of file NESpaceToDepthLayerKernel.cpp.

References ARM_COMPUTE_RETURN_ON_ERROR, and arm_compute::validate_arguments().

Referenced by NESpaceToDepthLayerKernel::name(), and NESpaceToDepthLayer::validate().

97 {
99  return Status{};
100 }
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
Status validate_arguments(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const GEMMLowpOutputStageInfo *output_stage)

The documentation for this class was generated from the following files: