Compute Library
 21.02
NEFFTConvolutionLayer Class Reference

Basic function to execute FFT-based convolution on Neon. More...

#include <NEFFTConvolutionLayer.h>

Collaboration diagram for NEFFTConvolutionLayer:
[legend]

Public Member Functions

 NEFFTConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
 NEFFTConvolutionLayer (const NEFFTConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEFFTConvolutionLayer (NEFFTConvolutionLayer &&)=delete
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
NEFFTConvolutionLayeroperator= (const NEFFTConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEFFTConvolutionLayeroperator= (NEFFTConvolutionLayer &&)=delete
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
 ~NEFFTConvolutionLayer ()
 Default destructor. More...
 
void configure (ITensor *input, const ITensor *weights, const ITensor *biases, ITensor *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Static function to check if given info will lead to a valid configuration of NEFFTConvolutionLayer. More...
 

Detailed Description

Basic function to execute FFT-based convolution on Neon.

This function calls the following Neon functions/kernels:

  1. NEPermute Permute input if NHWC(only NCHW is supported).
  2. NEPadLayer Pad input.
  3. NEFFT2D Forward transform to the frequency domain.
  4. NEComplexPixelWiseMultiplication Complex element-wise product of input and the weights.
  5. NEReductionOperation Reduction across channels.
  6. NEFFT2D Inverse transform back to the time domain.
  7. NEStridedSlice Extract valid output.
  8. NEArithmeticAddition Add bias.
  9. NEActivationLayer Perform activation.
  10. NEPermute Permute output if NHWC(only NCHW is supported).

Definition at line 59 of file NEFFTConvolutionLayer.h.

Constructor & Destructor Documentation

◆ NEFFTConvolutionLayer() [1/3]

NEFFTConvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 61 of file NEFFTConvolutionLayer.cpp.

62  : _memory_group(memory_manager),
63  _flip_weights_func(),
64  _permute_input_func(),
65  _permute_output_func(),
66  _permute_weights_func(),
67  _permute_bias_func(),
68  _pad_input_func(),
69  _pad_weights_func(),
70  _transform_input_func(memory_manager),
71  _transform_weights_func(),
72  _itransform_output_func(memory_manager),
73  _prod_func(),
74  _reduce_func(),
75  _extract_output_func(),
76  _bias_add_func(),
77  _activation_layer_func(),
78  _permuted_input(),
79  _permuted_weights(),
80  _permuted_bias(),
81  _permuted_output(),
82  _padded_input(),
83  _padded_weights(),
84  _flip_axis(),
85  _flipped_weights(),
86  _transformed_input(),
87  _transformed_weights(),
88  _input_weights_product(),
89  _output_product(),
90  _output_reduced(),
91  _itransformed_output(),
92  _reshaped_output(),
93  _bias_output(),
94  _original_weights(nullptr),
95  _original_bias(nullptr),
96  _is_activationlayer_enabled(false),
97  _needs_permute(false),
98  _has_bias(false),
99  _is_prepared(false)
100 {
101 }

◆ NEFFTConvolutionLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEFFTConvolutionLayer() [3/3]

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ ~NEFFTConvolutionLayer()

~NEFFTConvolutionLayer ( )
default

Default destructor.

Member Function Documentation

◆ configure()

void configure ( ITensor input,
const ITensor weights,
const ITensor biases,
ITensor output,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)

Set the input and output tensors.

Note
: This function only works with any square kernel size and unit strides for both NCHW and NHWC data layout
Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as input.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as input
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. Unused for Neon backend.

Definition at line 104 of file NEFFTConvolutionLayer.cpp.

References TensorAllocator::allocate(), Tensor::allocator(), ARM_COMPUTE_UNUSED, arm_compute::auto_init_if_empty(), Tensor::buffer(), ICloneable< T >::clone(), NEReverse::configure(), NEPermute::configure(), NEFFT2D::configure(), NEReductionOperation::configure(), NEActivationLayer::configure(), NEArithmeticAddition::configure(), NEPadLayer::configure(), NESlice::configure(), NEComplexPixelWiseMultiplication::configure(), ITensorInfo::data_layout(), FFT2DInfo::direction, ActivationLayerInfo::enabled(), arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::idx_height, arm_compute::test::validation::idx_width, ITensor::info(), Tensor::info(), TensorAllocator::init(), arm_compute::test::validation::input, arm_compute::Inverse, MemoryGroup::manage(), arm_compute::NCHW, arm_compute::NHWC, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), TensorShape::remove_dimension(), ITensorInfo::set_data_layout(), arm_compute::SUM, ITensorInfo::tensor_shape(), arm_compute::U, arm_compute::U32, arm_compute::WIDTH, arm_compute::WRAP, Dimensions< T >::x(), and Dimensions< T >::y().

106 {
107  ARM_COMPUTE_UNUSED(enable_fast_math);
108 
109  _original_weights = weights;
110  _original_bias = biases;
111 
112  // Flat if bias addition is required
113  _has_bias = biases != nullptr;
114 
115  // Get indices for the width and height
116  const size_t idx_width = get_data_layout_dimension_index(input->info()->data_layout(), DataLayoutDimension::WIDTH);
117  const size_t idx_height = get_data_layout_dimension_index(input->info()->data_layout(), DataLayoutDimension::HEIGHT);
118 
119  // Input shape, kernel size and output tile
120  const Size2D input_dims = Size2D(input->info()->tensor_shape()[idx_width], input->info()->tensor_shape()[idx_height]);
121  const Size2D kernel_size = Size2D(weights->info()->tensor_shape()[idx_width], weights->info()->tensor_shape()[idx_height]);
122  const Size2D pad_valid = Size2D(pad_decomposable(input_dims.x() + kernel_size.x() - 1),
123  pad_decomposable(input_dims.y() + kernel_size.y() - 1));
124  // Tensors to use
125  ITensor *input_to_use = input;
126  const ITensor *weights_to_use = weights;
127  ITensor *output_to_use = _has_bias ? &_bias_output : output;
128 
129  // Permute bias
130  if(biases != nullptr)
131  {
132  _permute_bias_func.configure(biases, &_permuted_bias, PermutationVector(1U, 2U, 0U));
133  _permuted_bias.info()->set_data_layout(DataLayout::NCHW);
134  }
135 
136  // Permute input if needed
137  _needs_permute = input->info()->data_layout() == DataLayout::NHWC;
138  if(_needs_permute)
139  {
140  _memory_group.manage(&_permuted_input);
141  // Configure the function to transform the input tensor from NHWC -> NCHW
142  _permute_input_func.configure(input, &_permuted_input, PermutationVector(1U, 2U, 0U));
143  _permuted_input.info()->set_data_layout(DataLayout::NCHW);
144 
145  // Configure the function to transform the weights tensor from HWI -> IHW
146  _permute_weights_func.configure(weights, &_permuted_weights, PermutationVector(1U, 2U, 0U));
147  _permuted_weights.info()->set_data_layout(DataLayout::NCHW);
148 
149  input_to_use = &_permuted_input;
150  weights_to_use = &_permuted_weights;
151  }
152 
153  // Flip weights
154  _flipped_weights.allocator()->init(weights_to_use->info()->clone()->set_is_resizable(true).reset_padding());
155  _flip_axis.allocator()->init(TensorInfo(TensorShape(2U), 1, DataType::U32));
156  _flip_weights_func.configure(weights_to_use, &_flipped_weights, &_flip_axis);
157 
158  // Pad weights
159  const PaddingList padding_w = { { 0, input_dims.x() + pad_valid.x() - 1 }, { 0, input_dims.y() + pad_valid.y() - 1 } };
160  _pad_weights_func.configure(&_flipped_weights, &_padded_weights, padding_w);
161 
162  // Transform weights
163  _transform_weights_func = std::make_unique<NEFFT2D>();
164  _transform_weights_func->configure(&_padded_weights, &_transformed_weights, FFT2DInfo());
165 
166  // Pad input
167  const PaddingList padding_in = { { 0, kernel_size.x() + pad_valid.x() - 1 }, { 0, kernel_size.y() + pad_valid.y() - 1 } };
168  _memory_group.manage(&_padded_input);
169  _pad_input_func.configure(input_to_use, &_padded_input, padding_in);
170  if(_needs_permute)
171  {
172  _permuted_input.allocator()->allocate();
173  }
174 
175  // Transform input
176  _memory_group.manage(&_transformed_input);
177  _transform_input_func.configure(&_padded_input, &_transformed_input, FFT2DInfo());
178  _padded_input.allocator()->allocate();
179 
180  // Perform product
181  _memory_group.manage(&_output_product);
182  _prod_func.configure(&_transformed_input, &_transformed_weights, &_output_product);
183  _transformed_input.allocator()->allocate();
184 
185  // Perform reduction
186  _memory_group.manage(&_output_reduced);
187  _reduce_func.configure(&_output_product, &_output_reduced, 2, ReductionOperation::SUM);
188  _output_product.allocator()->allocate();
189 
190  // Transform output
191  _memory_group.manage(&_itransformed_output);
192  FFT2DInfo itranform_info;
193  itranform_info.direction = FFTDirection::Inverse;
194  _itransformed_output.allocator()->init(_output_reduced.info()->clone()->set_is_resizable(true).set_num_channels(1).reset_padding());
195  _itransform_output_func.configure(&_output_reduced, &_itransformed_output, itranform_info);
196  _output_reduced.allocator()->allocate();
197 
198  // Reshape output
199  TensorShape reshaped_shape = _itransformed_output.info()->tensor_shape();
200  reshaped_shape.remove_dimension(2);
201  _reshaped_output.allocator()->init(_itransformed_output.info()->clone()->set_tensor_shape(reshaped_shape));
202 
203  // Extract correct region
204  const int start_left = kernel_size.x() - conv_info.pad_left() - 1;
205  const int start_top = kernel_size.y() - conv_info.pad_top() - 1;
206  const int end_right = _reshaped_output.info()->tensor_shape().x() - (kernel_size.x() - conv_info.pad_right() - 1) - pad_valid.x();
207  const int end_botton = _reshaped_output.info()->tensor_shape().y() - (kernel_size.y() - conv_info.pad_bottom() - 1) - pad_valid.y();
208  if(_has_bias)
209  {
210  _memory_group.manage(&_bias_output);
211  }
212  else if(_needs_permute)
213  {
214  output_to_use = &_permuted_output;
215  _memory_group.manage(&_permuted_output);
216  }
217  _extract_output_func.configure(&_reshaped_output, output_to_use, Coordinates(start_left, start_top), Coordinates(end_right, end_botton));
218  _reshaped_output.allocator()->allocate();
219  _itransformed_output.allocator()->allocate();
220 
221  // Add bias
222  if(biases != nullptr)
223  {
224  output_to_use = output;
225  if(_needs_permute)
226  {
227  output_to_use = &_permuted_output;
228  _memory_group.manage(&_permuted_output);
229  }
230  auto_init_if_empty(*output_to_use->info(), *_bias_output.info());
231  _bias_add_func.configure(&_bias_output, &_permuted_bias, output_to_use, ConvertPolicy::WRAP);
232  _bias_output.allocator()->allocate();
233  }
234 
235  // Permute output
236  if(_needs_permute)
237  {
238  // Configure the function to transform the convoluted output to ACL's native ordering format NCHW
239  _permuted_output.info()->set_data_layout(DataLayout::NCHW);
240  _permute_output_func.configure(&_permuted_output, output, PermutationVector(2U, 0U, 1U));
241 
242  // Allocate tensors
243  _permuted_output.allocator()->allocate();
244  }
245 
246  // Configure Activation Layer
247  _is_activationlayer_enabled = act_info.enabled();
248  if(_is_activationlayer_enabled)
249  {
250  _activation_layer_func.configure(output, nullptr, act_info);
251  }
252 
253  // Setup flip axis data
254  _flip_axis.allocator()->allocate();
255 
256  auto axis_data = reinterpret_cast<uint32_t *>(_flip_axis.buffer());
257  axis_data[0] = 0;
258  axis_data[1] = 1;
259 }
void remove_dimension(size_t n)
Accessor to remove the dimension n from the tensor shape.
Definition: TensorShape.h:111
void init(const TensorAllocator &allocator, const Coordinates &coords, TensorInfo &sub_info)
Shares the same backing memory with another tensor allocator, while the tensor info might be differen...
std::vector< PaddingInfo > PaddingList
List of padding information.
Definition: Types.h:481
Strides PermutationVector
Permutation vector.
Definition: Types.h:49
void configure(const ITensor *input1, const ITensor *input2, ITensor *output, ConvertPolicy policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Initialise the kernel&#39;s inputs, output and conversion policy.
void configure(const ITensor *input, ITensor *output, const ITensor *axis)
Initialize the function.
Definition: NEReverse.cpp:30
TensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: Tensor.cpp:48
ITensorInfo * info() const override
Interface to be implemented by the child class to return the tensor&#39;s metadata.
Definition: Tensor.cpp:33
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
T x() const
Alias to access the size of the first dimension.
Definition: Dimensions.h:87
void configure(const ITensor *input, ITensor *output, const Coordinates &starts, const Coordinates &ends)
Configure kernel.
Definition: NESlice.cpp:85
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
1 channel, 1 U32 per channel
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual ITensorInfo & set_data_layout(const DataLayout &data_layout)=0
Set the data layout of the tensor.
void allocate() override
Allocate size specified by TensorInfo of CPU memory.
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.
Num samples, channels, height, width.
Num samples, height, width, channels.
void configure(ITensor *input, ITensor *output, ActivationLayerInfo activation_info)
[NEActivationLayer snippet]
uint8_t * buffer() const override
Interface to be implemented by the child class to return a pointer to CPU memory. ...
Definition: Tensor.cpp:43
void configure(ITensor *input, ITensor *output, const PaddingList &padding, const PixelValue constant_value=PixelValue(), const PaddingMode mode=PaddingMode::CONSTANT)
Initialize the function.
Definition: NEPadLayer.cpp:167
T y() const
Alias to access the size of the second dimension.
Definition: Dimensions.h:92
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
void configure(const ITensor *input, ITensor *output, const PermutationVector &perm)
Configure the permute Neon kernel.
Definition: NEPermute.cpp:49
void configure(const ITensor *input, ITensor *output, const FFT2DInfo &config)
Initialise the function&#39;s source and destinations.
Definition: NEFFT2D.cpp:42
void configure(ITensor *input1, ITensor *input2, ITensor *output, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Initialise the kernel&#39;s inputs, output.
void configure(ITensor *input, ITensor *output, unsigned int axis, ReductionOperation op, bool keep_dims=true)
Set the input and output tensors.

◆ operator=() [1/2]

NEFFTConvolutionLayer& operator= ( const NEFFTConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEFFTConvolutionLayer& operator= ( NEFFTConvolutionLayer &&  )
delete

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 348 of file NEFFTConvolutionLayer.cpp.

References TensorAllocator::allocate(), Tensor::allocator(), ARM_COMPUTE_ERROR_ON, TensorAllocator::free(), ITensor::is_used(), ITensor::mark_as_unused(), INESimpleFunctionNoBorder::run(), NEPermute::run(), and NEPadLayer::run().

Referenced by NEFFTConvolutionLayer::run().

349 {
350  if(!_is_prepared)
351  {
352  // Permute bias to NCHW
353  if(_original_bias != nullptr)
354  {
355  _permuted_bias.allocator()->allocate();
356  _permute_bias_func.run();
357  _original_bias->mark_as_unused();
358  }
359 
360  const ITensor *cur_weights = _original_weights;
361 
362  // Permute weights
363  if(_needs_permute)
364  {
365  ARM_COMPUTE_ERROR_ON(!cur_weights->is_used());
366 
367  _permuted_weights.allocator()->allocate();
368  _permute_weights_func.run();
369  cur_weights->mark_as_unused();
370  cur_weights = &_permuted_weights;
371  }
372 
373  // Flip weights
374  _flipped_weights.allocator()->allocate();
375  _flip_weights_func.run();
376  cur_weights->mark_as_unused();
377 
378  // Pad weights
379  _padded_weights.allocator()->allocate();
380  _pad_weights_func.run();
381  _flipped_weights.mark_as_unused();
382  _flipped_weights.allocator()->free();
383 
384  // Transform weights to frequency domain
385  _transformed_weights.allocator()->allocate();
386  _transform_weights_func->run();
387  _transform_weights_func.reset();
388 
389  _padded_weights.mark_as_unused();
390  _padded_weights.allocator()->free();
391 
392  _is_prepared = true;
393  }
394 }
void run() override final
Run the kernels contained in the function.
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
TensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: Tensor.cpp:48
void mark_as_unused() const
Marks a tensor as unused.
Definition: ITensor.cpp:168
void allocate() override
Allocate size specified by TensorInfo of CPU memory.
void free() override
Free allocated CPU memory.
void run() override
Run the kernels contained in the function.
Definition: NEPermute.cpp:67
void run() override
Run the kernels contained in the function.
Definition: NEPadLayer.cpp:250

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 307 of file NEFFTConvolutionLayer.cpp.

References Tensor::allocator(), Tensor::buffer(), TensorAllocator::import_memory(), NEFFTConvolutionLayer::prepare(), NEFFT2D::run(), NEPermute::run(), NEReductionOperation::run(), NEActivationLayer::run(), NEArithmeticAddition::run(), NEPadLayer::run(), NESlice::run(), and NEComplexPixelWiseMultiplication::run().

308 {
309  prepare();
310 
311  MemoryGroupResourceScope scope_mg(_memory_group);
312 
313  // Transform input
314  if(_needs_permute)
315  {
316  _permute_input_func.run();
317  }
318  _pad_input_func.run();
319  _transform_input_func.run();
320 
321  // Perform operations to frequency domain
322  _prod_func.run();
323 
324  _reduce_func.run();
325 
326  // Transform output
327  _itransform_output_func.run();
328  _reshaped_output.allocator()->import_memory(_itransformed_output.buffer());
329  _extract_output_func.run();
330 
331  // Add bias
332  if(_has_bias)
333  {
334  _bias_add_func.run();
335  }
336  if(_needs_permute)
337  {
338  _permute_output_func.run();
339  }
340 
341  // Run activation layer
342  if(_is_activationlayer_enabled)
343  {
344  _activation_layer_func.run();
345  }
346 }
void prepare() override
Prepare the function for executing.
void run() override
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
TensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: Tensor.cpp:48
void run() override
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
Definition: NEPermute.cpp:67
void run() override
Run the kernels contained in the function.
Definition: NEPadLayer.cpp:250
void run() override
Run the kernels contained in the function.
Definition: NEFFT2D.cpp:91
uint8_t * buffer() const override
Interface to be implemented by the child class to return a pointer to CPU memory. ...
Definition: Tensor.cpp:43
Status import_memory(void *memory)
Import an existing memory as a tensor&#39;s backing memory.
void run() override
Run the kernels contained in the function.
Definition: NESlice.cpp:93

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)
static

Static function to check if given info will lead to a valid configuration of NEFFTConvolutionLayer.

Note
: This function only works with any square kernel size and unit strides for both NCHW and NHWC data layout
Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as input.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as input
[in]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. Unused for Neon backend.
Returns
a status

Definition at line 261 of file NEFFTConvolutionLayer.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ON_ERROR, ARM_COMPUTE_UNUSED, arm_compute::CHANNEL, ITensorInfo::data_layout(), ActivationLayerInfo::enabled(), arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::idx_height, arm_compute::test::validation::idx_width, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), PadStrideInfo::stride(), ITensorInfo::tensor_shape(), ITensorInfo::total_size(), NEActivationLayer::validate(), arm_compute::WIDTH, and Dimensions< T >::x().

Referenced by NEConvolutionLayer::get_convolution_method(), and NEConvolutionLayer::validate().

263 {
264  ARM_COMPUTE_UNUSED(enable_fast_math);
265 
268 
269  // Get indices for the width and height
272 
273  // Input shape, kernel size and output tile
274  const Size2D kernel_size = Size2D(weights->tensor_shape()[idx_width], weights->tensor_shape()[idx_height]);
275 
276  // Strides
277  const auto strides = conv_info.stride();
278  ARM_COMPUTE_RETURN_ERROR_ON(strides.first != strides.second && strides.first != 1);
279  ARM_COMPUTE_RETURN_ERROR_ON(kernel_size.x() != kernel_size.y());
280  ARM_COMPUTE_RETURN_ERROR_ON(conv_info.pad_left() != (kernel_size.x() / 2) || conv_info.pad_right() != (kernel_size.x() / 2));
281  ARM_COMPUTE_RETURN_ERROR_ON(conv_info.pad_top() != (kernel_size.y() / 2) || conv_info.pad_bottom() != (kernel_size.y() / 2));
282 
283  // Validate biases
284  if(biases != nullptr)
285  {
286  const size_t idx_channels = get_data_layout_dimension_index(input->data_layout(), DataLayoutDimension::CHANNEL);
288  ARM_COMPUTE_RETURN_ERROR_ON(input->tensor_shape()[idx_channels] != biases->tensor_shape().x());
289  }
290 
291  // Checks performed when output is configured
292  if((output != nullptr) && (output->total_size() != 0))
293  {
295  ARM_COMPUTE_RETURN_ERROR_ON((input->tensor_shape()[idx_height] != output->tensor_shape()[idx_height]) || (input->tensor_shape()[idx_width] != output->tensor_shape()[idx_width]));
296 
297  // Validate Activation Layer
298  if(act_info.enabled())
299  {
300  ARM_COMPUTE_RETURN_ON_ERROR(NEActivationLayer::validate(output, nullptr, act_info));
301  }
302  }
303 
304  return Status{};
305 }
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const ActivationLayerInfo &act_info)
[NEActivationLayer snippet]
1 channel, 1 F32 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:545
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:792
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193

The documentation for this class was generated from the following files: