Compute Library
 19.08
NEDirectConvolutionLayer Class Reference

Function to run the direct convolution. More...

#include <NEDirectConvolutionLayer.h>

Collaboration diagram for NEDirectConvolutionLayer:
[legend]

Public Member Functions

 NEDirectConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Constructor. More...
 
void configure (ITensor *input, const ITensor *weights, const ITensor *bias, ITensor *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo())
 Set the input, weights, biases and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 
virtual void prepare ()
 Prepare the function for executing. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *bias, const ITensorInfo *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo())
 Static function to check if given info will lead to a valid configuration of NEDirectConvolutionLayer. More...
 

Detailed Description

Function to run the direct convolution.

This function calls the following NEON kernels:

  1. NEFillBorderKernel for the input
  2. NEDirectConvolutionLayerOutputStageKernel
  3. NEDirectConvolutionLayerKernel

Definition at line 49 of file NEDirectConvolutionLayer.h.

Constructor & Destructor Documentation

◆ NEDirectConvolutionLayer()

NEDirectConvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Constructor.

Definition at line 36 of file NEDirectConvolutionLayer.cpp.

37  : _memory_group(std::move(memory_manager)), _output_stage_kernel(), _conv_kernel(), _input_border_handler(), _activationlayer_function(), _accumulator(), _has_bias(false),
38  _is_activationlayer_enabled(false), _dim_split(Window::DimZ)
39 {
40 }
static constexpr size_t DimZ
Alias for dimension 2 also known as Z dimension.
Definition: Window.h:47

Member Function Documentation

◆ configure()

void configure ( ITensor input,
const ITensor weights,
const ITensor bias,
ITensor output,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo() 
)

Set the input, weights, biases and output tensors.

Note
: DirectConvolution only works in the following configurations: 1x1 convolution with stride_x = 1/2/3, stride_y = 1/2/3 data type = F16/F32 3x3 convolution with stride_x = 1/2/3, stride_y = 1/2/3 data type = F16/F32 5x5 convolution with stride_x = 1/2/3, stride_y = 1/2/3 data type = F32
Parameters
[in,out]inputInput tensor. Data types supported: F16/F32.
[in]weightsSet of kernels to convolve the input volume. Supported sizes: 1x1, 3x3 and 5x5. The 3rd dimension must be the same as the input's volume 3rd dimension. Data type supported: Same as input.
[in]biasSet of biases. Can be nullptr. Data type supported: Same as input.
[out]outputOutput tensor. The 3rd dimensions must be equal to the 4th dimension of the kernels tensor. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]act_info(Optional) Activation layer information in case of a fused activation.

Definition at line 42 of file NEDirectConvolutionLayer.cpp.

43 {
45 
46  // Free accumulator
47  if(_accumulator.buffer() != nullptr)
48  {
49  _accumulator.allocator()->free();
50  }
51 
52  _dim_split = input->info()->data_layout() == DataLayout::NCHW ? Window::DimZ : Window::DimY;
53 
54  // Check if bias should be added in the convolution result
55  _has_bias = (bias != nullptr);
56 
57  _conv_kernel.configure(input, weights, output, conv_info);
58  if(_has_bias)
59  {
60  _output_stage_kernel.configure(output, bias);
61  }
62 
63  // Add zero padding XY
64  _input_border_handler.configure(input, _conv_kernel.border_size(), BorderMode::CONSTANT, PixelValue(static_cast<float>(0.f)));
65 
66  //Configure Activation Layer
67  _is_activationlayer_enabled = act_info.enabled();
68  if(_is_activationlayer_enabled)
69  {
70  _activationlayer_function.configure(output, nullptr, act_info);
71  }
72 }
Class describing the value of a pixel for any image format.
Definition: PixelValue.h:34
void configure(const ITensor *input, const ITensor *weights, ITensor *output, const PadStrideInfo &conv_info)
Set the input, weights, and output tensors.
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:337
void configure(ITensor *tensor, BorderSize border_size, BorderMode border_mode, const PixelValue &constant_border_value=PixelValue())
Initialise the function.
void configure(ITensor *input, const ITensor *bias=nullptr, ITensor *output=nullptr, int result_fixedpoint_multiplier=0, int result_shift=0, int result_offset_after_shift=0)
Set the accumulate buffer and the biases of the kernel.
TensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: Tensor.cpp:48
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
void free() override
Free allocated CPU memory.
Num samples, channels, height, width.
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
BorderSize border_size() const override
The size of the border for that kernel.
static constexpr size_t DimZ
Alias for dimension 2 also known as Z dimension.
Definition: Window.h:47
void configure(ITensor *input, ITensor *output, ActivationLayerInfo activation_info)
Set the input and output tensor.
uint8_t * buffer() const override
Interface to be implemented by the child class to return a pointer to CPU memory.
Definition: Tensor.cpp:43
virtual DataLayout data_layout() const =0
Get the data layout of the tensor.

References arm_compute::test::validation::act_info, Tensor::allocator(), ARM_COMPUTE_ERROR_ON, arm_compute::test::validation::bias, NEDirectConvolutionLayerKernel::border_size(), Tensor::buffer(), NEActivationLayer::configure(), NEFillBorderKernel::configure(), NEDirectConvolutionLayerOutputStageKernel::configure(), NEDirectConvolutionLayerKernel::configure(), arm_compute::CONSTANT, arm_compute::test::validation::conv_info, ITensorInfo::data_layout(), Window::DimY, Window::DimZ, TensorAllocator::free(), ITensor::info(), arm_compute::NCHW, arm_compute::UNKNOWN, and arm_compute::test::validation::weights.

Referenced by NEDepthwiseSeparableConvolutionLayer::configure().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For NEON kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 104 of file NEDirectConvolutionLayer.cpp.

105 {
106  NEScheduler::get().schedule(&_input_border_handler, Window::DimZ);
107 
108  MemoryGroupResourceScope scope_mg(_memory_group);
109 
110  NEScheduler::get().schedule(&_conv_kernel, _dim_split);
111  if(_has_bias)
112  {
113  NEScheduler::get().schedule(&_output_stage_kernel, Window::DimY);
114  }
115 
116  if(_is_activationlayer_enabled)
117  {
118  _activationlayer_function.run();
119  }
120 }
void run() override final
Run the kernels contained in the function.
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
Memory group resources scope handling class.
Definition: IMemoryGroup.h:46
virtual void schedule(ICPPKernel *kernel, const Hints &hints)=0
Runs the kernel in the same thread as the caller synchronously.
static constexpr size_t DimZ
Alias for dimension 2 also known as Z dimension.
Definition: Window.h:47
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:96

References Window::DimY, Window::DimZ, Scheduler::get(), INESimpleFunctionNoBorder::run(), and IScheduler::schedule().

Referenced by NEDepthwiseSeparableConvolutionLayer::run().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo bias,
const ITensorInfo output,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo() 
)
static

Static function to check if given info will lead to a valid configuration of NEDirectConvolutionLayer.

Note
: DirectConvolution only works in the following configurations: 1x1 convolution with stride_x = 1/2/3, stride_y = 1/2/3 data type = F16/F32 3x3 convolution with stride_x = 1/2/3, stride_y = 1/2/3 data type = F16/F32 5x5 convolution with stride_x = 1/2/3, stride_y = 1/2/3 data type = F32
Parameters
[in]inputInput tensor. Data types supported: F16/F32.
[in]weightsSet of kernels to convolve the input volume. Supported sizes: 1x1, 3x3 and 5x5. The 3rd dimension must be the same as the input's volume 3rd dimension. Data type supported: Same as input.
[in]biasSet of biases. Can be nullptr. Data type supported: Same as input.
[in]outputOutput tensor. The 3rd dimensions must be equal to the 4th dimension of the kernels tensor. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]act_info(Optional) Activation layer information in case of a fused activation.
Returns
a status

Definition at line 74 of file NEDirectConvolutionLayer.cpp.

76 {
78 
79  DataType data_type = output->data_type();
80  TensorInfo accumulator(output->clone()->set_is_resizable(true).reset_padding().set_data_type(data_type));
81 
82  // Validate Convolution kernel
84 
85  if(bias != nullptr)
86  {
88  ARM_COMPUTE_RETURN_ERROR_ON_MSG(bias->dimension(0) != weights->dimension(3),
89  "Biases size and number of input feature maps should match");
90  ARM_COMPUTE_RETURN_ERROR_ON_MSG(bias->num_dimensions() > 1, "Biases should be one dimensional");
91  }
92 
93  // Validate bias kernel
95 
96  if(act_info.enabled())
97  {
99  }
100 
101  return Status{};
102 }
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:545
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:193
virtual DataType data_type() const =0
Data type used for each element of the tensor.
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const ActivationLayerInfo &act_info)
Static function to check if given info will lead to a valid configuration of NEActivationLayer.
Status class.
Definition: Error.h:52
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG(cond,...)
If the condition is true, an error is returned.
Definition: Error.h:214
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *output, const PadStrideInfo &conv_info)
Static function to check if given info will lead to a valid configuration of NEDirectConvolutionLayer...
Store the tensor's metadata.
Definition: TensorInfo.h:45
static Status validate(const ITensorInfo *input, const ITensorInfo *bias=nullptr, const ITensorInfo *output=nullptr, int result_fixedpoint_multiplier=0, int result_shift=0, int result_offset_after_shift=0)
Static function to check if given info will lead to a valid configuration of NEDirectConvolutionLayer...
DataType
Available data types.
Definition: Types.h:74

References arm_compute::test::validation::act_info, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::test::validation::bias, ICloneable< T >::clone(), arm_compute::test::validation::conv_info, arm_compute::test::validation::data_type, ITensorInfo::data_type(), NEActivationLayer::validate(), NEDirectConvolutionLayerOutputStageKernel::validate(), NEDirectConvolutionLayerKernel::validate(), and arm_compute::test::validation::weights.

Referenced by NEConvolutionLayer::get_convolution_method(), and NEConvolutionLayer::validate().


The documentation for this class was generated from the following files: