Compute Library
 21.02
CLReduceMean Class Reference

Basic function to perform reduce operation. More...

#include <CLReduceMean.h>

Collaboration diagram for CLReduceMean:
[legend]

Public Member Functions

 CLReduceMean (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
void configure (ICLTensor *input, const Coordinates &reduction_axis, bool keep_dims, ICLTensor *output)
 Configure kernel. More...
 
void configure (const CLCompileContext &compile_context, ICLTensor *input, const Coordinates &reduction_axis, bool keep_dims, ICLTensor *output)
 Configure kernel. More...
 
void run () override
 Run the kernels contained in the function. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 
virtual void prepare ()
 Prepare the function for executing. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const Coordinates &reduction_axis, bool keep_dims, const ITensorInfo *output)
 Static function to check if given info will lead to a valid configuration of CLReduceMean. More...
 

Detailed Description

Basic function to perform reduce operation.

Definition at line 41 of file CLReduceMean.h.

Constructor & Destructor Documentation

◆ CLReduceMean()

CLReduceMean ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 101 of file CLReduceMean.cpp.

102  : _memory_group(std::move(memory_manager)), _reduction_kernels(), _reduced_outs(), _reshape(), _dequant(), _requant(), _reduction_ops(), _keep_dims(), _do_requant(), _input_no_quant(),
103  _output_no_quant()
104 {
105 }

Member Function Documentation

◆ configure() [1/2]

void configure ( ICLTensor input,
const Coordinates reduction_axis,
bool  keep_dims,
ICLTensor output 
)

Configure kernel.

Note
Supported tensor rank: up to 4
Parameters
[in]inputSource tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32
[in]reduction_axisReduction axis vector.
[in]keep_dimsIf positive, retains reduced dimensions with length 1.
[out]outputDestination tensor. Data type supported: Same as input

Definition at line 107 of file CLReduceMean.cpp.

References CLKernelLibrary::get().

108 {
109  configure(CLKernelLibrary::get().get_compile_context(), input, reduction_axis, keep_dims, output);
110 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void configure(ICLTensor *input, const Coordinates &reduction_axis, bool keep_dims, ICLTensor *output)
Configure kernel.

◆ configure() [2/2]

void configure ( const CLCompileContext compile_context,
ICLTensor input,
const Coordinates reduction_axis,
bool  keep_dims,
ICLTensor output 
)

Configure kernel.

Note
Supported tensor rank: up to 4
Parameters
[in]compile_contextThe compile context to be used.
[in]inputSource tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32
[in]reduction_axisReduction axis vector.
[in]keep_dimsIf positive, retains reduced dimensions with length 1.
[out]outputDestination tensor. Data type supported: Same as input

Definition at line 112 of file CLReduceMean.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), Dimensions< T >::begin(), arm_compute::misc::shape_calculator::calculate_reduce_mean_shape(), ICloneable< T >::clone(), CLDequantizationLayer::configure(), CLQuantizationLayer::configure(), CLReshapeLayer::configure(), arm_compute::convert_negative_axis(), ITensorInfo::data_type(), arm_compute::F32, ITensor::info(), CLTensor::info(), arm_compute::test::validation::info, arm_compute::test::validation::input, arm_compute::is_data_type_quantized(), MemoryGroup::manage(), arm_compute::MEAN_SUM, ITensorInfo::num_channels(), Dimensions< T >::num_dimensions(), ITensorInfo::num_dimensions(), arm_compute::test::validation::output_shape, ITensorInfo::quantization_info(), TensorShape::remove_dimension(), TensorShape::set(), TensorInfo::set_data_type(), ITensorInfo::tensor_shape(), and CLReduceMean::validate().

113 {
114  // Perform validate step
115  ARM_COMPUTE_ERROR_THROW_ON(CLReduceMean::validate(input->info(), reduction_axis, keep_dims, output->info()));
116  // Output auto inizialitation if not yet initialized
117  const TensorShape output_shape = arm_compute::misc::shape_calculator::calculate_reduce_mean_shape(input->info(), reduction_axis, keep_dims);
118  auto_init_if_empty(*output->info(), input->info()->clone()->set_tensor_shape(output_shape));
119 
120  _do_requant = is_data_type_quantized(input->info()->data_type()) && input->info()->quantization_info() != output->info()->quantization_info();
121  _reduction_ops = reduction_axis.num_dimensions();
122  _reduction_kernels.resize(_reduction_ops);
123  _reduced_outs.resize(_reduction_ops - (keep_dims ? 1 : 0));
124  _keep_dims = keep_dims;
125 
126  ICLTensor *tmp_input = input;
127  ICLTensor *tmp_output = output;
128  if(_do_requant)
129  {
130  _memory_group.manage(&_input_no_quant);
131  _memory_group.manage(&_output_no_quant);
132  TensorInfo output_no_quant_info = input->info()->clone()->set_tensor_shape(output_shape);
133  output_no_quant_info.set_data_type(DataType::F32);
134  auto_init_if_empty(*_output_no_quant.info(), output_no_quant_info);
135  auto_init_if_empty(*_input_no_quant.info(), input->info()->clone()->set_data_type(DataType::F32));
136  _dequant.configure(compile_context, input, &_input_no_quant);
137  tmp_input = &_input_no_quant;
138  tmp_output = &_output_no_quant;
139  }
140 
141  Coordinates axis_local = reduction_axis;
142  const int input_dims = tmp_input->info()->num_dimensions();
143 
144  convert_negative_axis(axis_local, input_dims);
145 
146  // Perform reduction for every axis
147  for(int i = 0; i < _reduction_ops; ++i)
148  {
149  TensorShape out_shape = i == 0 ? tmp_input->info()->tensor_shape() : (&_reduced_outs[i - 1])->info()->tensor_shape();
150  out_shape.set(axis_local[i], 1);
151  auto in = (i == 0) ? tmp_input : (&_reduced_outs[i - 1]);
152 
153  if(i == _reduction_ops - 1 && keep_dims)
154  {
155  _reduction_kernels[i].configure(compile_context, in, tmp_output, axis_local[i], ReductionOperation::MEAN_SUM);
156  }
157  else
158  {
159  _reduced_outs[i].allocator()->init(TensorInfo(out_shape, tmp_input->info()->num_channels(), tmp_input->info()->data_type(), tmp_input->info()->quantization_info()));
160  _memory_group.manage(&_reduced_outs[i]);
161  _reduction_kernels[i].configure(compile_context, in, &_reduced_outs[i], axis_local[i], ReductionOperation::MEAN_SUM);
162  }
163  }
164 
165  // Allocate intermediate tensors
166  for(int i = 0; i < _reduction_ops - (keep_dims ? 1 : 0); ++i)
167  {
168  _reduced_outs[i].allocator()->allocate();
169  }
170 
171  // Configure reshape layer if we want to drop the dimensions
172  if(!_keep_dims)
173  {
174  TensorShape out_shape = tmp_input->info()->tensor_shape();
175 
176  // We have to sort the reduction axis vectors in order for remove_dimension
177  // to work properly
178  std::sort(axis_local.begin(), axis_local.begin() + _reduction_ops);
179  for(int i = 0; i < _reduction_ops; ++i)
180  {
181  out_shape.remove_dimension(axis_local[i] - i);
182  }
183  auto_init_if_empty(*tmp_output->info(), tmp_input->info()->clone()->set_tensor_shape(out_shape));
184  _reshape.configure(compile_context, &_reduced_outs[_reduction_ops - 1], tmp_output);
185  }
186  if(_do_requant)
187  {
188  _requant.configure(compile_context, &_output_no_quant, output);
189  _input_no_quant.allocator()->allocate();
190  _output_no_quant.allocator()->allocate();
191  }
192 }
bool is_data_type_quantized(DataType dt)
Check if a given data type is of quantized type.
Definition: Utils.h:1168
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor&#39;s metadata.
Definition: CLTensor.cpp:41
1 channel, 1 F32 per channel
CLTensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: CLTensor.cpp:61
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
size_t num_dimensions() const override
The number of dimensions of the tensor (rank)
Definition: TensorInfo.h:254
TensorShape calculate_reduce_mean_shape(ITensorInfo *input, const Coordinates &reduction_axis, bool keep_dims)
Calculate the output tensor shape for the reduce mean operation.
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
void configure(const ICLTensor *input, ICLTensor *output)
Set the input and output tensors.
static Status validate(const ITensorInfo *input, const Coordinates &reduction_axis, bool keep_dims, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLReduceMean.
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
void configure(const ICLTensor *input, ICLTensor *output)
Set the input and output tensors.
void configure(const ICLTensor *input, ICLTensor *output)
Initialise the kernel&#39;s inputs and outputs.
Coordinates & convert_negative_axis(Coordinates &coords, int max_value)
Convert negative coordinates to positive in the range [0, num_dims_input].
Definition: Helpers.h:241

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 199 of file CLReduceMean.cpp.

References ICLSimpleFunction::run(), and CLReshapeLayer::run().

200 {
201  MemoryGroupResourceScope scope_mg(_memory_group);
202 
203  if(_do_requant)
204  {
205  _dequant.run();
206  }
207  for(auto &kernel : _reduction_kernels)
208  {
209  kernel.run();
210  }
211  if(!_keep_dims)
212  {
213  _reshape.run();
214  }
215  if(_do_requant)
216  {
217  _requant.run();
218  }
219 }
void run() override
Run the kernels contained in the function.
void run() override final
Run the kernels contained in the function.

◆ validate()

Status validate ( const ITensorInfo input,
const Coordinates reduction_axis,
bool  keep_dims,
const ITensorInfo output 
)
static

Static function to check if given info will lead to a valid configuration of CLReduceMean.

Parameters
[in]inputSource tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32
[in]reduction_axisReduction axis vector.
[in]keep_dimsIf positive, retains reduced dimensions with length 1.
[in]outputDestination tensor. Data type supported: Same as input
Returns
A status

Definition at line 194 of file CLReduceMean.cpp.

Referenced by CLReduceMean::configure(), and arm_compute::test::validation::DATA_TEST_CASE().

195 {
196  return validate_config(input, reduction_axis, keep_dims, output);
197 }

The documentation for this class was generated from the following files: