Compute Library
 19.08
CLReduceMean Class Reference

Basic function to perform reduce operation. More...

#include <CLReduceMean.h>

Collaboration diagram for CLReduceMean:
[legend]

Public Member Functions

 CLReduceMean (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
void configure (ICLTensor *input, const Coordinates &reduction_axis, bool keep_dims, ICLTensor *output)
 Configure kernel. More...
 
void run () override
 Run the kernels contained in the function. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 
virtual void prepare ()
 Prepare the function for executing. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const Coordinates &reduction_axis, bool keep_dims, const ITensorInfo *output)
 Static function to check if given info will lead to a valid configuration of CLReduceMean. More...
 

Detailed Description

Basic function to perform reduce operation.

Definition at line 39 of file CLReduceMean.h.

Constructor & Destructor Documentation

◆ CLReduceMean()

CLReduceMean ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 36 of file CLReduceMean.cpp.

37  : _memory_group(std::move(memory_manager)), _reduction_kernels(), _reduced_outs(), _reshape(), _reduction_ops(), _keep_dims()
38 {
39 }

Member Function Documentation

◆ configure()

void configure ( ICLTensor input,
const Coordinates reduction_axis,
bool  keep_dims,
ICLTensor output 
)

Configure kernel.

Note
Supported tensor rank: up to 4
Parameters
[in]inputSource tensor. Data type supported: QASYMM8/F16/F32
[in]reduction_axisReduction axis vector.
[in]keep_dimsIf positive, retains reduced dimensions with length 1.
[out]outputDestination tensor. Data type supported: Same as input

Definition at line 40 of file CLReduceMean.cpp.

41 {
43 
44  _reduction_ops = reduction_axis.num_dimensions();
45  _reduction_kernels.resize(_reduction_ops);
46  _reduced_outs.resize(_reduction_ops - (keep_dims ? 1 : 0));
47  _keep_dims = keep_dims;
48 
49  Coordinates axis_local = reduction_axis;
50  const int input_dims = input->info()->num_dimensions();
51 
52  // Convert negative axis
53  for(unsigned int i = 0; i < _reduction_ops; ++i)
54  {
55  axis_local[i] = wrap_around(axis_local[i], input_dims);
56  }
57 
58  // Perform reduction for every axis
59  for(unsigned int i = 0; i < _reduction_ops; ++i)
60  {
61  TensorShape out_shape = i == 0 ? input->info()->tensor_shape() : (&_reduced_outs[i - 1])->info()->tensor_shape();
62  out_shape.set(axis_local[i], 1);
63  auto in = (i == 0) ? input : (&_reduced_outs[i - 1]);
64 
65  if(i == _reduction_ops - 1 && keep_dims)
66  {
67  _reduction_kernels[i].configure(in, output, axis_local[i], ReductionOperation::MEAN_SUM);
68  }
69  else
70  {
71  _reduced_outs[i].allocator()->init(TensorInfo(out_shape, input->info()->num_channels(), input->info()->data_type(), input->info()->quantization_info()));
72  _memory_group.manage(&_reduced_outs[i]);
73  _reduction_kernels[i].configure(in, &_reduced_outs[i], axis_local[i], ReductionOperation::MEAN_SUM);
74  }
75  }
76 
77  // Allocate intermediate tensors
78  for(unsigned int i = 0; i < _reduction_ops - (keep_dims ? 1 : 0); ++i)
79  {
80  _reduced_outs[i].allocator()->allocate();
81  }
82 
83  // Configure reshape layer if we want to drop the dimensions
84  if(!keep_dims)
85  {
86  TensorShape out_shape = input->info()->tensor_shape();
87 
88  // We have to sort the reduction axis vectors in order for remove_dimension
89  // to work properly
90  std::sort(axis_local.begin(), axis_local.begin() + _reduction_ops);
91  for(unsigned int i = 0; i < _reduction_ops; ++i)
92  {
93  out_shape.remove_dimension(axis_local[i] - i);
94  }
95  auto_init_if_empty(*output->info(), input->info()->clone()->set_tensor_shape(out_shape));
96  _reshape.configure(&_reduced_outs[_reduction_ops - 1], output);
97  }
98 }
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
Definition: Helpers.inl:201
T wrap_around(T x, T m)
Wrap-around a number within the range 0 <= x < m.
Definition: Helpers.h:764
void manage(TensorType *obj)
Sets a object to be managed by the given memory group.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
void configure(const ICLTensor *input, ICLTensor *output)
Initialise the kernel's inputs and outputs.

References ARM_COMPUTE_ERROR_ON_NULLPTR, arm_compute::auto_init_if_empty(), Dimensions< T >::begin(), ICloneable< T >::clone(), CLReshapeLayer::configure(), ITensorInfo::data_type(), ITensor::info(), arm_compute::test::validation::info, MemoryGroupBase< TensorType >::manage(), arm_compute::MEAN_SUM, ITensorInfo::num_channels(), Dimensions< T >::num_dimensions(), ITensorInfo::num_dimensions(), ITensorInfo::quantization_info(), TensorShape::remove_dimension(), TensorShape::set(), ITensorInfo::tensor_shape(), and arm_compute::wrap_around().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For NEON kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 144 of file CLReduceMean.cpp.

145 {
146  MemoryGroupResourceScope scope_mg(_memory_group);
147 
148  for(unsigned int i = 0; i < _reduction_ops; ++i)
149  {
150  _reduction_kernels[i].run();
151  }
152 
153  if(!_keep_dims)
154  {
155  _reshape.run();
156  }
157 }
void run() override final
Run the kernels contained in the function.

References ICLSimpleFunction::run().

◆ validate()

Status validate ( const ITensorInfo input,
const Coordinates reduction_axis,
bool  keep_dims,
const ITensorInfo output 
)
static

Static function to check if given info will lead to a valid configuration of CLReduceMean.

Parameters
[in]inputSource tensor. Data type supported: QASYMM8/F16/F32
[in]reduction_axisReduction axis vector.
[in]keep_dimsIf positive, retains reduced dimensions with length 1.
[in]outputDestination tensor. Data type supported: Same as input
Returns
A status

Definition at line 100 of file CLReduceMean.cpp.

101 {
105  ARM_COMPUTE_RETURN_ERROR_ON(reduction_axis.num_dimensions() > input->num_dimensions());
106 
107  TensorShape out_shape = input->tensor_shape();
108 
109  Coordinates axis_sorted = reduction_axis;
110  const unsigned int reduction_ops = reduction_axis.num_dimensions();
111  const int input_dims = input->num_dimensions();
112 
113  // Convert negative axis
114  for(unsigned int i = 0; i < reduction_ops; ++i)
115  {
116  axis_sorted[i] = wrap_around(axis_sorted[i], input_dims);
117  }
118 
119  std::sort(axis_sorted.begin(), axis_sorted.begin() + reduction_ops);
120  for(unsigned int i = 0; i < reduction_ops; ++i)
121  {
122  ARM_COMPUTE_RETURN_ERROR_ON(axis_sorted[i] > 3);
123  ARM_COMPUTE_RETURN_ERROR_ON(static_cast<unsigned int>(axis_sorted[i]) > input->num_dimensions() - 1);
124  if(output->total_size() > 0 && keep_dims)
125  {
126  ARM_COMPUTE_RETURN_ERROR_ON(output->dimension(axis_sorted[i]) != 1);
127  }
128  if(keep_dims)
129  {
130  out_shape.set(axis_sorted[i], 1);
131  }
132  else
133  {
134  out_shape.remove_dimension(axis_sorted[i] - i);
135  }
136  }
137 
138  const TensorInfo out_info = input->clone()->set_tensor_shape(out_shape);
140 
141  return Status{};
142 }
#define ARM_COMPUTE_RETURN_ERROR_ON_F16_UNSUPPORTED(tensor)
Definition: CLValidate.h:34
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:791
1 channel, 1 F32 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:244
1 channel, 1 F16 per channel
T wrap_around(T x, T m)
Wrap-around a number within the range 0 <= x < m.
Definition: Helpers.h:764
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_SHAPES(...)
Definition: Validate.h:443
quantized, asymmetric fixed-point 8-bit number
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_F16_UNSUPPORTED, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_SHAPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, Dimensions< T >::begin(), ICloneable< T >::clone(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, Dimensions< T >::num_dimensions(), ITensorInfo::num_dimensions(), arm_compute::QASYMM8, ITensorInfo::tensor_shape(), ITensorInfo::total_size(), and arm_compute::wrap_around().


The documentation for this class was generated from the following files: