Basic function to perform reduce operation. More...

#include <CLReduceMean.h>

Collaboration diagram for CLReduceMean:

Public Member Functions
	CLReduceMean (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
	Default constructor. More...

void	configure (ICLTensor input, const Coordinates &reduction_axis, bool keep_dims, ICLTensor output)
	Configure kernel. More...

void	configure (const CLCompileContext &compile_context, ICLTensor input, const Coordinates &reduction_axis, bool keep_dims, ICLTensor output)
	Configure kernel. More...

void	run () override
	Run the kernels contained in the function. More...

Public Member Functions inherited from IFunction
virtual	~IFunction ()=default
	Destructor. More...

virtual void	prepare ()
	Prepare the function for executing. More...

Static Public Member Functions
static Status	validate (const ITensorInfo input, const Coordinates &reduction_axis, bool keep_dims, const ITensorInfo output)
	Static function to check if given info will lead to a valid configuration of CLReduceMean. More...

Detailed Description

Basic function to perform reduce operation.

Definition at line 41 of file CLReduceMean.h.

Constructor & Destructor Documentation

◆ CLReduceMean()

CLReduceMean ( std::shared_ptr< IMemoryManager > memory_manager = nullptr )

Default constructor.

Definition at line 113 of file CLReduceMean.cpp.

     : _memory_group(std::move(memory_manager)),
       _reduction_kernels(),
       _reduced_outs(),
       _reshape(),
       _dequant(),
       _requant(),
       _reduction_ops(),
       _keep_dims(),
       _do_requant(),
       _input_no_quant(),
       _output_no_quant()
 {
 }

Member Function Documentation

◆ configure() [1/2]

void configure	(	const CLCompileContext &	compile_context,
		ICLTensor *	input,
		const Coordinates &	reduction_axis,
		bool	keep_dims,
		ICLTensor *	output
	)

Configure kernel.

Note: Supported tensor rank: up to 4

Parameters

[in]	compile_context	The compile context to be used.
[in]	input	Source tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32
[in]	reduction_axis	Reduction axis vector.
[in]	keep_dims	If positive, retains reduced dimensions with length 1.
[out]	output	Destination tensor. Data type supported: Same as `input`

Definition at line 133 of file CLReduceMean.cpp.

 {
     // Perform validate step
     ARM_COMPUTE_ERROR_THROW_ON(CLReduceMean::validate(input->info(), reduction_axis, keep_dims, output->info()));
     ARM_COMPUTE_LOG_PARAMS(input, reduction_axis, keep_dims, output);
  
     // Output auto inizialitation if not yet initialized
     const TensorShape output_shape =
         arm_compute::misc::shape_calculator::calculate_reduce_mean_shape(input->info(), reduction_axis, keep_dims);
     auto_init_if_empty(*output->info(), input->info()->clone()->set_tensor_shape(output_shape));
  
     _do_requant = is_data_type_quantized(input->info()->data_type()) &&
                   input->info()->quantization_info() != output->info()->quantization_info();
     _reduction_ops = reduction_axis.num_dimensions();
     _reduction_kernels.resize(_reduction_ops);
     _reduced_outs.resize(_reduction_ops - (keep_dims ? 1 : 0));
     _keep_dims = keep_dims;
  
     ICLTensor *tmp_input  = input;
     ICLTensor *tmp_output = output;
     if (_do_requant)
     {
         _memory_group.manage(&_input_no_quant);
         _memory_group.manage(&_output_no_quant);
         TensorInfo output_no_quant_info = input->info()->clone()->set_tensor_shape(output_shape);
         output_no_quant_info.set_data_type(DataType::F32);
         auto_init_if_empty(*_output_no_quant.info(), output_no_quant_info);
         auto_init_if_empty(*_input_no_quant.info(), input->info()->clone()->set_data_type(DataType::F32));
         _dequant.configure(compile_context, input, &_input_no_quant);
         tmp_input  = &_input_no_quant;
         tmp_output = &_output_no_quant;
     }
  
     Coordinates axis_local = reduction_axis;
     const int   input_dims = tmp_input->info()->num_dimensions();
  
     convert_negative_axis(axis_local, input_dims);
  
     // Perform reduction for every axis
     for (int i = 0; i < _reduction_ops; ++i)
     {
         TensorShape out_shape =
             i == 0 ? tmp_input->info()->tensor_shape() : (&_reduced_outs[i - 1])->info()->tensor_shape();
         out_shape.set(axis_local[i], 1);
         auto in = (i == 0) ? tmp_input : (&_reduced_outs[i - 1]);
  
         if (i == _reduction_ops - 1 && keep_dims)
         {
             _reduction_kernels[i].configure(compile_context, in, tmp_output, axis_local[i],
                                             ReductionOperation::MEAN_SUM);
         }
         else
         {
             _reduced_outs[i].allocator()->init(TensorInfo(out_shape, tmp_input->info()->num_channels(),
                                                           tmp_input->info()->data_type(),
                                                           tmp_input->info()->quantization_info()));
             _memory_group.manage(&_reduced_outs[i]);
             _reduction_kernels[i].configure(compile_context, in, &_reduced_outs[i], axis_local[i],
                                             ReductionOperation::MEAN_SUM);
         }
     }
  
     // Allocate intermediate tensors
     for (int i = 0; i < _reduction_ops - (keep_dims ? 1 : 0); ++i)
     {
         _reduced_outs[i].allocator()->allocate();
     }
  
     // Configure reshape layer if we want to drop the dimensions
     if (!_keep_dims)
     {
         TensorShape out_shape = tmp_input->info()->tensor_shape();
  
         // We have to sort the reduction axis vectors in order for remove_dimension
         // to work properly
  
 // Suppress warning produced by a compiler bug in GCC
 // https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104165
 #pragma GCC diagnostic push
 #pragma GCC diagnostic ignored "-Warray-bounds"
         std::sort(axis_local.begin(), axis_local.begin() + _reduction_ops);
 #pragma GCC diagnostic pop
         for (int i = 0; i < _reduction_ops; ++i)
         {
             out_shape.remove_dimension(axis_local[i] - i, false);
         }
         auto_init_if_empty(*tmp_output->info(), tmp_input->info()->clone()->set_tensor_shape(out_shape));
         _reshape.configure(compile_context, &_reduced_outs[_reduction_ops - 1], tmp_output);
     }
     if (_do_requant)
     {
         _requant.configure(compile_context, &_output_no_quant, output);
         _input_no_quant.allocator()->allocate();
         _output_no_quant.allocator()->allocate();
     }
 }

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, arm_compute::auto_init_if_empty(), Dimensions< T >::begin(), arm_compute::misc::shape_calculator::calculate_reduce_mean_shape(), ICloneable< T >::clone(), CLReshapeLayer::configure(), CLDequantizationLayer::configure(), CLQuantizationLayer::configure(), arm_compute::convert_negative_axis(), ITensorInfo::data_type(), arm_compute::F32, ITensor::info(), CLTensor::info(), arm_compute::test::validation::info, arm_compute::test::validation::input, arm_compute::is_data_type_quantized(), MemoryGroup::manage(), arm_compute::MEAN_SUM, ITensorInfo::num_channels(), Dimensions< T >::num_dimensions(), ITensorInfo::num_dimensions(), arm_compute::test::validation::output_shape, ITensorInfo::quantization_info(), TensorShape::remove_dimension(), TensorShape::set(), TensorInfo::set_data_type(), ITensorInfo::tensor_shape(), and CLReduceMean::validate().

◆ configure() [2/2]

void configure	(	ICLTensor *	input,
		const Coordinates &	reduction_axis,
		bool	keep_dims,
		ICLTensor *	output
	)

Configure kernel.

Valid data layouts:

All

Valid data type configurations:

src	dst
QASYMM8	QASYMM8
QASYMM8_SIGNED	QASYMM8_SIGNED
F16	F16
F32	F32

Note: Supported tensor rank: up to 4

Parameters

[in]	input	Source tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32
[in]	reduction_axis	Reduction axis vector.
[in]	keep_dims	If positive, retains reduced dimensions with length 1.
[out]	output	Destination tensor. Data type supported: Same as `input`

Definition at line 128 of file CLReduceMean.cpp.

 {
     configure(CLKernelLibrary::get().get_compile_context(), input, reduction_axis, keep_dims, output);
 }

References CLKernelLibrary::get(), and arm_compute::test::validation::input.

◆ run()

void run ( )

overridevirtual

Run the kernels contained in the function.

For CPU kernels:

Multi-threading is used for the kernels which are parallelisable.
By default std::thread::hardware_concurrency() threads are used.

Note: CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

All the kernels are enqueued on the queue associated with CLScheduler.
The queue is then flushed.

Note: The function will not block until the kernels are executed. It is the user's responsibility to wait.; Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 242 of file CLReduceMean.cpp.

 {
     MemoryGroupResourceScope scope_mg(_memory_group);
  
     if (_do_requant)
     {
         _dequant.run();
     }
     for (auto &kernel : _reduction_kernels)
     {
         kernel.run();
     }
     if (!_keep_dims)
     {
         _reshape.run();
     }
     if (_do_requant)
     {
         _requant.run();
     }
 }

References CLReshapeLayer::run(), CLDequantizationLayer::run(), and CLQuantizationLayer::run().

◆ validate()

Status validate	(	const ITensorInfo *	input,
		const Coordinates &	reduction_axis,
		bool	keep_dims,
		const ITensorInfo *	output
	)

static

Static function to check if given info will lead to a valid configuration of CLReduceMean.

Parameters

[in]	input	Source tensor. Data type supported: QASYMM8/QASYMM8_SIGNED/F16/F32
[in]	reduction_axis	Reduction axis vector.
[in]	keep_dims	If positive, retains reduced dimensions with length 1.
[in]	output	Destination tensor. Data type supported: Same as `input`

Returns: A status

Definition at line 234 of file CLReduceMean.cpp.

 {
     return validate_config(input, reduction_axis, keep_dims, output);
 }

References arm_compute::test::validation::input.

Referenced by CLReduceMean::configure().

The documentation for this class was generated from the following files:

arm_compute/runtime/CL/functions/CLReduceMean.h
src/runtime/CL/functions/CLReduceMean.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ CLReduceMean()

Member Function Documentation

◆ configure() [1/2]

◆ configure() [2/2]

◆ run()

◆ validate()