Compute Library
 19.08
CLSoftmaxLayer Class Reference

Basic function to compute a SoftmaxLayer. More...

#include <CLSoftmaxLayer.h>

Collaboration diagram for CLSoftmaxLayer:
[legend]

Public Member Functions

 CLSoftmaxLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Constructor. More...
 
void configure (const ICLTensor *input, ICLTensor *output, float beta=1.0f, size_t axis=1)
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 
virtual void prepare ()
 Prepare the function for executing. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *output, float beta=1.0f, size_t axis=1)
 Static function to check if given info will lead to a valid configuration of CLSoftmaxLayer. More...
 

Detailed Description

Basic function to compute a SoftmaxLayer.

Softmax is calculated by :

\[ out = exp((x - max(x)) * beta) / sum(exp((x - max(x)) * beta)) \]

This function runs the following kernels:

  1. CLLogits1DMaxKernel
  2. CLLogits1DShiftExpSumKernel
  3. CLLogits1DNormKernel

Definition at line 51 of file CLSoftmaxLayer.h.

Constructor & Destructor Documentation

◆ CLSoftmaxLayer()

CLSoftmaxLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Constructor.

Definition at line 38 of file CLSoftmaxLayer.cpp.

39  : _memory_group(std::move(memory_manager)), _max_shift_exp_sum_kernel(), _norm_kernel(), _flatten_kernel_ptr(), _reshape_kernel(), _max(), _sum(), _tmp(), _input_flattened(), _output_flattened(),
40  _needs_flattening(false)
41 {
42 }

Member Function Documentation

◆ configure()

void configure ( const ICLTensor input,
ICLTensor output,
float  beta = 1.0f,
size_t  axis = 1 
)

Set the input and output tensors.

Parameters
[in]inputSource tensor. Data types supported: QASYMM8/F16/F32
[out]outputDestination tensor. Data types supported: same as input
[in]beta(Optional) A scaling factor for the exponent. Defaults to 1.f
[in]axis(Optional) Reduction axis. It has the purpose of squashing the first axis dimensions together. For instance, given a [4x4x4x4] image, when axis is 2, the Softmax reduction will be applied on each of the [4x4] planes of the input image.

Definition at line 73 of file CLSoftmaxLayer.cpp.

74 {
75  // Perform validation step
76  ARM_COMPUTE_ERROR_ON_NULLPTR(input, output);
77  ARM_COMPUTE_ERROR_THROW_ON(CLSoftmaxLayer::validate(input->info(), output->info(), beta, axis));
78 
79  // We don't need flattening only in the case the input is 2D and axis is 1
80  _needs_flattening = axis != 1;
81 
82  // If we are dealing with a 4D tensor, we will:
83  // - Flatten the input, so that we end up with a [width*height*depth] * batches 2D tensor
84  // - Execute all the pipeline (reduction + normalization) on the flattened tensor
85  // - Reshape the flattened output into the real output
86  if(_needs_flattening)
87  {
88  // Add to the memory manager _input_flattened
89  _memory_group.manage(&_input_flattened);
90 
91  // Cofigure _flatten_kernel and _input_flattened
92  configure_reshape_input_kernel(input, output, axis);
93  }
94 
95  // We want to deal with a 2D input. Either it is the flattened version of the original input (4D case)
96  // or it is the original input case (2D case)
97  const ICLTensor *input_2D = (_needs_flattening ? &_input_flattened : input);
98 
99  // Create intermediate tensors shapes
100  TensorInfo input_info = input_2D->info()->clone()->reset_padding().set_is_resizable(true);
101  DataType tmp_data_type = is_data_type_quantized_asymmetric(input_2D->info()->data_type()) ? DataType::S32 : input_2D->info()->data_type();
102  TensorInfo tensor_info_tmp(input_info.clone()->set_data_type(tmp_data_type));
103  _tmp.allocator()->init(tensor_info_tmp);
104 
105  TensorShape max_sum_shape = input_2D->info()->tensor_shape();
106  max_sum_shape.set(0, 1);
107  _max.allocator()->init(input_info.clone()->set_tensor_shape(max_sum_shape));
108  _sum.allocator()->init(input_info.clone()->set_tensor_shape(max_sum_shape).set_data_type(tmp_data_type));
109 
110  // Set GPU target to kernels
111  _max_shift_exp_sum_kernel.set_target(CLScheduler::get().target());
112 
113  // Manage intermediate buffers
114  _memory_group.manage(&_tmp);
115  _memory_group.manage(&_max);
116  _memory_group.manage(&_sum);
117 
118  // Configure kernels
119  _max_shift_exp_sum_kernel.configure(input_2D, &_max, &_tmp, &_sum, beta);
120 
121  if(_needs_flattening)
122  {
123  // Add to the memory manager _output_flattened
124  _memory_group.manage(&_output_flattened);
125 
126  // The normalization kernel stores the result in a flat output tensor
127  _norm_kernel.configure(&_tmp, &_sum, &_output_flattened, beta);
128 
129  // Reshape the flat output into a the requested (4D) output
130  _reshape_kernel.configure(&_output_flattened, output);
131 
132  // Allocate the intermediate flat tensors
133  _input_flattened.allocator()->allocate();
134  _output_flattened.allocator()->allocate();
135  }
136  else
137  {
138  // Softmax 2D case
139  _norm_kernel.configure(&_tmp, &_sum, output, beta);
140  }
141 
142  // Allocate intermediate buffers
143  _tmp.allocator()->allocate();
144  _max.allocator()->allocate();
145  _sum.allocator()->allocate();
146 }
static Status validate(const ITensorInfo *input, const ITensorInfo *output, float beta=1.0f, size_t axis=1)
Static function to check if given info will lead to a valid configuration of CLSoftmaxLayer.
static CLScheduler & get()
Access the scheduler singleton.
Definition: CLScheduler.cpp:41
void configure(const ICLTensor *input, const ICLTensor *sum, ICLTensor *output, float beta=1.0f)
Set the input and output tensors.
CLTensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: CLTensor.cpp:55
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:327
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
1 channel, 1 S32 per channel
void manage(TensorType *obj)
Sets a object to be managed by the given memory group.
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: Utils.h:1030
void configure(const ICLTensor *input, ICLTensor *output)
Set the input and output of the kernel.
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
void set_target(GPUTarget target)
Set the targeted GPU architecture.
Definition: ICLKernel.h:271
void configure(const ICLTensor *input, ICLTensor *max, ICLTensor *output, ICLTensor *sum, float beta=1.0f)
Set the input and output tensors.
DataType
Available data types.
Definition: Types.h:74

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::test::validation::axis, ICloneable< T >::clone(), CLReshapeLayerKernel::configure(), CLLogits1DMaxShiftExpSumKernel::configure(), CLLogits1DNormKernel::configure(), ITensorInfo::data_type(), CLScheduler::get(), ITensor::info(), ITensorAllocator::init(), arm_compute::test::validation::input_info, arm_compute::is_data_type_quantized_asymmetric(), MemoryGroupBase< TensorType >::manage(), arm_compute::S32, TensorShape::set(), ICLKernel::set_target(), ITensorInfo::tensor_shape(), and CLSoftmaxLayer::validate().

Referenced by arm_compute::test::validation::DATA_TEST_CASE().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For NEON kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 192 of file CLSoftmaxLayer.cpp.

193 {
194  MemoryGroupResourceScope scope_mg(_memory_group);
195 
196  if(_needs_flattening)
197  {
198  CLScheduler::get().enqueue(*_flatten_kernel_ptr, false);
199  }
200 
201  CLScheduler::get().enqueue(_max_shift_exp_sum_kernel, false);
202  CLScheduler::get().enqueue(_norm_kernel, !_needs_flattening);
203 
204  if(_needs_flattening)
205  {
206  CLScheduler::get().enqueue(_reshape_kernel, true);
207  }
208 }
static CLScheduler & get()
Access the scheduler singleton.
Definition: CLScheduler.cpp:41
void enqueue(ICLKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
Definition: CLScheduler.cpp:95

References CLScheduler::enqueue(), and CLScheduler::get().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo output,
float  beta = 1.0f,
size_t  axis = 1 
)
static

Static function to check if given info will lead to a valid configuration of CLSoftmaxLayer.

Parameters
[in]inputSource tensor. Data types supported: QASYMM8/F16/F32
[in]outputDestination tensor. Data types supported: same as input
[in]beta(Optional) A scaling factor for the exponent. Defaults to 1.f
[in]axis(Optional) Reduction axis. It has the purpose of squashing the first axis dimensions together. For instance, given a [4x4x4x4] image, when axis is 2, the Softmax reduction will be applied on each of the [4x4] planes of the input image.
Returns
a status

Definition at line 148 of file CLSoftmaxLayer.cpp.

149 {
151  ARM_COMPUTE_RETURN_ERROR_ON_MSG(input->num_dimensions() > 4, "Only up to 4 dimensions are supported");
152  ARM_COMPUTE_UNUSED(beta);
153 
154  // Create intermediate tensor info
155  DataType tmp_data_type = is_data_type_quantized_asymmetric(input->data_type()) ? DataType::S32 : input->data_type();
156  TensorInfo tensor_info_tmp(input->clone()->set_data_type(tmp_data_type).set_is_resizable(true));
157 
158  TensorShape max_sum_shape = input->tensor_shape();
159  max_sum_shape.set(0, 1);
160  TensorInfo tensor_info_max(input->clone()->set_tensor_shape(max_sum_shape).set_is_resizable(true));
161  TensorInfo tensor_info_sum(input->clone()->set_tensor_shape(max_sum_shape).set_data_type(tmp_data_type).set_quantization_info(QuantizationInfo()).set_is_resizable(true));
162 
163  const bool needs_flattening = (axis != 1);
164 
165  if(needs_flattening)
166  {
167  const TensorShape shape_flatten = misc::shape_calculator::compute_softmax_shape(input, axis);
168  TensorInfo tensor_info_flat(input->clone()->set_tensor_shape(shape_flatten).set_is_resizable(true));
169 
170  if(axis != 3)
171  {
173  }
174  else
175  {
177  }
178  }
179 
180  ARM_COMPUTE_RETURN_ON_ERROR(CLLogits1DMaxShiftExpSumKernel::validate(input, &tensor_info_max, &tensor_info_tmp, &tensor_info_sum));
181  ARM_COMPUTE_RETURN_ON_ERROR(CLLogits1DNormKernel::validate(&tensor_info_tmp, &tensor_info_sum, output));
182 
183  if(needs_flattening)
184  {
185  const TensorShape shape_flatten = misc::shape_calculator::compute_softmax_shape(input);
186  TensorInfo tensor_info_flat(input->clone()->set_tensor_shape(shape_flatten).set_is_resizable(true));
187  }
188 
189  return Status{};
190 }
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:193
TensorShape compute_softmax_shape(const ITensorInfo *input, size_t axis=1)
Calculate the softmax output shape of a tensor.
1 channel, 1 S32 per channel
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:160
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG(cond,...)
If the condition is true, an error is returned.
Definition: Error.h:214
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLFlattenLayerKernel.
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLReshapeLayerKernel.
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: Utils.h:1030
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
static Status validate(const ITensorInfo *input, const ITensorInfo *max, const ITensorInfo *output, const ITensorInfo *sum)
Static function to check if given info will lead to a valid configuration of CLLogits1DMaxShiftExpSum...
DataType
Available data types.
Definition: Types.h:74
static Status validate(const ITensorInfo *input, const ITensorInfo *sum, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of CLLogits1DNormKernel.

References ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, ARM_COMPUTE_UNUSED, arm_compute::test::validation::axis, ICloneable< T >::clone(), arm_compute::misc::shape_calculator::compute_softmax_shape(), ITensorInfo::data_type(), arm_compute::is_data_type_quantized_asymmetric(), ITensorInfo::num_dimensions(), arm_compute::S32, TensorShape::set(), ITensorInfo::tensor_shape(), CLReshapeLayerKernel::validate(), CLFlattenLayerKernel::validate(), CLLogits1DMaxShiftExpSumKernel::validate(), and CLLogits1DNormKernel::validate().

Referenced by CLSoftmaxLayer::configure().


The documentation for this class was generated from the following files: