Compute Library
 21.02
CLLSTMLayer Class Reference

This function performs a single time step in a Long Short-Term Memory (LSTM) layer. More...

#include <CLLSTMLayer.h>

Collaboration diagram for CLLSTMLayer:
[legend]

Public Member Functions

 CLLSTMLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
 CLLSTMLayer (const CLLSTMLayer &)=delete
 Prevent instances of this class from being copied. More...
 
CLLSTMLayeroperator= (const CLLSTMLayer &)=delete
 Prevent instances of this class from being copied. More...
 
 CLLSTMLayer (CLLSTMLayer &&)=delete
 Prevent instances of this class to be moved. More...
 
CLLSTMLayeroperator= (CLLSTMLayer &&)=delete
 Prevent instances of this class to be moved. More...
 
 ~CLLSTMLayer ()
 Default destructor. More...
 
void configure (const ICLTensor *input, const ICLTensor *input_to_forget_weights, const ICLTensor *input_to_cell_weights, const ICLTensor *input_to_output_weights, const ICLTensor *recurrent_to_forget_weights, const ICLTensor *recurrent_to_cell_weights, const ICLTensor *recurrent_to_output_weights, const ICLTensor *forget_gate_bias, const ICLTensor *cell_bias, const ICLTensor *output_gate_bias, const ICLTensor *output_state_in, ICLTensor *cell_state_in, ICLTensor *scratch_buffer, ICLTensor *output_state_out, ICLTensor *cell_state_out, ICLTensor *output, const LSTMParams< ICLTensor > &lstm_params, const ActivationLayerInfo &activation_info, float cell_threshold=0.f, float projection_threshold=0.f)
 Initialize function's tensors. More...
 
void configure (const CLCompileContext &compile_context, const ICLTensor *input, const ICLTensor *input_to_forget_weights, const ICLTensor *input_to_cell_weights, const ICLTensor *input_to_output_weights, const ICLTensor *recurrent_to_forget_weights, const ICLTensor *recurrent_to_cell_weights, const ICLTensor *recurrent_to_output_weights, const ICLTensor *forget_gate_bias, const ICLTensor *cell_bias, const ICLTensor *output_gate_bias, const ICLTensor *output_state_in, ICLTensor *cell_state_in, ICLTensor *scratch_buffer, ICLTensor *output_state_out, ICLTensor *cell_state_out, ICLTensor *output, const LSTMParams< ICLTensor > &lstm_params, const ActivationLayerInfo &activation_info, float cell_threshold=0.f, float projection_threshold=0.f)
 Initialize function's tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *input_to_forget_weights, const ITensorInfo *input_to_cell_weights, const ITensorInfo *input_to_output_weights, const ITensorInfo *recurrent_to_forget_weights, const ITensorInfo *recurrent_to_cell_weights, const ITensorInfo *recurrent_to_output_weights, const ITensorInfo *forget_gate_bias, const ITensorInfo *cell_bias, const ITensorInfo *output_gate_bias, const ITensorInfo *output_state_in, const ITensorInfo *cell_state_in, const ITensorInfo *scratch_buffer, const ITensorInfo *output_state_out, const ITensorInfo *cell_state_out, const ITensorInfo *output, const LSTMParams< ITensorInfo > &lstm_params, const ActivationLayerInfo &activation_info, float cell_threshold=0.f, float projection_threshold=0.f)
 Static function to check if given info will lead to a valid configuration of CLLSTMLayer. More...
 

Detailed Description

This function performs a single time step in a Long Short-Term Memory (LSTM) layer.

Definition at line 55 of file CLLSTMLayer.h.

Constructor & Destructor Documentation

◆ CLLSTMLayer() [1/3]

CLLSTMLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 51 of file CLLSTMLayer.cpp.

52  : _memory_group(std::move(memory_manager)), _fully_connected_input_gate(), _accum_input_gate1(), _subtract_input_gate(), _pixelwise_mul_input_gate(), _activation_input_gate(),
53  _fully_connected_forget_gate(), _accum_forget_gate1(), _pixelwise_mul_forget_gate(), _activation_forget_gate(), _fully_connected_cell_state(), _gemm_cell_state1(),
54  _transpose_cell_state(std::make_unique<CLTransposeKernel>()), _accum_cell_state1(), _accum_cell_state2(), _pixelwise_mul_cell_state1(), _activation_cell_state(), _cell_clip(),
55  _pixelwise_mul_cell_state2(), _fully_connected_output(), _pixelwise_mul_output_state1(), _accum_output1(), _activation_output(), _activation_output_state(), _pixelwise_mul_output_state2(),
56  _fully_connected_output_state(), _projection_clip(), _copy_cell_state(), _copy_output(), _concat_scratch_buffer(), _concat_inputs_forget_gate(), _concat_weights_forget_gate(),
57  _concat_weights_input_gate(), _concat_weights_output(), _ones_fill(), _mean_std_norm_input_gate(), _pixelwise_mul_input_gate_coeff(), _accum_input_gate_bias(), _mean_std_norm_forget_gate(),
58  _pixelwise_mul_forget_gate_coeff(), _accum_forget_gate_bias(), _mean_std_norm_cell_gate(), _pixelwise_mul_cell_gate_coeff(), _accum_cell_gate_bias(), _mean_std_norm_output_gate(),
59  _pixelwise_mul_output_gate_coeff(), _accum_output_gate_bias(), _input_gate_out1(), _input_gate_out2(), _input_gate_out3(), _input_gate_out4(), _forget_gate_out1(), _forget_gate_out2(),
60  _forget_gate_out3(), _forget_gate_out4(), _forget_gate_out5(), _forget_gate_out6(), _cell_state_out1(), _cell_state_out2(), _cell_state_out3(), _cell_state_out4(), _cell_state_out5(), _output1(),
61  _output2(), _output3(), _output4(), _cell_state_activation(), _output_state1(), _ones(), _input_layer_norm_out1(), _input_layer_norm_out2(), _forget_layer_norm_out1(), _forget_layer_norm_out2(),
62  _cell_layer_norm_out1(), _cell_layer_norm_out2(), _output_layer_norm_out1(), _output_layer_norm_out2(), _run_peephole_opt(false), _run_cifg_opt(false), _perform_cell_clipping(false),
63  _has_projection_weights(false), _perform_projection_clipping(false), _is_prepared(false), _is_layer_norm_lstm(false)
64 {
65 }

◆ CLLSTMLayer() [2/3]

CLLSTMLayer ( const CLLSTMLayer )
delete

Prevent instances of this class from being copied.

◆ CLLSTMLayer() [3/3]

CLLSTMLayer ( CLLSTMLayer &&  )
delete

Prevent instances of this class to be moved.

◆ ~CLLSTMLayer()

~CLLSTMLayer ( )
default

Default destructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( const ICLTensor input,
const ICLTensor input_to_forget_weights,
const ICLTensor input_to_cell_weights,
const ICLTensor input_to_output_weights,
const ICLTensor recurrent_to_forget_weights,
const ICLTensor recurrent_to_cell_weights,
const ICLTensor recurrent_to_output_weights,
const ICLTensor forget_gate_bias,
const ICLTensor cell_bias,
const ICLTensor output_gate_bias,
const ICLTensor output_state_in,
ICLTensor cell_state_in,
ICLTensor scratch_buffer,
ICLTensor output_state_out,
ICLTensor cell_state_out,
ICLTensor output,
const LSTMParams< ICLTensor > &  lstm_params,
const ActivationLayerInfo activation_info,
float  cell_threshold = 0.f,
float  projection_threshold = 0.f 
)

Initialize function's tensors.

Parameters
[in]inputSource tensor. Input is a 2D tensor with dimensions [input_size, batch_size]. Data types supported: F16/F32.
[in]input_to_forget_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_cell_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_output_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_forget_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_cell_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_output_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]forget_gate_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]cell_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]output_gate_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]output_state_in2D weights tensor with dimensions [output_size, batch_size]. Data type supported: Same as input.
[in]cell_state_in2D tensor with dimensions [num_units, batch_size]. Data type supported: Same as input.
[out]scratch_buffer2D tensor with dimensions [num_units * 4, batch_size] with CIFG or [num_units * 3, batch_size] without CIGF. Data type supported: Same as input.
[out]output_state_out2D weights tensor with dimensions [output_size, batch_size]. Data type supported: Same as input.
[out]cell_state_out2D tensor with dimensions [num_units, batch_size]. Data type supported: Same as input.
[out]outputDestination tensor. Output is a 2D tensor with dimensions [output_size, batch_size]. Data types supported: Same as input.
[in]lstm_paramsWeights tensors used in peephole optimization: input_to_input_weights 2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input. recurrent_to_input_weights 2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input. cell_to_input_weights 1D weights tensor with dimensions [num_units]. Can be nullptr. Data type supported: Same as input. cell_to_forget_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. cell_to_output_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. input_gate_bias 1D weights tensor with dimensions [num_units]. Data type supported: Same as input projection_weights 2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input. projection_bias 1D weights tensor with dimensions [output_size]. Data type supported: Same as input. input_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. forget_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. cell_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. output_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]activation_infoContains activation information described in ActivationLayerInfo.
[in]cell_threshold(Optional) The clipping threshold for the cell state, such that values are bound within [-cell_clip, cell_clip]. If set to 0.0f then clipping is disabled.
[in]projection_threshold(Optional) The clipping threshold for the output from the projection layer, such that values are bound within [-proj_clip, proj_clip]. If set to 0.0f then clipping is disabled.

Definition at line 69 of file CLLSTMLayer.cpp.

References CLKernelLibrary::get().

76 {
78  recurrent_to_output_weights, forget_gate_bias, cell_bias, output_gate_bias, output_state_in, cell_state_in, scratch_buffer, output_state_out, cell_state_out, output, lstm_params, activation_info,
79  cell_threshold, projection_threshold);
80 }
void configure(const ICLTensor *input, const ICLTensor *input_to_forget_weights, const ICLTensor *input_to_cell_weights, const ICLTensor *input_to_output_weights, const ICLTensor *recurrent_to_forget_weights, const ICLTensor *recurrent_to_cell_weights, const ICLTensor *recurrent_to_output_weights, const ICLTensor *forget_gate_bias, const ICLTensor *cell_bias, const ICLTensor *output_gate_bias, const ICLTensor *output_state_in, ICLTensor *cell_state_in, ICLTensor *scratch_buffer, ICLTensor *output_state_out, ICLTensor *cell_state_out, ICLTensor *output, const LSTMParams< ICLTensor > &lstm_params, const ActivationLayerInfo &activation_info, float cell_threshold=0.f, float projection_threshold=0.f)
Initialize function&#39;s tensors.
Definition: CLLSTMLayer.cpp:69
static CLKernelLibrary & get()
Access the KernelLibrary singleton.

◆ configure() [2/2]

void configure ( const CLCompileContext compile_context,
const ICLTensor input,
const ICLTensor input_to_forget_weights,
const ICLTensor input_to_cell_weights,
const ICLTensor input_to_output_weights,
const ICLTensor recurrent_to_forget_weights,
const ICLTensor recurrent_to_cell_weights,
const ICLTensor recurrent_to_output_weights,
const ICLTensor forget_gate_bias,
const ICLTensor cell_bias,
const ICLTensor output_gate_bias,
const ICLTensor output_state_in,
ICLTensor cell_state_in,
ICLTensor scratch_buffer,
ICLTensor output_state_out,
ICLTensor cell_state_out,
ICLTensor output,
const LSTMParams< ICLTensor > &  lstm_params,
const ActivationLayerInfo activation_info,
float  cell_threshold = 0.f,
float  projection_threshold = 0.f 
)

Initialize function's tensors.

Parameters
[in]compile_contextThe compile context to be used.
[in]inputSource tensor. Input is a 2D tensor with dimensions [input_size, batch_size]. Data types supported: F16/F32.
[in]input_to_forget_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_cell_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_output_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_forget_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_cell_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_output_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]forget_gate_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]cell_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]output_gate_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]output_state_in2D weights tensor with dimensions [output_size, batch_size]. Data type supported: Same as input.
[in]cell_state_in2D tensor with dimensions [num_units, batch_size]. Data type supported: Same as input.
[out]scratch_buffer2D tensor with dimensions [num_units * 4, batch_size] with CIFG or [num_units * 3, batch_size] without CIGF. Data type supported: Same as input.
[out]output_state_out2D weights tensor with dimensions [output_size, batch_size]. Data type supported: Same as input.
[out]cell_state_out2D tensor with dimensions [num_units, batch_size]. Data type supported: Same as input.
[out]outputDestination tensor. Output is a 2D tensor with dimensions [output_size, batch_size]. Data types supported: Same as input.
[in]lstm_paramsWeights tensors used in peephole optimization: input_to_input_weights 2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input. recurrent_to_input_weights 2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input. cell_to_input_weights 1D weights tensor with dimensions [num_units]. Can be nullptr. Data type supported: Same as input. cell_to_forget_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. cell_to_output_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. input_gate_bias 1D weights tensor with dimensions [num_units]. Data type supported: Same as input projection_weights 2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input. projection_bias 1D weights tensor with dimensions [output_size]. Data type supported: Same as input. input_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. forget_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. cell_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. output_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]activation_infoContains activation information described in ActivationLayerInfo.
[in]cell_threshold(Optional) The clipping threshold for the cell state, such that values are bound within [-cell_clip, cell_clip]. If set to 0.0f then clipping is disabled.
[in]projection_threshold(Optional) The clipping threshold for the output from the projection layer, such that values are bound within [-proj_clip, proj_clip]. If set to 0.0f then clipping is disabled.

lstm_res = PixelwiseMul(output, Activation(cell_state))

                -- Clip(lstm_res * projection_weights + projection_bias, projection_threshold) , if there is a projection
               /

output_state = – \ – lstm_res , otherwise

Definition at line 82 of file CLLSTMLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::utils::info_helpers::build_lstm_params_tensor_info(), arm_compute::misc::shape_calculator::calculate_concatenate_shape(), LSTMParams< T >::cell_layer_norm_weights(), LSTMParams< T >::cell_to_forget_weights(), LSTMParams< T >::cell_to_input_weights(), LSTMParams< T >::cell_to_output_weights(), arm_compute::misc::shape_calculator::compute_transposed_shape(), CLMeanStdDevNormalizationLayer::configure(), CLFill::configure(), CLCopy::configure(), CLActivationLayer::configure(), CLConcatenateLayer::configure(), CLArithmeticAddition::configure(), CLGEMM::configure(), CLFullyConnectedLayer::configure(), CLArithmeticSubtraction::configure(), CLPixelWiseMultiplication::configure(), ITensorInfo::data_type(), TensorInfo::data_type(), Window::DimX, LSTMParams< T >::forget_layer_norm_weights(), LSTMParams< T >::has_cifg_opt(), LSTMParams< T >::has_peephole_opt(), LSTMParams< T >::has_projection(), ITensor::info(), CLTensor::info(), ITensorAllocator::init(), LSTMParams< T >::input_gate_bias(), LSTMParams< T >::input_layer_norm_weights(), LSTMParams< T >::input_to_input_weights(), ActivationLayerInfo::LOGISTIC, ActivationLayerInfo::LU_BOUNDED_RELU, MemoryGroup::manage(), LSTMParams< T >::output_layer_norm_weights(), LSTMParams< T >::projection_bias(), LSTMParams< T >::projection_weights(), LSTMParams< T >::recurrent_to_input_weights(), arm_compute::SATURATE, ITensorInfo::tensor_shape(), TensorInfo::tensor_shape(), arm_compute::TO_NEAREST_EVEN, LSTMParams< T >::use_layer_norm(), and CLLSTMLayer::validate().

89 {
94  output_state_in, cell_state_in,
95  scratch_buffer, output_state_out, cell_state_out, output);
96 
97  _is_layer_norm_lstm = lstm_params.use_layer_norm();
98 
99  // Set lstm parameters
100  LSTMParams<ITensorInfo> lstm_params_info{};
101  build_lstm_params_tensor_info(lstm_params, &lstm_params_info);
102 
103  // Validate
107  forget_gate_bias->info(), cell_bias->info(), output_gate_bias->info(),
108  output_state_in->info(), cell_state_in->info(),
109  scratch_buffer->info(), output_state_out->info(), cell_state_out->info(), output->info(),
110  lstm_params_info, activation_info, cell_threshold, projection_threshold));
111 
112  const TensorShape cell_state_shape = cell_state_in->info()->tensor_shape();
113  // Configure block that calculates the forget gate
114  // forget_gate = Activation(input * input_to_forget_weights + output_state_in * recurrent_to_forget_weights + PixelWiseMul(cell_state, cell_to_forget_weights) + forget_gate_bias)
115  // We optimize this as follows:
116  // forget_gate = Activation( (input,output_state_in) * (input_to_forget_weights,recurrent_to_forget_weights) + PixelWiseMul(cell_state, cell_to_forget_weights) + forget_gate_bias
117  _forget_gate_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
118  _forget_gate_out3.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
119  _forget_gate_out5.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
120 
121  std::vector<const ICLTensor *> inputs_vector;
122  inputs_vector.emplace_back(input);
123  inputs_vector.emplace_back(output_state_in);
124  const TensorShape concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(inputs_vector, 0);
125  _forget_gate_out2.allocator()->init(TensorInfo(concat_shape, 1, input->info()->data_type()));
126 
127  _memory_group.manage(&_forget_gate_out2);
128  _concat_inputs_forget_gate.configure(compile_context, inputs_vector, &_forget_gate_out2, Window::DimX);
129 
130  std::vector<const ICLTensor *> weights_vector;
131 
132  weights_vector.emplace_back(input_to_forget_weights);
133  weights_vector.emplace_back(recurrent_to_forget_weights);
134  const TensorShape weights_concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(weights_vector, 0);
135  _forget_gate_out6.allocator()->init(TensorInfo(weights_concat_shape, 1, input->info()->data_type()));
136 
137  _concat_weights_forget_gate.configure(compile_context, weights_vector, &_forget_gate_out6, Window::DimX);
138 
139  _memory_group.manage(&_forget_gate_out5);
140  _fully_connected_forget_gate.configure(compile_context, &_forget_gate_out2, &_forget_gate_out6, (_is_layer_norm_lstm) ? nullptr : forget_gate_bias, &_forget_gate_out5);
141  _memory_group.manage(&_forget_gate_out1);
142  _memory_group.manage(&_forget_gate_out3);
143  _forget_gate_out6.allocator()->allocate();
144 
145  CLTensor *forget_gate_out = &_forget_gate_out5;
146  if(lstm_params.has_peephole_opt())
147  {
148  _forget_gate_out4.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
149 
150  _run_peephole_opt = true;
151  _memory_group.manage(&_forget_gate_out4);
152  _pixelwise_mul_forget_gate.configure(compile_context, cell_state_in, lstm_params.cell_to_forget_weights(), &_forget_gate_out4, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
153  _accum_forget_gate1.configure(compile_context, &_forget_gate_out5, &_forget_gate_out4, &_forget_gate_out3, ConvertPolicy::SATURATE);
154  _forget_gate_out4.allocator()->allocate();
155  _forget_gate_out5.allocator()->allocate();
156  forget_gate_out = &_forget_gate_out3;
157  }
158  else
159  {
160  _forget_gate_out3.allocator()->allocate();
161  }
162  if(_is_layer_norm_lstm)
163  {
164  _forget_layer_norm_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
165  _forget_layer_norm_out2.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
166  _memory_group.manage(&_forget_layer_norm_out1);
167  _memory_group.manage(&_forget_layer_norm_out2);
168  _mean_std_norm_forget_gate.configure(compile_context, forget_gate_out);
169  _pixelwise_mul_forget_gate_coeff.configure(compile_context, forget_gate_out, lstm_params.forget_layer_norm_weights(), &_forget_layer_norm_out1, 1, ConvertPolicy::SATURATE,
171  // forget_gate_out is going to be reassigned, so allocate the tensor that it was assigned to before
172  forget_gate_out->allocator()->allocate();
173  _accum_forget_gate_bias.configure(compile_context, &_forget_layer_norm_out1, forget_gate_bias, &_forget_layer_norm_out2, ConvertPolicy::SATURATE);
174  _forget_layer_norm_out1.allocator()->allocate();
175  forget_gate_out = &_forget_layer_norm_out2;
176  }
177  _activation_forget_gate.configure(compile_context, forget_gate_out, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LOGISTIC));
178 
179  // Configure block that calculates the input gate
180  // input_gate = Activation(input * input_to_input_weights + output_state * recurrent_to_input_weights + PixelWiseMul(cell_state, cell_to_input_weights) + input_gate_bias), without CIFG
181  // input_gate = 1 - forget_gate, with CIFG
182  // We optimize this as follows:
183  // input_gate = Activation((input,output_state) * (input_to_input_weights,recurrent_to_input_weights) + PixelWiseMul(cell_state, cell_to_input_weights) + input_gate_bias), without CIFG
184  _input_gate_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
185  CLTensor *input_gate_out = &_input_gate_out1;
186  if(lstm_params.has_cifg_opt())
187  {
188  _memory_group.manage(&_input_gate_out1);
189  _ones.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
190  _ones_fill.configure(compile_context, &_ones, PixelValue(1, _ones.info()->data_type()));
191  _subtract_input_gate.configure(compile_context, &_ones, forget_gate_out, &_input_gate_out1, ConvertPolicy::SATURATE);
192  _ones.allocator()->allocate();
193  _run_cifg_opt = true;
194  }
195  else
196  {
197  _input_gate_out3.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
198  _input_gate_out4.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
199 
200  std::vector<const ICLTensor *> lstm_weights;
201  lstm_weights.emplace_back(lstm_params.input_to_input_weights());
202  lstm_weights.emplace_back(lstm_params.recurrent_to_input_weights());
203  TensorShape lstm_weights_concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(lstm_weights, 0);
204  _input_gate_out2.allocator()->init(TensorInfo(lstm_weights_concat_shape, 1, input->info()->data_type()));
205 
206  _concat_weights_input_gate.configure(compile_context, lstm_weights, &_input_gate_out2, Window::DimX);
207 
208  _memory_group.manage(&_input_gate_out1);
209 
210  _memory_group.manage(&_input_gate_out3);
211  _fully_connected_input_gate.configure(compile_context, &_forget_gate_out2, &_input_gate_out2, (_is_layer_norm_lstm) ? nullptr : lstm_params.input_gate_bias(), &_input_gate_out3);
212  _input_gate_out2.allocator()->allocate();
213 
214  input_gate_out = &_input_gate_out3;
215  if(_run_peephole_opt)
216  {
217  _memory_group.manage(&_input_gate_out4);
218  _pixelwise_mul_input_gate.configure(compile_context, cell_state_in, lstm_params.cell_to_input_weights(), &_input_gate_out4, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
219  _accum_input_gate1.configure(compile_context, &_input_gate_out3, &_input_gate_out4, &_input_gate_out1, ConvertPolicy::SATURATE);
220  _input_gate_out3.allocator()->allocate();
221  _input_gate_out4.allocator()->allocate();
222  input_gate_out = &_input_gate_out1;
223  }
224  else
225  {
226  _input_gate_out1.allocator()->allocate();
227  }
228 
229  if(_is_layer_norm_lstm)
230  {
231  _input_layer_norm_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
232  _input_layer_norm_out2.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
233  _memory_group.manage(&_input_layer_norm_out1);
234  _memory_group.manage(&_input_layer_norm_out2);
235  _mean_std_norm_input_gate.configure(compile_context, input_gate_out);
236  _pixelwise_mul_input_gate_coeff.configure(compile_context, input_gate_out, lstm_params.input_layer_norm_weights(), &_input_layer_norm_out1, 1, ConvertPolicy::SATURATE,
238  // input_gate_out is going to be reassigned, so allocate the tensor that it was assigned to before
239  input_gate_out->allocator()->allocate();
240  _accum_input_gate_bias.configure(compile_context, &_input_layer_norm_out1, lstm_params.input_gate_bias(), &_input_layer_norm_out2, ConvertPolicy::SATURATE);
241  _input_layer_norm_out1.allocator()->allocate();
242  input_gate_out = &_input_layer_norm_out2;
243  }
244  _activation_input_gate.configure(compile_context, input_gate_out, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LOGISTIC));
245  }
246 
247  // Configure block that calculates the cell state
248  // cell_state = Clip((PixelwiseMul(input_gate, Activation(input * input_to_cell_weights + output_state_in * recurrent_to_cell_weights + cell_bias)) + PixelwiseMul(forget_gate, cell_state)), cell_threshold)
249  TensorShape cell_state1_shape = compute_transposed_shape(*recurrent_to_output_weights->info());
250  _cell_state_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
251  _cell_state_out2.allocator()->init(TensorInfo(cell_state1_shape, 1, input->info()->data_type()));
252  _cell_state_out3.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
253  _cell_state_out4.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
254  _cell_state_out5.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
255 
256  _memory_group.manage(&_cell_state_out1);
257  _fully_connected_cell_state.configure(compile_context, input, input_to_cell_weights, (_is_layer_norm_lstm) ? nullptr : cell_bias, &_cell_state_out1);
258  _memory_group.manage(&_cell_state_out2);
259  _transpose_cell_state->configure(compile_context, recurrent_to_cell_weights, &_cell_state_out2);
260  _memory_group.manage(&_cell_state_out3);
261  _gemm_cell_state1.configure(compile_context, output_state_in, &_cell_state_out2, nullptr, &_cell_state_out3, 1.f, 0.f);
262  _cell_state_out2.allocator()->allocate();
263  _memory_group.manage(&_cell_state_out4);
264  _accum_cell_state1.configure(compile_context, &_cell_state_out1, &_cell_state_out3, &_cell_state_out4, ConvertPolicy::SATURATE);
265  CLTensor *cell_state_out_ptr = &_cell_state_out4;
266  if(_is_layer_norm_lstm)
267  {
268  _cell_layer_norm_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
269  _cell_layer_norm_out2.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
270  _memory_group.manage(&_cell_layer_norm_out1);
271  _memory_group.manage(&_cell_layer_norm_out2);
272  _mean_std_norm_cell_gate.configure(compile_context, cell_state_out_ptr);
273  _pixelwise_mul_cell_gate_coeff.configure(compile_context, cell_state_out_ptr, lstm_params.cell_layer_norm_weights(), &_cell_layer_norm_out1, 1, ConvertPolicy::SATURATE,
275  // cell_state_out_ptr is going to be reassigned, so allocate the tensor that it was assigned to before
276  cell_state_out_ptr->allocator()->allocate();
277  _accum_cell_gate_bias.configure(compile_context, &_cell_layer_norm_out1, cell_bias, &_cell_layer_norm_out2, ConvertPolicy::SATURATE);
278  _cell_layer_norm_out1.allocator()->allocate();
279  cell_state_out_ptr = &_cell_layer_norm_out2;
280  }
281  _activation_cell_state.configure(compile_context, cell_state_out_ptr, nullptr, activation_info);
282  _memory_group.manage(&_cell_state_out5);
283  _pixelwise_mul_cell_state1.configure(compile_context, cell_state_out_ptr, input_gate_out, &_cell_state_out5, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
284  cell_state_out_ptr->allocator()->allocate();
285  _pixelwise_mul_cell_state2.configure(compile_context, forget_gate_out, cell_state_in, &_cell_state_out3, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
286  _accum_cell_state2.configure(compile_context, &_cell_state_out5, &_cell_state_out3, &_cell_state_out1, ConvertPolicy::SATURATE);
287  _cell_state_out3.allocator()->allocate();
288  _cell_state_out5.allocator()->allocate();
289  // Perform clipping
290  if(cell_threshold != 0.f)
291  {
292  _perform_cell_clipping = true;
293  _cell_clip.configure(compile_context, &_cell_state_out1, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU, -cell_threshold, cell_threshold));
294  }
295 
296  // Configure block that calculates the output
297  // output_state_out = Activation(input * input_to_output_weights + output_state_in * recurrent_to_output_weights + PixelWiseMul(cell_state, cell_to_output_weights) + output_gate_bias)
298  // We optimize this as follows:
299  // output_state_out = Activation( (input,output_state_in) * (input_to_output_weights, recurrent_to_output_weights) + PixelWiseMul(cell_state, cell_to_output_weights) + output_gate_bias)
300  _output1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
301  _output4.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
302  std::vector<const ICLTensor *> in_out_weights;
303  in_out_weights.emplace_back(input_to_output_weights);
304  in_out_weights.emplace_back(recurrent_to_output_weights);
305  TensorShape in_out_weights_concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(in_out_weights, 0);
306  _output2.allocator()->init(TensorInfo(in_out_weights_concat_shape, 1, input->info()->data_type()));
307 
308  _concat_weights_output.configure(compile_context, in_out_weights, &_output2, Window::DimX);
309 
310  _memory_group.manage(&_output1);
311  _memory_group.manage(&_output4);
312 
313  _fully_connected_output.configure(compile_context, &_forget_gate_out2, &_output2, (_is_layer_norm_lstm) ? nullptr : output_gate_bias, &_output4);
314 
315  _output2.allocator()->allocate();
316  _forget_gate_out2.allocator()->allocate();
317 
318  CLTensor *output_gate_out = &_output4;
319  if(lstm_params.has_peephole_opt())
320  {
321  _output3.allocator()->init(TensorInfo(_cell_state_out1.info()->tensor_shape(), 1, input->info()->data_type()));
322 
323  _memory_group.manage(&_output3);
324  _pixelwise_mul_output_state1.configure(compile_context, &_cell_state_out1, lstm_params.cell_to_output_weights(), &_output3, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
325  _accum_output1.configure(compile_context, &_output4, &_output3, &_output1, ConvertPolicy::SATURATE);
326  _output4.allocator()->allocate();
327  output_gate_out = &_output1;
328 
329  // Allocate intermediate buffers
330  _output3.allocator()->allocate();
331  }
332  else
333  {
334  _output1.allocator()->allocate();
335  }
336  if(_is_layer_norm_lstm)
337  {
338  _output_layer_norm_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
339  _output_layer_norm_out2.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
340  _memory_group.manage(&_output_layer_norm_out1);
341  _memory_group.manage(&_output_layer_norm_out2);
342  _mean_std_norm_output_gate.configure(compile_context, output_gate_out);
343  _pixelwise_mul_output_gate_coeff.configure(compile_context, output_gate_out, lstm_params.output_layer_norm_weights(), &_output_layer_norm_out1, 1, ConvertPolicy::SATURATE,
345  // output_gate_out is going to be reassigned, so allocate the tensor that it was assigned to before
346  output_gate_out->allocator()->allocate();
347  _accum_output_gate_bias.configure(compile_context, &_output_layer_norm_out1, output_gate_bias, &_output_layer_norm_out2, ConvertPolicy::SATURATE);
348  _output_layer_norm_out1.allocator()->allocate();
349  output_gate_out = &_output_layer_norm_out2;
350  }
351  _activation_output.configure(compile_context, output_gate_out, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LOGISTIC));
352 
353  // Configure block that calculates the output state
354  /** lstm_res = PixelwiseMul(output, Activation(cell_state))
355  *
356  * -- Clip(lstm_res * projection_weights + projection_bias, projection_threshold) , if there is a projection
357  * /
358  * output_state = --
359  * \
360  * -- lstm_res , otherwise
361  */
362  ICLTensor *output_state_out_tmp = lstm_params.has_projection() ? &_output_state1 : output_state_out;
363  _cell_state_activation.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
364  _output_state1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
365 
366  _memory_group.manage(&_cell_state_activation);
367  _activation_output_state.configure(compile_context, &_cell_state_out1, &_cell_state_activation, activation_info);
368  _pixelwise_mul_output_state2.configure(compile_context, &_cell_state_activation, output_gate_out, output_state_out_tmp, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
369  _cell_state_activation.allocator()->allocate();
370 
371  if(lstm_params.has_projection())
372  {
373  _has_projection_weights = true;
374  _fully_connected_output_state.configure(compile_context, output_state_out_tmp, lstm_params.projection_weights(), lstm_params.projection_bias(), output_state_out);
375  _output_state1.allocator()->allocate();
376  // Perform clipping
377  if(projection_threshold != 0.f)
378  {
379  _perform_projection_clipping = true;
380  _projection_clip.configure(compile_context, output_state_out, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU, -projection_threshold, projection_threshold));
381  }
382  }
383 
384  // Copy cell state and output
385  _copy_cell_state.configure(compile_context, &_cell_state_out1, cell_state_out);
386  _copy_output.configure(compile_context, output_state_out, output);
387 
388  // Vector for holding the tensors to store in scratch buffer
389  std::vector<const ICLTensor *> scratch_inputs;
390  if(!lstm_params.has_cifg_opt())
391  {
392  scratch_inputs.emplace_back(input_gate_out);
393  }
394  scratch_inputs.emplace_back(&_cell_state_out1);
395  scratch_inputs.emplace_back(forget_gate_out);
396  scratch_inputs.emplace_back(output_gate_out);
397  _concat_scratch_buffer.configure(compile_context, scratch_inputs, scratch_buffer, Window::DimX);
398  input_gate_out->allocator()->allocate();
399  _cell_state_out1.allocator()->allocate();
400  forget_gate_out->allocator()->allocate();
401  output_gate_out->allocator()->allocate();
402 }
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor&#39;s metadata.
Definition: CLTensor.cpp:41
void build_lstm_params_tensor_info(const LSTMParams< T > &lstm_params, LSTMParams< ITensorInfo > *lstm_params_info)
Build LSTMParams<ITensorInfo> object by extracting the metadata from each tensor. ...
Definition: InfoHelpers.h:71
void configure(ICLTensor *input1, ICLTensor *input2, ICLTensor *output, float scale, ConvertPolicy overflow_policy, RoundingPolicy rounding_policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Initialise the kernel&#39;s inputs, output and convertion policy.
CLTensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: CLTensor.cpp:61
void configure(ICLTensor *input, ICLTensor *output=nullptr, float epsilon=1e-8f)
Initialise the function&#39;s input and outputs.
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
void configure(std::vector< const ICLTensor *> &inputs_vector, ICLTensor *output, size_t axis)
Initialise the kernel&#39;s inputs vector and output.
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
TensorShape compute_transposed_shape(const ITensorInfo &input)
Calculate the transposed shape of a tensor.
DataType data_type() const override
Data type used for each element of the tensor.
Definition: TensorInfo.h:270
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
void configure(const ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
Set the input and output tensors.
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
void configure(ICLTensor *input, ICLTensor *output, Window *dst_window=nullptr)
Initialise the function&#39;s source and destination.
Definition: CLCopy.cpp:52
void configure(ICLTensor *input1, ICLTensor *input2, ICLTensor *output, ConvertPolicy policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Initialise the kernel&#39;s inputs, output and conversion policy.
static Status validate(const ITensorInfo *input, const ITensorInfo *input_to_forget_weights, const ITensorInfo *input_to_cell_weights, const ITensorInfo *input_to_output_weights, const ITensorInfo *recurrent_to_forget_weights, const ITensorInfo *recurrent_to_cell_weights, const ITensorInfo *recurrent_to_output_weights, const ITensorInfo *forget_gate_bias, const ITensorInfo *cell_bias, const ITensorInfo *output_gate_bias, const ITensorInfo *output_state_in, const ITensorInfo *cell_state_in, const ITensorInfo *scratch_buffer, const ITensorInfo *output_state_out, const ITensorInfo *cell_state_out, const ITensorInfo *output, const LSTMParams< ITensorInfo > &lstm_params, const ActivationLayerInfo &activation_info, float cell_threshold=0.f, float projection_threshold=0.f)
Static function to check if given info will lead to a valid configuration of CLLSTMLayer.
void configure(ICLTensor *tensor, const PixelValue &constant_value, Window *window=nullptr)
Initialize the kernel&#39;s tensor and filling value.
Definition: CLFill.cpp:52
void configure(const ICLTensor *input1, const ICLTensor *input2, ICLTensor *output, ConvertPolicy policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Initialise the kernel&#39;s inputs, output and conversion policy.
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
Rounds to nearest value; half rounds to nearest even.
void configure(const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Initialise the kernel&#39;s inputs and output.
Definition: CLGEMM.cpp:666
void configure(ICLTensor *input, ICLTensor *output, ActivationLayerInfo act_info)
Set the input and output tensor.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
const TensorShape & tensor_shape() const override
Size for each dimension of the tensor.
Definition: TensorInfo.h:262
TensorShape calculate_concatenate_shape(const std::vector< T *> &input, size_t axis)
Calculate the concatenate output shape of the concatenate operation along a single axis...

◆ operator=() [1/2]

CLLSTMLayer& operator= ( const CLLSTMLayer )
delete

Prevent instances of this class from being copied.

◆ operator=() [2/2]

CLLSTMLayer& operator= ( CLLSTMLayer &&  )
delete

Prevent instances of this class to be moved.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 735 of file CLLSTMLayer.cpp.

References CLConcatenateLayer::run().

Referenced by CLLSTMLayer::run().

736 {
737  if(!_is_prepared)
738  {
739  _concat_weights_forget_gate.run();
740  if(!_run_cifg_opt)
741  {
742  _concat_weights_input_gate.run();
743  }
744  _concat_weights_output.run();
745  _is_prepared = true;
746  }
747 }
void run() override
Run the kernels contained in the function.

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 635 of file CLLSTMLayer.cpp.

References CLScheduler::enqueue(), CLScheduler::get(), CLLSTMLayer::prepare(), ICLSimpleFunction::run(), CLFill::run(), CLCopy::run(), CLActivationLayer::run(), CLConcatenateLayer::run(), CLArithmeticAddition::run(), CLGEMM::run(), CLFullyConnectedLayer::run(), CLArithmeticSubtraction::run(), and CLPixelWiseMultiplication::run().

636 {
637  prepare();
638 
639  MemoryGroupResourceScope scope_mg(_memory_group);
640 
641  _concat_inputs_forget_gate.run();
642 
643  _fully_connected_forget_gate.run();
644 
645  if(_run_peephole_opt)
646  {
647  _pixelwise_mul_forget_gate.run();
648  _accum_forget_gate1.run();
649  }
650  if(_is_layer_norm_lstm)
651  {
652  _mean_std_norm_forget_gate.run();
653  _pixelwise_mul_forget_gate_coeff.run();
654  _accum_forget_gate_bias.run();
655  }
656  _activation_forget_gate.run();
657 
658  if(_run_cifg_opt)
659  {
660  _ones_fill.run();
661  _subtract_input_gate.run();
662  }
663  else
664  {
665  _fully_connected_input_gate.run();
666 
667  if(_run_peephole_opt)
668  {
669  _pixelwise_mul_input_gate.run();
670  _accum_input_gate1.run();
671  }
672 
673  if(_is_layer_norm_lstm)
674  {
675  _mean_std_norm_input_gate.run();
676  _pixelwise_mul_input_gate_coeff.run();
677  _accum_input_gate_bias.run();
678  }
679  _activation_input_gate.run();
680  }
681 
682  _fully_connected_cell_state.run();
683  CLScheduler::get().enqueue(*_transpose_cell_state);
684  _gemm_cell_state1.run();
685  _accum_cell_state1.run();
686  if(_is_layer_norm_lstm)
687  {
688  _mean_std_norm_cell_gate.run();
689  _pixelwise_mul_cell_gate_coeff.run();
690  _accum_cell_gate_bias.run();
691  }
692  _activation_cell_state.run();
693  _pixelwise_mul_cell_state1.run();
694  _pixelwise_mul_cell_state2.run();
695  _accum_cell_state2.run();
696 
697  if(_perform_cell_clipping)
698  {
699  _cell_clip.run();
700  }
701 
702  _fully_connected_output.run();
703 
704  if(_run_peephole_opt)
705  {
706  _pixelwise_mul_output_state1.run();
707  _accum_output1.run();
708  }
709  if(_is_layer_norm_lstm)
710  {
711  _mean_std_norm_output_gate.run();
712  _pixelwise_mul_output_gate_coeff.run();
713  _accum_output_gate_bias.run();
714  }
715  _activation_output.run();
716 
717  _activation_output_state.run();
718  _pixelwise_mul_output_state2.run();
719 
720  if(_has_projection_weights)
721  {
722  _fully_connected_output_state.run();
723  if(_perform_projection_clipping)
724  {
725  _projection_clip.run();
726  }
727  }
728 
729  _copy_cell_state.run();
730  _copy_output.run();
731 
732  _concat_scratch_buffer.run();
733 }
void run() override
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
Definition: CLGEMM.cpp:778
static CLScheduler & get()
Access the scheduler singleton.
void run() override
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
Definition: CLCopy.cpp:73
void run() override
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
Definition: CLFill.cpp:72
void run() override final
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
void enqueue(ICLKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
void run() override
Run the kernels contained in the function.
void prepare() override
Prepare the function for executing.

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo input_to_forget_weights,
const ITensorInfo input_to_cell_weights,
const ITensorInfo input_to_output_weights,
const ITensorInfo recurrent_to_forget_weights,
const ITensorInfo recurrent_to_cell_weights,
const ITensorInfo recurrent_to_output_weights,
const ITensorInfo forget_gate_bias,
const ITensorInfo cell_bias,
const ITensorInfo output_gate_bias,
const ITensorInfo output_state_in,
const ITensorInfo cell_state_in,
const ITensorInfo scratch_buffer,
const ITensorInfo output_state_out,
const ITensorInfo cell_state_out,
const ITensorInfo output,
const LSTMParams< ITensorInfo > &  lstm_params,
const ActivationLayerInfo activation_info,
float  cell_threshold = 0.f,
float  projection_threshold = 0.f 
)
static

Static function to check if given info will lead to a valid configuration of CLLSTMLayer.

Parameters
[in]inputSource tensor info. Input is a 2D tensor with dimensions [input_size, batch_size]. Data types supported: F16/F32.
[in]input_to_forget_weights2D weights tensor info with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_cell_weights2D weights tensor info with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_output_weights2D weights tensor info with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_forget_weights2D weights tensor info with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_cell_weights2D weights tensor info with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_output_weights2D weights tensor info with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]forget_gate_bias1D weights tensor info with dimensions [num_units]. Data type supported: Same as input.
[in]cell_bias1D weights tensor info with dimensions [num_units]. Data type supported: Same as input.
[in]output_gate_bias1D weights tensor info with dimensions [num_units]. Data type supported: Same as input.
[in]output_state_in2D weights tensor info with dimensions [output_size, batch_size]. Data type supported: Same as input.
[in]cell_state_in2D tensor info with dimensions [num_units, batch_size]. Data type supported: Same as input.
[in]scratch_buffer2D tensor info with dimensions [num_units * 4, batch_size] with CIFG or [num_units * 3, batch_size] without CIGF. Data type supported: Same as input.
[in]output_state_out2D weights tensor info with dimensions [output_size, batch_size]. Data type supported: Same as input.
[in]cell_state_out2D tensor info with dimensions [num_units, batch_size]. Data type supported: Same as input.
[in]outputDestination tensor info. Output is a 2D tensor with dimensions [output_size, batch_size]. Data types supported: Same as input.
[in]lstm_paramsWeights tensors info used in peephole optimization: input_to_input_weights 2D weights tensor info with dimensions [input_size, num_units]. Data type supported: Same as input. recurrent_to_input_weights 2D weights tensor info with dimensions [output_size, num_units]. Data type supported: Same as input. cell_to_input_weights 1D weights tensor info with dimensions [num_units]. Can be nullptr. Data type supported: Same as input. cell_to_forget_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input. cell_to_output_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input. input_gate_bias 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input projection_weights 2D weights tensor info with dimensions [output_size, num_units]. Data type supported: Same as input. projection_bias 1D weights tensor info with dimensions [output_size]. Data type supported: Same as input. input_layer_norm_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input. forget_layer_norm_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input. cell_layer_norm_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input. output_layer_norm_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input.
[in]activation_infoContains activation information described in ActivationLayerInfo.
[in]cell_threshold(Optional) The clipping threshold for the cell state, such that values are bound within [-cell_clip, cell_clip]. If set to 0.0f then clipping is disabled.
[in]projection_threshold(Optional) The clipping threshold for the output from the projection layer, such that values are bound within [-proj_clip, proj_clip]. If set to 0.0f then clipping is disabled.
Returns
a status

Definition at line 404 of file CLLSTMLayer.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::misc::shape_calculator::calculate_concatenate_shape(), LSTMParams< T >::cell_layer_norm_weights(), LSTMParams< T >::cell_to_forget_weights(), LSTMParams< T >::cell_to_input_weights(), LSTMParams< T >::cell_to_output_weights(), arm_compute::misc::shape_calculator::compute_transposed_shape(), ITensorInfo::data_type(), ITensorInfo::dimension(), Window::DimX, arm_compute::F16, arm_compute::F32, LSTMParams< T >::forget_layer_norm_weights(), LSTMParams< T >::has_cifg_opt(), LSTMParams< T >::has_peephole_opt(), LSTMParams< T >::has_projection(), LSTMParams< T >::input_gate_bias(), LSTMParams< T >::input_layer_norm_weights(), LSTMParams< T >::input_to_input_weights(), ActivationLayerInfo::LOGISTIC, ActivationLayerInfo::LU_BOUNDED_RELU, ITensorInfo::num_dimensions(), LSTMParams< T >::output_layer_norm_weights(), LSTMParams< T >::projection_bias(), LSTMParams< T >::projection_weights(), LSTMParams< T >::recurrent_to_input_weights(), arm_compute::SATURATE, arm_compute::TO_NEAREST_EVEN, LSTMParams< T >::use_layer_norm(), CLMeanStdDevNormalizationLayer::validate(), CLCopy::validate(), CLActivationLayer::validate(), CLConcatenateLayer::validate(), CLArithmeticAddition::validate(), CLGEMM::validate(), CLFullyConnectedLayer::validate(), CLArithmeticSubtraction::validate(), and CLPixelWiseMultiplication::validate().

Referenced by CLLSTMLayer::configure(), and arm_compute::test::validation::DATA_TEST_CASE().

411 {
416  output_state_in, cell_state_in,
417  scratch_buffer, output_state_out, cell_state_out, output);
418 
419  // Check data types
425  output_state_in, cell_state_in,
426  scratch_buffer, output_state_out, cell_state_out, output);
427 
428  // Check dimensions
429  ARM_COMPUTE_RETURN_ERROR_ON(input->num_dimensions() > 2);
431  ARM_COMPUTE_RETURN_ERROR_ON(input_to_cell_weights->num_dimensions() > 2);
436  ARM_COMPUTE_RETURN_ERROR_ON(forget_gate_bias->num_dimensions() > 1);
437  ARM_COMPUTE_RETURN_ERROR_ON(cell_bias->num_dimensions() > 1);
438  ARM_COMPUTE_RETURN_ERROR_ON(output_gate_bias->num_dimensions() > 1);
439  ARM_COMPUTE_RETURN_ERROR_ON(output_state_in->num_dimensions() > 2);
440  ARM_COMPUTE_RETURN_ERROR_ON(cell_state_in->num_dimensions() > 2);
441  ARM_COMPUTE_RETURN_ERROR_ON(scratch_buffer->num_dimensions() > 2);
442  ARM_COMPUTE_RETURN_ERROR_ON(output_state_out->num_dimensions() > 2);
443  ARM_COMPUTE_RETURN_ERROR_ON(cell_state_out->num_dimensions() > 2);
444  ARM_COMPUTE_RETURN_ERROR_ON(output->num_dimensions() > 2);
445  ARM_COMPUTE_RETURN_ERROR_ON(cell_bias->dimension(0) * 4 != scratch_buffer->dimension(0)
446  && cell_bias->dimension(0) * 3 != scratch_buffer->dimension(0));
447 
448  const unsigned int num_batches = input->dimension(1);
449  const unsigned int num_cells = input_to_output_weights->dimension(1);
450 
451  if(lstm_params.use_layer_norm())
452  {
453  // If CIFG is used, input layer normalization weights tensor is omitted
454  if(lstm_params.has_cifg_opt())
455  {
456  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.input_layer_norm_weights() != nullptr);
457  }
458  else
459  {
460  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(lstm_params.input_layer_norm_weights());
461  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.input_layer_norm_weights()->num_dimensions() > 1);
462  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.input_layer_norm_weights()->dimension(0) != num_cells);
463  ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, lstm_params.input_layer_norm_weights());
464  }
465 
466  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(lstm_params.forget_layer_norm_weights(), lstm_params.cell_layer_norm_weights(), lstm_params.output_layer_norm_weights());
467  ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, lstm_params.forget_layer_norm_weights(), lstm_params.cell_layer_norm_weights(), lstm_params.output_layer_norm_weights());
468  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.forget_layer_norm_weights()->num_dimensions() > 1);
469  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.cell_layer_norm_weights()->num_dimensions() > 1);
470  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.output_layer_norm_weights()->num_dimensions() > 1);
471  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.forget_layer_norm_weights()->dimension(0) != num_cells);
472  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.cell_layer_norm_weights()->dimension(0) != num_cells);
473  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.output_layer_norm_weights()->dimension(0) != num_cells);
474  }
475 
476  // Check peephole optimization
477  if(lstm_params.has_peephole_opt())
478  {
479  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(lstm_params.cell_to_output_weights(), lstm_params.cell_to_forget_weights());
480  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.cell_to_forget_weights()->num_dimensions() > 1);
481  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.cell_to_output_weights()->num_dimensions() > 1);
482  }
483 
484  TensorShape units_out_transposed_shape = compute_transposed_shape(*recurrent_to_output_weights);
485  TensorShape num_units_transposed_shape = compute_transposed_shape(*forget_gate_bias);
486  const TensorInfo units_out_transposed_info = TensorInfo(units_out_transposed_shape, 1, input->data_type());
487  const TensorInfo num_units_transposed_info = TensorInfo(num_units_transposed_shape, 1, input->data_type());
488 
489  TensorInfo input_gate = TensorInfo(TensorShape(num_cells, num_batches), 1, input->data_type());
490  TensorInfo forget_gate = TensorInfo(TensorShape(num_cells, num_batches), 1, input->data_type());
491  TensorInfo output_gate_tmp = TensorInfo(TensorShape(num_cells, num_batches), 1, input->data_type());
492  TensorInfo cell_state_tmp = TensorInfo(TensorShape(num_cells, num_batches), 1, input->data_type());
493 
494  // Validate forget gate
495  ARM_COMPUTE_RETURN_ON_ERROR(CLFullyConnectedLayer::validate(input, input_to_forget_weights, (lstm_params.use_layer_norm()) ? nullptr : forget_gate_bias, &forget_gate));
496 
497  std::vector<const ITensorInfo *> inputs_vector;
498  inputs_vector.emplace_back(input);
499  inputs_vector.emplace_back(output_state_in);
500  const TensorShape concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(inputs_vector, 0);
501  TensorInfo forget_gate_concat = TensorInfo(concat_shape, 1, input->data_type());
502 
503  ARM_COMPUTE_RETURN_ON_ERROR(CLConcatenateLayer::validate(inputs_vector, &forget_gate_concat, Window::DimX));
504 
505  if(lstm_params.has_peephole_opt())
506  {
507  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(cell_state_in, lstm_params.cell_to_forget_weights(), &forget_gate, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN));
508  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&forget_gate, &forget_gate, &forget_gate, ConvertPolicy::SATURATE));
509  }
510  if(lstm_params.use_layer_norm())
511  {
513  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(&forget_gate, lstm_params.forget_layer_norm_weights(), &forget_gate, 1, ConvertPolicy::SATURATE,
516  }
518 
519  // Validate input gate
520  if(!lstm_params.has_cifg_opt())
521  {
522  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(lstm_params.input_to_input_weights(),
523  lstm_params.recurrent_to_input_weights(),
524  lstm_params.input_gate_bias());
525  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.input_to_input_weights()->num_dimensions() > 2);
526  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.recurrent_to_input_weights()->num_dimensions() > 2);
527  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.input_gate_bias()->num_dimensions() > 1);
528 
529  std::vector<const ITensorInfo *> lstm_weights;
530  lstm_weights.emplace_back(lstm_params.input_to_input_weights());
531  lstm_weights.emplace_back(lstm_params.recurrent_to_input_weights());
532  TensorShape lstm_weights_concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(lstm_weights, 0);
533  TensorInfo lstm_gate_concat = TensorInfo(lstm_weights_concat_shape, 1, input->data_type());
534  ARM_COMPUTE_RETURN_ON_ERROR(CLConcatenateLayer::validate(lstm_weights, &lstm_gate_concat, Window::DimX));
535 
536  ARM_COMPUTE_RETURN_ON_ERROR(CLFullyConnectedLayer::validate(input, lstm_params.input_to_input_weights(), (lstm_params.use_layer_norm()) ? nullptr : lstm_params.input_gate_bias(), &input_gate));
537 
538  if(lstm_params.has_peephole_opt())
539  {
540  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(lstm_params.cell_to_input_weights());
541  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.cell_to_input_weights()->num_dimensions() > 1);
542  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(cell_state_in, lstm_params.cell_to_input_weights(), &input_gate, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN));
544  }
545 
546  if(lstm_params.use_layer_norm())
547  {
549  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(&input_gate, lstm_params.input_layer_norm_weights(), &input_gate, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN));
550  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&input_gate, lstm_params.input_gate_bias(), &input_gate, ConvertPolicy::SATURATE));
551  }
553  }
554  else
555  {
557  }
558 
559  // Validate cell state
560  ARM_COMPUTE_RETURN_ON_ERROR(CLFullyConnectedLayer::validate(input, input_to_cell_weights, (lstm_params.use_layer_norm()) ? nullptr : cell_bias, &cell_state_tmp));
561  ARM_COMPUTE_RETURN_ON_ERROR(CLGEMM::validate(output_state_in, &units_out_transposed_info, nullptr, &cell_state_tmp, 1.f, 0.f, GEMMInfo()));
562  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&cell_state_tmp, &cell_state_tmp, &cell_state_tmp, ConvertPolicy::SATURATE));
563  if(lstm_params.use_layer_norm())
564  {
566  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(&cell_state_tmp, lstm_params.cell_layer_norm_weights(), &cell_state_tmp, 1, ConvertPolicy::SATURATE,
568  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&cell_state_tmp, cell_bias, &cell_state_tmp, ConvertPolicy::SATURATE));
569  }
570  ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(&cell_state_tmp, nullptr, activation_info));
573  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&cell_state_tmp, &cell_state_tmp, &cell_state_tmp, ConvertPolicy::SATURATE));
574  if(cell_threshold != 0.f)
575  {
577  cell_threshold)));
578  }
579 
580  std::vector<const ITensorInfo *> in_out_weights;
581  in_out_weights.emplace_back(input_to_output_weights);
582  in_out_weights.emplace_back(recurrent_to_output_weights);
583  TensorShape in_out_weights_concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(in_out_weights, 0);
584  TensorInfo in_out_gate_concat = TensorInfo(in_out_weights_concat_shape, 1, input->data_type());
585  ARM_COMPUTE_RETURN_ON_ERROR(CLConcatenateLayer::validate(in_out_weights, &in_out_gate_concat, Window::DimX));
586  // Validate output gate tmp
587  ARM_COMPUTE_RETURN_ON_ERROR(CLFullyConnectedLayer::validate(input, input_to_output_weights, (lstm_params.use_layer_norm()) ? nullptr : output_gate_bias, &output_gate_tmp));
588 
589  if(lstm_params.has_peephole_opt())
590  {
591  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(&cell_state_tmp, lstm_params.cell_to_output_weights(), &output_gate_tmp, 1, ConvertPolicy::SATURATE,
593  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&output_gate_tmp, &output_gate_tmp, &output_gate_tmp, ConvertPolicy::SATURATE));
594  }
595  if(lstm_params.use_layer_norm())
596  {
598  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(&output_gate_tmp, lstm_params.output_layer_norm_weights(), &output_gate_tmp, 1, ConvertPolicy::SATURATE,
601  }
603 
604  // Validate output state
605  ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(&cell_state_tmp, &cell_state_tmp, activation_info));
607  if(lstm_params.has_projection())
608  {
609  ARM_COMPUTE_RETURN_ON_ERROR(CLFullyConnectedLayer::validate(&output_gate_tmp, lstm_params.projection_weights(), lstm_params.projection_bias(), output_state_out));
610  if(projection_threshold != 0.f)
611  {
612  ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(output_state_out, output_state_out,
613  ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU, -projection_threshold, projection_threshold)));
614  }
615  }
616 
617  // Validate copy kernel
618  ARM_COMPUTE_RETURN_ON_ERROR(CLCopy::validate(&cell_state_tmp, cell_state_out));
619  ARM_COMPUTE_RETURN_ON_ERROR(CLCopy::validate(output_state_out, output));
620 
621  // Validate scratch concatenation
622  std::vector<const ITensorInfo *> inputs_vector_info_raw;
623  if(!lstm_params.has_cifg_opt())
624  {
625  inputs_vector_info_raw.push_back(&input_gate);
626  }
627  inputs_vector_info_raw.push_back(&cell_state_tmp);
628  inputs_vector_info_raw.push_back(&forget_gate);
629  inputs_vector_info_raw.push_back(&output_gate_tmp);
630 
631  ARM_COMPUTE_RETURN_ON_ERROR(CLConcatenateLayer::validate(inputs_vector_info_raw, scratch_buffer, Window::DimX));
632  return Status{};
633 }
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const ActivationLayerInfo &act_info)
Static function to check if given info will lead to a valid configuration of CLActivationLayer.
static Status validate(const ITensorInfo *input, const ITensorInfo *output=nullptr, float epsilon=1e-8f)
Static function to check if given info will lead to a valid configuration of CLMeanStdDevNormalizatio...
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
1 channel, 1 F32 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
1 channel, 1 F16 per channel
TensorShape compute_transposed_shape(const ITensorInfo &input)
Calculate the transposed shape of a tensor.
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
Static function to check if given info will lead to a valid configuration of CLFullyConnectedLayer.
static Status validate(const ITensorInfo *input1, const ITensorInfo *input2, const ITensorInfo *output, ConvertPolicy policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Static function to check if given info will lead to a valid configuration of opencl::kernels::ClSatur...
static Status validate(const ITensorInfo *input1, const ITensorInfo *input2, const ITensorInfo *output, ConvertPolicy policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Static function to check if given info will lead to a valid configuration of opencl::kernels::ClSatur...
static Status validate(const ITensorInfo *input, const ITensorInfo *output, Window *dst_window=nullptr)
Static function to check if given info will lead to a valid configuration of CLCopy.
Definition: CLCopy.cpp:68
static Status validate(const std::vector< const ITensorInfo *> &inputs_vector, const ITensorInfo *output, size_t axis)
Static function to check if given info will lead to a valid configuration of CLConcatenateLayer.
Rounds to nearest value; half rounds to nearest even.
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:545
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:792
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Static function to check if given info will lead to a valid configuration of CLGEMM.
Definition: CLGEMM.cpp:727
TensorShape calculate_concatenate_shape(const std::vector< T *> &input, size_t axis)
Calculate the concatenate output shape of the concatenate operation along a single axis...
static Status validate(const ITensorInfo *input1, const ITensorInfo *input2, const ITensorInfo *output, float scale, ConvertPolicy overflow_policy, RoundingPolicy rounding_policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Static function to check if given info will lead to a valid configuration of CLPixelWiseMultiplicatio...

The documentation for this class was generated from the following files: