Compute Library
 23.08
CLLSTMLayer Class Reference

This function performs a single time step in a Long Short-Term Memory (LSTM) layer. More...

#include <CLLSTMLayer.h>

Collaboration diagram for CLLSTMLayer:
[legend]

Public Member Functions

 CLLSTMLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
 CLLSTMLayer (const CLLSTMLayer &)=delete
 Prevent instances of this class from being copied. More...
 
CLLSTMLayeroperator= (const CLLSTMLayer &)=delete
 Prevent instances of this class from being copied. More...
 
 CLLSTMLayer (CLLSTMLayer &&)=delete
 Prevent instances of this class to be moved. More...
 
CLLSTMLayeroperator= (CLLSTMLayer &&)=delete
 Prevent instances of this class to be moved. More...
 
 ~CLLSTMLayer ()
 Default destructor. More...
 
void configure (const ICLTensor *input, const ICLTensor *input_to_forget_weights, const ICLTensor *input_to_cell_weights, const ICLTensor *input_to_output_weights, const ICLTensor *recurrent_to_forget_weights, const ICLTensor *recurrent_to_cell_weights, const ICLTensor *recurrent_to_output_weights, const ICLTensor *forget_gate_bias, const ICLTensor *cell_bias, const ICLTensor *output_gate_bias, const ICLTensor *output_state_in, ICLTensor *cell_state_in, ICLTensor *scratch_buffer, ICLTensor *output_state_out, ICLTensor *cell_state_out, ICLTensor *output, const LSTMParams< ICLTensor > &lstm_params, const ActivationLayerInfo &activation_info, float cell_threshold=0.f, float projection_threshold=0.f)
 Initialize function's tensors. More...
 
void configure (const CLCompileContext &compile_context, const ICLTensor *input, const ICLTensor *input_to_forget_weights, const ICLTensor *input_to_cell_weights, const ICLTensor *input_to_output_weights, const ICLTensor *recurrent_to_forget_weights, const ICLTensor *recurrent_to_cell_weights, const ICLTensor *recurrent_to_output_weights, const ICLTensor *forget_gate_bias, const ICLTensor *cell_bias, const ICLTensor *output_gate_bias, const ICLTensor *output_state_in, ICLTensor *cell_state_in, ICLTensor *scratch_buffer, ICLTensor *output_state_out, ICLTensor *cell_state_out, ICLTensor *output, const LSTMParams< ICLTensor > &lstm_params, const ActivationLayerInfo &activation_info, float cell_threshold=0.f, float projection_threshold=0.f)
 Initialize function's tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *input_to_forget_weights, const ITensorInfo *input_to_cell_weights, const ITensorInfo *input_to_output_weights, const ITensorInfo *recurrent_to_forget_weights, const ITensorInfo *recurrent_to_cell_weights, const ITensorInfo *recurrent_to_output_weights, const ITensorInfo *forget_gate_bias, const ITensorInfo *cell_bias, const ITensorInfo *output_gate_bias, const ITensorInfo *output_state_in, const ITensorInfo *cell_state_in, const ITensorInfo *scratch_buffer, const ITensorInfo *output_state_out, const ITensorInfo *cell_state_out, const ITensorInfo *output, const LSTMParams< ITensorInfo > &lstm_params, const ActivationLayerInfo &activation_info, float cell_threshold=0.f, float projection_threshold=0.f)
 Static function to check if given info will lead to a valid configuration of CLLSTMLayer. More...
 

Detailed Description

This function performs a single time step in a Long Short-Term Memory (LSTM) layer.

Definition at line 61 of file CLLSTMLayer.h.

Constructor & Destructor Documentation

◆ CLLSTMLayer() [1/3]

CLLSTMLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 42 of file CLLSTMLayer.cpp.

43  : _memory_group(std::move(memory_manager)), _fully_connected_input_gate(), _accum_input_gate1(), _subtract_input_gate(), _pixelwise_mul_input_gate(), _activation_input_gate(),
44  _fully_connected_forget_gate(), _accum_forget_gate1(), _pixelwise_mul_forget_gate(), _activation_forget_gate(), _fully_connected_cell_state(), _gemm_cell_state1(),
45  _transpose_cell_state(std::make_unique<opencl::kernels::ClTransposeKernel>()), _accum_cell_state1(), _accum_cell_state2(), _pixelwise_mul_cell_state1(), _activation_cell_state(), _cell_clip(),
46  _pixelwise_mul_cell_state2(), _fully_connected_output(), _pixelwise_mul_output_state1(), _accum_output1(), _activation_output(), _activation_output_state(), _pixelwise_mul_output_state2(),
47  _fully_connected_output_state(), _projection_clip(), _copy_cell_state(), _copy_output(), _concat_scratch_buffer(), _concat_inputs_forget_gate(), _concat_weights_forget_gate(),
48  _concat_weights_input_gate(), _concat_weights_output(), _ones_fill(), _mean_std_norm_input_gate(), _pixelwise_mul_input_gate_coeff(), _accum_input_gate_bias(), _mean_std_norm_forget_gate(),
49  _pixelwise_mul_forget_gate_coeff(), _accum_forget_gate_bias(), _mean_std_norm_cell_gate(), _pixelwise_mul_cell_gate_coeff(), _accum_cell_gate_bias(), _mean_std_norm_output_gate(),
50  _pixelwise_mul_output_gate_coeff(), _accum_output_gate_bias(), _input_gate_out1(), _input_gate_out2(), _input_gate_out3(), _input_gate_out4(), _forget_gate_out1(), _forget_gate_out2(),
51  _forget_gate_out3(), _forget_gate_out4(), _forget_gate_out5(), _forget_gate_out6(), _cell_state_out1(), _cell_state_out2(), _cell_state_out3(), _cell_state_out4(), _cell_state_out5(), _output1(),
52  _output2(), _output3(), _output4(), _cell_state_activation(), _output_state1(), _ones(), _input_layer_norm_out1(), _input_layer_norm_out2(), _forget_layer_norm_out1(), _forget_layer_norm_out2(),
53  _cell_layer_norm_out1(), _cell_layer_norm_out2(), _output_layer_norm_out1(), _output_layer_norm_out2(), _run_peephole_opt(false), _run_cifg_opt(false), _perform_cell_clipping(false),
54  _has_projection_weights(false), _perform_projection_clipping(false), _is_prepared(false), _is_layer_norm_lstm(false)
55 {
56 }

◆ CLLSTMLayer() [2/3]

CLLSTMLayer ( const CLLSTMLayer )
delete

Prevent instances of this class from being copied.

◆ CLLSTMLayer() [3/3]

CLLSTMLayer ( CLLSTMLayer &&  )
delete

Prevent instances of this class to be moved.

◆ ~CLLSTMLayer()

~CLLSTMLayer ( )
default

Default destructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( const CLCompileContext compile_context,
const ICLTensor input,
const ICLTensor input_to_forget_weights,
const ICLTensor input_to_cell_weights,
const ICLTensor input_to_output_weights,
const ICLTensor recurrent_to_forget_weights,
const ICLTensor recurrent_to_cell_weights,
const ICLTensor recurrent_to_output_weights,
const ICLTensor forget_gate_bias,
const ICLTensor cell_bias,
const ICLTensor output_gate_bias,
const ICLTensor output_state_in,
ICLTensor cell_state_in,
ICLTensor scratch_buffer,
ICLTensor output_state_out,
ICLTensor cell_state_out,
ICLTensor output,
const LSTMParams< ICLTensor > &  lstm_params,
const ActivationLayerInfo activation_info,
float  cell_threshold = 0.f,
float  projection_threshold = 0.f 
)

Initialize function's tensors.

Parameters
[in]compile_contextThe compile context to be used.
[in]inputSource tensor. Input is a 2D tensor with dimensions [input_size, batch_size]. Data types supported: F16/F32.
[in]input_to_forget_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_cell_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_output_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_forget_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_cell_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_output_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]forget_gate_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]cell_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]output_gate_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]output_state_in2D weights tensor with dimensions [output_size, batch_size]. Data type supported: Same as input.
[in]cell_state_in2D tensor with dimensions [num_units, batch_size]. Data type supported: Same as input.
[out]scratch_buffer2D tensor with dimensions [num_units * 4, batch_size] with CIFG or [num_units * 3, batch_size] without CIGF. Data type supported: Same as input.
[out]output_state_out2D weights tensor with dimensions [output_size, batch_size]. Data type supported: Same as input.
[out]cell_state_out2D tensor with dimensions [num_units, batch_size]. Data type supported: Same as input.
[out]outputDestination tensor. Output is a 2D tensor with dimensions [output_size, batch_size]. Data types supported: Same as input.
[in]lstm_paramsWeights tensors used in peephole optimization: input_to_input_weights 2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input. recurrent_to_input_weights 2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input. cell_to_input_weights 1D weights tensor with dimensions [num_units]. Can be nullptr. Data type supported: Same as input. cell_to_forget_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. cell_to_output_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. input_gate_bias 1D weights tensor with dimensions [num_units]. Data type supported: Same as input projection_weights 2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input. projection_bias 1D weights tensor with dimensions [output_size]. Data type supported: Same as input. input_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. forget_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. cell_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. output_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]activation_infoContains activation information described in ActivationLayerInfo.
[in]cell_threshold(Optional) The clipping threshold for the cell state, such that values are bound within [-cell_clip, cell_clip]. If set to 0.0f then clipping is disabled.
[in]projection_threshold(Optional) The clipping threshold for the output from the projection layer, such that values are bound within [-proj_clip, proj_clip]. If set to 0.0f then clipping is disabled.

lstm_res = PixelwiseMul(output, Activation(cell_state))

                -- Clip(lstm_res * projection_weights + projection_bias, projection_threshold) , if there is a projection
               /

output_state = – \ – lstm_res , otherwise

Definition at line 73 of file CLLSTMLayer.cpp.

80 {
85  output_state_in, cell_state_in,
86  scratch_buffer, output_state_out, cell_state_out, output);
87 
89  recurrent_to_output_weights, forget_gate_bias, cell_bias, output_gate_bias, output_state_in, cell_state_in, scratch_buffer, output_state_out, cell_state_out,
90  output, lstm_params, activation_info, cell_threshold, projection_threshold);
91 
92  _is_layer_norm_lstm = lstm_params.use_layer_norm();
93 
94  // Set lstm parameters
95  LSTMParams<ITensorInfo> lstm_params_info{};
96  build_lstm_params_tensor_info(lstm_params, &lstm_params_info);
97 
98  // Validate
102  forget_gate_bias->info(), cell_bias->info(), output_gate_bias->info(),
103  output_state_in->info(), cell_state_in->info(),
104  scratch_buffer->info(), output_state_out->info(), cell_state_out->info(), output->info(),
105  lstm_params_info, activation_info, cell_threshold, projection_threshold));
106 
107  const TensorShape cell_state_shape = cell_state_in->info()->tensor_shape();
108  // Configure block that calculates the forget gate
109  // forget_gate = Activation(input * input_to_forget_weights + output_state_in * recurrent_to_forget_weights + PixelWiseMul(cell_state, cell_to_forget_weights) + forget_gate_bias)
110  // We optimize this as follows:
111  // forget_gate = Activation( (input,output_state_in) * (input_to_forget_weights,recurrent_to_forget_weights) + PixelWiseMul(cell_state, cell_to_forget_weights) + forget_gate_bias
112  _forget_gate_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
113  _forget_gate_out3.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
114  _forget_gate_out5.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
115 
116  std::vector<const ICLTensor *> inputs_vector;
117  inputs_vector.emplace_back(input);
118  inputs_vector.emplace_back(output_state_in);
119  const TensorShape concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(inputs_vector, 0);
120  _forget_gate_out2.allocator()->init(TensorInfo(concat_shape, 1, input->info()->data_type()));
121 
122  _memory_group.manage(&_forget_gate_out2);
123  _concat_inputs_forget_gate.configure(compile_context, inputs_vector, &_forget_gate_out2, Window::DimX);
124 
125  std::vector<const ICLTensor *> weights_vector;
126 
127  weights_vector.emplace_back(input_to_forget_weights);
128  weights_vector.emplace_back(recurrent_to_forget_weights);
129  const TensorShape weights_concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(weights_vector, 0);
130  _forget_gate_out6.allocator()->init(TensorInfo(weights_concat_shape, 1, input->info()->data_type()));
131 
132  _concat_weights_forget_gate.configure(compile_context, weights_vector, &_forget_gate_out6, Window::DimX);
133 
134  _memory_group.manage(&_forget_gate_out5);
135  _fully_connected_forget_gate.configure(compile_context, &_forget_gate_out2, &_forget_gate_out6, (_is_layer_norm_lstm) ? nullptr : forget_gate_bias, &_forget_gate_out5);
136  _memory_group.manage(&_forget_gate_out1);
137  _memory_group.manage(&_forget_gate_out3);
138  _forget_gate_out6.allocator()->allocate();
139 
140  CLTensor *forget_gate_out = &_forget_gate_out5;
141  if(lstm_params.has_peephole_opt())
142  {
143  _forget_gate_out4.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
144 
145  _run_peephole_opt = true;
146  _memory_group.manage(&_forget_gate_out4);
147  _pixelwise_mul_forget_gate.configure(compile_context, cell_state_in, lstm_params.cell_to_forget_weights(), &_forget_gate_out4, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
148  _accum_forget_gate1.configure(compile_context, &_forget_gate_out5, &_forget_gate_out4, &_forget_gate_out3, ConvertPolicy::SATURATE);
149  _forget_gate_out4.allocator()->allocate();
150  _forget_gate_out5.allocator()->allocate();
151  forget_gate_out = &_forget_gate_out3;
152  }
153  else
154  {
155  _forget_gate_out3.allocator()->allocate();
156  }
157  if(_is_layer_norm_lstm)
158  {
159  _forget_layer_norm_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
160  _forget_layer_norm_out2.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
161  _memory_group.manage(&_forget_layer_norm_out1);
162  _memory_group.manage(&_forget_layer_norm_out2);
163  _mean_std_norm_forget_gate.configure(compile_context, forget_gate_out);
164  _pixelwise_mul_forget_gate_coeff.configure(compile_context, forget_gate_out, lstm_params.forget_layer_norm_weights(), &_forget_layer_norm_out1, 1, ConvertPolicy::SATURATE,
166  // forget_gate_out is going to be reassigned, so allocate the tensor that it was assigned to before
167  forget_gate_out->allocator()->allocate();
168  _accum_forget_gate_bias.configure(compile_context, &_forget_layer_norm_out1, forget_gate_bias, &_forget_layer_norm_out2, ConvertPolicy::SATURATE);
169  _forget_layer_norm_out1.allocator()->allocate();
170  forget_gate_out = &_forget_layer_norm_out2;
171  }
172  _activation_forget_gate.configure(compile_context, forget_gate_out, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LOGISTIC));
173 
174  // Configure block that calculates the input gate
175  // input_gate = Activation(input * input_to_input_weights + output_state * recurrent_to_input_weights + PixelWiseMul(cell_state, cell_to_input_weights) + input_gate_bias), without CIFG
176  // input_gate = 1 - forget_gate, with CIFG
177  // We optimize this as follows:
178  // input_gate = Activation((input,output_state) * (input_to_input_weights,recurrent_to_input_weights) + PixelWiseMul(cell_state, cell_to_input_weights) + input_gate_bias), without CIFG
179  _input_gate_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
180  CLTensor *input_gate_out = &_input_gate_out1;
181  if(lstm_params.has_cifg_opt())
182  {
183  _memory_group.manage(&_input_gate_out1);
184  _ones.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
185  _ones_fill.configure(compile_context, &_ones, PixelValue(1, _ones.info()->data_type()));
186  _subtract_input_gate.configure(compile_context, &_ones, forget_gate_out, &_input_gate_out1, ConvertPolicy::SATURATE);
187  _ones.allocator()->allocate();
188  _run_cifg_opt = true;
189  }
190  else
191  {
192  _input_gate_out3.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
193  _input_gate_out4.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
194 
195  std::vector<const ICLTensor *> lstm_weights;
196  lstm_weights.emplace_back(lstm_params.input_to_input_weights());
197  lstm_weights.emplace_back(lstm_params.recurrent_to_input_weights());
198  TensorShape lstm_weights_concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(lstm_weights, 0);
199  _input_gate_out2.allocator()->init(TensorInfo(lstm_weights_concat_shape, 1, input->info()->data_type()));
200 
201  _concat_weights_input_gate.configure(compile_context, lstm_weights, &_input_gate_out2, Window::DimX);
202 
203  _memory_group.manage(&_input_gate_out1);
204 
205  _memory_group.manage(&_input_gate_out3);
206  _fully_connected_input_gate.configure(compile_context, &_forget_gate_out2, &_input_gate_out2, (_is_layer_norm_lstm) ? nullptr : lstm_params.input_gate_bias(), &_input_gate_out3);
207  _input_gate_out2.allocator()->allocate();
208 
209  input_gate_out = &_input_gate_out3;
210  if(_run_peephole_opt)
211  {
212  _memory_group.manage(&_input_gate_out4);
213  _pixelwise_mul_input_gate.configure(compile_context, cell_state_in, lstm_params.cell_to_input_weights(), &_input_gate_out4, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
214  _accum_input_gate1.configure(compile_context, &_input_gate_out3, &_input_gate_out4, &_input_gate_out1, ConvertPolicy::SATURATE);
215  _input_gate_out3.allocator()->allocate();
216  _input_gate_out4.allocator()->allocate();
217  input_gate_out = &_input_gate_out1;
218  }
219  else
220  {
221  _input_gate_out1.allocator()->allocate();
222  }
223 
224  if(_is_layer_norm_lstm)
225  {
226  _input_layer_norm_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
227  _input_layer_norm_out2.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
228  _memory_group.manage(&_input_layer_norm_out1);
229  _memory_group.manage(&_input_layer_norm_out2);
230  _mean_std_norm_input_gate.configure(compile_context, input_gate_out);
231  _pixelwise_mul_input_gate_coeff.configure(compile_context, input_gate_out, lstm_params.input_layer_norm_weights(), &_input_layer_norm_out1, 1, ConvertPolicy::SATURATE,
233  // input_gate_out is going to be reassigned, so allocate the tensor that it was assigned to before
234  input_gate_out->allocator()->allocate();
235  _accum_input_gate_bias.configure(compile_context, &_input_layer_norm_out1, lstm_params.input_gate_bias(), &_input_layer_norm_out2, ConvertPolicy::SATURATE);
236  _input_layer_norm_out1.allocator()->allocate();
237  input_gate_out = &_input_layer_norm_out2;
238  }
239  _activation_input_gate.configure(compile_context, input_gate_out, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LOGISTIC));
240  }
241 
242  // Configure block that calculates the cell state
243  // cell_state = Clip((PixelwiseMul(input_gate, Activation(input * input_to_cell_weights + output_state_in * recurrent_to_cell_weights + cell_bias)) + PixelwiseMul(forget_gate, cell_state)), cell_threshold)
244  TensorShape cell_state1_shape = compute_transposed_shape(*recurrent_to_output_weights->info());
245  _cell_state_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
246  _cell_state_out2.allocator()->init(TensorInfo(cell_state1_shape, 1, input->info()->data_type()));
247  _cell_state_out3.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
248  _cell_state_out4.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
249  _cell_state_out5.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
250 
251  _memory_group.manage(&_cell_state_out1);
252  _fully_connected_cell_state.configure(compile_context, input, input_to_cell_weights, (_is_layer_norm_lstm) ? nullptr : cell_bias, &_cell_state_out1);
253  _memory_group.manage(&_cell_state_out2);
254  _transpose_cell_state->configure(compile_context, recurrent_to_cell_weights->info(), _cell_state_out2.info());
255  _recurrent_to_cell_weights = recurrent_to_cell_weights;
256  _memory_group.manage(&_cell_state_out3);
257  _gemm_cell_state1.configure(compile_context, output_state_in, &_cell_state_out2, nullptr, &_cell_state_out3, 1.f, 0.f);
258  _cell_state_out2.allocator()->allocate();
259  _memory_group.manage(&_cell_state_out4);
260  _accum_cell_state1.configure(compile_context, &_cell_state_out1, &_cell_state_out3, &_cell_state_out4, ConvertPolicy::SATURATE);
261  CLTensor *cell_state_out_ptr = &_cell_state_out4;
262  if(_is_layer_norm_lstm)
263  {
264  _cell_layer_norm_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
265  _cell_layer_norm_out2.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
266  _memory_group.manage(&_cell_layer_norm_out1);
267  _memory_group.manage(&_cell_layer_norm_out2);
268  _mean_std_norm_cell_gate.configure(compile_context, cell_state_out_ptr);
269  _pixelwise_mul_cell_gate_coeff.configure(compile_context, cell_state_out_ptr, lstm_params.cell_layer_norm_weights(), &_cell_layer_norm_out1, 1, ConvertPolicy::SATURATE,
271  // cell_state_out_ptr is going to be reassigned, so allocate the tensor that it was assigned to before
272  cell_state_out_ptr->allocator()->allocate();
273  _accum_cell_gate_bias.configure(compile_context, &_cell_layer_norm_out1, cell_bias, &_cell_layer_norm_out2, ConvertPolicy::SATURATE);
274  _cell_layer_norm_out1.allocator()->allocate();
275  cell_state_out_ptr = &_cell_layer_norm_out2;
276  }
277  _activation_cell_state.configure(compile_context, cell_state_out_ptr, nullptr, activation_info);
278  _memory_group.manage(&_cell_state_out5);
279  _pixelwise_mul_cell_state1.configure(compile_context, cell_state_out_ptr, input_gate_out, &_cell_state_out5, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
280  cell_state_out_ptr->allocator()->allocate();
281  _pixelwise_mul_cell_state2.configure(compile_context, forget_gate_out, cell_state_in, &_cell_state_out3, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
282  _accum_cell_state2.configure(compile_context, &_cell_state_out5, &_cell_state_out3, &_cell_state_out1, ConvertPolicy::SATURATE);
283  _cell_state_out3.allocator()->allocate();
284  _cell_state_out5.allocator()->allocate();
285  // Perform clipping
286  if(cell_threshold != 0.f)
287  {
288  _perform_cell_clipping = true;
289  _cell_clip.configure(compile_context, &_cell_state_out1, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU, cell_threshold, -cell_threshold));
290  }
291 
292  // Configure block that calculates the output
293  // output_state_out = Activation(input * input_to_output_weights + output_state_in * recurrent_to_output_weights + PixelWiseMul(cell_state, cell_to_output_weights) + output_gate_bias)
294  // We optimize this as follows:
295  // output_state_out = Activation( (input,output_state_in) * (input_to_output_weights, recurrent_to_output_weights) + PixelWiseMul(cell_state, cell_to_output_weights) + output_gate_bias)
296  _output1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
297  _output4.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
298  std::vector<const ICLTensor *> in_out_weights;
299  in_out_weights.emplace_back(input_to_output_weights);
300  in_out_weights.emplace_back(recurrent_to_output_weights);
301  TensorShape in_out_weights_concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(in_out_weights, 0);
302  _output2.allocator()->init(TensorInfo(in_out_weights_concat_shape, 1, input->info()->data_type()));
303 
304  _concat_weights_output.configure(compile_context, in_out_weights, &_output2, Window::DimX);
305 
306  _memory_group.manage(&_output1);
307  _memory_group.manage(&_output4);
308 
309  _fully_connected_output.configure(compile_context, &_forget_gate_out2, &_output2, (_is_layer_norm_lstm) ? nullptr : output_gate_bias, &_output4);
310 
311  _output2.allocator()->allocate();
312  _forget_gate_out2.allocator()->allocate();
313 
314  CLTensor *output_gate_out = &_output4;
315  if(lstm_params.has_peephole_opt())
316  {
317  _output3.allocator()->init(TensorInfo(_cell_state_out1.info()->tensor_shape(), 1, input->info()->data_type()));
318 
319  _memory_group.manage(&_output3);
320  _pixelwise_mul_output_state1.configure(compile_context, &_cell_state_out1, lstm_params.cell_to_output_weights(), &_output3, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
321  _accum_output1.configure(compile_context, &_output4, &_output3, &_output1, ConvertPolicy::SATURATE);
322  _output4.allocator()->allocate();
323  output_gate_out = &_output1;
324 
325  // Allocate intermediate buffers
326  _output3.allocator()->allocate();
327  }
328  else
329  {
330  _output1.allocator()->allocate();
331  }
332  if(_is_layer_norm_lstm)
333  {
334  _output_layer_norm_out1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
335  _output_layer_norm_out2.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
336  _memory_group.manage(&_output_layer_norm_out1);
337  _memory_group.manage(&_output_layer_norm_out2);
338  _mean_std_norm_output_gate.configure(compile_context, output_gate_out);
339  _pixelwise_mul_output_gate_coeff.configure(compile_context, output_gate_out, lstm_params.output_layer_norm_weights(), &_output_layer_norm_out1, 1, ConvertPolicy::SATURATE,
341  // output_gate_out is going to be reassigned, so allocate the tensor that it was assigned to before
342  output_gate_out->allocator()->allocate();
343  _accum_output_gate_bias.configure(compile_context, &_output_layer_norm_out1, output_gate_bias, &_output_layer_norm_out2, ConvertPolicy::SATURATE);
344  _output_layer_norm_out1.allocator()->allocate();
345  output_gate_out = &_output_layer_norm_out2;
346  }
347  _activation_output.configure(compile_context, output_gate_out, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LOGISTIC));
348 
349  // Configure block that calculates the output state
350  /** lstm_res = PixelwiseMul(output, Activation(cell_state))
351  *
352  * -- Clip(lstm_res * projection_weights + projection_bias, projection_threshold) , if there is a projection
353  * /
354  * output_state = --
355  * \
356  * -- lstm_res , otherwise
357  */
358  ICLTensor *output_state_out_tmp = lstm_params.has_projection() ? &_output_state1 : output_state_out;
359  _cell_state_activation.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
360  _output_state1.allocator()->init(TensorInfo(cell_state_shape, 1, input->info()->data_type()));
361 
362  _memory_group.manage(&_cell_state_activation);
363  _activation_output_state.configure(compile_context, &_cell_state_out1, &_cell_state_activation, activation_info);
364  _pixelwise_mul_output_state2.configure(compile_context, &_cell_state_activation, output_gate_out, output_state_out_tmp, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN);
365  _cell_state_activation.allocator()->allocate();
366 
367  if(lstm_params.has_projection())
368  {
369  _has_projection_weights = true;
370  _fully_connected_output_state.configure(compile_context, output_state_out_tmp, lstm_params.projection_weights(), lstm_params.projection_bias(), output_state_out);
371  _output_state1.allocator()->allocate();
372  // Perform clipping
373  if(projection_threshold != 0.f)
374  {
375  _perform_projection_clipping = true;
376  _projection_clip.configure(compile_context, output_state_out, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU, -projection_threshold, projection_threshold));
377  }
378  }
379 
380  // Copy cell state and output
381  _copy_cell_state.configure(compile_context, &_cell_state_out1, cell_state_out);
382  _copy_output.configure(compile_context, output_state_out, output);
383 
384  // Vector for holding the tensors to store in scratch buffer
385  std::vector<const ICLTensor *> scratch_inputs;
386  if(!lstm_params.has_cifg_opt())
387  {
388  scratch_inputs.emplace_back(input_gate_out);
389  }
390  scratch_inputs.emplace_back(&_cell_state_out1);
391  scratch_inputs.emplace_back(forget_gate_out);
392  scratch_inputs.emplace_back(output_gate_out);
393  _concat_scratch_buffer.configure(compile_context, scratch_inputs, scratch_buffer, Window::DimX);
394  input_gate_out->allocator()->allocate();
395  _cell_state_out1.allocator()->allocate();
396  forget_gate_out->allocator()->allocate();
397  output_gate_out->allocator()->allocate();
398 }

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, arm_compute::utils::info_helpers::build_lstm_params_tensor_info(), arm_compute::misc::shape_calculator::calculate_concatenate_shape(), LSTMParams< T >::cell_layer_norm_weights(), LSTMParams< T >::cell_to_forget_weights(), LSTMParams< T >::cell_to_input_weights(), LSTMParams< T >::cell_to_output_weights(), arm_compute::misc::shape_calculator::compute_transposed_shape(), CLMeanStdDevNormalizationLayer::configure(), CLFill::configure(), CLCopy::configure(), CLActivationLayer::configure(), CLArithmeticAddition::configure(), CLConcatenateLayer::configure(), CLFullyConnectedLayer::configure(), CLPixelWiseMultiplication::configure(), CLGEMM::configure(), CLArithmeticSubtraction::configure(), TensorInfo::data_type(), Window::DimX, arm_compute::test::validation::forget_gate_bias, LSTMParams< T >::forget_layer_norm_weights(), LSTMParams< T >::has_cifg_opt(), LSTMParams< T >::has_peephole_opt(), LSTMParams< T >::has_projection(), ITensor::info(), CLTensor::info(), ITensorAllocator::init(), arm_compute::test::validation::input, LSTMParams< T >::input_gate_bias(), LSTMParams< T >::input_layer_norm_weights(), arm_compute::test::validation::input_to_cell_weights, arm_compute::test::validation::input_to_forget_weights, LSTMParams< T >::input_to_input_weights(), arm_compute::test::validation::input_to_output_weights, MemoryGroup::manage(), arm_compute::test::validation::output_gate_bias, LSTMParams< T >::output_layer_norm_weights(), LSTMParams< T >::projection_bias(), LSTMParams< T >::projection_weights(), arm_compute::test::validation::recurrent_to_cell_weights, arm_compute::test::validation::recurrent_to_forget_weights, LSTMParams< T >::recurrent_to_input_weights(), arm_compute::test::validation::recurrent_to_output_weights, arm_compute::SATURATE, ITensorInfo::tensor_shape(), TensorInfo::tensor_shape(), arm_compute::TO_NEAREST_EVEN, LSTMParams< T >::use_layer_norm(), and CLLSTMLayer::validate().

◆ configure() [2/2]

void configure ( const ICLTensor input,
const ICLTensor input_to_forget_weights,
const ICLTensor input_to_cell_weights,
const ICLTensor input_to_output_weights,
const ICLTensor recurrent_to_forget_weights,
const ICLTensor recurrent_to_cell_weights,
const ICLTensor recurrent_to_output_weights,
const ICLTensor forget_gate_bias,
const ICLTensor cell_bias,
const ICLTensor output_gate_bias,
const ICLTensor output_state_in,
ICLTensor cell_state_in,
ICLTensor scratch_buffer,
ICLTensor output_state_out,
ICLTensor cell_state_out,
ICLTensor output,
const LSTMParams< ICLTensor > &  lstm_params,
const ActivationLayerInfo activation_info,
float  cell_threshold = 0.f,
float  projection_threshold = 0.f 
)

Initialize function's tensors.

Valid data layouts:

  • All

Valid data type configurations:

src0 - src13 dst0 - dst3
F16 F16
F32 F32
Parameters
[in]inputSource tensor. Input is a 2D tensor with dimensions [input_size, batch_size]. Data types supported: F16/F32.
[in]input_to_forget_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_cell_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_output_weights2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_forget_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_cell_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_output_weights2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]forget_gate_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]cell_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]output_gate_bias1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]output_state_in2D weights tensor with dimensions [output_size, batch_size]. Data type supported: Same as input.
[in]cell_state_in2D tensor with dimensions [num_units, batch_size]. Data type supported: Same as input.
[out]scratch_buffer2D tensor with dimensions [num_units * 4, batch_size] with CIFG or [num_units * 3, batch_size] without CIGF. Data type supported: Same as input.
[out]output_state_out2D weights tensor with dimensions [output_size, batch_size]. Data type supported: Same as input.
[out]cell_state_out2D tensor with dimensions [num_units, batch_size]. Data type supported: Same as input.
[out]outputDestination tensor. Output is a 2D tensor with dimensions [output_size, batch_size]. Data types supported: Same as input.
[in]lstm_paramsWeights tensors used in peephole optimization: input_to_input_weights 2D weights tensor with dimensions [input_size, num_units]. Data type supported: Same as input. recurrent_to_input_weights 2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input. cell_to_input_weights 1D weights tensor with dimensions [num_units]. Can be nullptr. Data type supported: Same as input. cell_to_forget_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. cell_to_output_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. input_gate_bias 1D weights tensor with dimensions [num_units]. Data type supported: Same as input projection_weights 2D weights tensor with dimensions [output_size, num_units]. Data type supported: Same as input. projection_bias 1D weights tensor with dimensions [output_size]. Data type supported: Same as input. input_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. forget_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. cell_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input. output_layer_norm_weights 1D weights tensor with dimensions [num_units]. Data type supported: Same as input.
[in]activation_infoContains activation information described in ActivationLayerInfo.
[in]cell_threshold(Optional) The clipping threshold for the cell state, such that values are bound within [-cell_clip, cell_clip]. If set to 0.0f then clipping is disabled.
[in]projection_threshold(Optional) The clipping threshold for the output from the projection layer, such that values are bound within [-proj_clip, proj_clip]. If set to 0.0f then clipping is disabled.

Definition at line 60 of file CLLSTMLayer.cpp.

67 {
69  recurrent_to_output_weights, forget_gate_bias, cell_bias, output_gate_bias, output_state_in, cell_state_in, scratch_buffer, output_state_out, cell_state_out, output, lstm_params, activation_info,
70  cell_threshold, projection_threshold);
71 }

References arm_compute::test::validation::forget_gate_bias, CLKernelLibrary::get(), arm_compute::test::validation::input, arm_compute::test::validation::input_to_cell_weights, arm_compute::test::validation::input_to_forget_weights, arm_compute::test::validation::input_to_output_weights, arm_compute::test::validation::output_gate_bias, arm_compute::test::validation::recurrent_to_cell_weights, arm_compute::test::validation::recurrent_to_forget_weights, and arm_compute::test::validation::recurrent_to_output_weights.

◆ operator=() [1/2]

CLLSTMLayer& operator= ( CLLSTMLayer &&  )
delete

Prevent instances of this class to be moved.

◆ operator=() [2/2]

CLLSTMLayer& operator= ( const CLLSTMLayer )
delete

Prevent instances of this class from being copied.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 736 of file CLLSTMLayer.cpp.

737 {
738  if(!_is_prepared)
739  {
740  _concat_weights_forget_gate.run();
741  if(!_run_cifg_opt)
742  {
743  _concat_weights_input_gate.run();
744  }
745  _concat_weights_output.run();
746  _is_prepared = true;
747  }
748 }

References CLConcatenateLayer::run().

Referenced by CLLSTMLayer::run().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For CPU kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 631 of file CLLSTMLayer.cpp.

632 {
633  prepare();
634 
635  MemoryGroupResourceScope scope_mg(_memory_group);
636 
637  _concat_inputs_forget_gate.run();
638 
639  _fully_connected_forget_gate.run();
640 
641  if(_run_peephole_opt)
642  {
643  _pixelwise_mul_forget_gate.run();
644  _accum_forget_gate1.run();
645  }
646  if(_is_layer_norm_lstm)
647  {
648  _mean_std_norm_forget_gate.run();
649  _pixelwise_mul_forget_gate_coeff.run();
650  _accum_forget_gate_bias.run();
651  }
652  _activation_forget_gate.run();
653 
654  if(_run_cifg_opt)
655  {
656  _ones_fill.run();
657  _subtract_input_gate.run();
658  }
659  else
660  {
661  _fully_connected_input_gate.run();
662 
663  if(_run_peephole_opt)
664  {
665  _pixelwise_mul_input_gate.run();
666  _accum_input_gate1.run();
667  }
668 
669  if(_is_layer_norm_lstm)
670  {
671  _mean_std_norm_input_gate.run();
672  _pixelwise_mul_input_gate_coeff.run();
673  _accum_input_gate_bias.run();
674  }
675  _activation_input_gate.run();
676  }
677 
678  _fully_connected_cell_state.run();
679  ITensorPack pack;
680  pack.add_tensor(TensorType::ACL_SRC, _recurrent_to_cell_weights);
681  pack.add_tensor(TensorType::ACL_DST, &_cell_state_out2);
682  CLScheduler::get().enqueue_op(*_transpose_cell_state,
683  pack,
684  false);
685  _gemm_cell_state1.run();
686  _accum_cell_state1.run();
687  if(_is_layer_norm_lstm)
688  {
689  _mean_std_norm_cell_gate.run();
690  _pixelwise_mul_cell_gate_coeff.run();
691  _accum_cell_gate_bias.run();
692  }
693  _activation_cell_state.run();
694  _pixelwise_mul_cell_state1.run();
695  _pixelwise_mul_cell_state2.run();
696  _accum_cell_state2.run();
697 
698  if(_perform_cell_clipping)
699  {
700  _cell_clip.run();
701  }
702 
703  _fully_connected_output.run();
704 
705  if(_run_peephole_opt)
706  {
707  _pixelwise_mul_output_state1.run();
708  _accum_output1.run();
709  }
710  if(_is_layer_norm_lstm)
711  {
712  _mean_std_norm_output_gate.run();
713  _pixelwise_mul_output_gate_coeff.run();
714  _accum_output_gate_bias.run();
715  }
716  _activation_output.run();
717 
718  _activation_output_state.run();
719  _pixelwise_mul_output_state2.run();
720 
721  if(_has_projection_weights)
722  {
723  _fully_connected_output_state.run();
724  if(_perform_projection_clipping)
725  {
726  _projection_clip.run();
727  }
728  }
729 
730  _copy_cell_state.run();
731  _copy_output.run();
732 
733  _concat_scratch_buffer.run();
734 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC, ITensorPack::add_tensor(), CLScheduler::enqueue_op(), CLScheduler::get(), arm_compute::test::validation::pack, CLLSTMLayer::prepare(), ICLSimpleFunction::run(), CLFill::run(), CLCopy::run(), CLFullyConnectedLayer::run(), CLActivationLayer::run(), CLGEMM::run(), CLConcatenateLayer::run(), CLPixelWiseMultiplication::run(), CLArithmeticAddition::run(), and CLArithmeticSubtraction::run().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo input_to_forget_weights,
const ITensorInfo input_to_cell_weights,
const ITensorInfo input_to_output_weights,
const ITensorInfo recurrent_to_forget_weights,
const ITensorInfo recurrent_to_cell_weights,
const ITensorInfo recurrent_to_output_weights,
const ITensorInfo forget_gate_bias,
const ITensorInfo cell_bias,
const ITensorInfo output_gate_bias,
const ITensorInfo output_state_in,
const ITensorInfo cell_state_in,
const ITensorInfo scratch_buffer,
const ITensorInfo output_state_out,
const ITensorInfo cell_state_out,
const ITensorInfo output,
const LSTMParams< ITensorInfo > &  lstm_params,
const ActivationLayerInfo activation_info,
float  cell_threshold = 0.f,
float  projection_threshold = 0.f 
)
static

Static function to check if given info will lead to a valid configuration of CLLSTMLayer.

Parameters
[in]inputSource tensor info. Input is a 2D tensor with dimensions [input_size, batch_size]. Data types supported: F16/F32.
[in]input_to_forget_weights2D weights tensor info with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_cell_weights2D weights tensor info with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]input_to_output_weights2D weights tensor info with dimensions [input_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_forget_weights2D weights tensor info with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_cell_weights2D weights tensor info with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]recurrent_to_output_weights2D weights tensor info with dimensions [output_size, num_units]. Data type supported: Same as input.
[in]forget_gate_bias1D weights tensor info with dimensions [num_units]. Data type supported: Same as input.
[in]cell_bias1D weights tensor info with dimensions [num_units]. Data type supported: Same as input.
[in]output_gate_bias1D weights tensor info with dimensions [num_units]. Data type supported: Same as input.
[in]output_state_in2D weights tensor info with dimensions [output_size, batch_size]. Data type supported: Same as input.
[in]cell_state_in2D tensor info with dimensions [num_units, batch_size]. Data type supported: Same as input.
[in]scratch_buffer2D tensor info with dimensions [num_units * 4, batch_size] with CIFG or [num_units * 3, batch_size] without CIGF. Data type supported: Same as input.
[in]output_state_out2D weights tensor info with dimensions [output_size, batch_size]. Data type supported: Same as input.
[in]cell_state_out2D tensor info with dimensions [num_units, batch_size]. Data type supported: Same as input.
[in]outputDestination tensor info. Output is a 2D tensor with dimensions [output_size, batch_size]. Data types supported: Same as input.
[in]lstm_paramsWeights tensors info used in peephole optimization: input_to_input_weights 2D weights tensor info with dimensions [input_size, num_units]. Data type supported: Same as input. recurrent_to_input_weights 2D weights tensor info with dimensions [output_size, num_units]. Data type supported: Same as input. cell_to_input_weights 1D weights tensor info with dimensions [num_units]. Can be nullptr. Data type supported: Same as input. cell_to_forget_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input. cell_to_output_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input. input_gate_bias 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input projection_weights 2D weights tensor info with dimensions [output_size, num_units]. Data type supported: Same as input. projection_bias 1D weights tensor info with dimensions [output_size]. Data type supported: Same as input. input_layer_norm_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input. forget_layer_norm_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input. cell_layer_norm_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input. output_layer_norm_weights 1D weights tensor info with dimensions [num_units]. Data type supported: Same as input.
[in]activation_infoContains activation information described in ActivationLayerInfo.
[in]cell_threshold(Optional) The clipping threshold for the cell state, such that values are bound within [-cell_clip, cell_clip]. If set to 0.0f then clipping is disabled.
[in]projection_threshold(Optional) The clipping threshold for the output from the projection layer, such that values are bound within [-proj_clip, proj_clip]. If set to 0.0f then clipping is disabled.
Returns
a status

Definition at line 400 of file CLLSTMLayer.cpp.

407 {
412  output_state_in, cell_state_in,
413  scratch_buffer, output_state_out, cell_state_out, output);
414 
415  // Check data types
421  output_state_in, cell_state_in,
422  scratch_buffer, output_state_out, cell_state_out, output);
423 
424  // Check dimensions
425  ARM_COMPUTE_RETURN_ERROR_ON(input->num_dimensions() > 2);
427  ARM_COMPUTE_RETURN_ERROR_ON(input_to_cell_weights->num_dimensions() > 2);
432  ARM_COMPUTE_RETURN_ERROR_ON(forget_gate_bias->num_dimensions() > 1);
433  ARM_COMPUTE_RETURN_ERROR_ON(cell_bias->num_dimensions() > 1);
434  ARM_COMPUTE_RETURN_ERROR_ON(output_gate_bias->num_dimensions() > 1);
435  ARM_COMPUTE_RETURN_ERROR_ON(output_state_in->num_dimensions() > 2);
436  ARM_COMPUTE_RETURN_ERROR_ON(cell_state_in->num_dimensions() > 2);
437  ARM_COMPUTE_RETURN_ERROR_ON(scratch_buffer->num_dimensions() > 2);
438  ARM_COMPUTE_RETURN_ERROR_ON(output_state_out->num_dimensions() > 2);
439  ARM_COMPUTE_RETURN_ERROR_ON(cell_state_out->num_dimensions() > 2);
440  ARM_COMPUTE_RETURN_ERROR_ON(output->num_dimensions() > 2);
441  ARM_COMPUTE_RETURN_ERROR_ON(cell_bias->dimension(0) * 4 != scratch_buffer->dimension(0)
442  && cell_bias->dimension(0) * 3 != scratch_buffer->dimension(0));
443 
444  const unsigned int num_batches = input->dimension(1);
445  const unsigned int num_cells = input_to_output_weights->dimension(1);
446 
447  if(lstm_params.use_layer_norm())
448  {
449  // If CIFG is used, input layer normalization weights tensor is omitted
450  if(lstm_params.has_cifg_opt())
451  {
452  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.input_layer_norm_weights() != nullptr);
453  }
454  else
455  {
456  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(lstm_params.input_layer_norm_weights());
457  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.input_layer_norm_weights()->num_dimensions() > 1);
458  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.input_layer_norm_weights()->dimension(0) != num_cells);
459  ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, lstm_params.input_layer_norm_weights());
460  }
461 
462  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(lstm_params.forget_layer_norm_weights(), lstm_params.cell_layer_norm_weights(), lstm_params.output_layer_norm_weights());
463  ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(input, lstm_params.forget_layer_norm_weights(), lstm_params.cell_layer_norm_weights(), lstm_params.output_layer_norm_weights());
464  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.forget_layer_norm_weights()->num_dimensions() > 1);
465  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.cell_layer_norm_weights()->num_dimensions() > 1);
466  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.output_layer_norm_weights()->num_dimensions() > 1);
467  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.forget_layer_norm_weights()->dimension(0) != num_cells);
468  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.cell_layer_norm_weights()->dimension(0) != num_cells);
469  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.output_layer_norm_weights()->dimension(0) != num_cells);
470  }
471 
472  // Check peephole optimization
473  if(lstm_params.has_peephole_opt())
474  {
475  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(lstm_params.cell_to_output_weights(), lstm_params.cell_to_forget_weights());
476  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.cell_to_forget_weights()->num_dimensions() > 1);
477  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.cell_to_output_weights()->num_dimensions() > 1);
478  }
479 
480  TensorShape units_out_transposed_shape = compute_transposed_shape(*recurrent_to_output_weights);
481  TensorShape num_units_transposed_shape = compute_transposed_shape(*forget_gate_bias);
482  const TensorInfo units_out_transposed_info = TensorInfo(units_out_transposed_shape, 1, input->data_type());
483  const TensorInfo num_units_transposed_info = TensorInfo(num_units_transposed_shape, 1, input->data_type());
484 
485  TensorInfo input_gate = TensorInfo(TensorShape(num_cells, num_batches), 1, input->data_type());
486  TensorInfo forget_gate = TensorInfo(TensorShape(num_cells, num_batches), 1, input->data_type());
487  TensorInfo output_gate_tmp = TensorInfo(TensorShape(num_cells, num_batches), 1, input->data_type());
488  TensorInfo cell_state_tmp = TensorInfo(TensorShape(num_cells, num_batches), 1, input->data_type());
489 
490  // Validate forget gate
491  ARM_COMPUTE_RETURN_ON_ERROR(CLFullyConnectedLayer::validate(input, input_to_forget_weights, (lstm_params.use_layer_norm()) ? nullptr : forget_gate_bias, &forget_gate));
492 
493  std::vector<const ITensorInfo *> inputs_vector;
494  inputs_vector.emplace_back(input);
495  inputs_vector.emplace_back(output_state_in);
496  const TensorShape concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(inputs_vector, 0);
497  TensorInfo forget_gate_concat = TensorInfo(concat_shape, 1, input->data_type());
498 
499  ARM_COMPUTE_RETURN_ON_ERROR(CLConcatenateLayer::validate(inputs_vector, &forget_gate_concat, Window::DimX));
500 
501  if(lstm_params.has_peephole_opt())
502  {
503  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(cell_state_in, lstm_params.cell_to_forget_weights(), &forget_gate, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN));
504  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&forget_gate, &forget_gate, &forget_gate, ConvertPolicy::SATURATE));
505  }
506  if(lstm_params.use_layer_norm())
507  {
509  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(&forget_gate, lstm_params.forget_layer_norm_weights(), &forget_gate, 1, ConvertPolicy::SATURATE,
512  }
513  ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(&forget_gate, &forget_gate, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LOGISTIC)));
514 
515  // Validate input gate
516  if(!lstm_params.has_cifg_opt())
517  {
518  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(lstm_params.input_to_input_weights(),
519  lstm_params.recurrent_to_input_weights(),
520  lstm_params.input_gate_bias());
521  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.input_to_input_weights()->num_dimensions() > 2);
522  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.recurrent_to_input_weights()->num_dimensions() > 2);
523  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.input_gate_bias()->num_dimensions() > 1);
524 
525  std::vector<const ITensorInfo *> lstm_weights;
526  lstm_weights.emplace_back(lstm_params.input_to_input_weights());
527  lstm_weights.emplace_back(lstm_params.recurrent_to_input_weights());
528  TensorShape lstm_weights_concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(lstm_weights, 0);
529  TensorInfo lstm_gate_concat = TensorInfo(lstm_weights_concat_shape, 1, input->data_type());
530  ARM_COMPUTE_RETURN_ON_ERROR(CLConcatenateLayer::validate(lstm_weights, &lstm_gate_concat, Window::DimX));
531 
532  ARM_COMPUTE_RETURN_ON_ERROR(CLFullyConnectedLayer::validate(input, lstm_params.input_to_input_weights(), (lstm_params.use_layer_norm()) ? nullptr : lstm_params.input_gate_bias(), &input_gate));
533 
534  if(lstm_params.has_peephole_opt())
535  {
536  ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(lstm_params.cell_to_input_weights());
537  ARM_COMPUTE_RETURN_ERROR_ON(lstm_params.cell_to_input_weights()->num_dimensions() > 1);
538  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(cell_state_in, lstm_params.cell_to_input_weights(), &input_gate, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN));
540  }
541 
542  if(lstm_params.use_layer_norm())
543  {
545  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(&input_gate, lstm_params.input_layer_norm_weights(), &input_gate, 1, ConvertPolicy::SATURATE, RoundingPolicy::TO_NEAREST_EVEN));
546  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&input_gate, lstm_params.input_gate_bias(), &input_gate, ConvertPolicy::SATURATE));
547  }
548  ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(&input_gate, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LOGISTIC)));
549  }
550  else
551  {
553  }
554 
555  // Validate cell state
556  ARM_COMPUTE_RETURN_ON_ERROR(CLFullyConnectedLayer::validate(input, input_to_cell_weights, (lstm_params.use_layer_norm()) ? nullptr : cell_bias, &cell_state_tmp));
557  ARM_COMPUTE_RETURN_ON_ERROR(CLGEMM::validate(output_state_in, &units_out_transposed_info, nullptr, &cell_state_tmp, 1.f, 0.f, GEMMInfo()));
558  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&cell_state_tmp, &cell_state_tmp, &cell_state_tmp, ConvertPolicy::SATURATE));
559  if(lstm_params.use_layer_norm())
560  {
562  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(&cell_state_tmp, lstm_params.cell_layer_norm_weights(), &cell_state_tmp, 1, ConvertPolicy::SATURATE,
564  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&cell_state_tmp, cell_bias, &cell_state_tmp, ConvertPolicy::SATURATE));
565  }
566  ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(&cell_state_tmp, nullptr, activation_info));
569  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&cell_state_tmp, &cell_state_tmp, &cell_state_tmp, ConvertPolicy::SATURATE));
570  if(cell_threshold != 0.f)
571  {
572  ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(&cell_state_tmp, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU, cell_threshold,
573  -cell_threshold)));
574  }
575 
576  std::vector<const ITensorInfo *> in_out_weights;
577  in_out_weights.emplace_back(input_to_output_weights);
578  in_out_weights.emplace_back(recurrent_to_output_weights);
579  TensorShape in_out_weights_concat_shape = arm_compute::misc::shape_calculator::calculate_concatenate_shape(in_out_weights, 0);
580  TensorInfo in_out_gate_concat = TensorInfo(in_out_weights_concat_shape, 1, input->data_type());
581  ARM_COMPUTE_RETURN_ON_ERROR(CLConcatenateLayer::validate(in_out_weights, &in_out_gate_concat, Window::DimX));
582  // Validate output gate tmp
583  ARM_COMPUTE_RETURN_ON_ERROR(CLFullyConnectedLayer::validate(input, input_to_output_weights, (lstm_params.use_layer_norm()) ? nullptr : output_gate_bias, &output_gate_tmp));
584 
585  if(lstm_params.has_peephole_opt())
586  {
587  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(&cell_state_tmp, lstm_params.cell_to_output_weights(), &output_gate_tmp, 1, ConvertPolicy::SATURATE,
589  ARM_COMPUTE_RETURN_ON_ERROR(CLArithmeticAddition::validate(&output_gate_tmp, &output_gate_tmp, &output_gate_tmp, ConvertPolicy::SATURATE));
590  }
591  if(lstm_params.use_layer_norm())
592  {
594  ARM_COMPUTE_RETURN_ON_ERROR(CLPixelWiseMultiplication::validate(&output_gate_tmp, lstm_params.output_layer_norm_weights(), &output_gate_tmp, 1, ConvertPolicy::SATURATE,
597  }
598  ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(&output_gate_tmp, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LOGISTIC)));
599 
600  // Validate output state
601  ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(&cell_state_tmp, &cell_state_tmp, activation_info));
603  if(lstm_params.has_projection())
604  {
605  ARM_COMPUTE_RETURN_ON_ERROR(CLFullyConnectedLayer::validate(&output_gate_tmp, lstm_params.projection_weights(), lstm_params.projection_bias(), output_state_out));
606  if(projection_threshold != 0.f)
607  {
608  ARM_COMPUTE_RETURN_ON_ERROR(CLActivationLayer::validate(output_state_out, output_state_out,
609  ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LU_BOUNDED_RELU, -projection_threshold, projection_threshold)));
610  }
611  }
612 
613  // Validate copy kernel
614  ARM_COMPUTE_RETURN_ON_ERROR(CLCopy::validate(&cell_state_tmp, cell_state_out));
615  ARM_COMPUTE_RETURN_ON_ERROR(CLCopy::validate(output_state_out, output));
616 
617  // Validate scratch concatenation
618  std::vector<const ITensorInfo *> inputs_vector_info_raw;
619  if(!lstm_params.has_cifg_opt())
620  {
621  inputs_vector_info_raw.push_back(&input_gate);
622  }
623  inputs_vector_info_raw.push_back(&cell_state_tmp);
624  inputs_vector_info_raw.push_back(&forget_gate);
625  inputs_vector_info_raw.push_back(&output_gate_tmp);
626 
627  ARM_COMPUTE_RETURN_ON_ERROR(CLConcatenateLayer::validate(inputs_vector_info_raw, scratch_buffer, Window::DimX));
628  return Status{};
629 }

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::misc::shape_calculator::calculate_concatenate_shape(), LSTMParams< T >::cell_layer_norm_weights(), LSTMParams< T >::cell_to_forget_weights(), LSTMParams< T >::cell_to_input_weights(), LSTMParams< T >::cell_to_output_weights(), arm_compute::misc::shape_calculator::compute_transposed_shape(), ITensorInfo::dimension(), Window::DimX, arm_compute::F16, arm_compute::F32, arm_compute::test::validation::forget_gate_bias, LSTMParams< T >::forget_layer_norm_weights(), LSTMParams< T >::has_cifg_opt(), LSTMParams< T >::has_peephole_opt(), LSTMParams< T >::has_projection(), arm_compute::test::validation::input, LSTMParams< T >::input_gate_bias(), LSTMParams< T >::input_layer_norm_weights(), arm_compute::test::validation::input_to_cell_weights, arm_compute::test::validation::input_to_forget_weights, LSTMParams< T >::input_to_input_weights(), arm_compute::test::validation::input_to_output_weights, ITensorInfo::num_dimensions(), arm_compute::test::validation::output_gate_bias, LSTMParams< T >::output_layer_norm_weights(), LSTMParams< T >::projection_bias(), LSTMParams< T >::projection_weights(), arm_compute::test::validation::recurrent_to_cell_weights, arm_compute::test::validation::recurrent_to_forget_weights, LSTMParams< T >::recurrent_to_input_weights(), arm_compute::test::validation::recurrent_to_output_weights, arm_compute::SATURATE, arm_compute::TO_NEAREST_EVEN, LSTMParams< T >::use_layer_norm(), CLMeanStdDevNormalizationLayer::validate(), CLCopy::validate(), CLFullyConnectedLayer::validate(), CLActivationLayer::validate(), CLGEMM::validate(), CLConcatenateLayer::validate(), CLPixelWiseMultiplication::validate(), CLArithmeticAddition::validate(), and CLArithmeticSubtraction::validate().

Referenced by CLLSTMLayer::configure().


The documentation for this class was generated from the following files:
arm_compute::CLPixelWiseMultiplication::validate
static Status validate(const ITensorInfo *input1, const ITensorInfo *input2, const ITensorInfo *output, float scale, ConvertPolicy overflow_policy, RoundingPolicy rounding_policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Static function to check if given info will lead to a valid configuration of CLPixelWiseMultiplicatio...
Definition: CLPixelWiseMultiplication.cpp:67
arm_compute::CLLSTMLayer::prepare
void prepare() override
Prepare the function for executing.
Definition: CLLSTMLayer.cpp:736
arm_compute::CLArithmeticSubtraction::configure
void configure(const ICLTensor *input1, const ICLTensor *input2, ICLTensor *output, ConvertPolicy policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Initialise the kernel's inputs, output and conversion policy.
Definition: CLElementwiseOperations.cpp:99
arm_compute::CLArithmeticAddition::run
void run() override
Run the kernels contained in the function.
Definition: CLElementwiseOperations.cpp:73
arm_compute::MemoryGroup::manage
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
arm_compute::CLFill::run
void run() override
Run the kernels contained in the function.
Definition: CLFill.cpp:72
arm_compute::RoundingPolicy::TO_NEAREST_EVEN
@ TO_NEAREST_EVEN
Rounds to nearest value; half rounds to nearest even.
arm_compute::ITensorAllocator::init
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
Definition: ITensorAllocator.cpp:33
arm_compute::CLMeanStdDevNormalizationLayer::validate
static Status validate(const ITensorInfo *input, const ITensorInfo *output=nullptr, float epsilon=1e-8f)
Static function to check if given info will lead to a valid configuration of CLMeanStdDevNormalizatio...
Definition: CLMeanStdDevNormalizationLayer.cpp:46
arm_compute::Window::DimX
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
arm_compute::CLPixelWiseMultiplication::configure
void configure(ICLTensor *input1, ICLTensor *input2, ICLTensor *output, float scale, ConvertPolicy overflow_policy, RoundingPolicy rounding_policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Initialise the kernel's inputs, output and convertion policy.
Definition: CLPixelWiseMultiplication.cpp:51
arm_compute::CLFullyConnectedLayer::run
void run() override
Run the kernels contained in the function.
Definition: CLFullyConnectedLayer.cpp:114
arm_compute::misc::shape_calculator::calculate_concatenate_shape
TensorShape calculate_concatenate_shape(const std::vector< T * > &input, size_t axis)
Calculate the concatenate output shape of the concatenate operation along a single axis.
Definition: ShapeCalculator.h:1404
arm_compute::CLActivationLayer::configure
void configure(ICLTensor *input, ICLTensor *output, ActivationLayerInfo act_info)
Set the input and output tensor.
Definition: CLActivationLayer.cpp:53
arm_compute::CLConcatenateLayer::validate
static Status validate(const std::vector< const ITensorInfo * > &inputs_vector, const ITensorInfo *output, size_t axis)
Static function to check if given info will lead to a valid configuration of CLConcatenateLayer.
Definition: CLConcatenateLayer.cpp:79
arm_compute::ITensorPack::add_tensor
void add_tensor(int id, ITensor *tensor)
Add tensor to the pack.
Definition: ITensorPack.cpp:39
arm_compute::CLPixelWiseMultiplication::run
void run() override
Run the kernels contained in the function.
Definition: CLPixelWiseMultiplication.cpp:73
arm_compute::CLCopy::configure
void configure(ICLTensor *input, ICLTensor *output, Window *dst_window=nullptr)
Initialise the function's source and destination.
Definition: CLCopy.cpp:54
arm_compute::misc::shape_calculator::compute_transposed_shape
TensorShape compute_transposed_shape(const ITensorInfo &input)
Calculate the transposed shape of a tensor.
Definition: ShapeCalculator.h:404
arm_compute::TensorInfo::data_type
DataType data_type() const override
Data type used for each element of the tensor.
Definition: TensorInfo.h:243
arm_compute::CLKernelLibrary::get
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
Definition: CLKernelLibrary.cpp:39
ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:630
arm_compute::test::validation::forget_gate_bias
auto forget_gate_bias
Definition: LSTMLayerQuantized.cpp:481
ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:877
arm_compute::test::validation::output_gate_bias
auto output_gate_bias
Definition: LSTMLayerQuantized.cpp:483
ARM_COMPUTE_RETURN_ON_ERROR
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
ARM_COMPUTE_ERROR_ON_NULLPTR
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
arm_compute::test::validation::recurrent_to_forget_weights
auto recurrent_to_forget_weights
Definition: LSTMLayerQuantized.cpp:477
ARM_COMPUTE_ERROR_THROW_ON
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:456
arm_compute::CLFullyConnectedLayer::configure
void configure(const CLCompileContext &compile_context, const ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
Set the input and output tensors.
Definition: CLFullyConnectedLayer.cpp:67
arm_compute::CLGEMM::validate
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Static function to check if given info will lead to a valid configuration of CLGEMM.
Definition: CLGEMM.cpp:94
arm_compute::CLLSTMLayer::validate
static Status validate(const ITensorInfo *input, const ITensorInfo *input_to_forget_weights, const ITensorInfo *input_to_cell_weights, const ITensorInfo *input_to_output_weights, const ITensorInfo *recurrent_to_forget_weights, const ITensorInfo *recurrent_to_cell_weights, const ITensorInfo *recurrent_to_output_weights, const ITensorInfo *forget_gate_bias, const ITensorInfo *cell_bias, const ITensorInfo *output_gate_bias, const ITensorInfo *output_state_in, const ITensorInfo *cell_state_in, const ITensorInfo *scratch_buffer, const ITensorInfo *output_state_out, const ITensorInfo *cell_state_out, const ITensorInfo *output, const LSTMParams< ITensorInfo > &lstm_params, const ActivationLayerInfo &activation_info, float cell_threshold=0.f, float projection_threshold=0.f)
Static function to check if given info will lead to a valid configuration of CLLSTMLayer.
Definition: CLLSTMLayer.cpp:400
ARM_COMPUTE_RETURN_ERROR_ON
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:297
arm_compute::ACL_DST
@ ACL_DST
Definition: Types.h:55
arm_compute::CLConcatenateLayer::configure
void configure(std::vector< const ICLTensor * > &inputs_vector, ICLTensor *output, size_t axis)
Initialise the kernel's inputs vector and output.
Definition: CLConcatenateLayer.cpp:54
arm_compute::CLArithmeticAddition::configure
void configure(ICLTensor *input1, ICLTensor *input2, ICLTensor *output, ConvertPolicy policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Initialise the kernel's inputs, output and conversion policy.
Definition: CLElementwiseOperations.cpp:53
arm_compute::ConvertPolicy::SATURATE
@ SATURATE
Saturate.
arm_compute::CLFill::configure
void configure(ICLTensor *tensor, const PixelValue &constant_value, Window *window=nullptr)
Initialize the kernel's tensor and filling value.
Definition: CLFill.cpp:52
arm_compute::utils::info_helpers::build_lstm_params_tensor_info
void build_lstm_params_tensor_info(const LSTMParams< T > &lstm_params, LSTMParams< ITensorInfo > *lstm_params_info)
Build LSTMParams<ITensorInfo> object by extracting the metadata from each tensor.
Definition: InfoHelpers.h:71
arm_compute::test::validation::recurrent_to_output_weights
auto recurrent_to_output_weights
Definition: LSTMLayerQuantized.cpp:479
arm_compute::CLActivationLayer::validate
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const ActivationLayerInfo &act_info)
Static function to check if given info will lead to a valid configuration of CLActivationLayer.
Definition: CLActivationLayer.cpp:69
arm_compute::CLTensorAllocator::allocate
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
Definition: CLTensorAllocator.cpp:127
arm_compute::test::validation::pack
ITensorPack pack
Definition: Im2Col.cpp:188
arm_compute::CLArithmeticAddition::validate
static Status validate(const ITensorInfo *input1, const ITensorInfo *input2, const ITensorInfo *output, ConvertPolicy policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Static function to check if given info will lead to a valid configuration of opencl::kernels::ClSatur...
Definition: CLElementwiseOperations.cpp:68
arm_compute::CLTensor::allocator
CLTensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: CLTensor.cpp:61
arm_compute::CLLSTMLayer::configure
void configure(const ICLTensor *input, const ICLTensor *input_to_forget_weights, const ICLTensor *input_to_cell_weights, const ICLTensor *input_to_output_weights, const ICLTensor *recurrent_to_forget_weights, const ICLTensor *recurrent_to_cell_weights, const ICLTensor *recurrent_to_output_weights, const ICLTensor *forget_gate_bias, const ICLTensor *cell_bias, const ICLTensor *output_gate_bias, const ICLTensor *output_state_in, ICLTensor *cell_state_in, ICLTensor *scratch_buffer, ICLTensor *output_state_out, ICLTensor *cell_state_out, ICLTensor *output, const LSTMParams< ICLTensor > &lstm_params, const ActivationLayerInfo &activation_info, float cell_threshold=0.f, float projection_threshold=0.f)
Initialize function's tensors.
Definition: CLLSTMLayer.cpp:60
arm_compute::CLScheduler::get
static CLScheduler & get()
Access the scheduler singleton.
Definition: CLScheduler.cpp:103
arm_compute::CLGEMM::run
void run() override
Run the kernels contained in the function.
Definition: CLGEMM.cpp:99
arm_compute::CLConcatenateLayer::run
void run() override
Run the kernels contained in the function.
Definition: CLConcatenateLayer.cpp:84
arm_compute::CLGEMM::configure
void configure(const CLCompileContext &compile_context, const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Initialise the kernel's inputs and output.
Definition: CLGEMM.cpp:68
arm_compute::CLCopy::validate
static Status validate(const ITensorInfo *input, const ITensorInfo *output, Window *dst_window=nullptr)
Static function to check if given info will lead to a valid configuration of CLCopy.
Definition: CLCopy.cpp:71
arm_compute::CLFullyConnectedLayer::validate
static Status validate(const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, FullyConnectedLayerInfo fc_info=FullyConnectedLayerInfo())
Static function to check if given info will lead to a valid configuration of CLFullyConnectedLayer.
Definition: CLFullyConnectedLayer.cpp:108
arm_compute::CLArithmeticSubtraction::run
void run() override
Run the kernels contained in the function.
Definition: CLElementwiseOperations.cpp:119
arm_compute::DataType::F16
@ F16
16-bit floating-point number
arm_compute::CLActivationLayer::run
void run() override
Run the kernels contained in the function.
Definition: CLActivationLayer.cpp:74
ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
arm_compute::CLScheduler::enqueue_op
void enqueue_op(ICLKernel &kernel, ITensorPack &tensors, bool flush=true)
Schedule the execution of the passed kernel if possible.
Definition: CLScheduler.cpp:211
arm_compute::CLTensor::info
TensorInfo * info() const override
Interface to be implemented by the child class to return the tensor's metadata.
Definition: CLTensor.cpp:41
arm_compute::ACL_SRC
@ ACL_SRC
Definition: Types.h:44
arm_compute::DataType::F32
@ F32
32-bit floating-point number
arm_compute::CLCopy::run
void run() override
Run the kernels contained in the function.
Definition: CLCopy.cpp:76
arm_compute::test::validation::input_to_forget_weights
auto input_to_forget_weights
Definition: LSTMLayerQuantized.cpp:473
ARM_COMPUTE_LOG_PARAMS
#define ARM_COMPUTE_LOG_PARAMS(...)
Definition: Log.h:35
arm_compute::test::validation::input_to_output_weights
auto input_to_output_weights
Definition: LSTMLayerQuantized.cpp:475
arm_compute::CLMeanStdDevNormalizationLayer::configure
void configure(ICLTensor *input, ICLTensor *output=nullptr, float epsilon=1e-8f)
Initialise the function's input and outputs.
Definition: CLMeanStdDevNormalizationLayer.cpp:33
arm_compute::TensorInfo::tensor_shape
const TensorShape & tensor_shape() const override
Size for each dimension of the tensor.
Definition: TensorInfo.h:235
arm_compute::test::validation::input_to_cell_weights
auto input_to_cell_weights
Definition: LSTMLayerQuantized.cpp:474
arm_compute::test::validation::recurrent_to_cell_weights
auto recurrent_to_cell_weights
Definition: LSTMLayerQuantized.cpp:478
arm_compute::ICLSimpleFunction::run
void run() override final
Run the kernels contained in the function.
Definition: ICLSimpleFunction.cpp:43
arm_compute::test::validation::input
auto input
Definition: LSTMLayerQuantized.cpp:486
arm_compute::CLArithmeticSubtraction::validate
static Status validate(const ITensorInfo *input1, const ITensorInfo *input2, const ITensorInfo *output, ConvertPolicy policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Static function to check if given info will lead to a valid configuration of opencl::kernels::ClSatur...
Definition: CLElementwiseOperations.cpp:114