Compute Library
 19.11
NEGEMM Class Reference

Basic function to execute GEMM on NEON. More...

#include <NEGEMM.h>

Collaboration diagram for NEGEMM:
[legend]

Public Member Functions

 NEGEMM (std::shared_ptr< IMemoryManager > memory_manager=nullptr, IWeightsManager *weights_manager=nullptr)
 Constructor. More...
 
 NEGEMM (const NEGEMM &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEGEMM (NEGEMM &&)=default
 Default move constructor. More...
 
NEGEMMoperator= (const NEGEMM &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEGEMMoperator= (NEGEMM &&)=default
 Default move assignment operator. More...
 
void configure (const ITensor *a, const ITensor *b, const ITensor *c, ITensor *d, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
 Initialise the kernel's inputs, output. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
 Static function to check if given info will lead to a valid configuration of NEGEMM. More...
 

Detailed Description

Basic function to execute GEMM on NEON.

This function calls the following NEON kernels:

If optimized assembly is available:

  1. NEGEMMAssemblyDispatch
  2. NEActivationLayer (if alpha != 1.0) Else:
  3. NEGEMMInterleave4x4Kernel (if the output tensor is a matrix)
  4. NEGEMMTranspose1xWKernel (if the output tensor is a matrix)
  5. NEGEMMMatrixMultiplyKernel In both cases:
  6. NEGEMMMatrixAdditionKernel (if c != nullptr and beta != 0.0 and is not reshaped once) Else:
  7. NEArithmeticAdditionKernel (if c != nullptr and is reshaped once and not optimized assembly in place)
  8. NEActivationLayer (if activation is specified in GEMMInfo)

Definition at line 59 of file NEGEMM.h.

Constructor & Destructor Documentation

◆ NEGEMM() [1/3]

NEGEMM ( std::shared_ptr< IMemoryManager memory_manager = nullptr,
IWeightsManager weights_manager = nullptr 
)

Constructor.

Definition at line 44 of file NEGEMM.cpp.

45  : _memory_group(memory_manager), _weights_manager(weights_manager), _interleave_kernel(), _transpose_kernel(), _mm_kernel(), _asm_glue(memory_manager, weights_manager), _ma_kernel(),
46  _alpha_scale_func(nullptr), _add_bias_kernel(), _activation_func(), _tmp_a(), _tmp_b(), _tmp_d(), _original_b(nullptr), _run_vector_matrix_multiplication(false), _run_alpha_scale(false),
47  _run_addition(false), _run_bias_addition(false), _run_activation(false), _reshape_b_only_on_first_run(false), _is_prepared(false)
48 {
49 }

◆ NEGEMM() [2/3]

NEGEMM ( const NEGEMM )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEGEMM() [3/3]

NEGEMM ( NEGEMM &&  )
default

Default move constructor.

Member Function Documentation

◆ configure()

void configure ( const ITensor a,
const ITensor b,
const ITensor c,
ITensor d,
float  alpha,
float  beta,
const GEMMInfo gemm_info = GEMMInfo() 
)

Initialise the kernel's inputs, output.

Note
GEMM: General Matrix Multiply - [alpha * A * B + beta * C].
GEMM: The tensors a, b, c, d must have the same data type. You should not mix data types when calling this function.
Parameters
[in]aFirst input tensor (Matrix A or Vector A). Data type supported: F16/F32
[in]bSecond input tensor (Matrix B). Data type supported: same as a
[in]cThird input tensor (Matrix C). It can be a nullptr if just the multiplication between a and b is needed. Data type supported: same as a
[out]dOutput tensor. Data type supported: same as a
[in]alphaWeight of the matrix product
[in]betaWeight of matrix C
[in]gemm_info(Optional) Specifies if the matrix A and/or matrix B have been reshaped and if the reshape of matrix B should happen only for the first run

Definition at line 51 of file NEGEMM.cpp.

52 {
53  ARM_COMPUTE_ERROR_THROW_ON(NEGEMM::validate(a->info(), b->info(), (c != nullptr) ? c->info() : nullptr, d->info(), alpha, beta, gemm_info));
54 
55  const bool is_c_bias = gemm_info.reshape_b_only_on_first_run();
56  bool run_optimised = bool(NEGEMMAssemblyDispatch::validate(a->info(), b->info(), (is_c_bias && c != nullptr) ? c->info() : nullptr, d->info(), gemm_info));
57 
58  // Check if we need to reshape the matrix B only on the first run
59  _is_prepared = false;
60  _reshape_b_only_on_first_run = gemm_info.reshape_b_only_on_first_run();
61  _run_vector_matrix_multiplication = a->info()->dimension(1) < 2;
62  _original_b = b;
63  _run_alpha_scale = alpha != 1.f;
64  _run_bias_addition = c != nullptr && gemm_info.reshape_b_only_on_first_run();
65  _run_addition = beta != 0 && c != nullptr && !gemm_info.reshape_b_only_on_first_run();
66  _run_activation = gemm_info.activation_info().enabled() && (!run_optimised || (run_optimised && !NEGEMMAssemblyDispatch::is_activation_supported(gemm_info.activation_info())));
67 
68  if(run_optimised)
69  {
70  const ITensor *c_to_use = is_c_bias ? c : nullptr;
72  {
73  GEMMInfo gemm_info_ntb = gemm_info;
74  gemm_info_ntb.set_pretranpose_B(false);
75  _asm_glue.configure(a, b, c_to_use, d, gemm_info_ntb);
76  }
77  else
78  {
79  _asm_glue.configure(a, b, c_to_use, d, gemm_info);
80  }
81  ARM_COMPUTE_ERROR_ON(!_asm_glue.is_configured());
82 
83  // Scale product by alpha
84  if(_run_alpha_scale)
85  {
86  _alpha_scale_func.configure(d, nullptr, ActivationLayerInfo(ActivationLayerInfo::ActivationFunction::LINEAR, alpha, 0.f));
87  }
88  }
89  else
90  {
91  // Pick output tensor in case bias addition should be performed
92  ITensor *gemm_output_to_use = d;
93  if(_run_bias_addition)
94  {
95  gemm_output_to_use = &_tmp_d;
96  _memory_group.manage(&_tmp_d);
97  }
98 
99  // Select between GEMV and GEMM
100  if(_run_vector_matrix_multiplication)
101  {
102  // Configure the matrix multiply kernel
103  _mm_kernel.configure(a, b, gemm_output_to_use, alpha, false);
104  }
105  else
106  {
107  TensorShape shape_tmp_a = a->info()->tensor_shape();
108  TensorShape shape_tmp_b = b->info()->tensor_shape();
109 
110  shape_tmp_a.set(0, a->info()->dimension(0) * 4);
111  shape_tmp_a.set(1, std::ceil(a->info()->dimension(1) / 4.0f));
112 
113  const unsigned int transpose_w = 16 / data_size_from_type(b->info()->data_type());
114  shape_tmp_b.set(0, b->info()->dimension(1) * transpose_w);
115  shape_tmp_b.set(1, std::ceil(b->info()->dimension(0) / static_cast<float>(transpose_w)));
116 
117  TensorInfo info_a = a->info()->clone()->set_tensor_shape(shape_tmp_a).set_is_resizable(true);
118  TensorInfo info_b = b->info()->clone()->set_tensor_shape(shape_tmp_b).set_is_resizable(true);
119 
120  _tmp_a.allocator()->init(info_a);
121  _tmp_b.allocator()->init(info_b);
122 
123  // Manage intermediate buffers
124  _memory_group.manage(&_tmp_a);
125  if(!_reshape_b_only_on_first_run)
126  {
127  _memory_group.manage(&_tmp_b);
128  }
129 
130  int m = a->info()->dimension(1);
131  int n = b->info()->dimension(0);
132  int k = a->info()->dimension(0);
133 
134  // Configure interleave kernel
135  _interleave_kernel.configure(a, &_tmp_a);
136 
137  // Configure transpose kernel
138  _transpose_kernel.configure(b, &_tmp_b);
139 
140  // Configure matrix multiplication kernel
141  _mm_kernel.configure(&_tmp_a, &_tmp_b, gemm_output_to_use, alpha, true, GEMMReshapeInfo(m, n, k));
142 
143  // Allocate once the all configure methods have been called
144  _tmp_a.allocator()->allocate();
145  if(!_reshape_b_only_on_first_run)
146  {
147  _tmp_b.allocator()->allocate();
148  }
149  }
150 
151  if(_run_bias_addition)
152  {
153  _add_bias_kernel.configure(gemm_output_to_use, c, d, ConvertPolicy::SATURATE);
154  _tmp_d.allocator()->allocate();
155  }
156  }
157 
158  // Configure matrix addition kernel
159  if(_run_addition)
160  {
161  _ma_kernel.configure(c, d, beta);
162  }
163 
164  // Configure activation
165  const ActivationLayerInfo &activation = gemm_info.activation_info();
166  if(_run_activation)
167  {
168  _activation_func.configure(d, nullptr, activation);
169  }
170 }
void init(const TensorAllocator &allocator, const Coordinates &coords, TensorInfo &sub_info)
Shares the same backing memory with another tensor allocator, while the tensor info might be differen...
SimpleTensor< float > b
Definition: DFT.cpp:157
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
void configure(const ITensor *input1, const ITensor *input2, ITensor *output, ConvertPolicy policy)
Initialise the kernel's input, output and border mode.
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
TensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: Tensor.cpp:48
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Static function to check if given info will lead to a valid configuration of NEGEMM.
Definition: NEGEMM.cpp:172
static bool is_activation_supported(const ActivationLayerInfo &activation)
Checks if activation is supported by the gemm assembly dispatcher.
bool is_configured() const
Was the function successfully configured ?
void allocate() override
Allocate size specified by TensorInfo of CPU memory.
void configure(const ITensor *input, ITensor *output)
Initialise the kernel's input and output.
size_t data_size_from_type(DataType data_type)
The size in bytes of the data type.
Definition: Utils.h:109
static MemoryPolicy get_policy()
Definition: MEMUtils.cpp:86
void configure(const ITensor *input, ITensor *output)
Initialise the kernel's input and output.
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *d, const GEMMInfo &gemm_info)
Indicates whether or not this function can be used to process the given parameters.
void configure(const ITensor *input0, const ITensor *input1, ITensor *output, float alpha, bool is_interleaved, const GEMMReshapeInfo &reshape_info=GEMMReshapeInfo())
Initialise the kernel's input and output.
void configure(ITensor *input, ITensor *output, ActivationLayerInfo activation_info)
[NEActivationLayer snippet]
void configure(const ITensor *input, ITensor *output, float beta)
Initialise the kernel's input and output.
void configure(const ITensor *a, const ITensor *b, const ITensor *c, ITensor *d, const GEMMInfo &gemm_info)
If supported create an ACL function else fallback to the arm_gemm function.

References GEMMInfo::activation_info(), TensorAllocator::allocate(), Tensor::allocator(), arm_compute::test::validation::alpha, ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::test::validation::b, ICloneable< T >::clone(), NEGEMMMatrixAdditionKernel::configure(), NEActivationLayer::configure(), NEGEMMInterleave4x4Kernel::configure(), NEGEMMMatrixMultiplyKernel::configure(), NEArithmeticAdditionKernel::configure(), NEGEMMAssemblyDispatch::configure(), NEGEMMTranspose1xWKernel::configure(), arm_compute::data_size_from_type(), ITensorInfo::dimension(), ActivationLayerInfo::enabled(), MEMInfo::get_policy(), ITensor::info(), TensorAllocator::init(), NEGEMMAssemblyDispatch::is_activation_supported(), NEGEMMAssemblyDispatch::is_configured(), ActivationLayerInfo::LINEAR, MemoryGroup::manage(), arm_compute::MINIMIZE, GEMMInfo::reshape_b_only_on_first_run(), arm_compute::SATURATE, TensorShape::set(), GEMMInfo::set_pretranpose_B(), ITensorInfo::tensor_shape(), NEGEMMAssemblyDispatch::validate(), and NEGEMM::validate().

Referenced by NERNNLayer::configure(), NEWinogradConvolutionLayer::configure(), and NELSTMLayer::configure().

◆ operator=() [1/2]

NEGEMM& operator= ( const NEGEMM )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEGEMM& operator= ( NEGEMM &&  )
default

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 335 of file NEGEMM.cpp.

336 {
337  if(!_is_prepared)
338  {
339  if(_asm_glue.is_configured())
340  {
341  if(!_weights_manager || !_weights_manager->are_weights_managed(_original_b))
342  {
343  ARM_COMPUTE_ERROR_ON(!_original_b->is_used());
344  }
345 
346  _asm_glue.prepare();
347  }
348  else if(_reshape_b_only_on_first_run && !_run_vector_matrix_multiplication && !_asm_glue.is_configured())
349  {
350  if(!_weights_manager || !_weights_manager->are_weights_managed(_original_b))
351  {
352  ARM_COMPUTE_ERROR_ON(!_original_b->is_used());
353  }
354 
355  _tmp_b.allocator()->allocate();
356  NEScheduler::get().schedule(&_transpose_kernel, Window::DimY);
357  _original_b->mark_as_unused();
358  }
359 
360  _is_prepared = true;
361  }
362 }
bool is_used() const
Flags if the tensor is used or not.
Definition: ITensor.cpp:162
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
TensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: Tensor.cpp:48
void mark_as_unused() const
Marks a tensor as unused.
Definition: ITensor.cpp:167
bool are_weights_managed(const ITensor *weights)
Check if the weights are managed.
void prepare() override
Runs a preparation step, usually for pre-transposing matrix b.
bool is_configured() const
Was the function successfully configured ?
void allocate() override
Allocate size specified by TensorInfo of CPU memory.
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
virtual void schedule(ICPPKernel *kernel, const Hints &hints)=0
Runs the kernel in the same thread as the caller synchronously.
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:95

References TensorAllocator::allocate(), Tensor::allocator(), IWeightsManager::are_weights_managed(), ARM_COMPUTE_ERROR_ON, Window::DimY, Scheduler::get(), NEGEMMAssemblyDispatch::is_configured(), ITensor::is_used(), ITensor::mark_as_unused(), NEGEMMAssemblyDispatch::prepare(), and IScheduler::schedule().

Referenced by NERNNLayer::prepare(), NEFullyConnectedLayer::prepare(), NEGEMMConvolutionLayer::prepare(), and NEGEMM::run().

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For NEON kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 285 of file NEGEMM.cpp.

286 {
287  prepare();
288 
289  MemoryGroupResourceScope scope_mg(_memory_group);
290 
291  if(_asm_glue.is_configured())
292  {
293  _asm_glue.run();
294  if(_run_alpha_scale)
295  {
296  _alpha_scale_func.run();
297  }
298  }
299  else
300  {
301  if(!_run_vector_matrix_multiplication)
302  {
303  // Run interleave kernel
304  NEScheduler::get().schedule(&_interleave_kernel, Window::DimY);
305 
306  if(!_reshape_b_only_on_first_run)
307  {
308  // Run transpose kernel
309  NEScheduler::get().schedule(&_transpose_kernel, Window::DimY);
310  }
311  }
312 
313  NEScheduler::get().schedule(&_mm_kernel, _run_vector_matrix_multiplication ? Window::DimX : Window::DimY);
314 
315  // Run bias addition kernel
316  if(_run_bias_addition)
317  {
318  NEScheduler::get().schedule(&_add_bias_kernel, Window::DimY);
319  }
320  }
321 
322  // Run matrix addition kernel
323  if(_run_addition)
324  {
325  NEScheduler::get().schedule(&_ma_kernel, Window::DimY);
326  }
327 
328  // Run activation function
329  if(_run_activation)
330  {
331  _activation_func.run();
332  }
333 }
void run() override final
Run the kernels contained in the function.
void run() override
Run the kernels contained in the function.
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
bool is_configured() const
Was the function successfully configured ?
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
virtual void schedule(ICPPKernel *kernel, const Hints &hints)=0
Runs the kernel in the same thread as the caller synchronously.
void prepare() override
Prepare the function for executing.
Definition: NEGEMM.cpp:335
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:95

References Window::DimX, Window::DimY, Scheduler::get(), NEGEMMAssemblyDispatch::is_configured(), NEGEMM::prepare(), INESimpleFunctionNoBorder::run(), NEGEMMAssemblyDispatch::run(), and IScheduler::schedule().

Referenced by NEWinogradConvolutionLayer::run(), NERNNLayer::run(), NELSTMLayer::run(), NEFullyConnectedLayer::run(), and NEGEMMConvolutionLayer::run().

◆ validate()

Status validate ( const ITensorInfo a,
const ITensorInfo b,
const ITensorInfo c,
const ITensorInfo output,
float  alpha,
float  beta,
const GEMMInfo gemm_info = GEMMInfo() 
)
static

Static function to check if given info will lead to a valid configuration of NEGEMM.

Parameters
[in]aFirst input tensor info (Matrix or Vector A). Data types supported: F16/F32
[in]bSecond input tensor info (Matrix B). Data type supported: same as a.
[in]cThird input tensor info (Matrix C). It can be a nullptr if just the multiplication between a and b is needed. Data type supported: same as a.
[out]outputOutput tensor info. Data type supported: same as a
[in]alphaWeight of the matrix product
[in]betaWeight of matrix C
[in]gemm_info(Optional) Specifies if the matrix A and/or matrix B have been reshaped and if the reshape of matrix B should happen only for the first run
Returns
a status

Definition at line 172 of file NEGEMM.cpp.

173 {
175  const bool is_c_bias = gemm_info.reshape_b_only_on_first_run();
176 
180  ARM_COMPUTE_RETURN_ERROR_ON_MSG(a->dimension(0) != b->dimension(1), "The product AB is defined only if the number of columns in A is equal to the number of rows in B");
181  ARM_COMPUTE_RETURN_ERROR_ON_MSG(gemm_info.is_a_reshaped(), "Matrix A already reshaped is not supported");
182  ARM_COMPUTE_RETURN_ERROR_ON_MSG(gemm_info.is_b_reshaped(), "Matrix B already reshaped is not supported");
183 
184  if(c != nullptr && !is_c_bias)
185  {
186  ARM_COMPUTE_RETURN_ERROR_ON(gemm_info.depth_output_gemm3d() != 0);
187  ARM_COMPUTE_RETURN_ERROR_ON(gemm_info.reinterpret_input_as_3d());
189  ARM_COMPUTE_RETURN_ERROR_ON_MSG(a->dimension(1) != c->dimension(1), "The C matrix must have the same number of rows as the matrix A");
190  ARM_COMPUTE_RETURN_ERROR_ON_MSG(b->dimension(0) != c->dimension(0), "The C matrix must have the same number of columns as the matrix B");
191  }
192 
193  if(output->total_size() != 0)
194  {
195  ARM_COMPUTE_RETURN_ERROR_ON(b->dimension(0) != output->dimension(0));
196  if(gemm_info.depth_output_gemm3d() != 0)
197  {
198  if(gemm_info.reinterpret_input_as_3d())
199  {
200  ARM_COMPUTE_RETURN_ERROR_ON(a->dimension(1) != output->dimension(1));
201  ARM_COMPUTE_RETURN_ERROR_ON(a->dimension(2) != output->dimension(2));
202  }
203  else
204  {
205  ARM_COMPUTE_RETURN_ERROR_ON(a->dimension(1) != output->dimension(1) * output->dimension(2));
206  }
207  }
208  else
209  {
210  ARM_COMPUTE_RETURN_ERROR_ON(a->dimension(1) != output->dimension(1));
211  }
212  }
213 
214  // Check if we need to run the optimized assembly kernel
215  const bool run_optimised = bool(NEGEMMAssemblyDispatch::validate(a, b, is_c_bias ? c : nullptr, output, gemm_info));
216 
217  if(!run_optimised)
218  {
219  ARM_COMPUTE_RETURN_ERROR_ON_MSG(gemm_info.reinterpret_input_as_3d(), "NEGEMM cannot reinterpret the input tensor as 3D");
220  ARM_COMPUTE_RETURN_ERROR_ON_MSG(gemm_info.depth_output_gemm3d() != 0, "NEGEMM cannot reinterpret the output tensor as 3D");
221 
222  // Check if the first input tensor is a vector.
223  const bool run_vector_matrix_multiplication = a->dimension(1) < 2;
224  // Check if we need to reshape the matrix A and matrix B
225  const bool run_interleave_transpose = !run_vector_matrix_multiplication && !(gemm_info.reshape_b_only_on_first_run());
226 
227  // Arguments used by GEMMReshapeInfo
228  // If we pass the matrix A and matrix B reshaped to NEGEMMMatrixMultiplyKernel, we need to pass m, n, k, mult_transpose1xW_width and mult_interleave4x4_height to NEGEMMReshapeInfo
229  // in order to know how the matrices have been reshaped
230  const int m = a->dimension(1);
231  const int n = b->dimension(0);
232  const int k = a->dimension(0);
233  int mult_transpose1xW_width = 1;
234  int mult_interleave4x4_height = 1;
235 
236  const GEMMReshapeInfo reshape_info = GEMMReshapeInfo(m, n, k, mult_transpose1xW_width, mult_interleave4x4_height, gemm_info.depth_output_gemm3d());
237 
238  const ITensorInfo *matrix_a_info = a;
239  const ITensorInfo *matrix_b_info = b;
240 
241  TensorInfo tmp_a_info{};
242  TensorInfo tmp_b_info{};
243  TensorInfo tmp_output_info = *output->clone();
244 
245  if(run_interleave_transpose)
246  {
247  matrix_a_info = &tmp_a_info;
248  matrix_b_info = &tmp_b_info;
249 
250  // Validate interleave kernel
251  auto_init_if_empty(tmp_a_info, a->clone()->set_tensor_shape(compute_interleaved_shape(*a, mult_interleave4x4_height, gemm_info.reinterpret_input_as_3d())));
253 
254  // Validate transpose kernel
255  auto_init_if_empty(tmp_b_info, b->clone()->set_tensor_shape(compute_transpose1xW_with_element_size_shape(*b, mult_transpose1xW_width)));
257  }
258 
259  // Validate matrix multiply
260  auto_init_if_empty(tmp_output_info, matrix_a_info->clone()->set_tensor_shape(compute_mm_shape(*matrix_a_info, *matrix_b_info, run_interleave_transpose, reshape_info)));
261  ARM_COMPUTE_RETURN_ON_ERROR(NEGEMMMatrixMultiplyKernel::validate(matrix_a_info, matrix_b_info, &tmp_output_info, alpha, run_interleave_transpose, reshape_info));
262 
263  if(c != nullptr && gemm_info.reshape_b_only_on_first_run())
264  {
266  }
267  }
268 
269  // Validate matrix addition kernel
270  if(beta != 0 && c != nullptr && !is_c_bias)
271  {
273  }
274 
275  // Validate activation
276  const ActivationLayerInfo &activation = gemm_info.activation_info();
277  if(activation.enabled())
278  {
279  ARM_COMPUTE_RETURN_ON_ERROR(NEActivationLayer::validate(output, nullptr, activation));
280  }
281 
282  return Status{};
283 }
TensorShape compute_transpose1xW_with_element_size_shape(const ITensorInfo &b, int mult_transpose1xW_width=1)
Calculate the transposed 1xW width element shape.
static Status validate(const ITensorInfo *input1, const ITensorInfo *input2, const ITensorInfo *output, ConvertPolicy policy)
Static function to check if given info will lead to a valid configuration of NEArithmeticAdditionKern...
SimpleTensor< float > b
Definition: DFT.cpp:157
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:545
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
TensorShape compute_mm_shape(const ITensorInfo &input0, const ITensorInfo &input1, bool is_interleaved_transposed, const GEMMReshapeInfo &reshape_info)
Calculate the matrix multiplication output shape of two tensors.
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const ActivationLayerInfo &act_info)
[NEActivationLayer snippet]
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:792
1 channel, 1 F32 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
static Status validate(const ITensorInfo *input, const ITensorInfo *output, float beta)
Static function to check if given info will lead to a valid configuration of NEGEMMMatrixAdditionKern...
TensorShape compute_interleaved_shape(const ITensorInfo &a, int mult_interleave4x4_height=1, bool reinterpret_input_as_3d=false)
Calculate the interleaved shape of an input tensor.
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of NEGEMMTranspose1xWKernel...
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
Definition: Helpers.inl:202
#define ARM_COMPUTE_RETURN_ERROR_ON_CPU_F16_UNSUPPORTED(tensor)
Definition: Validate.h:71
1 channel, 1 F16 per channel
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
static Status validate(const ITensorInfo *input, const ITensorInfo *output)
Static function to check if given info will lead to a valid configuration of NEGEMMInterleave4x4Kerne...
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *d, const GEMMInfo &gemm_info)
Indicates whether or not this function can be used to process the given parameters.
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG(cond, msg)
If the condition is true, an error is returned.
Definition: Error.h:244
static Status validate(const ITensorInfo *input0, const ITensorInfo *input1, const ITensorInfo *output, float alpha, bool is_interleaved, const GEMMReshapeInfo &reshape_info)
Static function to check if given info will lead to a valid configuration of NEGEMMMatrixMultiplyKern...

References GEMMInfo::activation_info(), arm_compute::test::validation::alpha, ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_CPU_F16_UNSUPPORTED, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ON_ERROR, ARM_COMPUTE_UNUSED, arm_compute::auto_init_if_empty(), arm_compute::test::validation::b, ICloneable< T >::clone(), TensorInfo::clone(), arm_compute::misc::shape_calculator::compute_interleaved_shape(), arm_compute::misc::shape_calculator::compute_mm_shape(), arm_compute::misc::shape_calculator::compute_transpose1xW_with_element_size_shape(), GEMMInfo::depth_output_gemm3d(), ITensorInfo::dimension(), ActivationLayerInfo::enabled(), arm_compute::F16, arm_compute::F32, GEMMInfo::is_a_reshaped(), GEMMInfo::is_b_reshaped(), GEMMInfo::reinterpret_input_as_3d(), GEMMInfo::reshape_b_only_on_first_run(), arm_compute::SATURATE, ITensorInfo::total_size(), NEGEMMInterleave4x4Kernel::validate(), NEGEMMMatrixAdditionKernel::validate(), NEActivationLayer::validate(), NEGEMMMatrixMultiplyKernel::validate(), NEArithmeticAdditionKernel::validate(), NEGEMMAssemblyDispatch::validate(), and NEGEMMTranspose1xWKernel::validate().

Referenced by NEGEMM::configure(), and NELSTMLayer::validate().


The documentation for this class was generated from the following files: