Compute Library
 20.02.1
NEGEMMLowpMatrixAReductionKernel Class Reference

NEON kernel used to compute the row-vectors of sums of all the entries in each row of Matrix A. More...

#include <NEGEMMLowpReductionKernel.h>

Collaboration diagram for NEGEMMLowpMatrixAReductionKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
void configure (const ITensor *mtx_a, ITensor *vector_sum_row, int32_t num_mtx_a_cols, bool is_interleaved4x4) override
 Initialise the kernel's input and output. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from INEGEMMLowpReductionKernel
 INEGEMMLowpReductionKernel ()
 Constructor. More...
 
 INEGEMMLowpReductionKernel (const INEGEMMLowpReductionKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
INEGEMMLowpReductionKerneloperator= (const INEGEMMLowpReductionKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 INEGEMMLowpReductionKernel (INEGEMMLowpReductionKernel &&)=default
 Allow instances of this class to be moved. More...
 
INEGEMMLowpReductionKerneloperator= (INEGEMMLowpReductionKernel &&)=default
 Allow instances of this class to be moved. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *mtx_a, const ITensorInfo *vector_sum_row, int32_t num_mtx_a_cols, bool is_interleaved4x4)
 Static function to check if given info will lead to a valid configuration of NEGEMMLowpMatrixAReductionKernel. More...
 

Detailed Description

NEON kernel used to compute the row-vectors of sums of all the entries in each row of Matrix A.

Note
This stage is needed to handle the offset of matrix product https://github.com/google/gemmlowp/blob/master/doc/low-precision.md

Definition at line 69 of file NEGEMMLowpReductionKernel.h.

Member Function Documentation

◆ configure()

void configure ( const ITensor mtx_a,
ITensor vector_sum_row,
int32_t  num_mtx_a_cols,
bool  is_interleaved4x4 
)
overridevirtual

Initialise the kernel's input and output.

Parameters
[in]mtx_aInput tensor. Data type supported: QASYMM8/QASYMM8_SIGNED
[out]vector_sum_rowOutput row-vector of sums of all the entries in each row of mtx_a. Data type supported: S32
[in]num_mtx_a_colsNumber of matrix A columns
[in]is_interleaved4x4True if the matrix A has been interleaved4x4

Implements INEGEMMLowpReductionKernel.

Definition at line 105 of file NEGEMMLowpReductionKernel.cpp.

106 {
107  // Perform validate step
108  ARM_COMPUTE_ERROR_ON_NULLPTR(mtx_a, vector_sum_row);
109  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments_matrix_a_reduction(mtx_a->info(), vector_sum_row->info()));
110 
111  _input = mtx_a;
112  _output = vector_sum_row;
113  _k = num_mtx_a_cols;
114  _is_reshaped = is_interleaved4x4;
115 
116  // Configure kernel window
117  auto win_config = validate_and_configure_window_matrix_a_reduction(_input->info(), _output->info(), _is_reshaped);
118  ARM_COMPUTE_ERROR_THROW_ON(win_config.first);
119  INEKernel::configure(win_config.second);
120 }
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
ITensorInfo * info() const override
Interface to be implemented by the child class to return the tensor's metadata.
Definition: Tensor.cpp:33
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, and ITensor::info().

Referenced by NEGEMMLowpMatrixMultiplyCore::configure().

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 72 of file NEGEMMLowpReductionKernel.h.

73  {
74  return "NEGEMMLowpMatrixAReductionKernel";
75  }

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Implements ICPPKernel.

Definition at line 252 of file NEGEMMLowpReductionKernel.cpp.

253 {
257 
258  switch(_input->info()->data_type())
259  {
260  case DataType::QASYMM8:
261  run_internal<uint8_t>(window);
262  break;
265  run_internal<int8_t>(window);
266  break;
267  default:
268  ARM_COMPUTE_ERROR("Unsupported data type");
269  }
270 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
quantized, asymmetric fixed-point 8-bit number unsigned
quantized, symmetric per channel fixed-point 8-bit number
quantized, asymmetric fixed-point 8-bit number signed
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:941

References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, arm_compute::test::validation::info, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::QSYMM8_PER_CHANNEL, and IKernel::window().

◆ validate()

Status validate ( const ITensorInfo mtx_a,
const ITensorInfo vector_sum_row,
int32_t  num_mtx_a_cols,
bool  is_interleaved4x4 
)
static

Static function to check if given info will lead to a valid configuration of NEGEMMLowpMatrixAReductionKernel.

Parameters
[in]mtx_aInput tensor. Data type supported: QASYMM8/QASYMM8_SIGNED
[in]vector_sum_rowOutput row-vector of sums of all the entries in each row of mtx_a. Data type supported: S32
[in]num_mtx_a_colsNumber of matrix A columns
[in]is_interleaved4x4True if the matrix A has been interleaved4x4
Returns
a status

Definition at line 122 of file NEGEMMLowpReductionKernel.cpp.

123 {
124  ARM_COMPUTE_UNUSED(num_mtx_a_cols);
125  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments_matrix_a_reduction(mtx_a, vector_sum_row));
126  ARM_COMPUTE_RETURN_ON_ERROR(validate_and_configure_window_matrix_a_reduction(mtx_a->clone().get(), vector_sum_row->clone().get(), is_interleaved4x4).first);
127 
128  return Status{};
129 }
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
Status class.
Definition: Error.h:52
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.

References ARM_COMPUTE_RETURN_ON_ERROR, ARM_COMPUTE_UNUSED, and ICloneable< T >::clone().

Referenced by NEGEMMLowpMatrixMultiplyCore::validate().


The documentation for this class was generated from the following files: