Compute Library
 19.08
NEGEMMMatrixMultiplyKernel Class Reference

NEON kernel to multiply two input matrices "A" and "B". More...

#include <NEGEMMMatrixMultiplyKernel.h>

Collaboration diagram for NEGEMMMatrixMultiplyKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEGEMMMatrixMultiplyKernel ()
 Constructor. More...
 
 NEGEMMMatrixMultiplyKernel (const NEGEMMMatrixMultiplyKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEGEMMMatrixMultiplyKerneloperator= (const NEGEMMMatrixMultiplyKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEGEMMMatrixMultiplyKernel (NEGEMMMatrixMultiplyKernel &&)=default
 Allow instances of this class to be moved. More...
 
NEGEMMMatrixMultiplyKerneloperator= (NEGEMMMatrixMultiplyKernel &&)=default
 Allow instances of this class to be moved. More...
 
void configure (const ITensor *input0, const ITensor *input1, ITensor *output, float alpha, bool is_interleaved, const GEMMReshapeInfo &reshape_info=GEMMReshapeInfo())
 Initialise the kernel's input and output. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input0, const ITensorInfo *input1, const ITensorInfo *output, float alpha, bool is_interleaved, const GEMMReshapeInfo &reshape_info)
 Static function to check if given info will lead to a valid configuration of NEGEMMMatrixMultiplyKernel. More...
 

Detailed Description

NEON kernel to multiply two input matrices "A" and "B".

All elements of the output matrix/vector will be multiplied by alpha after the matrix multiplication

Note
If the output tensor is a matrix, the implementation assumes that the input tensors input0 and input1 are both matrices and reshaped respectively with NEGEMMInterleave4x4Kernel" and NEGEMMTranspose1xWKernel
If the output tensor is a vector and the data type is F32, the implementation assumes that the first input tensor input0 is a vector and the second input tensor input1 a matrix. The implementation also assumes that both tensors have not been reshaped

Definition at line 39 of file NEGEMMMatrixMultiplyKernel.h.

Constructor & Destructor Documentation

◆ NEGEMMMatrixMultiplyKernel() [1/3]

Constructor.

Definition at line 957 of file NEGEMMMatrixMultiplyKernel.cpp.

958  : _input0(nullptr), _input1(nullptr), _output(nullptr), _alpha(1.0f)
959 {
960 }

◆ NEGEMMMatrixMultiplyKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEGEMMMatrixMultiplyKernel() [3/3]

Allow instances of this class to be moved.

Member Function Documentation

◆ configure()

void configure ( const ITensor input0,
const ITensor input1,
ITensor output,
float  alpha,
bool  is_interleaved,
const GEMMReshapeInfo reshape_info = GEMMReshapeInfo() 
)

Initialise the kernel's input and output.

Note
If the output tensor is a matrix, the input matrices input0 and input1 should be the output of the kernels: NEGEMMInterleave4x4Kernel and NEGEMMTranspose1xWKernel These two kernels change the layout of the original matrices to be more cache-friendly.
Parameters
[in]input0Input tensor containing the interleaved Matrix A or the vector A. Data types supported: F16/F32
[in]input1Input tensor containing the transposed Matrix B if the first input tensor A is not a vector. If the output tensor is a vector, input1 must contain the matrix B not reshaped. Data type supported: same as input0
[out]outputOutput tensor to store the result of matrix multiplication. Data type supported: same as input0.
[in]alphaWeight of the matrix product
[in]is_interleaved(Optional) True if input0 and input1 have been reshaped respectively using NEGEMMInterleave4x4Kernel and NEGEMMTranspose1xWKernel
[in]reshape_info(Optional) GEMM reshape info. If is_interleaved_transposed = true, this object must contain the information to understand how the matrix A and matrix B have been reshaped

Definition at line 962 of file NEGEMMMatrixMultiplyKernel.cpp.

963 {
964  ARM_COMPUTE_ERROR_ON_NULLPTR(input0, input1, output);
965 
966  // Output tensor auto inizialitation if not yet initialized
967  TensorShape tensor_shape{ input0->info()->tensor_shape() };
968  tensor_shape.set(0, is_interleaved ? reshape_info.n() : input1->info()->dimension(0));
969  tensor_shape.set(1, is_interleaved ? reshape_info.m() : input0->info()->dimension(1));
970 
971  auto_init_if_empty(*output->info(), input0->info()->clone()->set_tensor_shape(tensor_shape));
972 
973  // Perform validate step
974  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input0->info(), input1->info(), output->info(), alpha, is_interleaved, reshape_info));
975 
976  _input0 = input0;
977  _input1 = input1;
978  _output = output;
979  _alpha = alpha;
980 
981  // Configure kernel window
982  auto win_config = validate_and_configure_window(input0->info(), input1->info(), output->info());
983  ARM_COMPUTE_ERROR_THROW_ON(win_config.first);
984  INEKernel::configure(win_config.second);
985 }
Shape of a tensor.
Definition: TensorShape.h:39
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:327
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
Definition: Helpers.inl:201
int n() const
Number of matrix B columns.
Definition: Types.h:1762
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
int m() const
Number of matrix A rows.
Definition: Types.h:1754
TensorShape & set(size_t dimension, size_t value, bool apply_dim_correction=true)
Accessor to set the value of one of the dimensions.
Definition: TensorShape.h:78

References arm_compute::test::validation::alpha, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), ICloneable< T >::clone(), ITensorInfo::dimension(), ITensor::info(), GEMMReshapeInfo::m(), GEMMReshapeInfo::n(), TensorShape::set(), ITensorInfo::tensor_shape(), and arm_compute::validate_and_configure_window().

Referenced by NEGEMM::configure().

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 42 of file NEGEMMMatrixMultiplyKernel.h.

43  {
44  return "NEGEMMMatrixMultiplyKernel";
45  }

◆ operator=() [1/2]

NEGEMMMatrixMultiplyKernel& operator= ( const NEGEMMMatrixMultiplyKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Allow instances of this class to be moved.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Implements ICPPKernel.

Definition at line 996 of file NEGEMMMatrixMultiplyKernel.cpp.

997 {
1000 
1001  const bool multiply_alpha = !(helpers::float_ops::is_one(_alpha));
1002 
1003  // Check if the output tensor is a vector. If so,the kernel runs the vector-matrix multiplication
1004  if((_output->info()->dimension(1) == 1))
1005  {
1006  switch(_input0->info()->data_type())
1007  {
1008  case DataType::F32:
1009  {
1010  multiply_alpha ? vector_matrix_multiply_f32<true>(_input0, _input1, _output, window, info, _alpha) :
1011  vector_matrix_multiply_f32<false>(_input0, _input1, _output, window, info, _alpha);
1012  break;
1013  }
1014 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
1015  case DataType::F16:
1016  {
1017  multiply_alpha ? vector_matrix_multiply_f16<true>(_input0, _input1, _output, window, info, _alpha) :
1018  vector_matrix_multiply_f16<false>(_input0, _input1, _output, window, info, _alpha);
1019  break;
1020  }
1021 #endif /* __ARM_FEATURE_FP16_VECTOR_ARITHMETIC */
1022  default:
1023  {
1024  ARM_COMPUTE_ERROR("Data type not supported");
1025  break;
1026  }
1027  }
1028  }
1029  else
1030  {
1031  switch(_input0->info()->data_type())
1032  {
1033  case DataType::F32:
1034  {
1035  multiply_alpha ? matrix_matrix_multiply_f32<true>(_input0, _input1, _output, window, _alpha) :
1036  matrix_matrix_multiply_f32<false>(_input0, _input1, _output, window, _alpha);
1037  break;
1038  }
1039 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
1040  case DataType::F16:
1041  {
1042  multiply_alpha ? matrix_matrix_multiply_f16<true>(_input0, _input1, _output, window, _alpha) :
1043  matrix_matrix_multiply_f16<false>(_input0, _input1, _output, window, _alpha);
1044  break;
1045  }
1046 #endif /* __ARM_FEATURE_FP16_VECTOR_ARITHMETIC */
1047  default:
1048  {
1049  ARM_COMPUTE_ERROR("Data type not supported");
1050  break;
1051  }
1052  }
1053  }
1054 }
#define ARM_COMPUTE_ERROR(...)
Print the given message then throw an std::runtime_error.
Definition: Error.h:261
bool is_one(float a, float epsilon=0.00001f)
Checks if the input floating point number is 1.0f checking if the difference is within a range define...
Definition: float_ops.h:97
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
virtual DataType data_type() const =0
Data type used for each element of the tensor.
1 channel, 1 F32 per channel
1 channel, 1 F16 per channel
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:940

References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, ITensor::info(), arm_compute::test::validation::info, arm_compute::helpers::float_ops::is_one(), and IKernel::window().

◆ validate()

Status validate ( const ITensorInfo input0,
const ITensorInfo input1,
const ITensorInfo output,
float  alpha,
bool  is_interleaved,
const GEMMReshapeInfo reshape_info 
)
static

Static function to check if given info will lead to a valid configuration of NEGEMMMatrixMultiplyKernel.

Parameters
[in]input0Input tensor containing the interleaved Matrix A or the vector A. Data types supported: F16/F32
[in]input1Input tensor containing the transposed Matrix B if the first input tensor A is not a vector. If the output tensor is a vector, input1 must contain the matrix B not reshaped. Data type supported: same as input0
[in]outputOutput tensor to store the result of matrix multiplication. Data type supported: same as input0.
[in]alphaWeight of the matrix product
[in]is_interleaved(Optional) True if input0 and input1 have been reshaped respectively using NEGEMMInterleave4x4Kernel and NEGEMMTranspose1xWKernel
[in]reshape_info(Optional) GEMM reshape info. If is_interleaved_transposed = true, this object must contain the information to understand how the matrix A and matrix B have been reshaped
Returns
a status

Definition at line 987 of file NEGEMMMatrixMultiplyKernel.cpp.

989 {
990  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(input0, input1, output, alpha, is_interleaved, reshape_info));
991  ARM_COMPUTE_RETURN_ON_ERROR(validate_and_configure_window(input0->clone().get(), input1->clone().get(), output->clone().get()).first);
992 
993  return Status{};
994 }
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:193
Status class.
Definition: Error.h:52
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.

References arm_compute::test::validation::alpha, ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), and arm_compute::validate_and_configure_window().

Referenced by NEGEMM::validate().


The documentation for this class was generated from the following files: