Compute Library
 23.11
CpuGemmAssemblyDispatch Class Reference

Assembly kernel glue. More...

#include <CpuGemmAssemblyDispatch.h>

Collaboration diagram for CpuGemmAssemblyDispatch:
[legend]

Data Structures

class  IFallback
 

Public Member Functions

 CpuGemmAssemblyDispatch ()
 Constructor. More...
 
 ~CpuGemmAssemblyDispatch ()=default
 Defautl destructor. More...
 
 ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE (CpuGemmAssemblyDispatch)
 
void configure (const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, ITensorInfo *d, const AsmGemmInfo &info)
 If supported create a Compute Library function else fallback to the arm_gemm function. More...
 
bool is_configured () const
 Was the function successfully configured ? More...
 
bool isVarWeightsKernel () const
 Indicates if the convolution executes in variable weights mode. More...
 
void prepare (ITensorPack &tensors) override
 Prepare the function for executing. More...
 
void run (ITensorPack &tensors) override
 Run the kernels contained in the function. More...
 
experimental::MemoryRequirements workspace () const override
 Return the memory requirements required by the workspace. More...
 
- Public Member Functions inherited from INEOperator
 INEOperator (IRuntimeContext *ctx=nullptr)
 Constructor. More...
 
 INEOperator (const INEOperator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 INEOperator (INEOperator &&)=default
 Default move constructor. More...
 
INEOperatoroperator= (const INEOperator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
INEOperatoroperator= (INEOperator &&)=default
 Default move assignment operator. More...
 
 ~INEOperator ()
 Default destructor. More...
 
- Public Member Functions inherited from IOperator
virtual ~IOperator ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *d, const AsmGemmInfo &info)
 Indicates whether or not this function can be used to process the given parameters. More...
 
static Status has_opt_impl (arm_compute::WeightFormat &weight_format, const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *d, const AsmGemmInfo &info)
 Indicates whether or not there is an optimal assembly implementation that can be used to process the given parameters. More...
 
static bool is_activation_supported (const ActivationLayerInfo &activation)
 Checks if activation is supported by the gemm assembly dispatcher. More...
 

Detailed Description

Assembly kernel glue.

Definition at line 70 of file CpuGemmAssemblyDispatch.h.

Constructor & Destructor Documentation

◆ CpuGemmAssemblyDispatch()

Constructor.

Definition at line 822 of file CpuGemmAssemblyDispatch.cpp.

822  : _arm_gemm(nullptr)
823 {
824 }

◆ ~CpuGemmAssemblyDispatch()

Defautl destructor.

Member Function Documentation

◆ ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE()

ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE ( CpuGemmAssemblyDispatch  )

◆ configure()

void configure ( const ITensorInfo a,
const ITensorInfo b,
const ITensorInfo c,
ITensorInfo d,
const AsmGemmInfo info 
)

If supported create a Compute Library function else fallback to the arm_gemm function.

Note
Configuring "batches" The shapes of a b and d are arranged as follows: Lowest dimension <-> Highest dimension a: [K, M, Batch, Multi] b: [N, K, Multi] d: [N, M, Batch, Multi]

The "Batch" refers to where "Batch" number of MxK slices of tensor a multiplies with a single KxN slice of b The "Multi" refers to where "Multi" number of individual multiplication of a with b

E.g. the following are some example input shape configurations

(1) Normal 2D gemm a: [K=3, M=4] b: [N=5, K=3] d: [N=5, M=4]

(2) Batches of a sharing b (e.g. gemm-based batched convolution where b is the shared ) a: [K=3, M=4, Batch=9] b: [N=5, K=3] d: [N=5, M=4, Batch=9]

(3) "Batches" of independent gemm (e.g. batched matmul) a: [K=3, M=4, Batch=1, Multi=7] b: [N=5, K=3, Multi=7] d: [N=5, M=4, Batch=1, Multi=7]

(4) "Batches" of independent gemm where b is also shared a: [K=3, M=4, Batch=4, Multi=7] b: [N=5, K=3, Multi=7] d: [N=5, M=4, Batch=4, Multi=7]

Parameters
[in]aInput tensor (Matrix A)
[in]bInput tensor (Matrix B)
[in]cInput tensor (Matrix C) used to pass the bias for quantized calculations
[out]dOutput tensor to store the result of matrix multiplication. Data type supported: same as input0.
[in]infoGEMM meta-data

Definition at line 974 of file CpuGemmAssemblyDispatch.cpp.

976 {
979 
980  //If we don't support a combination of data types, silently return: it is the caller's responsibility to check if configure() was successful via is_configured()
981  if (!CpuGemmAssemblyDispatch::validate(a, b, c, d, info))
982  {
983  return;
984  }
985 
986  switch (a->data_type())
987  {
988  case DataType::F32:
989  create_arm_gemm<float, float>(_arm_gemm, a, b, c, d, act, info);
990  break;
991 #ifdef __aarch64__
992  case DataType::U8:
993  case DataType::QASYMM8:
994  if (d->data_type() == DataType::S32)
995  {
996  create_arm_gemm<uint8_t, uint32_t>(_arm_gemm, a, b, c, d, act, info);
997  }
998  else
999  {
1000  create_arm_gemm_quant<uint8_t, uint8_t>(_arm_gemm, a, b, c, d, act, info);
1001  }
1002  break;
1003  case DataType::S8:
1005  if (d->data_type() == DataType::S32)
1006  {
1007  create_arm_gemm<int8_t, int32_t>(_arm_gemm, a, b, c, d, act, info);
1008  }
1009  else
1010  {
1011  create_arm_gemm_quant<int8_t, int8_t>(_arm_gemm, a, b, c, d, act, info);
1012  }
1013  break;
1014 #endif /* __aarch64__ */
1015 #if defined(ARM_COMPUTE_ENABLE_BF16)
1016  case DataType::BFLOAT16:
1017  create_arm_gemm<bfloat16, float>(_arm_gemm, a, b, c, d, act, info);
1018  break;
1019 #endif /* defined(ARM_COMPUTE_ENABLE_BF16) */
1020 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
1021  case DataType::F16:
1022  create_arm_gemm<float16_t, float16_t>(_arm_gemm, a, b, c, d, act, info);
1023  break;
1024 #endif /* __ARM_FEATURE_FP16_VECTOR_ARITHMETIC */
1025  default:
1026  break;
1027  }
1028 }

References ARM_COMPUTE_ERROR_ON_NULLPTR, arm_compute::test::validation::b, arm_compute::BFLOAT16, ITensorInfo::data_type(), arm_compute::F16, arm_compute::F32, arm_compute::test::validation::info, arm_compute::assembly_utils::map_to_arm_gemm_activation(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::S32, arm_compute::S8, arm_compute::U8, and CpuGemmAssemblyDispatch::validate().

◆ has_opt_impl()

Status has_opt_impl ( arm_compute::WeightFormat weight_format,
const ITensorInfo a,
const ITensorInfo b,
const ITensorInfo c,
const ITensorInfo d,
const AsmGemmInfo info 
)
static

Indicates whether or not there is an optimal assembly implementation that can be used to process the given parameters.

This method has the same use of NEGEMMConvolutionLayer::has_opt_impl, with the only caveat that the value of arm_compute::WeightFormat need to be passed via the parameter info.

Returns
a status.

Definition at line 826 of file CpuGemmAssemblyDispatch.cpp.

832 {
836  Params p = extract_parameters(a, b, d, info);
837  const CPUInfo &ci = NEScheduler::get().cpu_info();
838  unsigned int num_threads = NEScheduler::get().num_threads();
841  arm_gemm::WeightFormat arm_gemm_expected_wf = assembly_utils::map_to_arm_gemm_weight_format(expected_weight_format);
842  arm_gemm::GemmArgs args(&ci, p.M, p.N, p.K, p.sections, p.batches, p.multis, p.indirect, act, num_threads,
843  info.fixed_format, info.fast_mode, &cfg);
844  // TODO: Incorporate info.transpose_b COMPMID-6595
845  switch (a->data_type())
846  {
847  case DataType::F32:
849  !(arm_gemm::has_opt_gemm<float, float, arm_gemm::Nothing>(arm_gemm_expected_wf, args, {})),
850  "We could not find an optimized kernel for F32 input");
851  break;
852 #ifdef __aarch64__
853  case DataType::U8:
854  case DataType::QASYMM8:
855  if (d->data_type() == DataType::S32)
856  {
858  !(arm_gemm::has_opt_gemm<uint8_t, uint32_t, arm_gemm::Nothing>(arm_gemm_expected_wf, args, {})),
859  "We could not find an optimized kernel for U8/QASYMM8 input and U32 output");
860  }
861  else
862  {
864  !(arm_gemm::has_opt_gemm<uint8_t, uint8_t, arm_gemm::Requantize32>(arm_gemm_expected_wf, args, {})),
865  "We could not find an optimized kernel for U8 input and U8 output");
866  }
867  break;
868  case DataType::S8:
870  if (d->data_type() == DataType::S32)
871  {
873  !(arm_gemm::has_opt_gemm<int8_t, int32_t, arm_gemm::Nothing>(arm_gemm_expected_wf, args, {})),
874  "We could not find an optimized kernel for S8/QASYMM8_SIGNED input and S32 output");
875  }
876  else
877  {
879  !(arm_gemm::has_opt_gemm<int8_t, int8_t, arm_gemm::Requantize32>(arm_gemm_expected_wf, args, {})),
880  "We could not find an optimized kernel for S8 input and S8 output");
881  }
882  break;
883 #endif /* __aarch64__ */
884 #if defined(ARM_COMPUTE_ENABLE_BF16)
885  case DataType::BFLOAT16:
886  {
888  !(arm_gemm::has_opt_gemm<bfloat16, float, arm_gemm::Nothing>(arm_gemm_expected_wf, args, {})),
889  "We could not find an optimized kernel for BFLOAT16 input and F32 output");
890  break;
891  }
892 #endif /* defined(ARM_COMPUTE_ENABLE_BF16) */
893 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
894  case DataType::F16:
896  !(arm_gemm::has_opt_gemm<float16_t, float16_t, arm_gemm::Nothing>(arm_gemm_expected_wf, args, {})),
897  "We could not find an optimized kernel for F16 input and F16 output");
898  break;
899 #endif /* __ARM_FEATURE_FP16_VECTOR_ARITHMETIC */
900  default:
901  ARM_COMPUTE_RETURN_ERROR_ON_MSG(true, "Usupported type. Could not find a kernel");
902  break;
903  }
904  expected_weight_format = assembly_utils::map_to_arm_compute_weight_format(arm_gemm_expected_wf);
905 
906  return Status{};
907 }

References GemmTuner::args, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_UNUSED, arm_compute::test::validation::b, arm_compute::BFLOAT16, ci, IScheduler::cpu_info(), ITensorInfo::data_type(), arm_compute::F16, arm_compute::F32, Scheduler::get(), arm_compute::test::validation::info, arm_compute::assembly_utils::map_to_arm_compute_weight_format(), arm_compute::assembly_utils::map_to_arm_gemm_activation(), arm_compute::assembly_utils::map_to_arm_gemm_weight_format(), IScheduler::num_threads(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::S32, arm_compute::S8, arm_compute::U8, and GemmConfig::weight_format.

Referenced by CpuGemmAssemblyDispatch::validate().

◆ is_activation_supported()

bool is_activation_supported ( const ActivationLayerInfo activation)
static

Checks if activation is supported by the gemm assembly dispatcher.

Parameters
[in]activationActivation to check
Returns
True if activation is supported else false

Definition at line 968 of file CpuGemmAssemblyDispatch.cpp.

References arm_compute::assembly_utils::map_to_arm_gemm_activation(), Activation::None, and Activation::type.

Referenced by CpuGemmLowpMatrixMultiplyCore::configure().

◆ is_configured()

bool is_configured ( ) const

Was the function successfully configured ?

Returns
True if the function is configured and ready to run

Definition at line 1036 of file CpuGemmAssemblyDispatch.cpp.

1037 {
1038  return _arm_gemm && _arm_gemm->is_configured();
1039 }

◆ isVarWeightsKernel()

bool isVarWeightsKernel ( ) const
inline

Indicates if the convolution executes in variable weights mode.

Similar to CpuGemm::isVarWeightsKernel

Definition at line 182 of file CpuGemmAssemblyDispatch.h.

183  {
184  return _arm_gemm && _arm_gemm->isVarWeightsKernel();
185  }

◆ prepare()

void prepare ( ITensorPack constants)
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Parameters
[in]constantsVector that contains the constants tensors.
Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from INEOperator.

Definition at line 1030 of file CpuGemmAssemblyDispatch.cpp.

1031 {
1032  ARM_COMPUTE_ERROR_ON(_arm_gemm == nullptr);
1033  _arm_gemm->prepare(tensors);
1034 }

References ARM_COMPUTE_ERROR_ON.

◆ run()

void run ( ITensorPack tensors)
overridevirtual

Run the kernels contained in the function.

Parameters
[in]tensorsVector that contains the tensors to operate on.

Reimplemented from INEOperator.

Definition at line 1041 of file CpuGemmAssemblyDispatch.cpp.

1042 {
1043  ARM_COMPUTE_ERROR_ON(_arm_gemm == nullptr);
1044  _arm_gemm->run(tensors);
1045 }

References ARM_COMPUTE_ERROR_ON.

◆ validate()

Status validate ( const ITensorInfo a,
const ITensorInfo b,
const ITensorInfo c,
const ITensorInfo d,
const AsmGemmInfo info 
)
static

Indicates whether or not this function can be used to process the given parameters.

Parameters
[in]aInput tensor info (Matrix A)
[in]bInput tensor info (Matrix B)
[in]cInput tensor info (Matrix C) used to pass the bias for quantized calculations
[in]dOutput tensor to store the result of matrix multiplication. Data type supported: same as input0.
[in]infoGEMM meta-data
Returns
a status.

Definition at line 909 of file CpuGemmAssemblyDispatch.cpp.

911 {
916  ARM_COMPUTE_RETURN_ERROR_ON_MSG(!(info.reshape_b_only_on_first_run),
917  "Assembly kernel will not be executed when reshape_b_only_on_first_run is false");
918 
919 #ifndef __aarch64__
920  ARM_COMPUTE_RETURN_ERROR_ON_MSG(a->element_size() == 1, "8bit integer types only supported for aarch64");
921 #endif /* __aarch64__ */
928  if (is_data_type_quantized_per_channel(b->data_type()))
929  {
931  }
932  else if (is_fixed_format_fast_math(info.weight_format))
933  {
936  }
937  else
938  {
940  }
941  ARM_COMPUTE_RETURN_ERROR_ON_MSG(a->data_type() == DataType::F32 && d->data_type() != DataType::F32,
942  "Only F32 output supported for F32 input");
943  ARM_COMPUTE_RETURN_ERROR_ON_MSG(a->data_type() == DataType::F16 && d->data_type() != DataType::F16,
944  "Only F16 output supported for F16 input");
945  ARM_COMPUTE_RETURN_ERROR_ON_MSG(a->data_type() == DataType::BFLOAT16 && d->data_type() != DataType::F32,
946  "Only F32 output supported for BFLOAT16 input");
947  ARM_COMPUTE_RETURN_ERROR_ON_MSG(a->data_type() == DataType::U8 && d->data_type() != DataType::U32,
948  "Only U32 output supported for U8 input");
949  ARM_COMPUTE_RETURN_ERROR_ON_MSG(a->data_type() == DataType::S8 && d->data_type() != DataType::S32,
950  "Only S32 output supported for S8 input");
952  (d->data_type() != DataType::QASYMM8 && d->data_type() != DataType::S32),
953  "Only QASYMM8/S32 output supported for QASYMM8 input");
955  const Status ret = CpuGemmAssemblyDispatch::has_opt_impl(expected_weight_format, a, b, c, d, info);
956  if ((bool)ret && expected_weight_format != arm_compute::WeightFormat::ANY)
957  {
958  // Correctness check: if the format expected by the kernel is
959  // not "any", make sure that the one found matches the format
960  // intended by the caller.
962  (expected_weight_format != info.weight_format),
963  "The format expected by the kernel does not correspond with the one requested by the user.");
964  }
965  return ret;
966 }

References arm_compute::ANY, ARM_COMPUTE_RETURN_ERROR_ON_CPU_BF16_UNSUPPORTED, ARM_COMPUTE_RETURN_ERROR_ON_CPU_F16_UNSUPPORTED, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_UNUSED, arm_compute::test::validation::b, arm_compute::BFLOAT16, ITensorInfo::data_type(), ITensorInfo::element_size(), arm_compute::F16, arm_compute::F32, CpuGemmAssemblyDispatch::has_opt_impl(), arm_compute::test::validation::info, arm_compute::is_data_type_quantized_per_channel(), arm_compute::is_fixed_format_fast_math(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::QSYMM8_PER_CHANNEL, arm_compute::S32, arm_compute::S8, arm_compute::U32, arm_compute::U8, and arm_compute::UNSPECIFIED.

Referenced by CpuGemm::configure(), CpuGemmAssemblyDispatch::configure(), CpuMatMul::validate(), CpuGemmDirectConv2d::validate(), CpuGemm::validate(), and CpuGemmLowpMatrixMultiplyCore::validate().

◆ workspace()

experimental::MemoryRequirements workspace ( ) const
overridevirtual

Return the memory requirements required by the workspace.

Reimplemented from INEOperator.

Definition at line 1047 of file CpuGemmAssemblyDispatch.cpp.

1048 {
1049  ARM_COMPUTE_ERROR_ON(_arm_gemm == nullptr);
1050  return _arm_gemm->workspace();
1051 }

References ARM_COMPUTE_ERROR_ON.


The documentation for this class was generated from the following files:
ARM_COMPUTE_RETURN_ERROR_ON_CPU_BF16_UNSUPPORTED
#define ARM_COMPUTE_RETURN_ERROR_ON_CPU_BF16_UNSUPPORTED(tensor)
Definition: Validate.h:123
arm_compute::DataType::QSYMM8_PER_CHANNEL
@ QSYMM8_PER_CHANNEL
quantized, symmetric per channel fixed-point 8-bit number
GemmTuner.args
args
Definition: GemmTuner.py:679
arm_compute::DataType::BFLOAT16
@ BFLOAT16
16-bit brain floating-point number
arm_compute::assembly_utils::map_to_arm_gemm_weight_format
arm_gemm::WeightFormat map_to_arm_gemm_weight_format(const arm_compute::WeightFormat &weight_format)
Performs a mapping from Compute Library WeightFormat to the assembly WeightFormat enum.
Definition: AssemblyUtils.cpp:70
arm_compute::WeightFormat::ANY
@ ANY
arm_compute::cpu::CpuGemmAssemblyDispatch::has_opt_impl
static Status has_opt_impl(arm_compute::WeightFormat &weight_format, const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *d, const AsmGemmInfo &info)
Indicates whether or not there is an optimal assembly implementation that can be used to process the ...
Definition: CpuGemmAssemblyDispatch.cpp:826
arm_compute::DataType::QASYMM8
@ QASYMM8
quantized, asymmetric fixed-point 8-bit number unsigned
arm_compute::assembly_utils::map_to_arm_gemm_activation
arm_gemm::Activation map_to_arm_gemm_activation(const ActivationLayerInfo &act)
Performs a mapping between Compute Library ActivationLayerInfo and the assembly Activation structure.
Definition: AssemblyUtils.cpp:32
arm_compute::IScheduler::cpu_info
CPUInfo & cpu_info()
Get CPU info.
Definition: IScheduler.cpp:42
arm_compute::DataType::S8
@ S8
signed 8-bit number
ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:677
ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:952
arm_compute::CPUInfo
Definition: CPPTypes.h:66
ARM_COMPUTE_ERROR_ON_NULLPTR
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:159
arm_gemm::GemmConfig::weight_format
WeightFormat weight_format
Definition: arm_gemm.hpp:111
ARM_COMPUTE_ERROR_ON
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
arm_gemm::Activation
Definition: arm_gemm.hpp:121
arm_compute::WeightFormat
WeightFormat
Memory layouts for the weights tensor.
Definition: CoreTypes.h:311
arm_compute::DataType::U32
@ U32
unsigned 32-bit number
ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_NOT_IN
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_NOT_IN(t,...)
Definition: Validate.h:838
ARM_COMPUTE_RETURN_ERROR_ON_CPU_F16_UNSUPPORTED
#define ARM_COMPUTE_RETURN_ERROR_ON_CPU_F16_UNSUPPORTED(tensor)
Definition: Validate.h:117
arm_compute::DataType::U8
@ U8
unsigned 8-bit number
arm_compute::Scheduler::get
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:94
arm_compute::DataType::QASYMM8_SIGNED
@ QASYMM8_SIGNED
quantized, asymmetric fixed-point 8-bit number signed
arm_gemm::GemmConfig
Definition: arm_gemm.hpp:105
arm_compute::is_data_type_quantized_per_channel
bool is_data_type_quantized_per_channel(DataType dt)
Check if a given data type is of per channel type.
Definition: DataTypeUtils.h:401
ci
const CPUInfo & ci
Definition: NEBatchNormalizationLayerKernel.cpp:51
arm_gemm::GemmArgs
Definition: arm_gemm.hpp:139
ARM_COMPUTE_UNUSED
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:151
arm_compute::WeightFormat::UNSPECIFIED
@ UNSPECIFIED
arm_gemm::Activation::Type::None
@ None
arm_compute::is_fixed_format_fast_math
bool is_fixed_format_fast_math(const WeightFormat &wf)
Definition: Types.h:1664
arm_compute::test::validation::b
SimpleTensor< float > b
Definition: DFT.cpp:157
ARM_COMPUTE_RETURN_ERROR_ON_MSG
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG(cond, msg)
If the condition is true, an error is returned.
Definition: Error.h:245
arm_gemm::Activation::type
Type type
Definition: arm_gemm.hpp:130
arm_compute::DataType::F16
@ F16
16-bit floating-point number
arm_compute::DataType::S32
@ S32
signed 32-bit number
ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
arm_compute::IScheduler::num_threads
virtual unsigned int num_threads() const =0
Returns the number of threads that the SingleThreadScheduler has in its pool.
arm_gemm::WeightFormat
WeightFormat
Definition: arm_gemm.hpp:49
arm_compute::DataType::F32
@ F32
32-bit floating-point number
arm_compute::test::validation::info
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
arm_compute::cpu::CpuGemmAssemblyDispatch::validate
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *d, const AsmGemmInfo &info)
Indicates whether or not this function can be used to process the given parameters.
Definition: CpuGemmAssemblyDispatch.cpp:909
arm_compute::assembly_utils::map_to_arm_compute_weight_format
arm_compute::WeightFormat map_to_arm_compute_weight_format(const arm_gemm::WeightFormat &weight_format)
Performs a mapping from Assembly WeightFormat to the Compute Library WeightFormat enum.
Definition: AssemblyUtils.cpp:190