Compute Library
 22.08
CpuGemmDirectConv2d Class Reference

#include <CpuGemmDirectConv2d.h>

Collaboration diagram for CpuGemmDirectConv2d:
[legend]

Public Member Functions

 CpuGemmDirectConv2d ()
 
 ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE (CpuGemmDirectConv2d)
 
 ~CpuGemmDirectConv2d ()
 
void configure (const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, ITensorInfo *dst, const Conv2dInfo &info)
 Set the input and output tensors. More...
 
void run (ITensorPack &tensors) override
 Run the kernels contained in the function. More...
 
void prepare (ITensorPack &constants) override
 Prepare the function for executing. More...
 
experimental::MemoryRequirements workspace () const override
 Return the memory requirements required by the workspace. More...
 
- Public Member Functions inherited from INEOperator
 INEOperator (IRuntimeContext *ctx=nullptr)
 Constructor. More...
 
 INEOperator (const INEOperator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 INEOperator (INEOperator &&)=default
 Default move constructor. More...
 
INEOperatoroperator= (const INEOperator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
INEOperatoroperator= (INEOperator &&)=default
 Default move assignment operator. More...
 
 ~INEOperator ()
 Default destructor. More...
 
- Public Member Functions inherited from IOperator
virtual ~IOperator ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, const Conv2dInfo &info)
 Static function to check if given info will lead to a valid configuration of CpuGemmDirectConv2d. More...
 

Detailed Description

Definition at line 41 of file CpuGemmDirectConv2d.h.

Constructor & Destructor Documentation

◆ CpuGemmDirectConv2d()

Definition at line 97 of file CpuGemmDirectConv2d.cpp.

98  : _gemm_asm_func(std::make_unique<CpuGemmAssemblyDispatch>()),
99  _activation_func(std::make_unique<CpuActivation>()),
100  _weights_permute_func(std::make_unique<CpuPermute>()),
101  _aux_mem(AuxTensorIdx::Count),
102  _perm_weights(),
103  _run_activation(false),
104  _is_prepared(false)
105 {
106 }

◆ ~CpuGemmDirectConv2d()

~CpuGemmDirectConv2d ( )
default

Member Function Documentation

◆ ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE()

ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE ( CpuGemmDirectConv2d  )

◆ configure()

void configure ( const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo biases,
ITensorInfo dst,
const Conv2dInfo info 
)

Set the input and output tensors.

Valid data layouts:

  • All

Valid data type configurations:

src0 src1 src2 dst
QASYMM8 QASYMM8 S32 QASYMM8
QASYMM8_SIGNED QASYMM8_SIGNED S32 QASYMM8_SIGNED
F16 F16 F16 F16
F32 F32 F32 F32
BFLOAT16 BFLOAT16 BFLOAT16 BFLOAT16
Parameters
[in]srcSource tensor info. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: QASYMM8/QASYMM8_SIGNED/BFLOAT16/F16/F32.
[in]weightsWeights tensor info. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: QASYMM8/QASYMM8_SIGNED/QSYMM8_PER_CHANNEL/BFLOAT16/F16/F32.
[in]biasesBiases tensor info. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Should match input data type, except for input of QASYMM8/QASYMM8_SIGNED type where biases should be of S32 type.
[in]dstDestination tensor info. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]infoContains padding and stride information described in PadStrideInfo.

Definition at line 110 of file CpuGemmDirectConv2d.cpp.

References Conv2dInfo::act_info, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_LOG_PARAMS, ITensorInfo::data_type(), ActivationLayerInfo::enabled(), arm_compute::is_data_type_quantized(), arm_compute::offset_int_vec(), AsmGemmInfo::output_stage, arm_compute::experimental::Prepare, ITensorInfo::total_size(), arm_compute::UNSPECIFIED, CpuGemmDirectConv2d::validate(), WeightsInfo::weight_format(), and Conv2dInfo::weights_info.

111 {
114  weights,
115  biases != nullptr ? biases : nullptr,
116  dst,
117  info));
118  ARM_COMPUTE_LOG_PARAMS(src, weights, biases, dst, info);
119 
120  _run_activation = info.act_info.enabled() && !_gemm_asm_func->is_activation_supported(info.act_info);
121  _is_prepared = false;
122 
123  _weights_permute_func->configure(weights, &_perm_weights, PermutationVector{ 3, 0, 1, 2 });
124 
125  // Configure assembly dispatch
126  cpu::AsmGemmInfo asm_info = init_assembly_metadata(info, false);
127  if(is_data_type_quantized(src->data_type()))
128  {
129  asm_info.output_stage = calculate_output_stage_metadata(src, weights, dst, info.act_info);
130  }
131  _gemm_asm_func->configure(src, &_perm_weights, biases, dst, asm_info);
132 
133  // Configure activation
134  if(_run_activation)
135  {
136  _activation_func->configure(dst, nullptr, info.act_info);
137  }
138 
139  // Add auxiliary memory requirements of the assembly dispatch
140  auto asm_mem_req = _gemm_asm_func->workspace();
141  _aux_mem[AsmGemmWorkspace] = asm_mem_req[AsmGemmWorkspace];
142  _aux_mem[Pretranspose] = asm_mem_req[Pretranspose];
143 
144  if(_aux_mem[Pretranspose].size > 0)
145  {
146  // Release permuted weights at the of prepare as they are further transposed by the assembly dispatch
147  _aux_mem[PermutedWeights] = MemoryInfo(offset_int_vec(PermutedWeights), MemoryLifetime::Prepare, weights->total_size());
148  }
149  else
150  {
151  // We must permute weights if they are WeightFormat::UNSPECIFIED
152  if(info.weights_info.weight_format() == WeightFormat::UNSPECIFIED)
153  _aux_mem[PermutedWeights] = MemoryInfo(offset_int_vec(PermutedWeights), MemoryLifetime::Persistent, weights->total_size());
154  }
155 }
bool is_data_type_quantized(DataType dt)
Check if a given data type is of quantized type.
Definition: Utils.h:1030
Strides PermutationVector
Permutation vector.
Definition: Types.h:51
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
SimpleTensor< float > src
Definition: DFT.cpp:155
static Status validate(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, const Conv2dInfo &info)
Static function to check if given info will lead to a valid configuration of CpuGemmDirectConv2d.
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
#define ARM_COMPUTE_LOG_PARAMS(...)
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157
int offset_int_vec(int offset)
Definition: MemoryHelpers.h:38

◆ prepare()

void prepare ( ITensorPack constants)
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Parameters
[in]constantsVector that contains the constants tensors.
Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from INEOperator.

Definition at line 206 of file CpuGemmDirectConv2d.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC, arm_compute::ACL_SRC_1, ITensorPack::add_const_tensor(), ARM_COMPUTE_ERROR_ON_NULLPTR, CpuAuxTensorHandler::get(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), arm_compute::offset_int_vec(), and arm_compute::utils::cast::polymorphic_cast().

Referenced by CpuGemmDirectConv2d::run().

207 {
208  if(!_is_prepared)
209  {
210  // If we are using fixed-format kernel the weights are already reshaped
211  if(_gemm_asm_func && _gemm_asm_func->isVarWeightsKernel())
212  {
213  _gemm_asm_func->prepare(tensors);
214  _is_prepared = true;
215  return;
216  }
217  const ITensor *weights = tensors.get_const_tensor(ACL_SRC_1);
218  ITensor *weights_aux = utils::cast::polymorphic_cast<ITensor *>(tensors.get_tensor(offset_int_vec(PermutedWeights)));
219  ARM_COMPUTE_ERROR_ON_NULLPTR(weights, weights_aux);
220 
221  CpuAuxTensorHandler permuted_weights(_perm_weights, *weights_aux);
222  ITensorPack permute_tensors{ { ACL_SRC, weights }, { ACL_DST, permuted_weights.get() } };
223  _weights_permute_func->run(permute_tensors);
224 
225  tensors.add_const_tensor(ACL_SRC_1, permuted_weights.get());
226  // Call prepare of assembly dispatch
227  _gemm_asm_func->prepare(tensors);
228 
229  _is_prepared = true;
230  }
231 }
Target polymorphic_cast(Source *v)
Polymorphic cast between two types.
Definition: Cast.h:47
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157
int offset_int_vec(int offset)
Definition: MemoryHelpers.h:38

◆ run()

void run ( ITensorPack tensors)
overridevirtual

Run the kernels contained in the function.

Parameters
[in]tensorsVector that contains the tensors to operate on.

Reimplemented from INEOperator.

Definition at line 193 of file CpuGemmDirectConv2d.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC, ITensorPack::get_tensor(), arm_compute::test::validation::pack, and CpuGemmDirectConv2d::prepare().

194 {
195  prepare(tensors);
196 
197  _gemm_asm_func->run(tensors);
198  if(_run_activation)
199  {
200  ITensor *io = tensors.get_tensor(ACL_DST);
201  ITensorPack pack{ { ACL_SRC, io }, { ACL_DST, io } };
202  _activation_func->run(pack);
203  }
204 }
void prepare(ITensorPack &constants) override
Prepare the function for executing.

◆ validate()

Status validate ( const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo dst,
const Conv2dInfo info 
)
static

Static function to check if given info will lead to a valid configuration of CpuGemmDirectConv2d.

Similar to CpuGemmDirectConv2d::configure()

Returns
a status

Definition at line 156 of file CpuGemmDirectConv2d.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::BFLOAT16, ITensorInfo::data_layout(), ITensorInfo::data_type(), Conv2dInfo::dilation, ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, arm_compute::is_data_type_quantized_asymmetric(), arm_compute::NHWC, ITensorInfo::num_dimensions(), Conv2dInfo::num_groups, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::QSYMM8_PER_CHANNEL, arm_compute::S32, ITensorInfo::tensor_shape(), arm_compute::utils::cast::U, and CpuGemmAssemblyDispatch::validate().

Referenced by CpuGemmDirectConv2d::configure(), CpuConv2d::get_convolution_method(), and CpuConv2d::validate().

157 {
162  ARM_COMPUTE_RETURN_ERROR_ON_MSG(info.num_groups > 1, "Grouping (num_groups != 1) is not supported on Neon");
163  ARM_COMPUTE_RETURN_ERROR_ON_MSG(src->data_layout() != DataLayout::NHWC, "Data layout supported is NHWC");
164  const DataType data_type = src->data_type();
165  const TensorShape i_shape = src->tensor_shape();
166  const TensorShape w_shape = weights->tensor_shape();
167  ARM_COMPUTE_RETURN_ERROR_ON(w_shape[0] != i_shape[0]);
168  ARM_COMPUTE_RETURN_ERROR_ON(info.dilation != Size2D(1U, 1U));
169  ARM_COMPUTE_RETURN_ERROR_ON(weights->num_dimensions() > 4);
170  // Validate biases
171  if(biases != nullptr)
172  {
173  if(is_data_type_quantized_asymmetric(data_type))
174  {
176  }
177  else if(data_type == DataType::BFLOAT16)
178  {
180  }
181  else
182  {
184  }
185  ARM_COMPUTE_RETURN_ERROR_ON(biases->dimension(0) != weights->dimension(3));
186  ARM_COMPUTE_RETURN_ERROR_ON(biases->num_dimensions() > 1);
187  }
188 
189  cpu::AsmGemmInfo asm_info = init_assembly_metadata(info, false);
191  return Status{};
192 }
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_LAYOUT(...)
Definition: Validate.h:490
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
1 channel, 1 F32 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
SimpleTensor< float > src
Definition: DFT.cpp:155
1 channel, 1 F16 per channel
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:159
1 channel, 1 S32 per channel
16-bit brain floating-point number
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *d, const AsmGemmInfo &info)
Indicates whether or not this function can be used to process the given parameters.
quantized, asymmetric fixed-point 8-bit number unsigned
bool is_data_type_quantized_asymmetric(DataType dt)
Check if a given data type is of asymmetric quantized type.
Definition: Utils.h:1052
quantized, symmetric per channel fixed-point 8-bit number
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:541
Num samples, height, width, channels.
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:788
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG(cond, msg)
If the condition is true, an error is returned.
Definition: Error.h:244
quantized, asymmetric fixed-point 8-bit number signed
DataType
Available data types.
Definition: Types.h:79

◆ workspace()

experimental::MemoryRequirements workspace ( ) const
overridevirtual

Return the memory requirements required by the workspace.

Reimplemented from INEOperator.

Definition at line 233 of file CpuGemmDirectConv2d.cpp.

234 {
235  return _aux_mem;
236 }

The documentation for this class was generated from the following files: