Compute Library
 23.08
CpuWinogradConv2d Class Reference

#include <CpuWinogradConv2d.h>

Collaboration diagram for CpuWinogradConv2d:
[legend]

Public Member Functions

 CpuWinogradConv2d ()
 Constructor. More...
 
 ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE (CpuWinogradConv2d)
 
 ~CpuWinogradConv2d ()
 Destructor. More...
 
void configure (const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, ITensorInfo *dst, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Set the input and output tensors. More...
 
void run (ITensorPack &tensors) override
 Run the kernels contained in the function. More...
 
void prepare (ITensorPack &constants) override
 Prepare the function for executing. More...
 
experimental::MemoryRequirements workspace () const override
 Return the memory requirements required by the workspace. More...
 
- Public Member Functions inherited from INEOperator
 INEOperator (IRuntimeContext *ctx=nullptr)
 Constructor. More...
 
 INEOperator (const INEOperator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 INEOperator (INEOperator &&)=default
 Default move constructor. More...
 
INEOperatoroperator= (const INEOperator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
INEOperatoroperator= (INEOperator &&)=default
 Default move assignment operator. More...
 
 ~INEOperator ()
 Default destructor. More...
 
- Public Member Functions inherited from IOperator
virtual ~IOperator ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Static function to check if given info will lead to a valid configuration of CpuWinogradConv2d. More...
 

Detailed Description

Definition at line 42 of file CpuWinogradConv2d.h.

Constructor & Destructor Documentation

◆ CpuWinogradConv2d()

Constructor.

Definition at line 134 of file CpuWinogradConv2d.cpp.

136  : _gemm_function(std::make_unique<CpuGemm>()),
137  _activation_func(std::make_unique<CpuActivation>()),
138  _transform_input_kernel(nullptr),
139  _transform_output_kernel(nullptr),
140  _permute_input(std::make_unique<CpuPermute>()),
141  _permute_output(std::make_unique<CpuPermute>()),
142  _permute_weights(std::make_unique<CpuPermute>()),
143  _aux_mem(AuxTensorIdx::Count),
144  _conv_args{ nullptr },
145  _winograd_impl{},
146  _data_layout(),
147  _winograd_transformed_input{},
148  _winograd_transformed_output{},
149  _winograd_transformed_weights{},
150  _input_workspace(),
151  _output_workspace(),
152  _weights_hwio(),
153  _input_nhwc(),
154  _output_nhwc(),
155  _is_prepared{ false },
156  _run_activation{ false }
157 {
158 }

◆ ~CpuWinogradConv2d()

~CpuWinogradConv2d ( )
default

Destructor.

Member Function Documentation

◆ ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE()

ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE ( CpuWinogradConv2d  )

◆ configure()

void configure ( const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo biases,
ITensorInfo dst,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)

Set the input and output tensors.

Valid data layouts:

  • NHWC
  • NCHW

Valid data type configurations:

src0 src1 src2 dst
F16 F16 F16 F16
F32 F32 F32 F32
Parameters
[in]srcSource tensor Info. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor Info. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported: Same as input. Currently only 3x3 and 5x5 kernels are supported.
[in]biasesBiases tensor Info. Shared biases supported. Biases are 1D tensor with dimensions [OFM]. Data type supported: Same as weights.
[out]dstDestination tensor Info. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo. Currently only unit strides are supported.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false

Definition at line 162 of file CpuWinogradConv2d.cpp.

164 {
166  ARM_COMPUTE_ERROR_THROW_ON(validate(src, weights, biases, dst, conv_info, act_info, enable_fast_math));
167  ARM_COMPUTE_LOG_PARAMS(src, weights, biases, dst, conv_info, act_info, enable_fast_math);
168  ARM_COMPUTE_UNUSED(biases);
169  const DataType data_type = src->data_type();
170  uint32_t nthreads = NEScheduler::get().num_threads();
171  _data_layout = src->data_layout();
172  const Tensor4DShape kernel_shape{ internal_get_shape(weights) };
173 
174  bool success = get_winograd_kernel_implementation(src, weights, dst, conv_info, act_info, enable_fast_math, &_winograd_impl, _conv_args);
175 
176  ARM_COMPUTE_EXIT_ON_MSG_VAR(!success, "Unsupported kernel size: %d x %d.\n", kernel_shape.n_rows, kernel_shape.n_cols);
177  ARM_COMPUTE_LOG_MSG_WITH_FORMAT_ACL(arm_compute::logging::LogLevel::INFO, "Using input transform: %s\n", _winograd_impl.input_transform->get_name().c_str());
178  ARM_COMPUTE_LOG_MSG_WITH_FORMAT_ACL(arm_compute::logging::LogLevel::INFO, "Using weight transform: %s\n", _winograd_impl.input_transform->get_name().c_str());
179  ARM_COMPUTE_LOG_MSG_WITH_FORMAT_ACL(arm_compute::logging::LogLevel::INFO, "Using output transform: %s\n", _winograd_impl.input_transform->get_name().c_str());
180 
181  const bool has_impl = ((_winograd_impl.input_transform != nullptr) && (_winograd_impl.output_transform != nullptr) && (_winograd_impl.gemm_args != nullptr));
182  if(has_impl)
183  {
184  // Determine how much working space is required, allocate it.
185  const size_t input_workspace_size = _winograd_impl.input_transform->get_working_space_size(*_conv_args, nthreads);
186  const size_t output_workspace_size = _winograd_impl.output_transform->get_working_space_size(*_conv_args, nthreads);
187 
188  TensorInfo input_workspace_info(TensorShape(input_workspace_size), 1, DataType::U8);
189  TensorInfo output_workspace_info(TensorShape(output_workspace_size), 1, DataType::U8);
190  _input_workspace = input_workspace_info;
191  _output_workspace = output_workspace_info;
192 
193  const auto &wds = _winograd_impl.winograd_spec;
194 
195  // Preparing winograd transformed input tensor
196  const size_t data_type_size = src->element_size();
197  const uint32_t m = _winograd_impl.gemm_args->_Msize; // Total number of tiles
198  const uint32_t k = _winograd_impl.gemm_args->_Ksize; // Input channels
199  const uint32_t n = _winograd_impl.gemm_args->_Nsize; // Output channels
200  const uint32_t n_gemms = _winograd_impl.gemm_args->_nmulti;
201  const uint32_t n_batches = _winograd_impl.gemm_args->_nbatches;
202  constexpr size_t storage_alignment = 64;
203 
204  const TensorShape a_shape(k, m, n_batches, n_gemms);
205  Strides a_strides(data_type_size);
206  a_strides.set(1, data_type_size * _winograd_impl.winograd_spec.input_ld_row);
207  a_strides.set(2, data_type_size * _winograd_impl.winograd_spec.input_ld_batch);
208  a_strides.set(3, data_type_size * _winograd_impl.winograd_spec.input_ld_matrix);
209 
210  const TensorShape b_shape(n, k, n_gemms);
211  Strides b_strides(data_type_size);
212  b_strides.set(1, data_type_size * _winograd_impl.winograd_spec.weight_ld_row);
213  b_strides.set(2, data_type_size * _winograd_impl.winograd_spec.weight_ld_matrix);
214 
215  const TensorShape d_shape(n, m, n_batches, n_gemms);
216  Strides d_strides(data_type_size);
217  d_strides.set(1, data_type_size * _winograd_impl.winograd_spec.output_ld_row);
218  d_strides.set(2, data_type_size * _winograd_impl.winograd_spec.output_ld_batch);
219  d_strides.set(3, data_type_size * _winograd_impl.winograd_spec.output_ld_matrix);
220 
221  TensorInfo a_info{};
222  TensorInfo b_info{};
223  TensorInfo d_info{};
224  a_info.init(a_shape, 1, data_type, a_strides, 0, wds.input_matrix_size_bytes);
225  b_info.init(b_shape, 1, data_type, b_strides, 0, wds.weight_matrix_size_bytes);
226  d_info.init(d_shape, 1, data_type, d_strides, 0, wds.output_matrix_size_bytes);
227 
228  _winograd_transformed_input = a_info;
229  _winograd_transformed_weights = b_info;
230  _winograd_transformed_output = d_info;
231 
232  PermutationVector weights_permutation_vector(3U, 0U, 1U, 2U);
233 
234  // Configure the kernel to transform the input tensor from NCHW -> NHWC
235  if(_data_layout == DataLayout::NCHW)
236  {
237  _permute_input->configure(src, &_input_nhwc, PermutationVector(2U, 0U, 1U));
238  weights_permutation_vector = PermutationVector(3U, 2U, 0U, 1U);
239  }
240 
241  // Re-order a weight tensor from [Output feature map x Input feature map x Height x Width] to [Height x Width x Input feature map x Output feature map]
242  _permute_weights->configure(weights, &_weights_hwio, weights_permutation_vector);
243 
244  // Reorder the convoluted output to ACL's ordering NCHW
245  if(_data_layout == DataLayout::NCHW)
246  {
247  // configure and allocate dst tensor to be used to convert from winograd domain to spatial domain when calling to reshape_output()
248  TensorInfo info(TensorShape(dst->dimension(2), dst->dimension(0),
249  dst->dimension(1), dst->dimension(3)),
250  1, dst->data_type());
251  _output_nhwc = info;
252  _permute_output->configure(&_output_nhwc, dst, PermutationVector(1U, 2U, 0U));
253  }
254 
255  // Configure input transform kernel
256  _transform_input_kernel = std::make_unique<CpuWinogradConv2dTransformInputKernel>(_winograd_impl, *_conv_args, nthreads);
257 
258  // Configure GEMM function
259  _gemm_function->configure(&_winograd_transformed_input, &_winograd_transformed_weights, nullptr, &_winograd_transformed_output, 1.0f, 0.f);
260 
261  // Configure output transform kernel
262  _transform_output_kernel = std::make_unique<CpuWinogradConv2dTransformOutputKernel>(_winograd_impl, *_conv_args, nthreads);
263 
264  //Configure Activation Layer
265  _run_activation = act_info.enabled() && !fuse_function_supported(act_info);
266  if(_run_activation)
267  {
268  _activation_func->configure(dst, nullptr, act_info);
269  }
270 
271  auto asm_mem_req = _gemm_function->workspace();
272  _aux_mem[GemmWorkspace] = asm_mem_req[GemmWorkspace];
273  _aux_mem[Pretranspose] = asm_mem_req[Pretranspose];
274  _aux_mem[InterleavedLHS] = asm_mem_req[InterleavedLHS];
275  _aux_mem[TransposedRHS] = asm_mem_req[TransposedRHS];
276  _aux_mem[TempResult] = asm_mem_req[TempResult];
277 
278  // Request temporary memory. Overlap memory needed for Input/Output transformations as they run on different non-overlapping time-steps.
279  _aux_mem[TransformedInput] = MemoryInfo(offset_int_vec(TransformedInput), MemoryLifetime::Temporary, wds.input_matrix_size_bytes, storage_alignment);
280  _aux_mem[TransformedOutput] = MemoryInfo(offset_int_vec(TransformedOutput), MemoryLifetime::Temporary, wds.output_matrix_size_bytes, storage_alignment);
281  _aux_mem[WorkspaceIO] = MemoryInfo(offset_int_vec(WorkspaceIO), MemoryLifetime::Temporary, std::max(input_workspace_size, output_workspace_size));
282  _aux_mem[PermutedWeights] = MemoryInfo(offset_int_vec(PermutedWeights), MemoryLifetime::Prepare, _weights_hwio.total_size());
283  _aux_mem[TransformedWeights] = MemoryInfo(offset_int_vec(TransformedWeights), MemoryLifetime::Persistent, wds.weight_matrix_size_bytes, storage_alignment);
284  if(_data_layout == DataLayout::NCHW)
285  {
286  _aux_mem[PermutedInput].merge(offset_int_vec(PermutedInput), src->total_size());
287  _aux_mem[PermutedOutput].merge(offset_int_vec(PermutedOutput), dst->total_size());
288  }
289  }
290 }

References arm_compute::test::validation::act_info, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ARM_COMPUTE_EXIT_ON_MSG_VAR, ARM_COMPUTE_LOG_MSG_WITH_FORMAT_ACL, ARM_COMPUTE_LOG_PARAMS, ARM_COMPUTE_UNUSED, arm_compute::test::validation::conv_info, arm_compute::test::validation::data_type, arm_compute::test::validation::dst, Scheduler::get(), arm_compute::logging::INFO, arm_compute::test::validation::info, TensorInfo::init(), arm_compute::test::validation::k, arm_compute::test::validation::m, arm_compute::test::validation::n, arm_compute::NCHW, IScheduler::num_threads(), arm_compute::offset_int_vec(), arm_compute::experimental::Prepare, Dimensions< T >::set(), arm_compute::test::validation::src, TensorInfo::total_size(), arm_compute::utils::cast::U, arm_compute::U8, and CpuWinogradConv2d::validate().

◆ prepare()

void prepare ( ITensorPack constants)
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Parameters
[in]constantsVector that contains the constants tensors.
Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from INEOperator.

Definition at line 374 of file CpuWinogradConv2d.cpp.

375 {
376  if(!_is_prepared)
377  {
378  const ITensor *weights = tensors.get_const_tensor(ACL_SRC_1);
379  ITensor *weights_aux = utils::cast::polymorphic_cast<ITensor *>(tensors.get_tensor(offset_int_vec(PermutedWeights)));
380 
381  CpuAuxTensorHandler permuted_weights(_weights_hwio, *weights_aux);
382  ITensorPack permute_tensors{ { ACL_SRC, weights }, { ACL_DST, permuted_weights.get() } };
383  _permute_weights->run(permute_tensors);
384  const int element_size_in_bytes = permuted_weights.get()->info()->element_size();
385  // Weights were in OHWI format, before being permuted "permuted_weights" to be in HWIO format.
386  const unsigned int height_idx = 3; // H in HWIO
387  const unsigned int width_idx = 2; // W in HWIO
388  const unsigned int channel_idx = 1; // I in HWIO
389 
390  const int permuted_weight_row_stride = permuted_weights.get()->info()->strides_in_bytes()[height_idx] / element_size_in_bytes;
391  const int permuted_weight_col_stride = permuted_weights.get()->info()->strides_in_bytes()[width_idx] / element_size_in_bytes;
392  const int permuted_weight_channel_stride = permuted_weights.get()->info()->strides_in_bytes()[channel_idx] / element_size_in_bytes;
393 
394  // Wrap the winograd-domain transformed weight TensorInfo in Auxiliary tensor and allocate the required memory.
395  ITensor *weights_transf = utils::cast::polymorphic_cast<ITensor *>(tensors.get_tensor(offset_int_vec(TransformedWeights)));
396  ARM_COMPUTE_ERROR_ON_NULLPTR(weights_transf);
397  CpuAuxTensorHandler winograd_transformed_weights(_winograd_transformed_weights, *weights_transf);
398 
399  const void *permuted_weights_ptr;
400  void *win_wght_transf_ptr;
401 
402  permuted_weights_ptr = reinterpret_cast<const void *>(permuted_weights.get()->buffer() + permuted_weights.get()->info()->offset_first_element_in_bytes());
403  win_wght_transf_ptr = reinterpret_cast<void *>(winograd_transformed_weights.get()->buffer() + winograd_transformed_weights.get()->info()->offset_first_element_in_bytes());
404 
405  // Prepare Weights
406  _winograd_impl.weight_transform->execute(
407  *_conv_args,
408  permuted_weights_ptr,
409  permuted_weight_row_stride,
410  permuted_weight_col_stride,
411  permuted_weight_channel_stride,
412  win_wght_transf_ptr,
413  _winograd_impl.winograd_spec,
414  0, 1 // Thread 1 of 1
415  );
416  ITensorPack gemm_pack = tensors;
417  gemm_pack.add_const_tensor(ACL_SRC_1, winograd_transformed_weights.get());
418  _gemm_function->prepare(gemm_pack);
419  _is_prepared = 1;
420  }
421 }

References arm_compute::ACL_DST, arm_compute::ACL_SRC, arm_compute::ACL_SRC_1, ITensorPack::add_const_tensor(), ARM_COMPUTE_ERROR_ON_NULLPTR, ITensor::buffer(), ITensorInfo::element_size(), CpuAuxTensorHandler::get(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), ITensor::info(), ITensorInfo::offset_first_element_in_bytes(), arm_compute::offset_int_vec(), and ITensorInfo::strides_in_bytes().

Referenced by CpuWinogradConv2d::run().

◆ run()

void run ( ITensorPack tensors)
overridevirtual

Run the kernels contained in the function.

Parameters
[in]tensorsVector that contains the tensors to operate on.

Reimplemented from INEOperator.

Definition at line 316 of file CpuWinogradConv2d.cpp.

317 {
318  prepare(tensors);
319  auto src = tensors.get_const_tensor(ACL_SRC_0);
320  auto biases = tensors.get_const_tensor(ACL_SRC_2);
321  auto output = tensors.get_tensor(ACL_DST);
322  Window win;
323 
324  const uint32_t nthreads = NEScheduler::get().num_threads();
325 
326  // The Winograd transform implementation does fine-grain threading inside the transforms. Just pass thread_id and nthreads.
327  win.set(Window::DimX, Window::Dimension(0, nthreads, 1));
328 
329  // Wrap the winograd-domain tensorInfos created in configuration in tensors and allocate the required memory.
330  CpuAuxTensorHandler input_nhwc(offset_int_vec(PermutedInput), _input_nhwc, tensors, true);
331  CpuAuxTensorHandler winograd_input_transformed(offset_int_vec(TransformedInput), _winograd_transformed_input, tensors, true);
332  CpuAuxTensorHandler input_workspace(offset_int_vec(WorkspaceIO), _input_workspace, tensors, true);
333  const bool is_nchw = _data_layout == DataLayout::NCHW;
334  if(is_nchw)
335  {
336  //Bring channels to the front as Winograd code expects the tensor to be in the format NHWC
337  ITensorPack pack{ { ACL_SRC, src }, { ACL_DST, input_nhwc.get() } };
338  _permute_input->run(pack);
339  }
340 
341  CpuAuxTensorHandler winograd_output_transformed(offset_int_vec(TransformedOutput), _winograd_transformed_output, tensors, true);
342  CpuAuxTensorHandler output_workspace(offset_int_vec(WorkspaceIO), _output_workspace, tensors, true);
343  CpuAuxTensorHandler output_nhwc(offset_int_vec(PermutedOutput), _output_nhwc, tensors, true);
344 
345  ITensorPack transform_input_pack{ { ACL_SRC, is_nchw ? input_nhwc.get() : src }, { ACL_DST, winograd_input_transformed.get() }, { ACL_INT, input_workspace.get() } };
346  NEScheduler::get().schedule_op(_transform_input_kernel.get(), Window::DimX, win, transform_input_pack);
347 
348  CpuAuxTensorHandler winograd_weights_transformed(offset_int_vec(TransformedWeights), _winograd_transformed_weights, tensors, true);
349 
350  // Run 16 GEMMs in multiple threads, each kernel runs one or more GEMMs
351  ITensorPack gemm_pack = tensors;
352  gemm_pack.add_const_tensor(ACL_SRC, winograd_input_transformed.get());
353  gemm_pack.add_const_tensor(ACL_SRC_1, winograd_weights_transformed.get());
354  gemm_pack.add_const_tensor(ACL_BIAS, nullptr);
355  gemm_pack.add_tensor(ACL_DST, winograd_output_transformed.get());
356  _gemm_function->run(gemm_pack);
357 
358  // Output transform
359  ITensorPack transform_output_pack{ { ACL_SRC_0, winograd_output_transformed.get() }, { ACL_DST, is_nchw ? output_nhwc.get() : output }, { ACL_SRC_1, biases }, { ACL_INT, output_workspace.get() } };
360  NEScheduler::get().schedule_op(_transform_output_kernel.get(), Window::DimX, win, transform_output_pack);
361  if(is_nchw)
362  {
363  // Reorder the convoluted output to ACL's ordering NCHW
364  ITensorPack pack{ { ACL_SRC, output_nhwc.get() }, { ACL_DST, output } };
365  _permute_output->run(pack);
366  }
367  if(_run_activation)
368  {
369  ITensorPack pack{ { ACL_SRC, output }, { ACL_DST, output } };
370  _activation_func->run(pack);
371  }
372 }

References arm_compute::ACL_BIAS, arm_compute::ACL_DST, arm_compute::ACL_INT, arm_compute::ACL_SRC, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, arm_compute::ACL_SRC_2, ITensorPack::add_const_tensor(), ITensorPack::add_tensor(), Window::DimX, Scheduler::get(), CpuAuxTensorHandler::get(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), arm_compute::NCHW, IScheduler::num_threads(), arm_compute::offset_int_vec(), arm_compute::test::validation::pack, CpuWinogradConv2d::prepare(), IScheduler::schedule_op(), Window::set(), and arm_compute::test::validation::src.

◆ validate()

Status validate ( const ITensorInfo src,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo dst,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)
static

Static function to check if given info will lead to a valid configuration of CpuWinogradConv2d.

Similar to CpuWinogradConv2d::configure()

Returns
a status

Definition at line 291 of file CpuWinogradConv2d.cpp.

293 {
296 
297  // Disable winograd for fp16 if fast math is false.
298  if(!enable_fast_math)
299  {
301  }
302 
303  const Tensor4DShape kernel_shape{ internal_get_shape(weights) };
304  arm_conv::winograd::WinogradImpl winograd_impl{};
305 
306  std::unique_ptr<arm_conv::ConvolutionArgs> conv_args;
307  const bool success = get_winograd_kernel_implementation(src, weights, dst, conv_info, act_info, enable_fast_math, &winograd_impl, conv_args);
308 
309  ARM_COMPUTE_RETURN_ERROR_ON_MSG_VAR(success == false, "Unsupported kernel size: %d x %d.\n", kernel_shape.n_rows, kernel_shape.n_cols);
310  ARM_COMPUTE_LOG_MSG_WITH_FORMAT_ACL(arm_compute::logging::LogLevel::INFO, "Using input transform: %s\n", winograd_impl.input_transform->get_name().c_str());
311  ARM_COMPUTE_LOG_MSG_WITH_FORMAT_ACL(arm_compute::logging::LogLevel::INFO, "Using weight transform: %s\n", winograd_impl.input_transform->get_name().c_str());
312  ARM_COMPUTE_LOG_MSG_WITH_FORMAT_ACL(arm_compute::logging::LogLevel::INFO, "Using output transform: %s\n", winograd_impl.input_transform->get_name().c_str());
313  return Status{};
314 }

References arm_compute::test::validation::act_info, ARM_COMPUTE_LOG_MSG_WITH_FORMAT_ACL, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MSG_VAR, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, arm_compute::test::validation::conv_info, arm_compute::test::validation::dst, arm_compute::F32, arm_compute::logging::INFO, arm_compute::test::validation::src, and arm_compute::cpu::kernels::validate_arguments().

Referenced by CpuWinogradConv2d::configure(), CpuConv2d::get_convolution_method(), NEWinogradConvolutionLayer::validate(), and CpuConv2d::validate().

◆ workspace()

experimental::MemoryRequirements workspace ( ) const
overridevirtual

Return the memory requirements required by the workspace.

Reimplemented from INEOperator.

Definition at line 422 of file CpuWinogradConv2d.cpp.

423 {
424  return _aux_mem;
425 }

The documentation for this class was generated from the following files:
arm_compute::DataLayout::NCHW
@ NCHW
Num samples, channels, height, width.
arm_compute::cpu::CpuWinogradConv2d::prepare
void prepare(ITensorPack &constants) override
Prepare the function for executing.
Definition: CpuWinogradConv2d.cpp:374
arm_compute::test::validation::src
SimpleTensor< float > src
Definition: DFT.cpp:155
ARM_COMPUTE_RETURN_ERROR_ON_MSG_VAR
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG_VAR(cond, msg,...)
If the condition is true, an error is returned.
Definition: Error.h:228
arm_compute::IScheduler::schedule_op
virtual void schedule_op(ICPPKernel *kernel, const Hints &hints, const Window &window, ITensorPack &tensors)=0
Runs the kernel in the same thread as the caller synchronously.
arm_compute::test::validation::dst
auto dst
Definition: DFT.cpp:170
arm_compute::cpu::kernels::validate_arguments
Status validate_arguments(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *dst, const PadStrideInfo &conv_info)
Definition: CpuDirectConv2dKernel.cpp:60
arm_compute::Window::DimX
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
arm_compute::test::validation::k
const unsigned int k
Definition: GEMMMatrixMultiplyNative.cpp:361
ARM_COMPUTE_EXIT_ON_MSG_VAR
#define ARM_COMPUTE_EXIT_ON_MSG_VAR(cond, msg,...)
If the condition is true, the given message is printed and program exits.
Definition: Error.h:396
arm_compute::experimental::MemoryLifetime::Prepare
@ Prepare
arm_compute::ACL_SRC_0
@ ACL_SRC_0
Definition: Types.h:45
arm_compute::ACL_SRC_1
@ ACL_SRC_1
Definition: Types.h:46
ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:877
arm_compute::ACL_SRC_2
@ ACL_SRC_2
Definition: Types.h:47
ARM_COMPUTE_LOG_MSG_WITH_FORMAT_ACL
#define ARM_COMPUTE_LOG_MSG_WITH_FORMAT_ACL(log_level, fmt,...)
Definition: Log.h:31
ARM_COMPUTE_RETURN_ON_ERROR
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
arm_compute::test::validation::act_info
act_info
Definition: DirectConvolutionLayer.cpp:547
arm_compute::test::validation::m
const unsigned int m
Definition: GEMMMatrixMultiplyNative.cpp:359
ARM_COMPUTE_ERROR_ON_NULLPTR
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
arm_compute::experimental::MemoryInfo
Definition: Types.h:96
arm_compute::PermutationVector
Strides PermutationVector
Permutation vector.
Definition: CoreTypes.h:37
ARM_COMPUTE_ERROR_THROW_ON
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:456
arm_compute::TensorInfo::total_size
size_t total_size() const override
Returns the total size of the tensor in bytes.
Definition: TensorInfo.h:251
arm_compute::ACL_DST
@ ACL_DST
Definition: Types.h:55
arm_compute::DataType::U8
@ U8
unsigned 8-bit number
arm_compute::Scheduler::get
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:94
arm_compute::test::validation::pack
ITensorPack pack
Definition: Im2Col.cpp:188
ARM_COMPUTE_UNUSED
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
arm_compute::test::validation::data_type
data_type
Definition: Cast.cpp:223
arm_compute::ACL_INT
@ ACL_INT
Definition: Types.h:62
arm_compute::offset_int_vec
int offset_int_vec(int offset)
Definition: MemoryHelpers.h:38
arm_compute::ACL_BIAS
@ ACL_BIAS
Definition: Types.h:74
arm_compute::test::validation::conv_info
const auto conv_info
Definition: ConvolutionLayer.cpp:407
ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:163
arm_compute::IScheduler::num_threads
virtual unsigned int num_threads() const =0
Returns the number of threads that the SingleThreadScheduler has in its pool.
arm_compute::ACL_SRC
@ ACL_SRC
Definition: Types.h:44
arm_compute::DataType::F32
@ F32
32-bit floating-point number
arm_compute::test::validation::info
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
ARM_COMPUTE_LOG_PARAMS
#define ARM_COMPUTE_LOG_PARAMS(...)
Definition: Log.h:35
arm_compute::cpu::CpuWinogradConv2d::validate
static Status validate(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
Static function to check if given info will lead to a valid configuration of CpuWinogradConv2d.
Definition: CpuWinogradConv2d.cpp:291
arm_compute::DataType
DataType
Available data types.
Definition: CoreTypes.h:82
arm_compute::test::validation::n
const unsigned int n
Definition: GEMMMatrixMultiplyNative.cpp:360
arm_compute::logging::LogLevel::INFO
@ INFO
Information log level.