Compute Library
CpuConvertFullyConnectedWeightsKernel Class Reference

Interface to convert the 2D Fully Connected weights from NCHW to NHWC or vice versa. More...

#include <CpuConvertFullyConnectedWeightsKernel.h>

Collaboration diagram for CpuConvertFullyConnectedWeightsKernel:

Public Member Functions

 CpuConvertFullyConnectedWeightsKernel ()=default
 ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE (CpuConvertFullyConnectedWeightsKernel)
void configure (const ITensorInfo *src, ITensorInfo *dst, const TensorShape &original_input_shape, DataLayout data_layout)
 Set the src and dst tensor. More...
void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
const char * name () const override
 Name of the kernel. More...
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
virtual void run (const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
virtual ~IKernel ()=default
 Destructor. More...
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
bool is_window_configured () const
 Function to check if the embedded window of this kernel has been configured. More...

Static Public Member Functions

static Status validate (const ITensorInfo *src, const ITensorInfo *dst, const TensorShape &original_input_shape, DataLayout data_layout)
 Static function to check if given info will lead to a valid configuration. More...

Detailed Description

Interface to convert the 2D Fully Connected weights from NCHW to NHWC or vice versa.

This function can be applied to the 2D weights used by a Fully Connected layer if:
  • It follows a Convolution layer
  • The data layout used by the network does not match the one the model has been trained in.
This function assumes the weights are already reshaped (transposed)

Definition at line 44 of file CpuConvertFullyConnectedWeightsKernel.h.

Constructor & Destructor Documentation

◆ CpuConvertFullyConnectedWeightsKernel()

Member Function Documentation


ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE ( CpuConvertFullyConnectedWeightsKernel  )

◆ configure()

void configure ( const ITensorInfo src,
ITensorInfo dst,
const TensorShape original_input_shape,
DataLayout  data_layout 

Set the src and dst tensor.

[in]srcSource weights tensor info to convert. Must be 2 dimensional. Data types supported: All.
[in]dstThe converted weights tensor info. Shape and Data Type: Same as src.
[in]original_input_shapeShape of the original src tensor (the one entering fully connected layer).
[in]data_layoutThe data layout the weights have been trained in.

Definition at line 37 of file CpuConvertFullyConnectedWeightsKernel.cpp.

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::calculate_max_window(), arm_compute::CHANNEL, ICloneable< T >::clone(), arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::NCHW, arm_compute::NHWC, CpuConvertFullyConnectedWeightsKernel::validate(), and arm_compute::WIDTH.

40 {
43  // Output tensor auto initialisation if not yet initialized
44  auto_init_if_empty(*dst, *src->clone());
48  const DataLayout input_data_layout = (data_layout == DataLayout::NCHW) ? DataLayout::NHWC : DataLayout::NCHW;
50  const int width_idx = get_data_layout_dimension_index(input_data_layout, DataLayoutDimension::WIDTH);
51  const int height_idx = get_data_layout_dimension_index(input_data_layout, DataLayoutDimension::HEIGHT);
52  const int channel_idx = get_data_layout_dimension_index(input_data_layout, DataLayoutDimension::CHANNEL);
54  const unsigned int num_elems_per_input_plane = original_input_shape[width_idx] * original_input_shape[height_idx];
55  const unsigned int num_channels = original_input_shape[channel_idx];
57  _factor1 = (data_layout == DataLayout::NCHW) ? num_elems_per_input_plane : num_channels;
58  _factor2 = (data_layout == DataLayout::NCHW) ? num_channels : num_elems_per_input_plane;
60  // Configure kernel window
61  Window win = calculate_max_window(*src, Steps());
62  ICpuKernel::configure(win);
63 }
Window calculate_max_window(const ValidRegion &valid_region, const Steps &steps, bool skip_border, BorderSize border_size)
const DataLayout data_layout
Definition: Im2Col.cpp:151
Definition: Error.h:455
SimpleTensor< float > src
Definition: DFT.cpp:155
static Status validate(const ITensorInfo *src, const ITensorInfo *dst, const TensorShape &original_input_shape, DataLayout data_layout)
Static function to check if given info will lead to a valid configuration.
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
Num samples, channels, height, width.
Num samples, height, width, channels.
Definition: Validate.h:157
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
[DataLayout enum definition]
Definition: Types.h:111

◆ name()

const char * name ( ) const

Name of the kernel.

Kernel name

Implements ICPPKernel.

Definition at line 107 of file CpuConvertFullyConnectedWeightsKernel.cpp.

108 {
109  return "CpuConvertFullyConnectedWeightsKernel";
110 }

◆ run_op()

void run_op ( ITensorPack tensors,
const Window window,
const ThreadInfo info 

Execute the kernel on the passed window.

If is_parallelisable() returns false then the passed window must be equal to window()
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
[in]tensorsA vector containing the tensors to operate on.
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 84 of file CpuConvertFullyConnectedWeightsKernel.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC, ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, arm_compute::test::validation::dst, arm_compute::execute_window_loop(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), arm_compute::test::validation::input, Iterator::ptr(), arm_compute::test::validation::src, and IKernel::window().

85 {
90  const auto src = tensors.get_const_tensor(TensorType::ACL_SRC);
91  auto dst = tensors.get_tensor(TensorType::ACL_DST);
93  const unsigned int dst_stride_x = dst->info()->strides_in_bytes().x();
94  const unsigned int dst_stride_y = dst->info()->strides_in_bytes().y();
95  const unsigned int element_size = src->info()->element_size();
97  Iterator input(src, window);
98  Iterator output(dst, window);
100  execute_window_loop(window, [&](const Coordinates & id)
101  {
102  memcpy(output.ptr() + id.x() * dst_stride_x + (id.y() % _factor1 * _factor2 + id.y() / _factor1) * dst_stride_y, input.ptr(), element_size);
103  },
104  input);
105 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
SimpleTensor< float > src
Definition: DFT.cpp:155
To avoid unused variables warnings.
Definition: Error.h:152
Definition: Validate.h:915
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:77
Definition: Validate.h:201

◆ validate()

Status validate ( const ITensorInfo src,
const ITensorInfo dst,
const TensorShape original_input_shape,
DataLayout  data_layout 

Static function to check if given info will lead to a valid configuration.

Similar to CpuConvertFullyConnectedWeightsKernel::configure()

a status

Definition at line 65 of file CpuConvertFullyConnectedWeightsKernel.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_SHAPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ITensorInfo::data_type(), ITensorInfo::dimension(), ITensorInfo::num_dimensions(), ITensorInfo::total_size(), TensorShape::total_size_lower(), and arm_compute::UNKNOWN.

Referenced by CpuConvertFullyConnectedWeightsKernel::configure(), and CpuConvertFullyConnectedWeights::validate().

67 {
70  ARM_COMPUTE_RETURN_ERROR_ON(src->num_dimensions() != 2);
71  ARM_COMPUTE_RETURN_ERROR_ON(src->dimension(1) != original_input_shape.total_size_lower(3));
74  // Checks performed when dst is configured
75  if((dst != nullptr) && (dst->total_size() != 0))
76  {
79  }
81  return Status{};
82 }
const DataLayout data_layout
Definition: Im2Col.cpp:151
If the condition is true, an error is returned.
Definition: Error.h:296
SimpleTensor< float > src
Definition: DFT.cpp:155
Definition: Validate.h:159
Definition: Validate.h:439
Definition: Validate.h:541

The documentation for this class was generated from the following files: