Compute Library
 21.11
ClConvertFullyConnectedWeightsKernel Class Reference

#include <ClConvertFullyConnectedWeightsKernel.h>

Collaboration diagram for ClConvertFullyConnectedWeightsKernel:
[legend]

Public Member Functions

 ClConvertFullyConnectedWeightsKernel ()
 
 ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE (ClConvertFullyConnectedWeightsKernel)
 
void configure (const CLCompileContext &compile_context, const ITensorInfo *src, ITensorInfo *dst, const TensorShape &original_src_shape, DataLayout data_layout)
 Set the src and dst tensor. More...
 
void run_op (ITensorPack &tensors, const Window &window, ::cl::CommandQueue &queue) override
 
- Public Member Functions inherited from ICLKernel
 ICLKernel ()
 Constructor. More...
 
cl::Kernel & kernel ()
 Returns a reference to the OpenCL kernel of this object. More...
 
CLKernelType type () const
 Returns the CL kernel type. More...
 
template<typename T >
void add_1D_array_argument (unsigned int &idx, const ICLArray< T > *array, const Strides &strides, unsigned int num_dimensions, const Window &window)
 Add the passed 1D array's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_1D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_1D_tensor_argument_if (bool cond, unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx if the condition is true. More...
 
void add_2D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_2D_tensor_argument_if (bool cond, unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx if the condition is true. More...
 
void add_3D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 3D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_4D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 4D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
virtual void run (const Window &window, cl::CommandQueue &queue)
 Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue. More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, cl::CommandQueue &queue)
 Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue. More...
 
template<typename T >
void add_argument (unsigned int &idx, T value)
 Add the passed parameters to the object's kernel's arguments starting from the index idx. More...
 
void set_lws_hint (const cl::NDRange &lws_hint)
 Set the Local-Workgroup-Size hint. More...
 
cl::NDRange lws_hint () const
 Return the Local-Workgroup-Size hint. More...
 
void set_wbsm_hint (const cl_int &wbsm_hint)
 Set the workgroup batch size modifier hint. More...
 
cl_int wbsm_hint () const
 Return the workgroup batch size modifier hint. More...
 
const std::string & config_id () const
 Get the configuration ID. More...
 
void set_target (GPUTarget target)
 Set the targeted GPU architecture. More...
 
void set_target (cl::Device &device)
 Set the targeted GPU architecture according to the CL device. More...
 
GPUTarget get_target () const
 Get the targeted GPU architecture. More...
 
size_t get_max_workgroup_size ()
 Get the maximum workgroup size for the device the CLKernelLibrary uses. More...
 
template<unsigned int dimension_size>
void add_tensor_argument (unsigned &idx, const ICLTensor *tensor, const Window &window)
 
template<typename T , unsigned int dimension_size>
void add_array_argument (unsigned &idx, const ICLArray< T > *array, const Strides &strides, unsigned int num_dimensions, const Window &window)
 Add the passed array's parameters to the object's kernel's arguments starting from the index idx. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 
bool is_window_configured () const
 Function to check if the embedded window of this kernel has been configured. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *src, const ITensorInfo *dst, const TensorShape &original_src_shape, DataLayout data_layout)
 Static function to check if given info will lead to a valid configuration. More...
 
- Static Public Member Functions inherited from ICLKernel
static constexpr unsigned int num_arguments_per_1D_array ()
 Returns the number of arguments enqueued per 1D array object. More...
 
static constexpr unsigned int num_arguments_per_1D_tensor ()
 Returns the number of arguments enqueued per 1D tensor object. More...
 
static constexpr unsigned int num_arguments_per_2D_tensor ()
 Returns the number of arguments enqueued per 2D tensor object. More...
 
static constexpr unsigned int num_arguments_per_3D_tensor ()
 Returns the number of arguments enqueued per 3D tensor object. More...
 
static constexpr unsigned int num_arguments_per_4D_tensor ()
 Returns the number of arguments enqueued per 4D tensor object. More...
 
static cl::NDRange gws_from_window (const Window &window)
 Get the global work size given an execution window. More...
 

Detailed Description

Definition at line 45 of file ClConvertFullyConnectedWeightsKernel.h.

Constructor & Destructor Documentation

◆ ClConvertFullyConnectedWeightsKernel()

Definition at line 43 of file ClConvertFullyConnectedWeightsKernel.cpp.

References arm_compute::ELEMENTWISE.

44 {
46 }
Elementeise CL kernel type.
Definition: CLTypes.h:84

Member Function Documentation

◆ ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE()

ARM_COMPUTE_DISALLOW_COPY_ALLOW_MOVE ( ClConvertFullyConnectedWeightsKernel  )

◆ configure()

void configure ( const CLCompileContext compile_context,
const ITensorInfo src,
ITensorInfo dst,
const TensorShape original_src_shape,
DataLayout  data_layout 
)

Set the src and dst tensor.

Parameters
[in]compile_contextThe compile context to be used.
[in]srcSource weights tensor info to convert. Must be 2 dimensional. Data types supported: All.
[out]dstThe converted weights tensor info. Shape and Data Type: Same as src.
[in]original_src_shapeShape of the original src tensor (the one entering fully connected layer).
[in]data_layoutThe data layout the weights have been trained in.

Definition at line 48 of file ClConvertFullyConnectedWeightsKernel.cpp.

References CLBuildOptions::add_option(), ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, arm_compute::auto_init_if_empty(), arm_compute::calculate_max_window(), arm_compute::CHANNEL, ICloneable< T >::clone(), arm_compute::create_kernel(), ITensorInfo::element_size(), arm_compute::get_cl_unsigned_type_from_element_size(), arm_compute::get_data_layout_dimension_index(), arm_compute::get_padding_info(), arm_compute::has_padding_changed(), arm_compute::HEIGHT, arm_compute::NCHW, arm_compute::NHWC, CLBuildOptions::options(), arm_compute::test::validation::src, arm_compute::support::cpp11::to_string(), ClConvertFullyConnectedWeightsKernel::validate(), and arm_compute::WIDTH.

50 {
52 
53  // Output tensor auto initialisation if not yet initialized
54  auto_init_if_empty(*dst, *src->clone());
55 
56  auto padding_info = get_padding_info({ src, dst });
57 
59 
61 
62  const int width_idx = get_data_layout_dimension_index(src_data_layout, DataLayoutDimension::WIDTH);
63  const int height_idx = get_data_layout_dimension_index(src_data_layout, DataLayoutDimension::HEIGHT);
64  const int channel_idx = get_data_layout_dimension_index(src_data_layout, DataLayoutDimension::CHANNEL);
65 
66  const unsigned int num_elems_per_src_plane = original_src_shape[width_idx] * original_src_shape[height_idx];
67  const unsigned int num_channels = original_src_shape[channel_idx];
68 
69  const unsigned int factor_1 = (data_layout == DataLayout::NCHW) ? num_elems_per_src_plane : num_channels;
70  const unsigned int factor_2 = (data_layout == DataLayout::NCHW) ? num_channels : num_elems_per_src_plane;
71 
72  // Set build options
73  CLBuildOptions build_opts;
74  build_opts.add_option("-DDATA_TYPE=" + get_cl_unsigned_type_from_element_size(src->element_size()));
75  build_opts.add_option("-DFACTOR_1=" + support::cpp11::to_string(factor_1));
76  build_opts.add_option("-DFACTOR_2=" + support::cpp11::to_string(factor_2));
77 
78  // Create kernel
79  _kernel = create_kernel(compile_context, "convert_fc_weights", build_opts.options());
80 
81  // Configure kernel window
82  Window win = calculate_max_window(*src, Steps());
83  ICLKernel::configure_internal(win);
84 
86 }
Window calculate_max_window(const ValidRegion &valid_region, const Steps &steps, bool skip_border, BorderSize border_size)
std::string to_string(T &&value)
Convert integer and float values to string.
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
SimpleTensor< float > src
Definition: DFT.cpp:155
cl::Kernel create_kernel(const CLCompileContext &ctx, const std::string &kernel_name, const std::set< std::string > &build_opts=std::set< std::string >())
Creates an opencl kernel using a compile context.
Definition: CLHelpers.cpp:391
static Status validate(const ITensorInfo *src, const ITensorInfo *dst, const TensorShape &original_src_shape, DataLayout data_layout)
Static function to check if given info will lead to a valid configuration.
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
bool has_padding_changed(const std::unordered_map< const ITensorInfo *, PaddingSize > &padding_map)
Check if the previously stored padding info has changed after configuring a kernel.
Definition: Utils.cpp:533
Num samples, channels, height, width.
size_t get_data_layout_dimension_index(const DataLayout &data_layout, const DataLayoutDimension &data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
Num samples, height, width, channels.
std::string get_cl_unsigned_type_from_element_size(size_t element_size)
Translates the element size to an unsigned integer data type.
Definition: CLHelpers.cpp:105
std::unordered_map< const ITensorInfo *, PaddingSize > get_padding_info(std::initializer_list< const ITensorInfo *> infos)
Stores padding information before configuring a kernel.
Definition: Utils.cpp:518
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157
DataLayout
[DataLayout enum definition]
Definition: Types.h:113

◆ run_op()

void run_op ( ITensorPack tensors,
const Window window,
::cl::CommandQueue &  queue 
)
override

Definition at line 108 of file ClConvertFullyConnectedWeightsKernel.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC, ICLKernel::add_2D_tensor_argument(), ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, arm_compute::test::validation::dst, arm_compute::enqueue(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), ICLKernel::lws_hint(), arm_compute::test::validation::src, and IKernel::window().

109 {
112 
113  const auto src = utils::cast::polymorphic_downcast<const ICLTensor *>(tensors.get_const_tensor(TensorType::ACL_SRC));
114  auto dst = utils::cast::polymorphic_downcast<ICLTensor *>(tensors.get_tensor(TensorType::ACL_DST));
116 
117  unsigned int idx = 0;
119  add_2D_tensor_argument(idx, dst, window);
120  enqueue(queue, *this, window, lws_hint());
121 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
void enqueue(cl::CommandQueue &queue, ICLKernel &kernel, const Window &window, const cl::NDRange &lws_hint=CLKernelLibrary::get().default_ndrange(), bool use_dummy_work_items=false)
Add the kernel to the command queue with the given window.
Definition: ICLKernel.cpp:32
cl::NDRange lws_hint() const
Return the Local-Workgroup-Size hint.
Definition: ICLKernel.h:318
SimpleTensor< float > src
Definition: DFT.cpp:155
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:915
void add_2D_tensor_argument(unsigned int &idx, const ICLTensor *tensor, const Window &window)
Add the passed 2D tensor&#39;s parameters to the object&#39;s kernel&#39;s arguments starting from the index idx...
Definition: ICLKernel.h:190
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:201

◆ validate()

Status validate ( const ITensorInfo src,
const ITensorInfo dst,
const TensorShape original_src_shape,
DataLayout  data_layout 
)
static

Static function to check if given info will lead to a valid configuration.

Similar to ClConvertFullyConnectedWeightsKernel::configure()

Returns
a status

Definition at line 88 of file ClConvertFullyConnectedWeightsKernel.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_F16_UNSUPPORTED, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_SHAPES, ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR, ITensorInfo::data_type(), ITensorInfo::dimension(), ITensorInfo::num_dimensions(), ITensorInfo::total_size(), TensorShape::total_size_lower(), and arm_compute::UNKNOWN.

Referenced by ClConvertFullyConnectedWeightsKernel::configure(), and ClConvertFullyConnectedWeights::validate().

90 {
94  ARM_COMPUTE_RETURN_ERROR_ON(src->num_dimensions() != 2);
95  ARM_COMPUTE_RETURN_ERROR_ON(src->dimension(1) != original_src_shape.total_size_lower(3));
97 
98  // Checks performed when dst is configured
99  if(dst->total_size() != 0)
100  {
103  }
104 
105  return Status{};
106 }
#define ARM_COMPUTE_RETURN_ERROR_ON_F16_UNSUPPORTED(tensor)
Definition: CLValidate.h:35
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
SimpleTensor< float > src
Definition: DFT.cpp:155
#define ARM_COMPUTE_RETURN_ERROR_ON_NULLPTR(...)
Definition: Validate.h:159
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_SHAPES(...)
Definition: Validate.h:439
#define ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:541

The documentation for this class was generated from the following files: