Compute Library
 21.05
ICLSimpleKernel Class Reference

Interface for simple OpenCL kernels having 1 tensor input and 1 tensor output. More...

#include <ICLSimpleKernel.h>

Collaboration diagram for ICLSimpleKernel:
[legend]

Public Member Functions

 ICLSimpleKernel ()
 Constructor. More...
 
 ICLSimpleKernel (const ICLSimpleKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
ICLSimpleKerneloperator= (const ICLSimpleKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 ICLSimpleKernel (ICLSimpleKernel &&)=default
 Allow instances of this class to be moved. More...
 
ICLSimpleKerneloperator= (ICLSimpleKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~ICLSimpleKernel ()=default
 Default destructor. More...
 
void configure (const ICLTensor *input, ICLTensor *output, unsigned int num_elems_processed_per_iteration, bool border_undefined=false, const BorderSize &border_size=BorderSize())
 Configure the kernel. More...
 
- Public Member Functions inherited from ICLKernel
 ICLKernel ()
 Constructor. More...
 
cl::Kernel & kernel ()
 Returns a reference to the OpenCL kernel of this object. More...
 
template<typename T >
void add_1D_array_argument (unsigned int &idx, const ICLArray< T > *array, const Strides &strides, unsigned int num_dimensions, const Window &window)
 Add the passed 1D array's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_1D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_1D_tensor_argument_if (bool cond, unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx if the condition is true. More...
 
void add_2D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_2D_tensor_argument_if (bool cond, unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx if the condition is true. More...
 
void add_3D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 3D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_4D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 4D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
virtual void run (const Window &window, cl::CommandQueue &queue)
 Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue. More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, cl::CommandQueue &queue)
 Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue. More...
 
template<typename T >
void add_argument (unsigned int &idx, T value)
 Add the passed parameters to the object's kernel's arguments starting from the index idx. More...
 
void set_lws_hint (const cl::NDRange &lws_hint)
 Set the Local-Workgroup-Size hint. More...
 
cl::NDRange lws_hint () const
 Return the Local-Workgroup-Size hint. More...
 
void set_wbsm_hint (const cl_int &wbsm_hint)
 Set the workgroup batch size modifier hint. More...
 
cl_int wbsm_hint () const
 Return the workgroup batch size modifier hint. More...
 
const std::string & config_id () const
 Get the configuration ID. More...
 
void set_target (GPUTarget target)
 Set the targeted GPU architecture. More...
 
void set_target (cl::Device &device)
 Set the targeted GPU architecture according to the CL device. More...
 
GPUTarget get_target () const
 Get the targeted GPU architecture. More...
 
size_t get_max_workgroup_size ()
 Get the maximum workgroup size for the device the CLKernelLibrary uses. More...
 
template<unsigned int dimension_size>
void add_tensor_argument (unsigned &idx, const ICLTensor *tensor, const Window &window)
 
template<typename T , unsigned int dimension_size>
void add_array_argument (unsigned &idx, const ICLArray< T > *array, const Strides &strides, unsigned int num_dimensions, const Window &window)
 Add the passed array's parameters to the object's kernel's arguments starting from the index idx. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 
bool is_window_configured () const
 Function to check if the embedded window of this kernel has been configured. More...
 

Additional Inherited Members

- Static Public Member Functions inherited from ICLKernel
static constexpr unsigned int num_arguments_per_1D_array ()
 Returns the number of arguments enqueued per 1D array object. More...
 
static constexpr unsigned int num_arguments_per_1D_tensor ()
 Returns the number of arguments enqueued per 1D tensor object. More...
 
static constexpr unsigned int num_arguments_per_2D_tensor ()
 Returns the number of arguments enqueued per 2D tensor object. More...
 
static constexpr unsigned int num_arguments_per_3D_tensor ()
 Returns the number of arguments enqueued per 3D tensor object. More...
 
static constexpr unsigned int num_arguments_per_4D_tensor ()
 Returns the number of arguments enqueued per 4D tensor object. More...
 
static cl::NDRange gws_from_window (const Window &window)
 Get the global work size given an execution window. More...
 

Detailed Description

Interface for simple OpenCL kernels having 1 tensor input and 1 tensor output.

Definition at line 34 of file ICLSimpleKernel.h.

Constructor & Destructor Documentation

◆ ICLSimpleKernel() [1/3]

Constructor.

Definition at line 33 of file ICLSimpleKernel.cpp.

34  : _input(nullptr), _output(nullptr)
35 {
36 }

◆ ICLSimpleKernel() [2/3]

ICLSimpleKernel ( const ICLSimpleKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ ICLSimpleKernel() [3/3]

ICLSimpleKernel ( ICLSimpleKernel &&  )
default

Allow instances of this class to be moved.

◆ ~ICLSimpleKernel()

~ICLSimpleKernel ( )
default

Default destructor.

Member Function Documentation

◆ configure()

void configure ( const ICLTensor input,
ICLTensor output,
unsigned int  num_elems_processed_per_iteration,
bool  border_undefined = false,
const BorderSize border_size = BorderSize() 
)

Configure the kernel.

Parameters
[in]inputSource tensor.
[out]outputDestination tensor.
[in]num_elems_processed_per_iterationNumber of processed elements per iteration.
[in]border_undefined(Optional) True if the border mode is undefined. False if it's replicate or constant.
[in]border_size(Optional) Size of the border.

Definition at line 38 of file ICLSimpleKernel.cpp.

39 {
40  _input = input;
41  _output = output;
42 
43  // Configure kernel window
46 
49  output_access);
50 
51  output_access.set_valid_region(win, input->info()->valid_region(), border_undefined, border_size);
52 
53  ICLKernel::configure_internal(win);
54 }
Window calculate_max_window(const ValidRegion &valid_region, const Steps &steps, bool skip_border, BorderSize border_size)
bool update_window_and_padding(Window &win, Ts &&... patterns)
Update window and padding size for each of the access patterns.
Definition: WindowHelpers.h:46
Class to describe a number of elements in each dimension.
Definition: Steps.h:40
Implementation of a row access pattern.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
virtual BorderSize border_size() const
The size of the border for that kernel.
Definition: IKernel.cpp:46
unsigned int num_elems_processed_per_iteration
Describe a multidimensional execution window.
Definition: Window.h:39

References IKernel::border_size(), arm_compute::calculate_max_window(), ITensor::info(), arm_compute::test::validation::input, num_elems_processed_per_iteration, and arm_compute::update_window_and_padding().

◆ operator=() [1/2]

ICLSimpleKernel& operator= ( const ICLSimpleKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

ICLSimpleKernel& operator= ( ICLSimpleKernel &&  )
default

Allow instances of this class to be moved.


The documentation for this class was generated from the following files: