Compute Library
 21.02
CLAccumulateWeightedKernel Class Reference

Interface for the accumulate weighted kernel. More...

#include <CLAccumulateKernel.h>

Collaboration diagram for CLAccumulateWeightedKernel:
[legend]

Public Member Functions

void configure (const ICLTensor *input, float alpha, ICLTensor *accum)
 Set the input and accumulation images, and the scale value. More...
 
void configure (const CLCompileContext &compile_context, const ICLTensor *input, float alpha, ICLTensor *accum)
 Set the input and accumulation images, and the scale value. More...
 
- Public Member Functions inherited from ICLSimple2DKernel
void run (const Window &window, cl::CommandQueue &queue) override
 Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue. More...
 
- Public Member Functions inherited from ICLSimpleKernel
 ICLSimpleKernel ()
 Constructor. More...
 
 ICLSimpleKernel (const ICLSimpleKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
ICLSimpleKerneloperator= (const ICLSimpleKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 ICLSimpleKernel (ICLSimpleKernel &&)=default
 Allow instances of this class to be moved. More...
 
ICLSimpleKerneloperator= (ICLSimpleKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~ICLSimpleKernel ()=default
 Default destructor. More...
 
void configure (const ICLTensor *input, ICLTensor *output, unsigned int num_elems_processed_per_iteration, bool border_undefined=false, const BorderSize &border_size=BorderSize())
 Configure the kernel. More...
 
- Public Member Functions inherited from ICLKernel
 ICLKernel ()
 Constructor. More...
 
cl::Kernel & kernel ()
 Returns a reference to the OpenCL kernel of this object. More...
 
template<typename T >
void add_1D_array_argument (unsigned int &idx, const ICLArray< T > *array, const Strides &strides, unsigned int num_dimensions, const Window &window)
 Add the passed 1D array's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_1D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_1D_tensor_argument_if (bool cond, unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx if the condition is true. More...
 
void add_2D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_2D_tensor_argument_if (bool cond, unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx if the condition is true. More...
 
void add_3D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 3D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_4D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
 Add the passed 4D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, cl::CommandQueue &queue)
 Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue. More...
 
template<typename T >
void add_argument (unsigned int &idx, T value)
 Add the passed parameters to the object's kernel's arguments starting from the index idx. More...
 
void set_lws_hint (const cl::NDRange &lws_hint)
 Set the Local-Workgroup-Size hint. More...
 
cl::NDRange lws_hint () const
 Return the Local-Workgroup-Size hint. More...
 
void set_wbsm_hint (const cl_int &wbsm_hint)
 Set the workgroup batch size modifier hint. More...
 
cl_int wbsm_hint () const
 Return the workgroup batch size modifier hint. More...
 
const std::string & config_id () const
 Get the configuration ID. More...
 
void set_target (GPUTarget target)
 Set the targeted GPU architecture. More...
 
void set_target (cl::Device &device)
 Set the targeted GPU architecture according to the CL device. More...
 
GPUTarget get_target () const
 Get the targeted GPU architecture. More...
 
size_t get_max_workgroup_size ()
 Get the maximum workgroup size for the device the CLKernelLibrary uses. More...
 
template<unsigned int dimension_size>
void add_tensor_argument (unsigned &idx, const ICLTensor *tensor, const Window &window)
 
template<typename T , unsigned int dimension_size>
void add_array_argument (unsigned &idx, const ICLArray< T > *array, const Strides &strides, unsigned int num_dimensions, const Window &window)
 Add the passed array's parameters to the object's kernel's arguments starting from the index idx. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Additional Inherited Members

- Static Public Member Functions inherited from ICLKernel
static constexpr unsigned int num_arguments_per_1D_array ()
 Returns the number of arguments enqueued per 1D array object. More...
 
static constexpr unsigned int num_arguments_per_1D_tensor ()
 Returns the number of arguments enqueued per 1D tensor object. More...
 
static constexpr unsigned int num_arguments_per_2D_tensor ()
 Returns the number of arguments enqueued per 2D tensor object. More...
 
static constexpr unsigned int num_arguments_per_3D_tensor ()
 Returns the number of arguments enqueued per 3D tensor object. More...
 
static constexpr unsigned int num_arguments_per_4D_tensor ()
 Returns the number of arguments enqueued per 4D tensor object. More...
 
static cl::NDRange gws_from_window (const Window &window)
 Get the global work size given an execution window. More...
 

Detailed Description

Interface for the accumulate weighted kernel.

Weighted accumulation is computed:

\[ accum(x,y) = (1 - \alpha)*accum(x,y) + \alpha*input(x,y) \]

Where \( 0 \le \alpha \le 1 \) Conceptually, the rounding for this is defined as:

\[ output(x,y)= uint8( (1 - \alpha) * float32( int32( output(x,y) ) ) + \alpha * float32( int32( input(x,y) ) ) ) \]

Definition at line 67 of file CLAccumulateKernel.h.

Member Function Documentation

◆ configure() [1/2]

void configure ( const ICLTensor input,
float  alpha,
ICLTensor accum 
)

Set the input and accumulation images, and the scale value.

Parameters
[in]inputSource tensor. Data types supported: U8.
[in]alphaScalar value in the range [0, 1.0]. Data types supported: F32.
[in,out]accumAccumulated tensor. Data types supported: U8.

Definition at line 58 of file CLAccumulateKernel.cpp.

References CLAccumulateKernel::configure(), and CLKernelLibrary::get().

59 {
60  configure(CLKernelLibrary::get().get_compile_context(), input, alpha, accum);
61 }
void configure(const ICLTensor *input, float alpha, ICLTensor *accum)
Set the input and accumulation images, and the scale value.
static CLKernelLibrary & get()
Access the KernelLibrary singleton.

◆ configure() [2/2]

void configure ( const CLCompileContext compile_context,
const ICLTensor input,
float  alpha,
ICLTensor accum 
)

Set the input and accumulation images, and the scale value.

Parameters
[in]compile_contextThe compile context to be used.
[in]inputSource tensor. Data types supported: U8.
[in]alphaScalar value in the range [0, 1.0]. Data types supported: F32.
[in,out]accumAccumulated tensor. Data types supported: U8.

Definition at line 63 of file CLAccumulateKernel.cpp.

References ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ICLSimpleKernel::configure(), arm_compute::create_kernel(), ICLKernel::num_arguments_per_2D_tensor(), num_elems_processed_per_iteration, and arm_compute::U8.

64 {
67  ARM_COMPUTE_ERROR_ON(alpha < 0.0 || alpha > 1.0);
68 
69  // Create kernel
70  _kernel = create_kernel(compile_context, "accumulate_weighted");
71 
72  // Set static kernel arguments
73  unsigned int idx = 2 * num_arguments_per_2D_tensor(); //Skip the input and output parameters
74  _kernel.setArg(idx++, alpha);
75 
76  // Configure kernel window
78 }
1 channel, 1 U8 per channel
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
void configure(const ICLTensor *input, ICLTensor *output, unsigned int num_elems_processed_per_iteration, bool border_undefined=false, const BorderSize &border_size=BorderSize())
Configure the kernel.
cl::Kernel create_kernel(const CLCompileContext &ctx, const std::string &kernel_name, const std::set< std::string > &build_opts=std::set< std::string >())
Creates an opencl kernel using a compile context.
Definition: CLHelpers.cpp:403
static constexpr unsigned int num_arguments_per_2D_tensor()
Returns the number of arguments enqueued per 2D tensor object.
Definition: ICLKernel.h:206
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:790
unsigned int num_elems_processed_per_iteration

The documentation for this class was generated from the following files: