24.02.1
|
Interface for filling the border of a kernel. More...
#include <CLFillBorderKernel.h>
Public Member Functions | |
CLFillBorderKernel () | |
Default constructor. More... | |
CLFillBorderKernel (const CLFillBorderKernel &)=delete | |
Prevent instances of this class from being copied (As this class contains pointers) More... | |
CLFillBorderKernel & | operator= (const CLFillBorderKernel &)=delete |
Prevent instances of this class from being copied (As this class contains pointers) More... | |
CLFillBorderKernel (CLFillBorderKernel &&)=default | |
Allow instances of this class to be moved. More... | |
CLFillBorderKernel & | operator= (CLFillBorderKernel &&)=default |
Allow instances of this class to be moved. More... | |
~CLFillBorderKernel ()=default | |
Default destructor. More... | |
void | configure (const CLCompileContext &compile_context, ICLTensor *tensor, BorderSize border_size, BorderMode border_mode, const PixelValue &constant_border_value=PixelValue()) |
Initialise the kernel's input, output and border mode. More... | |
void | configure (ICLTensor *tensor, BorderSize border_size, BorderMode border_mode, const PixelValue &constant_border_value=PixelValue()) |
Initialise the kernel's input, output and border mode. More... | |
void | configure (const CLCompileContext &compile_context, ITensorInfo *tensor, BorderSize border_size, BorderMode border_mode, const PixelValue &constant_border_value=PixelValue()) |
Initialise the kernel's input, output and border mode. More... | |
template<class T > | |
void | set_constant_border (unsigned int idx, const PixelValue &constant_border_value) |
Function to set the constant value on fill border kernel depending on type. More... | |
void | run_op (ITensorPack &tensors, const Window &window, cl::CommandQueue &queue) override |
Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue. More... | |
void | run (const Window &window, cl::CommandQueue &queue) override |
Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue. More... | |
bool | is_parallelisable () const override |
Indicates whether or not the kernel is parallelisable. More... | |
Public Member Functions inherited from ICLKernel | |
ICLKernel () | |
Constructor. More... | |
cl::Kernel & | kernel () |
Returns a reference to the OpenCL kernel of this object. More... | |
CLKernelType | type () const |
Returns the CL kernel type. More... | |
template<typename T > | |
void | add_1D_array_argument (unsigned int &idx, const ICLArray< T > *array, const Strides &strides, unsigned int num_dimensions, const Window &window) |
Add the passed 1D array's parameters to the object's kernel's arguments starting from the index idx. More... | |
void | add_1D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window) |
Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx. More... | |
void | add_1D_tensor_argument_if (bool cond, unsigned int &idx, const ICLTensor *tensor, const Window &window) |
Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx if the condition is true. More... | |
void | add_2D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window) |
Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx. More... | |
void | add_2D_tensor_argument_if (bool cond, unsigned int &idx, const ICLTensor *tensor, const Window &window) |
Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx if the condition is true. More... | |
void | add_3D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window) |
Add the passed 3D tensor's parameters to the object's kernel's arguments starting from the index idx. More... | |
void | add_4D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window) |
Add the passed 4D tensor's parameters to the object's kernel's arguments starting from the index idx. More... | |
void | add_5D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window) |
Add the passed 5D tensor's parameters to the object's kernel's arguments starting from the index idx. More... | |
void | add_3d_tensor_nhw_argument (unsigned int &idx, const ICLTensor *tensor) |
Add the passed NHW 3D tensor's parameters to the object's kernel's arguments by passing strides, dimensions and the offset to the first valid element in bytes. More... | |
void | add_4d_tensor_nhwc_argument (unsigned int &idx, const ICLTensor *tensor) |
Add the passed NHWC 4D tensor's parameters to the object's kernel's arguments by passing strides, dimensions and the offset to the first valid element in bytes. More... | |
template<typename T > | |
void | add_argument (unsigned int &idx, T value) |
Add the passed parameters to the object's kernel's arguments starting from the index idx. More... | |
void | set_lws_hint (const cl::NDRange &lws_hint) |
Set the Local-Workgroup-Size hint. More... | |
cl::NDRange | lws_hint () const |
Return the Local-Workgroup-Size hint. More... | |
void | set_wbsm_hint (const cl_int &wbsm_hint) |
Set the workgroup batch size modifier hint. More... | |
cl_int | wbsm_hint () const |
Return the workgroup batch size modifier hint. More... | |
const std::string & | config_id () const |
Get the configuration ID. More... | |
void | set_target (GPUTarget target) |
Set the targeted GPU architecture. More... | |
void | set_target (cl::Device &device) |
Set the targeted GPU architecture according to the CL device. More... | |
GPUTarget | get_target () const |
Get the targeted GPU architecture. More... | |
size_t | get_max_workgroup_size () |
Get the maximum workgroup size for the device the CLKernelLibrary uses. More... | |
cl::NDRange | get_cached_gws () const |
Get the cached gws used to enqueue this kernel. More... | |
void | cache_gws (const cl::NDRange &gws) |
Cache the latest gws used to enqueue this kernel. More... | |
template<unsigned int dimension_size> | |
void | add_tensor_argument (unsigned &idx, const ICLTensor *tensor, const Window &window) |
template<typename T , unsigned int dimension_size> | |
void | add_array_argument (unsigned &idx, const ICLArray< T > *array, const Strides &strides, unsigned int num_dimensions, const Window &window) |
Add the passed array's parameters to the object's kernel's arguments starting from the index idx. More... | |
Public Member Functions inherited from IKernel | |
IKernel () | |
Constructor. More... | |
virtual | ~IKernel ()=default |
Destructor. More... | |
virtual BorderSize | border_size () const |
The size of the border for that kernel. More... | |
const Window & | window () const |
The maximum window the kernel can be executed on. More... | |
bool | is_window_configured () const |
Function to check if the embedded window of this kernel has been configured. More... | |
Additional Inherited Members | |
Static Public Member Functions inherited from ICLKernel | |
constexpr static unsigned int | num_arguments_per_3d_tensor_nhw () |
Returns the number of arguments enqueued per NHW 3D Tensor object. More... | |
constexpr static unsigned int | num_arguments_per_4d_tensor_nhwc () |
Returns the number of arguments enqueued per NHWC 4D Tensor object. More... | |
constexpr static unsigned int | num_arguments_per_1D_array () |
Returns the number of arguments enqueued per 1D array object. More... | |
constexpr static unsigned int | num_arguments_per_1D_tensor () |
Returns the number of arguments enqueued per 1D tensor object. More... | |
constexpr static unsigned int | num_arguments_per_2D_tensor () |
Returns the number of arguments enqueued per 2D tensor object. More... | |
constexpr static unsigned int | num_arguments_per_3D_tensor () |
Returns the number of arguments enqueued per 3D tensor object. More... | |
constexpr static unsigned int | num_arguments_per_4D_tensor () |
Returns the number of arguments enqueued per 4D tensor object. More... | |
static cl::NDRange | gws_from_window (const Window &window, bool use_dummy_work_items) |
Get the global work size given an execution window. More... | |
Interface for filling the border of a kernel.
Definition at line 37 of file CLFillBorderKernel.h.
Default constructor.
Definition at line 41 of file CLFillBorderKernel.cpp.
References arm_compute::ELEMENTWISE.
|
delete |
Prevent instances of this class from being copied (As this class contains pointers)
|
default |
Allow instances of this class to be moved.
|
default |
Default destructor.
void configure | ( | const CLCompileContext & | compile_context, |
ICLTensor * | tensor, | ||
BorderSize | border_size, | ||
BorderMode | border_mode, | ||
const PixelValue & | constant_border_value = PixelValue() |
||
) |
Initialise the kernel's input, output and border mode.
[in] | compile_context | The compile context to be used. |
[in,out] | tensor | Tensor to process Data types supported: U8/QASYMM8/S8/QASYMM8_SIGNED/U16/S16/U32/S32/F16/F32. |
[in] | border_size | Size of the border to fill in elements. |
[in] | border_mode | Border mode to use for the convolution. |
[in] | constant_border_value | (Optional) Constant value to use for borders if border_mode is set to CONSTANT. |
Definition at line 67 of file CLFillBorderKernel.cpp.
References IKernel::border_size(), and tensor.
Referenced by CLFillBorderKernel::configure(), CLSynthetizeOperatorInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::configure(), ClWinogradConv2d::configure(), CLSynthetizeFunctionInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::configure(), and arm_compute::test::validation::DATA_TEST_CASE().
void configure | ( | const CLCompileContext & | compile_context, |
ITensorInfo * | tensor, | ||
BorderSize | border_size, | ||
BorderMode | border_mode, | ||
const PixelValue & | constant_border_value = PixelValue() |
||
) |
Initialise the kernel's input, output and border mode.
[in] | compile_context | The compile context to be used. |
[in,out] | tensor | Tensor to process Data types supported: U8/QASYMM8/S8/QASYMM8_SIGNED/U16/S16/U32/S32/F16/F32. |
[in] | border_size | Size of the border to fill in elements. |
[in] | border_mode | Border mode to use for the convolution. |
[in] | constant_border_value | (Optional) Constant value to use for borders if border_mode is set to CONSTANT. |
Definition at line 77 of file CLFillBorderKernel.cpp.
References CLBuildOptions::add_option(), ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON, IKernel::border_size(), BorderSize::bottom, arm_compute::CONSTANT, arm_compute::create_kernel(), Window::DimX, Window::DimY, Window::DimZ, dt, BorderSize::empty(), arm_compute::F16, arm_compute::F32, arm_compute::get_cl_type_from_data_type(), arm_compute::get_padding_info(), arm_compute::has_padding_changed(), kernel_name, BorderSize::left, BorderSize::limit(), arm_compute::lower_string(), ICLKernel::num_arguments_per_3D_tensor(), CLBuildOptions::options(), arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, BorderSize::right, arm_compute::S16, arm_compute::S32, arm_compute::S8, Window::set(), arm_compute::string_from_border_mode(), arm_compute::string_from_data_type(), tensor, arm_compute::support::cpp11::to_string(), BorderSize::top, arm_compute::U16, arm_compute::U32, arm_compute::U8, arm_compute::UNDEFINED, and Window::use_tensor_dimensions().
void configure | ( | ICLTensor * | tensor, |
BorderSize | border_size, | ||
BorderMode | border_mode, | ||
const PixelValue & | constant_border_value = PixelValue() |
||
) |
Initialise the kernel's input, output and border mode.
[in,out] | tensor | Tensor to process Data types supported: U8/QASYMM8/S8/QASYMM8_SIGNED/U16/S16/U32/S32/F16/F32. |
[in] | border_size | Size of the border to fill in elements. |
[in] | border_mode | Border mode to use for the convolution. |
[in] | constant_border_value | (Optional) Constant value to use for borders if border_mode is set to CONSTANT. |
Definition at line 59 of file CLFillBorderKernel.cpp.
References IKernel::border_size(), CLFillBorderKernel::configure(), CLKernelLibrary::get(), and tensor.
|
overridevirtual |
Indicates whether or not the kernel is parallelisable.
If the kernel is parallelisable then the window returned by window() can be split into sub-windows which can then be run in parallel.
If the kernel is not parallelisable then only the window returned by window() can be passed to run()
Reimplemented from IKernel.
Definition at line 46 of file CLFillBorderKernel.cpp.
|
default |
Allow instances of this class to be moved.
|
delete |
Prevent instances of this class from being copied (As this class contains pointers)
|
overridevirtual |
Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue.
[in] | window | Region on which to execute the kernel. (Must be a valid region of the window returned by window()). |
[in,out] | queue | Command queue on which to enqueue the kernel. |
Reimplemented from ICLKernel.
Definition at line 208 of file CLFillBorderKernel.cpp.
References ICLKernel::add_3D_tensor_argument(), ARM_COMPUTE_ERROR_ON_MISMATCHING_WINDOWS, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, Window::collapse_if_possible(), Window::DimZ, arm_compute::enqueue(), Window::first_slice_window_3D(), ICLKernel::lws_hint(), arm_compute::test::validation::reference::slice(), Window::slide_window_slice_3D(), and IKernel::window().
Referenced by arm_compute::test::validation::DATA_TEST_CASE().
|
overridevirtual |
Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue.
[in] | tensors | A vector containing the tensors to operato on. |
[in] | window | Region on which to execute the kernel. (Must be a valid region of the window returned by window()). |
[in,out] | queue | Command queue on which to enqueue the kernel. |
Reimplemented from ICLKernel.
Definition at line 183 of file CLFillBorderKernel.cpp.
References arm_compute::ACL_SRC, ICLKernel::add_3D_tensor_argument(), ARM_COMPUTE_ERROR_ON_MISMATCHING_WINDOWS, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, Window::collapse_if_possible(), Window::DimZ, arm_compute::enqueue(), Window::first_slice_window_3D(), ITensorPack::get_const_tensor(), ICLKernel::lws_hint(), arm_compute::test::validation::reference::slice(), Window::slide_window_slice_3D(), tensor, and IKernel::window().
void set_constant_border | ( | unsigned int | idx, |
const PixelValue & | constant_border_value | ||
) |
Function to set the constant value on fill border kernel depending on type.
[in] | idx | Index of the kernel argument to set. |
[in] | constant_border_value | Constant value to use for borders if border_mode is set to CONSTANT. |
Definition at line 52 of file CLFillBorderKernel.cpp.
References PixelValue::get().