Interface for the pixelwise multiplication kernel. More...

#include <CLPixelWiseMultiplicationKernel.h>

Collaboration diagram for CLPixelWiseMultiplicationKernel:

Public Member Functions
	CLPixelWiseMultiplicationKernel ()
	Default constructor. More...

	CLPixelWiseMultiplicationKernel (const CLPixelWiseMultiplicationKernel &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

CLPixelWiseMultiplicationKernel &	operator= (const CLPixelWiseMultiplicationKernel &)=delete
	Prevent instances of this class from being copied (As this class contains pointers) More...

	CLPixelWiseMultiplicationKernel (CLPixelWiseMultiplicationKernel &&)=default
	Allow instances of this class to be moved. More...

CLPixelWiseMultiplicationKernel &	operator= (CLPixelWiseMultiplicationKernel &&)=default
	Allow instances of this class to be moved. More...

void	configure (ITensorInfo input1, ITensorInfo input2, ITensorInfo *output, float scale, ConvertPolicy overflow_policy, RoundingPolicy rounding_policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
	Initialise the kernel's input, output and border mode. More...

void	configure (const CLCompileContext &compile_context, ITensorInfo input1, ITensorInfo input2, ITensorInfo *output, float scale, ConvertPolicy overflow_policy, RoundingPolicy rounding_policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
	Initialise the kernel's input, output and border mode. More...

void	run_op (ITensorPack &tensors, const Window &window, cl::CommandQueue &queue) override
	Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue. More...

BorderSize	border_size () const override
	The size of the border for that kernel. More...

Public Member Functions inherited from ICLKernel
	ICLKernel ()
	Constructor. More...

cl::Kernel &	kernel ()
	Returns a reference to the OpenCL kernel of this object. More...

template<typename T >
void	add_1D_array_argument (unsigned int &idx, const ICLArray< T > *array, const Strides &strides, unsigned int num_dimensions, const Window &window)
	Add the passed 1D array's parameters to the object's kernel's arguments starting from the index idx. More...

void	add_1D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
	Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx. More...

void	add_1D_tensor_argument_if (bool cond, unsigned int &idx, const ICLTensor *tensor, const Window &window)
	Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx if the condition is true. More...

void	add_2D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
	Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx. More...

void	add_2D_tensor_argument_if (bool cond, unsigned int &idx, const ICLTensor *tensor, const Window &window)
	Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx if the condition is true. More...

void	add_3D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
	Add the passed 3D tensor's parameters to the object's kernel's arguments starting from the index idx. More...

void	add_4D_tensor_argument (unsigned int &idx, const ICLTensor *tensor, const Window &window)
	Add the passed 4D tensor's parameters to the object's kernel's arguments starting from the index idx. More...

virtual void	run (const Window &window, cl::CommandQueue &queue)
	Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue. More...

template<typename T >
void	add_argument (unsigned int &idx, T value)
	Add the passed parameters to the object's kernel's arguments starting from the index idx. More...

void	set_lws_hint (const cl::NDRange &lws_hint)
	Set the Local-Workgroup-Size hint. More...

cl::NDRange	lws_hint () const
	Return the Local-Workgroup-Size hint. More...

void	set_wbsm_hint (const cl_int &wbsm_hint)
	Set the workgroup batch size modifier hint. More...

cl_int	wbsm_hint () const
	Return the workgroup batch size modifier hint. More...

const std::string &	config_id () const
	Get the configuration ID. More...

void	set_target (GPUTarget target)
	Set the targeted GPU architecture. More...

void	set_target (cl::Device &device)
	Set the targeted GPU architecture according to the CL device. More...

GPUTarget	get_target () const
	Get the targeted GPU architecture. More...

size_t	get_max_workgroup_size ()
	Get the maximum workgroup size for the device the CLKernelLibrary uses. More...

template<unsigned int dimension_size>
void	add_tensor_argument (unsigned &idx, const ICLTensor *tensor, const Window &window)

template<typename T , unsigned int dimension_size>
void	add_array_argument (unsigned &idx, const ICLArray< T > *array, const Strides &strides, unsigned int num_dimensions, const Window &window)
	Add the passed array's parameters to the object's kernel's arguments starting from the index idx. More...

Public Member Functions inherited from IKernel
	IKernel ()
	Constructor. More...

virtual	~IKernel ()=default
	Destructor. More...

virtual bool	is_parallelisable () const
	Indicates whether or not the kernel is parallelisable. More...

const Window &	window () const
	The maximum window the kernel can be executed on. More...

Static Public Member Functions
static Status	validate (const ITensorInfo input1, const ITensorInfo input2, const ITensorInfo *output, float scale, ConvertPolicy overflow_policy, RoundingPolicy rounding_policy, const ActivationLayerInfo &act_info=ActivationLayerInfo())
	Static function to check if given info will lead to a valid configuration of CLPixelWiseMultiplicationKernel. More...

Static Public Member Functions inherited from ICLKernel
static constexpr unsigned int	num_arguments_per_1D_array ()
	Returns the number of arguments enqueued per 1D array object. More...

static constexpr unsigned int	num_arguments_per_1D_tensor ()
	Returns the number of arguments enqueued per 1D tensor object. More...

static constexpr unsigned int	num_arguments_per_2D_tensor ()
	Returns the number of arguments enqueued per 2D tensor object. More...

static constexpr unsigned int	num_arguments_per_3D_tensor ()
	Returns the number of arguments enqueued per 3D tensor object. More...

static constexpr unsigned int	num_arguments_per_4D_tensor ()
	Returns the number of arguments enqueued per 4D tensor object. More...

static cl::NDRange	gws_from_window (const Window &window)
	Get the global work size given an execution window. More...

Detailed Description

Interface for the pixelwise multiplication kernel.

Definition at line 36 of file CLPixelWiseMultiplicationKernel.h.

Constructor & Destructor Documentation

◆ CLPixelWiseMultiplicationKernel() [1/3]

CLPixelWiseMultiplicationKernel ( )

Default constructor.

Definition at line 143 of file CLPixelWiseMultiplicationKernel.cpp.

     : _input1(nullptr), _input2(nullptr), _output(nullptr)
 {
 }

◆ CLPixelWiseMultiplicationKernel() [2/3]

CLPixelWiseMultiplicationKernel ( const CLPixelWiseMultiplicationKernel & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLPixelWiseMultiplicationKernel() [3/3]

CLPixelWiseMultiplicationKernel ( CLPixelWiseMultiplicationKernel && )

default

Allow instances of this class to be moved.

Member Function Documentation

◆ border_size()

BorderSize border_size ( ) const

overridevirtual

The size of the border for that kernel.

Returns: The width in number of elements of the border.

Reimplemented from IKernel.

Definition at line 319 of file CLPixelWiseMultiplicationKernel.cpp.

References ARM_COMPUTE_CREATE_ERROR, ARM_COMPUTE_RETURN_ERROR_ON, ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_RETURN_ERROR_ON_MSG, arm_compute::auto_init_if_empty(), Window::broadcast_if_dimension_le_one(), TensorShape::broadcast_shape(), ITensorInfo::broadcast_shape_and_valid_region(), arm_compute::calculate_max_window(), ITensorInfo::data_type(), ITensorInfo::dimension(), ActivationLayerInfo::enabled(), arm_compute::F16, arm_compute::F32, arm_compute::detail::have_different_dimensions(), arm_compute::is_data_type_float(), ITensorInfo::num_channels(), arm_compute::RUNTIME_ERROR, AccessWindowRectangle::set_valid_region(), ITensorInfo::tensor_shape(), TensorShape::total_size(), ITensorInfo::total_size(), arm_compute::U, and arm_compute::update_window_and_padding().

 {
     const unsigned int replicateSize = _output->dimension(0) - std::min(_input1->dimension(0), _input2->dimension(0));
     const unsigned int border        = std::min<unsigned int>(num_elems_processed_per_iteration - 1U, replicateSize);
     return BorderSize{ 0, border, 0, 0 };
 }

◆ configure() [1/2]

void configure	(	ITensorInfo *	input1,
		ITensorInfo *	input2,
		ITensorInfo *	output,
		float	scale,
		ConvertPolicy	overflow_policy,
		RoundingPolicy	rounding_policy,
		const ActivationLayerInfo &	act_info = `ActivationLayerInfo()`
	)

Initialise the kernel's input, output and border mode.

Valid configurations (Input1,Input2) -> Output :

(U8,U8) -> U8
(U8,U8) -> S16
(U8,S16) -> S16
(S16,U8) -> S16
(S16,S16) -> S16
(F16,F16) -> F16
(F32,F32) -> F32
(QASYMM8,QASYMM8) -> QASYMM8
(QASYMM8_SIGNED,QASYMM8_SIGNED) -> QASYMM8_SIGNED
(QSYMM16,QSYMM16) -> QSYMM16
(QSYMM16,QSYMM16) -> S32

Parameters

[in]	input1	An input tensor info. Data types supported: U8/QASYMM8/QASYMM8_SIGNED/S16/QSYMM16/F16/F32.
[in]	input2	An input tensor info. Data types supported: U8/QASYMM8/QASYMM8_SIGNED/S16/QSYMM16/F16/F32.
[out]	output	The output tensor info. Data types supported: U8/QASYMM8/QASYMM8_SIGNED/S16/QSYMM16/F16/F32.
[in]	scale	Scale to apply after multiplication. Scale must be positive and its value must be either 1/255 or 1/2^n where n is between 0 and 15.
[in]	overflow_policy	Overflow policy. Supported overflow policies: Wrap, Saturate
[in]	rounding_policy	Rounding policy. Supported rounding modes: to zero, to nearest even.
[in]	act_info	(Optional) Activation layer information in case of a fused activation.

Definition at line 148 of file CLPixelWiseMultiplicationKernel.cpp.

References CLKernelLibrary::get().

 {
     configure(CLKernelLibrary::get().get_compile_context(), input1, input2, output, scale, overflow_policy, rounding_policy, act_info);
 }

◆ configure() [2/2]

void configure	(	const CLCompileContext &	compile_context,
		ITensorInfo *	input1,
		ITensorInfo *	input2,
		ITensorInfo *	output,
		float	scale,
		ConvertPolicy	overflow_policy,
		RoundingPolicy	rounding_policy,
		const ActivationLayerInfo &	act_info = `ActivationLayerInfo()`
	)

Initialise the kernel's input, output and border mode.

Valid configurations (Input1,Input2) -> Output :

(U8,U8) -> U8
(U8,U8) -> S16
(U8,S16) -> S16
(S16,U8) -> S16
(S16,S16) -> S16
(F16,F16) -> F16
(F32,F32) -> F32
(QASYMM8,QASYMM8) -> QASYMM8
(QASYMM8_SIGNED,QASYMM8_SIGNED) -> QASYMM8_SIGNED
(QSYMM16,QSYMM16) -> QSYMM16
(QSYMM16,QSYMM16) -> S32

Parameters

[in]	compile_context	The compile context to be used.
[in]	input1	An input tensor info. Data types supported: U8/QASYMM8/QASYMM8_SIGNED/S16/QSYMM16/F16/F32.
[in]	input2	An input tensor info. Data types supported: U8/QASYMM8/QASYMM8_SIGNED/S16/QSYMM16/F16/F32.
[out]	output	The output tensor info. Data types supported: U8/QASYMM8/QASYMM8_SIGNED/S16/QSYMM16/F16/F32.
[in]	scale	Scale to apply after multiplication. Scale must be positive and its value must be either 1/255 or 1/2^n where n is between 0 and 15.
[in]	overflow_policy	Overflow policy. Supported overflow policies: Wrap, Saturate
[in]	rounding_policy	Rounding policy. Supported rounding modes: to zero, to nearest even.
[in]	act_info	(Optional) Activation layer information in case of a fused activation.

Definition at line 154 of file CLPixelWiseMultiplicationKernel.cpp.

References ActivationLayerInfo::a(), ActivationLayerInfo::activation(), CLBuildOptions::add_option(), CLBuildOptions::add_option_if(), CLBuildOptions::add_option_if_else(), ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ActivationLayerInfo::b(), arm_compute::create_kernel(), ITensorInfo::data_type(), ITensorInfo::element_size(), ActivationLayerInfo::enabled(), arm_compute::F32, arm_compute::float_to_string_with_full_precision(), arm_compute::get_cl_type_from_data_type(), arm_compute::is_data_type_float(), arm_compute::is_data_type_quantized(), arm_compute::is_data_type_quantized_asymmetric(), kernel_name, arm_compute::lower_string(), ICLKernel::num_arguments_per_3D_tensor(), UniformQuantizationInfo::offset, CLBuildOptions::options(), ITensorInfo::quantization_info(), arm_compute::S32, UniformQuantizationInfo::scale, arm_compute::string_from_activation_func(), arm_compute::support::cpp11::to_string(), arm_compute::TO_ZERO, QuantizationInfo::uniform(), arm_compute::validate_arguments(), and arm_compute::WRAP.

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(input1, input2, output);
     ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input1, input2, output,
                                                   scale, overflow_policy, rounding_policy, act_info));
 
     // Configure kernel window
     auto win_config = validate_and_configure_window(input1, input2, output);
     ARM_COMPUTE_ERROR_THROW_ON(win_config.first);
 
     _input1 = input1;
     _input2 = input2;
     _output = output;
 
     int scale_int = -1;
     // Extract sign, exponent and mantissa
     int   exponent            = 0;
     float normalized_mantissa = std::frexp(scale, &exponent);
     // Use int scaling if factor is equal to 1/2^n for 0 <= n <= 15
     // frexp returns 0.5 as mantissa which means that the exponent will be in the range of -1 <= e <= 14
     // Moreover, it will be negative as we deal with 1/2^n
     if((normalized_mantissa == 0.5f) && (-14 <= exponent) && (exponent <= 1))
     {
         // Store the positive exponent. We know that we compute 1/2^n
         // Additionally we need to subtract 1 to compensate that frexp used a mantissa of 0.5
         scale_int = std::abs(exponent - 1);
     }
 
     std::string acc_type;
     // Check if it has float inputs and output
     if(is_data_type_float(input1->data_type()) || is_data_type_float(input2->data_type()))
     {
         scale_int = -1;
         acc_type  = (input1->data_type() == DataType::F32 || input2->data_type() == DataType::F32) ? "float" : "half";
     }
     else
     {
         if(input1->element_size() == 2 || input2->element_size() == 2)
         {
             // Use 32-bit accumulator for 16-bit input
             acc_type = "int";
         }
         else
         {
             // Use 16-bit accumulator for 8-bit input
             acc_type = "ushort";
         }
     }
 
     const bool is_quantized = is_data_type_quantized(input1->data_type());
 
     // Set kernel build options
     std::string    kernel_name = "pixelwise_mul";
     CLBuildOptions build_opts;
     build_opts.add_option("-DDATA_TYPE_IN1=" + get_cl_type_from_data_type(input1->data_type()));
     build_opts.add_option("-DDATA_TYPE_IN2=" + get_cl_type_from_data_type(input2->data_type()));
     build_opts.add_option("-DDATA_TYPE_OUT=" + get_cl_type_from_data_type(output->data_type()));
     build_opts.add_option("-DVEC_SIZE=" + support::cpp11::to_string(num_elems_processed_per_iteration));
     if(is_quantized && (output->data_type() != DataType::S32))
     {
         const UniformQuantizationInfo iq1_info = input1->quantization_info().uniform();
         const UniformQuantizationInfo iq2_info = input2->quantization_info().uniform();
         const UniformQuantizationInfo oq_info  = output->quantization_info().uniform();
 
         build_opts.add_option_if(is_data_type_quantized_asymmetric(input1->data_type()),
                                  "-DOFFSET_IN1=" + support::cpp11::to_string(iq1_info.offset));
         build_opts.add_option_if(is_data_type_quantized_asymmetric(input2->data_type()),
                                  "-DOFFSET_IN2=" + support::cpp11::to_string(iq2_info.offset));
         build_opts.add_option_if(is_data_type_quantized_asymmetric(output->data_type()),
                                  "-DOFFSET_OUT=" + support::cpp11::to_string(oq_info.offset));
         build_opts.add_option("-DSCALE_IN1=" + float_to_string_with_full_precision(iq1_info.scale));
         build_opts.add_option("-DSCALE_IN2=" + float_to_string_with_full_precision(iq2_info.scale));
         build_opts.add_option("-DSCALE_OUT=" + float_to_string_with_full_precision(oq_info.scale));
         kernel_name += "_quantized";
     }
     else
     {
         kernel_name += (scale_int >= 0) ? "_int" : "_float";
         build_opts.add_option_if_else(overflow_policy == ConvertPolicy::WRAP || is_data_type_float(output->data_type()), "-DWRAP", "-DSATURATE");
         build_opts.add_option_if_else(rounding_policy == RoundingPolicy::TO_ZERO, "-DROUND=_rtz", "-DROUND=_rte");
         build_opts.add_option("-DACC_DATA_TYPE=" + acc_type);
         if(act_info.enabled())
         {
             build_opts.add_option("-DACTIVATION_TYPE=" + lower_string(string_from_activation_func(act_info.activation())));
             build_opts.add_option("-DA_VAL=" + float_to_string_with_full_precision(act_info.a()));
             build_opts.add_option("-DB_VAL=" + float_to_string_with_full_precision(act_info.b()));
         }
     }
 
     // Create kernel
     _kernel = create_kernel(compile_context, kernel_name, build_opts.options());
 
     // Set scale argument
     unsigned int idx = 3 * num_arguments_per_3D_tensor(); // Skip the inputs and output parameters
 
     if(scale_int >= 0 && !is_quantized)
     {
         _kernel.setArg(idx++, scale_int);
     }
     else
     {
         _kernel.setArg(idx++, scale);
     }
 
     ICLKernel::configure_internal(win_config.second);
 }

◆ operator=() [1/2]

CLPixelWiseMultiplicationKernel& operator= ( const CLPixelWiseMultiplicationKernel & )

delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

CLPixelWiseMultiplicationKernel& operator= ( CLPixelWiseMultiplicationKernel && )

default

Allow instances of this class to be moved.

◆ run_op()

void run_op	(	ITensorPack &	tensors,
		const Window &	window,
		cl::CommandQueue &	queue
	)

overridevirtual

Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue.

Note: The queue is not flushed by this method, and therefore the kernel will not have been executed by the time this method returns.

Parameters

[in]	tensors	A vector containing the tensors to operato on.
[in]	window	Region on which to execute the kernel. (Must be a valid region of the window returned by window()).
[in,out]	queue	Command queue on which to enqueue the kernel.

Reimplemented from ICLKernel.

Definition at line 272 of file CLPixelWiseMultiplicationKernel.cpp.

References arm_compute::ACL_DST, arm_compute::ACL_SRC_0, arm_compute::ACL_SRC_1, ICLKernel::add_3D_tensor_argument(), ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, Window::broadcast_if_dimension_le_one(), Window::collapse_if_possible(), TensorShape::collapsed_from(), Window::DimZ, arm_compute::test::validation::dst, arm_compute::enqueue(), Window::first_slice_window_3D(), ITensorPack::get_const_tensor(), ITensorPack::get_tensor(), ICLKernel::lws_hint(), Dimensions< T >::num_dimensions(), arm_compute::test::validation::reference::slice(), Window::slide_window_slice_3D(), TensorShape::total_size(), and IKernel::window().

 {
     ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(this);
     ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(ICLKernel::window(), window);
 
     const auto src_0 = utils::cast::polymorphic_downcast<const ICLTensor *>(tensors.get_const_tensor(TensorType::ACL_SRC_0));
     const auto src_1 = utils::cast::polymorphic_downcast<const ICLTensor *>(tensors.get_const_tensor(TensorType::ACL_SRC_1));
     auto       dst   = utils::cast::polymorphic_downcast<ICLTensor *>(tensors.get_tensor(TensorType::ACL_DST));
 
     const TensorShape &in_shape1 = src_0->info()->tensor_shape();
     const TensorShape &in_shape2 = src_1->info()->tensor_shape();
     const TensorShape &out_shape = dst->info()->tensor_shape();
 
     bool can_collapse = true;
     if(std::min(in_shape1.total_size(), in_shape2.total_size()) > 1)
     {
         can_collapse = (std::min(in_shape1.num_dimensions(), in_shape2.num_dimensions()) > Window::DimZ);
         for(size_t d = Window::DimZ; can_collapse && (d < out_shape.num_dimensions()); ++d)
         {
             can_collapse = (in_shape1[d] == in_shape2[d]);
         }
     }
 
     bool   has_collapsed = false;
     Window collapsed     = can_collapse ? window.collapse_if_possible(ICLKernel::window(), Window::DimZ, &has_collapsed) : window;
 
     const TensorShape &in_shape1_collapsed = has_collapsed ? in_shape1.collapsed_from(Window::DimZ) : in_shape1;
     const TensorShape &in_shape2_collapsed = has_collapsed ? in_shape2.collapsed_from(Window::DimZ) : in_shape2;
 
     Window slice        = collapsed.first_slice_window_3D();
     Window slice_input1 = slice.broadcast_if_dimension_le_one(in_shape1_collapsed);
     Window slice_input2 = slice.broadcast_if_dimension_le_one(in_shape2_collapsed);
 
     do
     {
         unsigned int idx = 0;
         add_3D_tensor_argument(idx, src_0, slice_input1);
         add_3D_tensor_argument(idx, src_1, slice_input2);
         add_3D_tensor_argument(idx, dst, slice);
         enqueue(queue, *this, slice, lws_hint());
 
         ARM_COMPUTE_UNUSED(collapsed.slide_window_slice_3D(slice_input1));
         ARM_COMPUTE_UNUSED(collapsed.slide_window_slice_3D(slice_input2));
     }
     while(collapsed.slide_window_slice_3D(slice));
 }

◆ validate()

Status validate	(	const ITensorInfo *	input1,
		const ITensorInfo *	input2,
		const ITensorInfo *	output,
		float	scale,
		ConvertPolicy	overflow_policy,
		RoundingPolicy	rounding_policy,
		const ActivationLayerInfo &	act_info = `ActivationLayerInfo()`
	)

static

Static function to check if given info will lead to a valid configuration of CLPixelWiseMultiplicationKernel.

Valid configurations (Input1,Input2) -> Output :

(U8,U8) -> U8
(U8,U8) -> S16
(U8,S16) -> S16
(S16,U8) -> S16
(S16,S16) -> S16
(F16,F16) -> F16
(F32,F32) -> F32
(QASYMM8,QASYMM8) -> QASYMM8
(QASYMM8_SIGNED,QASYMM8_SIGNED) -> QASYMM8_SIGNED
(QSYMM16,QSYMM16) -> QSYMM16
(QSYMM16,QSYMM16) -> S32

Parameters

[in]	input1	An input tensor info. Data types supported: U8/QASYMM8/QASYMM8_SIGNED/S16/QSYMM16/F16/F32.
[in]	input2	An input tensor info. Data types supported: U8/QASYMM8/QASYMM8_SIGNED/S16/QSYMM16/F16/F32.
[in]	output	The output tensor info. Data types supported: U8/QASYMM8/QASYMM8_SIGNED/S16/QSYMM16/F16/F32.
[in]	scale	Scale to apply after multiplication. Scale must be positive and its value must be either 1/255 or 1/2^n where n is between 0 and 15.
[in]	overflow_policy	Overflow policy. Supported overflow policies: Wrap, Saturate
[in]	rounding_policy	Rounding policy. Supported rounding modes: to zero, to nearest even.
[in]	act_info	(Optional) Activation layer information in case of a fused activation.

Returns: a status

Definition at line 262 of file CLPixelWiseMultiplicationKernel.cpp.

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), and arm_compute::validate_arguments().

Referenced by CLPixelWiseMultiplication::validate().

 {
     ARM_COMPUTE_ERROR_ON_NULLPTR(input1, input2, output);
     ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(input1, input2, output, scale, overflow_policy, rounding_policy, act_info));
     ARM_COMPUTE_RETURN_ON_ERROR(validate_and_configure_window(input1->clone().get(), input2->clone().get(), output->clone().get()).first);
 
     return Status{};
 }

The documentation for this class was generated from the following files:

src/core/CL/kernels/CLPixelWiseMultiplicationKernel.h
src/core/CL/kernels/CLPixelWiseMultiplicationKernel.cpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ CLPixelWiseMultiplicationKernel() [1/3]

◆ CLPixelWiseMultiplicationKernel() [2/3]

◆ CLPixelWiseMultiplicationKernel() [3/3]

Member Function Documentation

◆ border_size()

◆ configure() [1/2]

◆ configure() [2/2]

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ run_op()

◆ validate()