Compute Library
 21.02
NEAccumulateSquaredKernel Class Reference

Interface for the accumulate squared kernel. More...

#include <NEAccumulateKernel.h>

Collaboration diagram for NEAccumulateSquaredKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEAccumulateSquaredKernel ()
 Default constructor. More...
 
 NEAccumulateSquaredKernel (const NEAccumulateSquaredKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEAccumulateSquaredKerneloperator= (const NEAccumulateSquaredKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEAccumulateSquaredKernel (NEAccumulateSquaredKernel &&)=default
 Allow instances of this class to be moved. More...
 
NEAccumulateSquaredKerneloperator= (NEAccumulateSquaredKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~NEAccumulateSquaredKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input, uint32_t shift, ITensor *accum)
 Set the input and accumulation tensors and the shift value. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPSimpleKernel
 ICPPSimpleKernel ()
 Constructor. More...
 
 ICPPSimpleKernel (const ICPPSimpleKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
ICPPSimpleKerneloperator= (const ICPPSimpleKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 ICPPSimpleKernel (ICPPSimpleKernel &&)=default
 Allow instances of this class to be moved. More...
 
ICPPSimpleKerneloperator= (ICPPSimpleKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~ICPPSimpleKernel ()=default
 Default destructor. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Detailed Description

Interface for the accumulate squared kernel.

The accumulation of squares is computed:

\[ accum(x,y) = saturate_{int16} ( (uint16) accum(x,y) + (((uint16)(input(x,y)^2)) >> (shift)) ) \]

Where \( 0 \le shift \le 15 \)

Definition at line 149 of file NEAccumulateKernel.h.

Constructor & Destructor Documentation

◆ NEAccumulateSquaredKernel() [1/3]

Default constructor.

Definition at line 321 of file NEAccumulateKernel.cpp.

322  : _shift(0)
323 {
324 }

◆ NEAccumulateSquaredKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEAccumulateSquaredKernel() [3/3]

Allow instances of this class to be moved.

◆ ~NEAccumulateSquaredKernel()

Default destructor.

Member Function Documentation

◆ configure()

void configure ( const ITensor input,
uint32_t  shift,
ITensor accum 
)

Set the input and accumulation tensors and the shift value.

Parameters
[in]inputSource tensor. Data type supported: U8.
[in]shiftShift value in the range of [0, 15]
[in,out]accumAccumulated tensor. Data type supported: S16.

Definition at line 326 of file NEAccumulateKernel.cpp.

References ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_ERROR_ON_MISMATCHING_SHAPES, ARM_COMPUTE_ERROR_ON_NULLPTR, ITensor::info(), num_elems_processed_per_iteration, arm_compute::S16, arm_compute::set_format_if_unknown(), arm_compute::set_shape_if_empty(), ITensorInfo::tensor_shape(), and arm_compute::U8.

327 {
329 
330  set_shape_if_empty(*accum->info(), input->info()->tensor_shape());
331 
332  set_format_if_unknown(*accum->info(), Format::S16);
333 
337  ARM_COMPUTE_ERROR_ON(shift > 15);
338 
339  _shift = shift;
340 
341  constexpr unsigned int num_elems_processed_per_iteration = 16;
342  INESimpleKernel::configure(input, accum, num_elems_processed_per_iteration);
343 }
bool set_format_if_unknown(ITensorInfo &info, Format format)
Set the format, data type and number of channels to the specified value if the current data type is u...
1 channel, 1 U8 per channel
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
#define ARM_COMPUTE_ERROR_ON_MISMATCHING_SHAPES(...)
Definition: Validate.h:441
bool set_shape_if_empty(ITensorInfo &info, const TensorShape &shape)
Set the shape to the specified value if the current assignment is empty.
1 channel, 1 S16 per channel
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:790
unsigned int num_elems_processed_per_iteration
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 152 of file NEAccumulateKernel.h.

References NEAccumulateKernel::configure(), arm_compute::test::validation::info, arm_compute::test::validation::input, NEAccumulateKernel::operator=(), NEAccumulateKernel::run(), and IKernel::window().

153  {
154  return "NEAccumulateSquaredKernel";
155  }

◆ operator=() [1/2]

NEAccumulateSquaredKernel& operator= ( const NEAccumulateSquaredKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Allow instances of this class to be moved.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 345 of file NEAccumulateKernel.cpp.

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, arm_compute::execute_window_loop(), arm_compute::test::validation::input, Iterator::ptr(), and IKernel::window().

346 {
350  Iterator input(_input, window);
351  Iterator accum(_output, window);
352 
353  execute_window_loop(window, [&](const Coordinates &)
354  {
355  acc_sq_v16_u8(input.ptr(), _shift, accum.ptr());
356  },
357  input, accum);
358 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:941
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:77
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205

The documentation for this class was generated from the following files: