Compute Library
NEFastCornersKernel Class Reference

Neon kernel to perform fast corners. More...

#include <NEFastCornersKernel.h>

Collaboration diagram for NEFastCornersKernel:

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 NEFastCornersKernel ()
 Constructor. More...
 NEFastCornersKernel (const NEFastCornersKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
NEFastCornersKerneloperator= (const NEFastCornersKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 NEFastCornersKernel (NEFastCornersKernel &&)=default
 Allow instances of this class to be moved. More...
NEFastCornersKerneloperator= (NEFastCornersKernel &&)=default
 Allow instances of this class to be moved. More...
 ~NEFastCornersKernel ()=default
 Default destructor. More...
void configure (const IImage *input, IImage *output, uint8_t threshold, bool non_max_suppression, bool border_undefined)
 Initialise the kernel. More...
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
BorderSize border_size () const override
 The size of the border for that kernel. More...
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
virtual ~IKernel ()=default
 Destructor. More...
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
const Windowwindow () const
 The maximum window the kernel can be executed on. More...

Detailed Description

Neon kernel to perform fast corners.

Definition at line 38 of file NEFastCornersKernel.h.

Constructor & Destructor Documentation

◆ NEFastCornersKernel() [1/3]


Definition at line 40 of file NEFastCornersKernel.cpp.

Referenced by NEFastCornersKernel::name().

41  : INEKernel(), _input(nullptr), _output(nullptr), _threshold(0), _non_max_suppression(false)
42 {
43 }
ICPPKernel INEKernel
Common interface for all kernels implemented in Neon.
Definition: INEOperator.h:37

◆ NEFastCornersKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEFastCornersKernel() [3/3]

Allow instances of this class to be moved.

◆ ~NEFastCornersKernel()

~NEFastCornersKernel ( )

Default destructor.

Referenced by NEFastCornersKernel::name().

Member Function Documentation

◆ border_size()

BorderSize border_size ( ) const

The size of the border for that kernel.

The width in number of elements of the border.

Reimplemented from IKernel.

Definition at line 356 of file NEFastCornersKernel.cpp.

Referenced by NEFastCornersKernel::configure(), and NEFastCornersKernel::name().

357 {
358  return BorderSize(3);
359 }
Container for 2D border size.
Definition: Types.h:273

◆ configure()

void configure ( const IImage input,
IImage output,
uint8_t  threshold,
bool  non_max_suppression,
bool  border_undefined 

Initialise the kernel.

[in]inputSource image. Data type supported: U8.
[out]outputOutput image. Data type supported: U8.
[in]thresholdThreshold on difference between intensity of the central pixel and pixels on Bresenham's circle of radius 3.
[in]non_max_suppressionTrue if non-maxima suppresion is applied, false otherwise.
[in]border_undefinedTrue if the border mode is undefined. False if it's replicate or constant.

Definition at line 361 of file NEFastCornersKernel.cpp.

References ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_ERROR_ON_MSG, ARM_COMPUTE_ERROR_ON_TENSOR_NOT_2D, NEFastCornersKernel::border_size(), arm_compute::calculate_max_window(), ITensor::info(), arm_compute::test::validation::input, BorderSize::left, non_max_suppression(), num_elems_processed_per_iteration, arm_compute::test::validation::reference::threshold(), BorderSize::top, arm_compute::U8, arm_compute::update_window_and_padding(), and ITensorInfo::valid_region().

Referenced by NEFastCornersKernel::name().

362 {
367  ARM_COMPUTE_ERROR_ON_MSG(border_undefined == false, "Not implemented");
369  _input = input;
370  _output = output;
371  _threshold = threshold;
372  _non_max_suppression = non_max_suppression;
374  constexpr unsigned int num_elems_processed_per_iteration = 1;
375  constexpr unsigned int num_elems_read_per_iteration = 8;
376  constexpr unsigned int num_elems_written_per_iteration = 1;
377  constexpr unsigned int num_rows_read_per_iteration = 7;
379  // Configure kernel window
380  Window win = calculate_max_window(*input->info(), Steps(num_elems_processed_per_iteration), border_undefined, border_size());
381  AccessWindowHorizontal output_access(output->info(), 0, num_elems_written_per_iteration);
382  AccessWindowRectangle input_access(input->info(), -border_size().left, -border_size().top, num_elems_read_per_iteration, num_rows_read_per_iteration);
384  update_window_and_padding(win, input_access, output_access);
386  output_access.set_valid_region(win, input->info()->valid_region(), border_undefined, border_size());
388  INEKernel::configure(win);
389 }
unsigned int top
top of the border
Definition: Types.h:375
Window calculate_max_window(const ValidRegion &valid_region, const Steps &steps, bool skip_border, BorderSize border_size)
Definition: Validate.h:856
1 channel, 1 U8 per channel
virtual ValidRegion valid_region() const =0
Valid region of the tensor.
Implementation of a rectangular access pattern.
bool update_window_and_padding(Window &win, Ts &&... patterns)
Update window and padding size for each of the access patterns.
Definition: WindowHelpers.h:46
Class to describe a number of elements in each dimension.
Definition: Steps.h:40
#define ARM_COMPUTE_ERROR_ON_MSG(cond, msg)
Definition: Error.h:456
Implementation of a row access pattern.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
unsigned int left
left of the border
Definition: Types.h:378
BorderSize border_size() const override
The size of the border for that kernel.
Definition: Validate.h:790
__kernel void non_max_suppression(__global uchar *src_ptr, uint src_stride_x, uint src_step_x, uint src_stride_y, uint src_step_y, uint src_offset_first_element_in_bytes, __global uchar *dst_ptr, uint dst_stride_x, uint dst_step_x, uint dst_stride_y, uint dst_step_y, uint dst_offset_first_element_in_bytes)
This function performs Non maxima suppression over a 3x3 window on a given image. ...
unsigned int num_elems_processed_per_iteration
SimpleTensor< T > threshold(const SimpleTensor< T > &src, T threshold, T false_value, T true_value, ThresholdType type, T upper)
Definition: Threshold.cpp:35
Describe a multidimensional execution window.
Definition: Window.h:39

◆ name()

◆ operator=() [1/2]

NEFastCornersKernel& operator= ( const NEFastCornersKernel )

Prevent instances of this class from being copied (As this class contains pointers)

Referenced by NEFastCornersKernel::name().

◆ operator=() [2/2]

NEFastCornersKernel& operator= ( NEFastCornersKernel &&  )

Allow instances of this class to be moved.

◆ run()

void run ( const Window window,
const ThreadInfo info 

Execute the kernel on the passed window.

If is_parallelisable() returns false then the passed window must be equal to window()
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 391 of file NEFastCornersKernel.cpp.

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, arm_compute::test::validation::b, arm_compute::execute_window_loop(), Iterator::offset(), Iterator::ptr(), ITensor::ptr_to_element(), and IKernel::window().

Referenced by NEFastCornersKernel::name().

392 {
397  std::array<uint8x8x2_t, PERMUTATIONS> perm_index{ {} };
398  /*
399  We use a LUT loaded with 7 rows of uint8_t from the input image [-3,-3]...[+3,+3] to retrieve the texels in the Brensenham circle radius 3 and put them in one neon register uint8x16_t.
400  The three lines below setup the neon index registers to get these texels out from the table
401  */
402  const uint8x8x4_t circle_index_r = create_circle_index_register();
403  /*
404  We put the 16 texels (circle) in a LUT to easily generate all the permutations. The for block below setups the indices for each permutation.
405  */
406  for(size_t k = 0; k < PERMUTATIONS; ++k)
407  {
408  perm_index[k] = create_permutation_index(k);
409  }
411  Iterator in(_input, window);
412  Iterator out(_output, window);
414  const std::array<uint8_t *const __restrict, 7> in_row
415  {
416  _input->ptr_to_element(Coordinates(-3, -3)),
417  _input->ptr_to_element(Coordinates(-3, -2)),
418  _input->ptr_to_element(Coordinates(-3, -1)),
419  _input->ptr_to_element(Coordinates(-3, 0)),
420  _input->ptr_to_element(Coordinates(-3, 1)),
421  _input->ptr_to_element(Coordinates(-3, 2)),
422  _input->ptr_to_element(Coordinates(-3, 3))
423  };
425  auto is_rejected = [](uint8_t p, uint8_t q, uint8_t a, uint8_t b)
426  {
427  const bool p_is_in_ab = (a <= p) && (p <= b);
428  const bool q_is_in_ab = (a <= q) && (q <= b);
429  return p_is_in_ab && q_is_in_ab;
430  };
432  execute_window_loop(window, [&](const Coordinates &)
433  {
434  const size_t in_offset = in.offset();
435  const uint8_t p0 = *in.ptr();
436  const uint8_t b = std::min(p0 + _threshold, 255);
437  const uint8_t a = std::max(p0 - _threshold, 0);
438  uint8_t score = 0;
439  /*
440  Fast check to discard points which cannot be corners and avoid the expensive computation of the potential 16 permutations
442  pixels 1 and 9 are examined, if both I1 and I9 are within [Ip - t, Ip + t], then candidate p is not a corner.
443  */
444  const uint8_t p1 = (in_offset + in_row[0])[3];
445  const uint8_t p9 = (in_offset + in_row[6])[3];
447  if(!is_rejected(p1, p9, a, b))
448  {
449  /* pixels 5 and 13 are further examined to check whether three of them are brighter than Ip + t or darker than Ip - t */
450  const uint8_t p5 = (in_offset + in_row[3])[6];
451  const uint8_t p13 = (in_offset + in_row[3])[0];
453  if(!is_rejected(p5, p13, a, b))
454  {
455  /* at this stage we use the full test with the 16 permutations to classify the point as corner or not */
456  const uint8x8x2_t tbl_circle_texel = create_circle_tbl(in_row, in_offset, circle_index_r);
458  if(point_is_fast_corner(p0, _threshold, tbl_circle_texel, perm_index))
459  {
460  if(_non_max_suppression)
461  {
462  score = get_point_score(p0, _threshold, tbl_circle_texel, perm_index);
463  }
464  else
465  {
466  score = 1;
467  }
468  }
469  }
470  }
472  *out.ptr() = score;
473  },
474  in, out);
475 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
uint8_t * ptr_to_element(const Coordinates &id) const
Return a pointer to the element at the passed coordinates.
Definition: ITensor.h:63
SimpleTensor< float > b
Definition: DFT.cpp:157
To avoid unused variables warnings.
Definition: Error.h:152
Coordinates of an item.
Definition: Coordinates.h:37
Definition: Validate.h:941
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:77
Iterator updated by execute_window_loop for each window element.
Definition: Helpers.h:46
Definition: Validate.h:205

The documentation for this class was generated from the following files: