Compute Library
 21.02
NEHOGDetectorKernel Class Reference

Neon kernel to perform HOG detector kernel using linear SVM. More...

#include <NEHOGDetectorKernel.h>

Collaboration diagram for NEHOGDetectorKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEHOGDetectorKernel ()
 Default constructor. More...
 
 NEHOGDetectorKernel (const NEHOGDetectorKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEHOGDetectorKerneloperator= (const NEHOGDetectorKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEHOGDetectorKernel (NEHOGDetectorKernel &&)=delete
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
NEHOGDetectorKerneloperator= (NEHOGDetectorKernel &&)=delete
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
 ~NEHOGDetectorKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input, const IHOG *hog, IDetectionWindowArray *detection_windows, const Size2D &detection_window_stride, float threshold=0.0f, uint16_t idx_class=0)
 Initialise the kernel's input, HOG data-object, detection window, the stride of the detection window, the threshold and index of the object to detect. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Detailed Description

Neon kernel to perform HOG detector kernel using linear SVM.

Definition at line 37 of file NEHOGDetectorKernel.h.

Constructor & Destructor Documentation

◆ NEHOGDetectorKernel() [1/3]

Default constructor.

Definition at line 38 of file NEHOGDetectorKernel.cpp.

Referenced by NEHOGDetectorKernel::name().

39  : _input(nullptr), _detection_windows(), _hog_descriptor(nullptr), _bias(0.0f), _threshold(0.0f), _idx_class(0), _num_bins_per_descriptor_x(0), _num_blocks_per_descriptor_y(0), _block_stride_width(0),
40  _block_stride_height(0), _detection_window_width(0), _detection_window_height(0), _max_num_detection_windows(0), _mutex()
41 {
42 }

◆ NEHOGDetectorKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEHOGDetectorKernel() [3/3]

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ ~NEHOGDetectorKernel()

~NEHOGDetectorKernel ( )
default

Default destructor.

Referenced by NEHOGDetectorKernel::name().

Member Function Documentation

◆ configure()

void configure ( const ITensor input,
const IHOG hog,
IDetectionWindowArray detection_windows,
const Size2D detection_window_stride,
float  threshold = 0.0f,
uint16_t  idx_class = 0 
)

Initialise the kernel's input, HOG data-object, detection window, the stride of the detection window, the threshold and index of the object to detect.

Parameters
[in]inputInput tensor which stores the HOG descriptor obtained with NEHOGOrientationBinningKernel. Data type supported: F32. Number of channels supported: equal to the number of histogram bins per block
[in]hogHOG data object used by NEHOGOrientationBinningKernel and NEHOGBlockNormalizationKernel
[out]detection_windowsArray of DetectionWindow. This array stores all the detected objects
[in]detection_window_strideDistance in pixels between 2 consecutive detection windows in x and y directions. It must be multiple of the hog->info()->block_stride()
[in]threshold(Optional) Threshold for the distance between features and SVM classifying plane
[in]idx_class(Optional) Index of the class used for evaluating which class the detection window belongs to

Definition at line 44 of file NEHOGDetectorKernel.cpp.

References ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_DATA_TYPE_NOT_IN, HOGInfo::block_size(), HOGInfo::block_stride(), IHOG::descriptor(), HOGInfo::descriptor_size(), HOGInfo::detection_window_size(), Window::DimX, Window::DimY, arm_compute::F32, arm_compute::floor_to_multiple(), Size2D::height, IHOG::info(), ITensor::info(), arm_compute::test::validation::input, IArray< T >::max_num_values(), ITensorInfo::num_channels(), Window::set(), ValidRegion::shape, arm_compute::test::validation::reference::threshold(), arm_compute::update_window_and_padding(), arm_compute::test::validation::valid_region, ITensorInfo::valid_region(), and Size2D::width.

Referenced by NEHOGDetectorKernel::name().

45 {
47  ARM_COMPUTE_ERROR_ON(hog == nullptr);
48  ARM_COMPUTE_ERROR_ON(detection_windows == nullptr);
49  ARM_COMPUTE_ERROR_ON((detection_window_stride.width % hog->info()->block_stride().width) != 0);
50  ARM_COMPUTE_ERROR_ON((detection_window_stride.height % hog->info()->block_stride().height) != 0);
51 
52  const Size2D &detection_window_size = hog->info()->detection_window_size();
53  const Size2D &block_size = hog->info()->block_size();
54  const Size2D &block_stride = hog->info()->block_stride();
55 
56  _input = input;
57  _detection_windows = detection_windows;
58  _threshold = threshold;
59  _idx_class = idx_class;
60  _hog_descriptor = hog->descriptor();
61  _bias = _hog_descriptor[hog->info()->descriptor_size() - 1];
62  _num_bins_per_descriptor_x = ((detection_window_size.width - block_size.width) / block_stride.width + 1) * input->info()->num_channels();
63  _num_blocks_per_descriptor_y = (detection_window_size.height - block_size.height) / block_stride.height + 1;
64  _block_stride_width = block_stride.width;
65  _block_stride_height = block_stride.height;
66  _detection_window_width = detection_window_size.width;
67  _detection_window_height = detection_window_size.height;
68  _max_num_detection_windows = detection_windows->max_num_values();
69 
70  ARM_COMPUTE_ERROR_ON((_num_bins_per_descriptor_x * _num_blocks_per_descriptor_y + 1) != hog->info()->descriptor_size());
71 
72  // Get the number of blocks along the x and y directions of the input tensor
73  const ValidRegion &valid_region = input->info()->valid_region();
74  const size_t num_blocks_x = valid_region.shape[0];
75  const size_t num_blocks_y = valid_region.shape[1];
76 
77  // Get the number of blocks along the x and y directions of the detection window
78  const size_t num_blocks_per_detection_window_x = detection_window_size.width / block_stride.width;
79  const size_t num_blocks_per_detection_window_y = detection_window_size.height / block_stride.height;
80 
81  const size_t window_step_x = detection_window_stride.width / block_stride.width;
82  const size_t window_step_y = detection_window_stride.height / block_stride.height;
83 
84  // Configure kernel window
85  Window win;
86  win.set(Window::DimX, Window::Dimension(0, floor_to_multiple(num_blocks_x - num_blocks_per_detection_window_x, window_step_x) + window_step_x, window_step_x));
87  win.set(Window::DimY, Window::Dimension(0, floor_to_multiple(num_blocks_y - num_blocks_per_detection_window_y, window_step_y) + window_step_y, window_step_y));
88 
89  constexpr unsigned int num_elems_read_per_iteration = 1;
90  const unsigned int num_rows_read_per_iteration = _num_blocks_per_descriptor_y;
91 
92  update_window_and_padding(win, AccessWindowRectangle(input->info(), 0, 0, num_elems_read_per_iteration, num_rows_read_per_iteration));
93 
94  INEKernel::configure(win);
95 }
const Size2D & detection_window_size() const
The detection window size in pixels.
Definition: HOGInfo.cpp:101
TensorShape shape
Shape of the valid region.
Definition: Types.h:261
1 channel, 1 F32 per channel
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
Describe one of the image&#39;s dimensions with a start, end and step.
Definition: Window.h:77
const ValidRegion valid_region
Definition: Scale.cpp:221
auto floor_to_multiple(S value, T divisor) -> decltype((value/divisor) *divisor)
Computes the largest number smaller or equal to value that is a multiple of divisor.
Definition: Utils.h:85
const Size2D & block_stride() const
The block stride in pixels.
Definition: HOGInfo.cpp:106
size_t height
Height of the image region or rectangle.
Definition: Size2D.h:90
virtual ValidRegion valid_region() const =0
Valid region of the tensor.
Implementation of a rectangular access pattern.
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_NOT_IN(t,...)
Definition: Validate.h:692
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
bool update_window_and_padding(Window &win, Ts &&... patterns)
Update window and padding size for each of the access patterns.
Definition: WindowHelpers.h:46
const Size2D & block_size() const
The block size in pixels.
Definition: HOGInfo.cpp:96
virtual float * descriptor() const =0
Pointer to the first element of the array which stores the linear SVM coefficients of HOG descriptor...
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
void set(size_t dimension, const Dimension &dim)
Set the values of a given dimension.
Definition: Window.inl:49
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
size_t width
Width of the image region or rectangle.
Definition: Size2D.h:89
Class for specifying the size of an image or rectangle.
Definition: Size2D.h:34
size_t max_num_values() const
Maximum number of values which can be stored in this array.
Definition: IArray.h:58
virtual const HOGInfo * info() const =0
Interface to be implemented by the child class to return the HOG&#39;s metadata.
Container for valid region of a window.
Definition: Types.h:188
size_t descriptor_size() const
The size of HOG descriptor.
Definition: HOGInfo.cpp:131
SimpleTensor< T > threshold(const SimpleTensor< T > &src, T threshold, T false_value, T true_value, ThresholdType type, T upper)
Definition: Threshold.cpp:35
Describe a multidimensional execution window.
Definition: Window.h:39
virtual size_t num_channels() const =0
The number of channels for each tensor element.

◆ name()

◆ operator=() [1/2]

NEHOGDetectorKernel& operator= ( const NEHOGDetectorKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

Referenced by NEHOGDetectorKernel::name().

◆ operator=() [2/2]

NEHOGDetectorKernel& operator= ( NEHOGDetectorKernel &&  )
delete

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 97 of file NEHOGDetectorKernel.cpp.

References ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, arm_compute::test::validation::b, arm_compute::data_size_from_type(), ITensorInfo::data_type(), Window::DimY, arm_compute::execute_window_loop(), DetectionWindow::height, DetectionWindow::idx_class, ITensor::info(), IArray< T >::num_values(), IArray< T >::push_back(), DetectionWindow::score, ITensorInfo::strides_in_bytes(), DetectionWindow::width, IKernel::window(), DetectionWindow::x, and DetectionWindow::y.

Referenced by NEHOGDetectorKernel::name().

98 {
99  ARM_COMPUTE_UNUSED(info);
102  ARM_COMPUTE_ERROR_ON(_hog_descriptor == nullptr);
103 
104  const size_t in_step_y = _input->info()->strides_in_bytes()[Window::DimY] / data_size_from_type(_input->info()->data_type());
105 
106  Iterator in(_input, window);
107 
108  execute_window_loop(window, [&](const Coordinates & id)
109  {
110  const auto *in_row_ptr = reinterpret_cast<const float *>(in.ptr());
111 
112  // Init score_f32 with 0
113  float32x4_t score_f32 = vdupq_n_f32(0.0f);
114 
115  // Init score with bias
116  float score = _bias;
117 
118  // Compute Linear SVM
119  for(size_t yb = 0; yb < _num_blocks_per_descriptor_y; ++yb, in_row_ptr += in_step_y)
120  {
121  int32_t xb = 0;
122 
123  const int32_t offset_y = yb * _num_bins_per_descriptor_x;
124 
125  for(; xb < static_cast<int32_t>(_num_bins_per_descriptor_x) - 16; xb += 16)
126  {
127  // Load descriptor values
128  const float32x4x4_t a_f32 =
129  {
130  {
131  vld1q_f32(&in_row_ptr[xb + 0]),
132  vld1q_f32(&in_row_ptr[xb + 4]),
133  vld1q_f32(&in_row_ptr[xb + 8]),
134  vld1q_f32(&in_row_ptr[xb + 12])
135  }
136  };
137 
138  // Load detector values
139  const float32x4x4_t b_f32 =
140  {
141  {
142  vld1q_f32(&_hog_descriptor[xb + 0 + offset_y]),
143  vld1q_f32(&_hog_descriptor[xb + 4 + offset_y]),
144  vld1q_f32(&_hog_descriptor[xb + 8 + offset_y]),
145  vld1q_f32(&_hog_descriptor[xb + 12 + offset_y])
146  }
147  };
148 
149  // Multiply accumulate
150  score_f32 = vmlaq_f32(score_f32, a_f32.val[0], b_f32.val[0]);
151  score_f32 = vmlaq_f32(score_f32, a_f32.val[1], b_f32.val[1]);
152  score_f32 = vmlaq_f32(score_f32, a_f32.val[2], b_f32.val[2]);
153  score_f32 = vmlaq_f32(score_f32, a_f32.val[3], b_f32.val[3]);
154  }
155 
156  for(; xb < static_cast<int32_t>(_num_bins_per_descriptor_x); ++xb)
157  {
158  const float a = in_row_ptr[xb];
159  const float b = _hog_descriptor[xb + offset_y];
160 
161  score += a * b;
162  }
163  }
164 
165  score += vgetq_lane_f32(score_f32, 0);
166  score += vgetq_lane_f32(score_f32, 1);
167  score += vgetq_lane_f32(score_f32, 2);
168  score += vgetq_lane_f32(score_f32, 3);
169 
170  if(score > _threshold)
171  {
172  if(_detection_windows->num_values() < _max_num_detection_windows)
173  {
174  DetectionWindow win;
175  win.x = (id.x() * _block_stride_width);
176  win.y = (id.y() * _block_stride_height);
177  win.width = _detection_window_width;
178  win.height = _detection_window_height;
179  win.idx_class = _idx_class;
180  win.score = score;
181 
183  _detection_windows->push_back(win);
184  lock.unlock();
185  }
186  }
187  },
188  in);
189 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
uint16_t x
Top-left x coordinate.
Definition: Types.h:592
float score
Confidence value for the detection window.
Definition: Types.h:597
SimpleTensor< float > b
Definition: DFT.cpp:157
virtual DataType data_type() const =0
Data type used for each element of the tensor.
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
std::unique_lock< Mutex > unique_lock
Wrapper of lock_guard data-object.
Definition: Mutex.h:41
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
size_t num_values() const
Number of values currently stored in the array.
Definition: IArray.h:68
uint16_t width
Width of the detection window.
Definition: Types.h:594
Coordinates of an item.
Definition: Coordinates.h:37
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
size_t data_size_from_type(DataType data_type)
The size in bytes of the data type.
Definition: Utils.h:106
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:941
uint16_t idx_class
Index of the class.
Definition: Types.h:596
uint16_t height
Height of the detection window.
Definition: Types.h:595
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
Detection window used for the object detection.
Definition: Types.h:590
uint16_t y
Top-left y coordinate.
Definition: Types.h:593
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:77
virtual const Strides & strides_in_bytes() const =0
The strides in bytes for accessing each dimension of the tensor.
bool push_back(const T &val)
Append the passed argument to the end of the array if there is room.
Definition: IArray.h:78
Iterator updated by execute_window_loop for each window element.
Definition: Helpers.h:46
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205

The documentation for this class was generated from the following files: