Compute Library
 22.05
CPPBoxWithNonMaximaSuppressionLimit Class Reference

Basic function to run CPPBoxWithNonMaximaSuppressionLimitKernel. More...

#include <CPPBoxWithNonMaximaSuppressionLimit.h>

Collaboration diagram for CPPBoxWithNonMaximaSuppressionLimit:
[legend]

Public Member Functions

 CPPBoxWithNonMaximaSuppressionLimit (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Constructor. More...
 
 CPPBoxWithNonMaximaSuppressionLimit (const CPPBoxWithNonMaximaSuppressionLimit &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CPPBoxWithNonMaximaSuppressionLimitoperator= (const CPPBoxWithNonMaximaSuppressionLimit &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
void configure (const ITensor *scores_in, const ITensor *boxes_in, const ITensor *batch_splits_in, ITensor *scores_out, ITensor *boxes_out, ITensor *classes, ITensor *batch_splits_out=nullptr, ITensor *keeps=nullptr, ITensor *keeps_size=nullptr, const BoxNMSLimitInfo info=BoxNMSLimitInfo())
 Configure the BoxWithNonMaximaSuppressionLimit CPP kernel. More...
 
void run () override
 Run the kernels contained in the function. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 
virtual void prepare ()
 Prepare the function for executing. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *scores_in, const ITensorInfo *boxes_in, const ITensorInfo *batch_splits_in, const ITensorInfo *scores_out, const ITensorInfo *boxes_out, const ITensorInfo *classes, const ITensorInfo *batch_splits_out=nullptr, const ITensorInfo *keeps=nullptr, const ITensorInfo *keeps_size=nullptr, const BoxNMSLimitInfo info=BoxNMSLimitInfo())
 Static function to check if given info will lead to a valid configuration of CPPDetectionOutputLayer. More...
 

Detailed Description

Basic function to run CPPBoxWithNonMaximaSuppressionLimitKernel.

Definition at line 39 of file CPPBoxWithNonMaximaSuppressionLimit.h.

Constructor & Destructor Documentation

◆ CPPBoxWithNonMaximaSuppressionLimit() [1/2]

CPPBoxWithNonMaximaSuppressionLimit ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Constructor.

Definition at line 112 of file CPPBoxWithNonMaximaSuppressionLimit.cpp.

113  : _memory_group(std::move(memory_manager)),
114  _box_with_nms_limit_kernel(),
115  _scores_in(),
116  _boxes_in(),
117  _batch_splits_in(),
118  _scores_out(),
119  _boxes_out(),
120  _classes(),
121  _batch_splits_out(),
122  _keeps(),
123  _scores_in_f32(),
124  _boxes_in_f32(),
125  _batch_splits_in_f32(),
126  _scores_out_f32(),
127  _boxes_out_f32(),
128  _classes_f32(),
129  _batch_splits_out_f32(),
130  _keeps_f32(),
131  _is_qasymm8(false)
132 {
133 }

◆ CPPBoxWithNonMaximaSuppressionLimit() [2/2]

Prevent instances of this class from being copied (As this class contains pointers)

Member Function Documentation

◆ configure()

void configure ( const ITensor scores_in,
const ITensor boxes_in,
const ITensor batch_splits_in,
ITensor scores_out,
ITensor boxes_out,
ITensor classes,
ITensor batch_splits_out = nullptr,
ITensor keeps = nullptr,
ITensor keeps_size = nullptr,
const BoxNMSLimitInfo  info = BoxNMSLimitInfo() 
)

Configure the BoxWithNonMaximaSuppressionLimit CPP kernel.

Parameters
[in]scores_inThe scores input tensor of size [count, num_classes]. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32
[in]boxes_inThe boxes input tensor of size [count, num_classes * 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8/QASYMM8_SIGNED, otherwise same as scores_in
[in]batch_splits_inThe batch splits input tensor of size [batch_size]. Data types supported: Same as scores_in
Note
Can be a nullptr. If not a nullptr, scores_in and boxes_in have items from multiple images.
Parameters
[out]scores_outThe scores output tensor of size [N]. Data types supported: Same as scores_in
[out]boxes_outThe boxes output tensor of size [N, 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8/QASYMM8_SIGNED, otherwise same as scores_in
[out]classesThe classes output tensor of size [N]. Data types supported: Same as scores_in
[out]batch_splits_out(Optional) The batch splits output tensor. Data types supported: Same as scores_in
[out]keeps(Optional) The keeps output tensor of size [N]. Data types supported: Same as scores_in
[in]keeps_size(Optional) Number of filtered indices per class tensor of size [num_classes]. Data types supported: U32.
[in]info(Optional) BoxNMSLimitInfo information.

Definition at line 135 of file CPPBoxWithNonMaximaSuppressionLimit.cpp.

References TensorAllocator::allocate(), Tensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_LOG_PARAMS, ICloneable< T >::clone(), CPPBoxWithNonMaximaSuppressionLimitKernel::configure(), ITensorInfo::data_type(), arm_compute::F32, ITensor::info(), TensorAllocator::init(), MemoryGroup::manage(), arm_compute::QASYMM8, and arm_compute::QASYMM8_SIGNED.

Referenced by NEGenerateProposalsLayer::configure(), and CLGenerateProposalsLayer::configure().

138 {
139  ARM_COMPUTE_ERROR_ON_NULLPTR(scores_in, boxes_in, scores_out, boxes_out, classes);
140  ARM_COMPUTE_LOG_PARAMS(scores_in, boxes_in, batch_splits_in, scores_out, boxes_out, classes, batch_splits_out, keeps, keeps_size, info);
141 
142  _is_qasymm8 = scores_in->info()->data_type() == DataType::QASYMM8 || scores_in->info()->data_type() == DataType::QASYMM8_SIGNED;
143 
144  _scores_in = scores_in;
145  _boxes_in = boxes_in;
146  _batch_splits_in = batch_splits_in;
147  _scores_out = scores_out;
148  _boxes_out = boxes_out;
149  _classes = classes;
150  _batch_splits_out = batch_splits_out;
151  _keeps = keeps;
152 
153  if(_is_qasymm8)
154  {
155  // Manage intermediate buffers
156  _memory_group.manage(&_scores_in_f32);
157  _memory_group.manage(&_boxes_in_f32);
158  _memory_group.manage(&_scores_out_f32);
159  _memory_group.manage(&_boxes_out_f32);
160  _memory_group.manage(&_classes_f32);
161  _scores_in_f32.allocator()->init(scores_in->info()->clone()->set_data_type(DataType::F32));
162  _boxes_in_f32.allocator()->init(boxes_in->info()->clone()->set_data_type(DataType::F32));
163  if(batch_splits_in != nullptr)
164  {
165  _memory_group.manage(&_batch_splits_in_f32);
166  _batch_splits_in_f32.allocator()->init(batch_splits_in->info()->clone()->set_data_type(DataType::F32));
167  }
168  _scores_out_f32.allocator()->init(scores_out->info()->clone()->set_data_type(DataType::F32));
169  _boxes_out_f32.allocator()->init(boxes_out->info()->clone()->set_data_type(DataType::F32));
170  _classes_f32.allocator()->init(classes->info()->clone()->set_data_type(DataType::F32));
171  if(batch_splits_out != nullptr)
172  {
173  _memory_group.manage(&_batch_splits_out_f32);
174  _batch_splits_out_f32.allocator()->init(batch_splits_out->info()->clone()->set_data_type(DataType::F32));
175  }
176  if(keeps != nullptr)
177  {
178  _memory_group.manage(&_keeps_f32);
179  _keeps_f32.allocator()->init(keeps->info()->clone()->set_data_type(DataType::F32));
180  }
181 
182  _box_with_nms_limit_kernel.configure(&_scores_in_f32, &_boxes_in_f32, (batch_splits_in != nullptr) ? &_batch_splits_in_f32 : nullptr,
183  &_scores_out_f32, &_boxes_out_f32, &_classes_f32,
184  (batch_splits_out != nullptr) ? &_batch_splits_out_f32 : nullptr, (keeps != nullptr) ? &_keeps_f32 : nullptr,
185  keeps_size, info);
186  }
187  else
188  {
189  _box_with_nms_limit_kernel.configure(scores_in, boxes_in, batch_splits_in, scores_out, boxes_out, classes, batch_splits_out, keeps, keeps_size, info);
190  }
191 
192  if(_is_qasymm8)
193  {
194  _scores_in_f32.allocator()->allocate();
195  _boxes_in_f32.allocator()->allocate();
196  if(_batch_splits_in != nullptr)
197  {
198  _batch_splits_in_f32.allocator()->allocate();
199  }
200  _scores_out_f32.allocator()->allocate();
201  _boxes_out_f32.allocator()->allocate();
202  _classes_f32.allocator()->allocate();
203  if(batch_splits_out != nullptr)
204  {
205  _batch_splits_out_f32.allocator()->allocate();
206  }
207  if(keeps != nullptr)
208  {
209  _keeps_f32.allocator()->allocate();
210  }
211  }
212 }
void configure(const ITensor *scores_in, const ITensor *boxes_in, const ITensor *batch_splits_in, ITensor *scores_out, ITensor *boxes_out, ITensor *classes, ITensor *batch_splits_out=nullptr, ITensor *keeps=nullptr, ITensor *keeps_size=nullptr, const BoxNMSLimitInfo info=BoxNMSLimitInfo())
Initialise the kernel&#39;s input and output tensors.
void init(const TensorAllocator &allocator, const Coordinates &coords, TensorInfo &sub_info)
Shares the same backing memory with another tensor allocator, while the tensor info might be differen...
1 channel, 1 F32 per channel
TensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: Tensor.cpp:48
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
quantized, asymmetric fixed-point 8-bit number unsigned
void allocate() override
Allocate size specified by TensorInfo of CPU memory.
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
#define ARM_COMPUTE_LOG_PARAMS(...)
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157
quantized, asymmetric fixed-point 8-bit number signed

◆ operator=()

Prevent instances of this class from being copied (As this class contains pointers)

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For CPU kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 235 of file CPPBoxWithNonMaximaSuppressionLimit.cpp.

References Window::DimY, Scheduler::get(), and IScheduler::schedule().

Referenced by NEGenerateProposalsLayer::run(), and CLGenerateProposalsLayer::validate().

236 {
237  // Acquire all the temporaries
238  MemoryGroupResourceScope scope_mg(_memory_group);
239 
240  if(_is_qasymm8)
241  {
242  dequantize_tensor(_scores_in, &_scores_in_f32);
243  dequantize_tensor(_boxes_in, &_boxes_in_f32);
244  if(_batch_splits_in != nullptr)
245  {
246  dequantize_tensor(_batch_splits_in, &_batch_splits_in_f32);
247  }
248  }
249 
250  Scheduler::get().schedule(&_box_with_nms_limit_kernel, Window::DimY);
251 
252  if(_is_qasymm8)
253  {
254  quantize_tensor(&_scores_out_f32, _scores_out);
255  quantize_tensor(&_boxes_out_f32, _boxes_out);
256  quantize_tensor(&_classes_f32, _classes);
257  if(_batch_splits_out != nullptr)
258  {
259  quantize_tensor(&_batch_splits_out_f32, _batch_splits_out);
260  }
261  if(_keeps != nullptr)
262  {
263  quantize_tensor(&_keeps_f32, _keeps);
264  }
265  }
266 }
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
virtual void schedule(ICPPKernel *kernel, const Hints &hints)=0
Runs the kernel in the same thread as the caller synchronously.
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:94

◆ validate()

static Status validate ( const ITensorInfo scores_in,
const ITensorInfo boxes_in,
const ITensorInfo batch_splits_in,
const ITensorInfo scores_out,
const ITensorInfo boxes_out,
const ITensorInfo classes,
const ITensorInfo batch_splits_out = nullptr,
const ITensorInfo keeps = nullptr,
const ITensorInfo keeps_size = nullptr,
const BoxNMSLimitInfo  info = BoxNMSLimitInfo() 
)
static

Static function to check if given info will lead to a valid configuration of CPPDetectionOutputLayer.

Parameters
[in]scores_inThe scores input tensor of size [count, num_classes]. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32
[in]boxes_inThe boxes input tensor of size [count, num_classes * 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8/QASYMM8_SIGNED, otherwise same as scores_in
[in]batch_splits_inThe batch splits input tensor of size [batch_size]. Data types supported: Same as scores_in
Note
Can be a nullptr. If not a nullptr, scores_in and boxes_in have items from multiple images.
Parameters
[in]scores_outThe scores output tensor of size [N]. Data types supported: Same as scores_in
[in]boxes_outThe boxes output tensor of size [N, 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8/QASYMM8_SIGNED, otherwise same as scores_in
[in]classesThe classes output tensor of size [N]. Data types supported: Same as scores_in
[in]batch_splits_out(Optional) The batch splits output tensor. Data types supported: Same as scores_in
[in]keeps(Optional) The keeps output tensor of size [N]. Data types supported: Same as scores_in
[in]keeps_size(Optional) Number of filtered indices per class tensor of size [num_classes]. Data types supported: U32.
[in]info(Optional) BoxNMSLimitInfo information.
Returns
a status

The documentation for this class was generated from the following files: