Compute Library
 21.02
CPPBoxWithNonMaximaSuppressionLimit Class Reference

Basic function to run CPPBoxWithNonMaximaSuppressionLimitKernel. More...

#include <CPPBoxWithNonMaximaSuppressionLimit.h>

Collaboration diagram for CPPBoxWithNonMaximaSuppressionLimit:
[legend]

Public Member Functions

 CPPBoxWithNonMaximaSuppressionLimit (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Constructor. More...
 
 CPPBoxWithNonMaximaSuppressionLimit (const CPPBoxWithNonMaximaSuppressionLimit &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CPPBoxWithNonMaximaSuppressionLimitoperator= (const CPPBoxWithNonMaximaSuppressionLimit &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
void configure (const ITensor *scores_in, const ITensor *boxes_in, const ITensor *batch_splits_in, ITensor *scores_out, ITensor *boxes_out, ITensor *classes, ITensor *batch_splits_out=nullptr, ITensor *keeps=nullptr, ITensor *keeps_size=nullptr, const BoxNMSLimitInfo info=BoxNMSLimitInfo())
 Configure the BoxWithNonMaximaSuppressionLimit CPP kernel. More...
 
void run () override
 Run the kernels contained in the function. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 
virtual void prepare ()
 Prepare the function for executing. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *scores_in, const ITensorInfo *boxes_in, const ITensorInfo *batch_splits_in, const ITensorInfo *scores_out, const ITensorInfo *boxes_out, const ITensorInfo *classes, const ITensorInfo *batch_splits_out=nullptr, const ITensorInfo *keeps=nullptr, const ITensorInfo *keeps_size=nullptr, const BoxNMSLimitInfo info=BoxNMSLimitInfo())
 Static function to check if given info will lead to a valid configuration of CPPDetectionOutputLayer. More...
 

Detailed Description

Basic function to run CPPBoxWithNonMaximaSuppressionLimitKernel.

Definition at line 39 of file CPPBoxWithNonMaximaSuppressionLimit.h.

Constructor & Destructor Documentation

◆ CPPBoxWithNonMaximaSuppressionLimit() [1/2]

CPPBoxWithNonMaximaSuppressionLimit ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Constructor.

Definition at line 110 of file CPPBoxWithNonMaximaSuppressionLimit.cpp.

111  : _memory_group(std::move(memory_manager)),
112  _box_with_nms_limit_kernel(),
113  _scores_in(),
114  _boxes_in(),
115  _batch_splits_in(),
116  _scores_out(),
117  _boxes_out(),
118  _classes(),
119  _batch_splits_out(),
120  _keeps(),
121  _scores_in_f32(),
122  _boxes_in_f32(),
123  _batch_splits_in_f32(),
124  _scores_out_f32(),
125  _boxes_out_f32(),
126  _classes_f32(),
127  _batch_splits_out_f32(),
128  _keeps_f32(),
129  _is_qasymm8(false)
130 {
131 }

◆ CPPBoxWithNonMaximaSuppressionLimit() [2/2]

Prevent instances of this class from being copied (As this class contains pointers)

Member Function Documentation

◆ configure()

void configure ( const ITensor scores_in,
const ITensor boxes_in,
const ITensor batch_splits_in,
ITensor scores_out,
ITensor boxes_out,
ITensor classes,
ITensor batch_splits_out = nullptr,
ITensor keeps = nullptr,
ITensor keeps_size = nullptr,
const BoxNMSLimitInfo  info = BoxNMSLimitInfo() 
)

Configure the BoxWithNonMaximaSuppressionLimit CPP kernel.

Parameters
[in]scores_inThe scores input tensor of size [count, num_classes]. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32
[in]boxes_inThe boxes input tensor of size [count, num_classes * 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8/QASYMM8_SIGNED, otherwise same as scores_in
[in]batch_splits_inThe batch splits input tensor of size [batch_size]. Data types supported: Same as scores_in
Note
Can be a nullptr. If not a nullptr, scores_in and boxes_in have items from multiple images.
Parameters
[out]scores_outThe scores output tensor of size [N]. Data types supported: Same as scores_in
[out]boxes_outThe boxes output tensor of size [N, 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8/QASYMM8_SIGNED, otherwise same as scores_in
[out]classesThe classes output tensor of size [N]. Data types supported: Same as scores_in
[out]batch_splits_out(Optional) The batch splits output tensor. Data types supported: Same as scores_in
[out]keeps(Optional) The keeps output tensor of size [N]. Data types supported: Same as scores_in
[in]keeps_size(Optional) Number of filtered indices per class tensor of size [num_classes]. Data types supported: U32.
[in]info(Optional) BoxNMSLimitInfo information.

Definition at line 133 of file CPPBoxWithNonMaximaSuppressionLimit.cpp.

References TensorAllocator::allocate(), Tensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ICloneable< T >::clone(), CPPBoxWithNonMaximaSuppressionLimitKernel::configure(), ITensorInfo::data_type(), arm_compute::F32, ITensor::info(), TensorAllocator::init(), MemoryGroup::manage(), arm_compute::QASYMM8, and arm_compute::QASYMM8_SIGNED.

Referenced by NEGenerateProposalsLayer::configure(), and CLGenerateProposalsLayer::configure().

135 {
136  ARM_COMPUTE_ERROR_ON_NULLPTR(scores_in, boxes_in, scores_out, boxes_out, classes);
137 
138  _is_qasymm8 = scores_in->info()->data_type() == DataType::QASYMM8 || scores_in->info()->data_type() == DataType::QASYMM8_SIGNED;
139 
140  _scores_in = scores_in;
141  _boxes_in = boxes_in;
142  _batch_splits_in = batch_splits_in;
143  _scores_out = scores_out;
144  _boxes_out = boxes_out;
145  _classes = classes;
146  _batch_splits_out = batch_splits_out;
147  _keeps = keeps;
148 
149  if(_is_qasymm8)
150  {
151  // Manage intermediate buffers
152  _memory_group.manage(&_scores_in_f32);
153  _memory_group.manage(&_boxes_in_f32);
154  _memory_group.manage(&_scores_out_f32);
155  _memory_group.manage(&_boxes_out_f32);
156  _memory_group.manage(&_classes_f32);
157  _scores_in_f32.allocator()->init(scores_in->info()->clone()->set_data_type(DataType::F32));
158  _boxes_in_f32.allocator()->init(boxes_in->info()->clone()->set_data_type(DataType::F32));
159  if(batch_splits_in != nullptr)
160  {
161  _memory_group.manage(&_batch_splits_in_f32);
162  _batch_splits_in_f32.allocator()->init(batch_splits_in->info()->clone()->set_data_type(DataType::F32));
163  }
164  _scores_out_f32.allocator()->init(scores_out->info()->clone()->set_data_type(DataType::F32));
165  _boxes_out_f32.allocator()->init(boxes_out->info()->clone()->set_data_type(DataType::F32));
166  _classes_f32.allocator()->init(classes->info()->clone()->set_data_type(DataType::F32));
167  if(batch_splits_out != nullptr)
168  {
169  _memory_group.manage(&_batch_splits_out_f32);
170  _batch_splits_out_f32.allocator()->init(batch_splits_out->info()->clone()->set_data_type(DataType::F32));
171  }
172  if(keeps != nullptr)
173  {
174  _memory_group.manage(&_keeps_f32);
175  _keeps_f32.allocator()->init(keeps->info()->clone()->set_data_type(DataType::F32));
176  }
177 
178  _box_with_nms_limit_kernel.configure(&_scores_in_f32, &_boxes_in_f32, (batch_splits_in != nullptr) ? &_batch_splits_in_f32 : nullptr,
179  &_scores_out_f32, &_boxes_out_f32, &_classes_f32,
180  (batch_splits_out != nullptr) ? &_batch_splits_out_f32 : nullptr, (keeps != nullptr) ? &_keeps_f32 : nullptr,
181  keeps_size, info);
182  }
183  else
184  {
185  _box_with_nms_limit_kernel.configure(scores_in, boxes_in, batch_splits_in, scores_out, boxes_out, classes, batch_splits_out, keeps, keeps_size, info);
186  }
187 
188  if(_is_qasymm8)
189  {
190  _scores_in_f32.allocator()->allocate();
191  _boxes_in_f32.allocator()->allocate();
192  if(_batch_splits_in != nullptr)
193  {
194  _batch_splits_in_f32.allocator()->allocate();
195  }
196  _scores_out_f32.allocator()->allocate();
197  _boxes_out_f32.allocator()->allocate();
198  _classes_f32.allocator()->allocate();
199  if(batch_splits_out != nullptr)
200  {
201  _batch_splits_out_f32.allocator()->allocate();
202  }
203  if(keeps != nullptr)
204  {
205  _keeps_f32.allocator()->allocate();
206  }
207  }
208 }
void configure(const ITensor *scores_in, const ITensor *boxes_in, const ITensor *batch_splits_in, ITensor *scores_out, ITensor *boxes_out, ITensor *classes, ITensor *batch_splits_out=nullptr, ITensor *keeps=nullptr, ITensor *keeps_size=nullptr, const BoxNMSLimitInfo info=BoxNMSLimitInfo())
Initialise the kernel&#39;s input and output tensors.
void init(const TensorAllocator &allocator, const Coordinates &coords, TensorInfo &sub_info)
Shares the same backing memory with another tensor allocator, while the tensor info might be differen...
1 channel, 1 F32 per channel
TensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: Tensor.cpp:48
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
quantized, asymmetric fixed-point 8-bit number unsigned
void allocate() override
Allocate size specified by TensorInfo of CPU memory.
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
quantized, asymmetric fixed-point 8-bit number signed

◆ operator=()

Prevent instances of this class from being copied (As this class contains pointers)

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 231 of file CPPBoxWithNonMaximaSuppressionLimit.cpp.

References Window::DimY, Scheduler::get(), and IScheduler::schedule().

Referenced by NEGenerateProposalsLayer::run(), and CLGenerateProposalsLayer::validate().

232 {
233  // Acquire all the temporaries
234  MemoryGroupResourceScope scope_mg(_memory_group);
235 
236  if(_is_qasymm8)
237  {
238  dequantize_tensor(_scores_in, &_scores_in_f32);
239  dequantize_tensor(_boxes_in, &_boxes_in_f32);
240  if(_batch_splits_in != nullptr)
241  {
242  dequantize_tensor(_batch_splits_in, &_batch_splits_in_f32);
243  }
244  }
245 
246  Scheduler::get().schedule(&_box_with_nms_limit_kernel, Window::DimY);
247 
248  if(_is_qasymm8)
249  {
250  quantize_tensor(&_scores_out_f32, _scores_out);
251  quantize_tensor(&_boxes_out_f32, _boxes_out);
252  quantize_tensor(&_classes_f32, _classes);
253  if(_batch_splits_out != nullptr)
254  {
255  quantize_tensor(&_batch_splits_out_f32, _batch_splits_out);
256  }
257  if(_keeps != nullptr)
258  {
259  quantize_tensor(&_keeps_f32, _keeps);
260  }
261  }
262 }
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
virtual void schedule(ICPPKernel *kernel, const Hints &hints)=0
Runs the kernel in the same thread as the caller synchronously.
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:94

◆ validate()

static Status validate ( const ITensorInfo scores_in,
const ITensorInfo boxes_in,
const ITensorInfo batch_splits_in,
const ITensorInfo scores_out,
const ITensorInfo boxes_out,
const ITensorInfo classes,
const ITensorInfo batch_splits_out = nullptr,
const ITensorInfo keeps = nullptr,
const ITensorInfo keeps_size = nullptr,
const BoxNMSLimitInfo  info = BoxNMSLimitInfo() 
)
static

Static function to check if given info will lead to a valid configuration of CPPDetectionOutputLayer.

Parameters
[in]scores_inThe scores input tensor of size [count, num_classes]. Data types supported: QASYMM8/QASYMM8_SIGNED/F16/F32
[in]boxes_inThe boxes input tensor of size [count, num_classes * 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8/QASYMM8_SIGNED, otherwise same as scores_in
[in]batch_splits_inThe batch splits input tensor of size [batch_size]. Data types supported: Same as scores_in
Note
Can be a nullptr. If not a nullptr, scores_in and boxes_in have items from multiple images.
Parameters
[in]scores_outThe scores output tensor of size [N]. Data types supported: Same as scores_in
[in]boxes_outThe boxes output tensor of size [N, 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8/QASYMM8_SIGNED, otherwise same as scores_in
[in]classesThe classes output tensor of size [N]. Data types supported: Same as scores_in
[in]batch_splits_out(Optional) The batch splits output tensor. Data types supported: Same as scores_in
[in]keeps(Optional) The keeps output tensor of size [N]. Data types supported: Same as scores_in
[in]keeps_size(Optional) Number of filtered indices per class tensor of size [num_classes]. Data types supported: U32.
[in]info(Optional) BoxNMSLimitInfo information.
Returns
a status

The documentation for this class was generated from the following files: