Compute Library
 19.11
CPPBoxWithNonMaximaSuppressionLimit Class Reference

Basic function to run CPPBoxWithNonMaximaSuppressionLimitKernel. More...

#include <CPPBoxWithNonMaximaSuppressionLimit.h>

Collaboration diagram for CPPBoxWithNonMaximaSuppressionLimit:
[legend]

Public Member Functions

 CPPBoxWithNonMaximaSuppressionLimit (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Constructor. More...
 
 CPPBoxWithNonMaximaSuppressionLimit (const CPPBoxWithNonMaximaSuppressionLimit &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CPPBoxWithNonMaximaSuppressionLimitoperator= (const CPPBoxWithNonMaximaSuppressionLimit &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
void configure (const ITensor *scores_in, const ITensor *boxes_in, const ITensor *batch_splits_in, ITensor *scores_out, ITensor *boxes_out, ITensor *classes, ITensor *batch_splits_out=nullptr, ITensor *keeps=nullptr, ITensor *keeps_size=nullptr, const BoxNMSLimitInfo info=BoxNMSLimitInfo())
 Configure the BoxWithNonMaximaSuppressionLimit CPP kernel. More...
 
void run () override
 Run the kernels contained in the function. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 
virtual void prepare ()
 Prepare the function for executing. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *scores_in, const ITensorInfo *boxes_in, const ITensorInfo *batch_splits_in, const ITensorInfo *scores_out, const ITensorInfo *boxes_out, const ITensorInfo *classes, const ITensorInfo *batch_splits_out=nullptr, const ITensorInfo *keeps=nullptr, const ITensorInfo *keeps_size=nullptr, const BoxNMSLimitInfo info=BoxNMSLimitInfo())
 Static function to check if given info will lead to a valid configuration of CPPDetectionOutputLayer. More...
 

Detailed Description

Basic function to run CPPBoxWithNonMaximaSuppressionLimitKernel.

Definition at line 39 of file CPPBoxWithNonMaximaSuppressionLimit.h.

Constructor & Destructor Documentation

◆ CPPBoxWithNonMaximaSuppressionLimit() [1/2]

CPPBoxWithNonMaximaSuppressionLimit ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Constructor.

Definition at line 96 of file CPPBoxWithNonMaximaSuppressionLimit.cpp.

97  : _memory_group(std::move(memory_manager)),
98  _box_with_nms_limit_kernel(),
99  _scores_in(),
100  _boxes_in(),
101  _batch_splits_in(),
102  _scores_out(),
103  _boxes_out(),
104  _classes(),
105  _batch_splits_out(),
106  _keeps(),
107  _scores_in_f32(),
108  _boxes_in_f32(),
109  _batch_splits_in_f32(),
110  _scores_out_f32(),
111  _boxes_out_f32(),
112  _classes_f32(),
113  _batch_splits_out_f32(),
114  _keeps_f32(),
115  _is_qasymm8(false)
116 {
117 }

◆ CPPBoxWithNonMaximaSuppressionLimit() [2/2]

Prevent instances of this class from being copied (As this class contains pointers)

Member Function Documentation

◆ configure()

void configure ( const ITensor scores_in,
const ITensor boxes_in,
const ITensor batch_splits_in,
ITensor scores_out,
ITensor boxes_out,
ITensor classes,
ITensor batch_splits_out = nullptr,
ITensor keeps = nullptr,
ITensor keeps_size = nullptr,
const BoxNMSLimitInfo  info = BoxNMSLimitInfo() 
)

Configure the BoxWithNonMaximaSuppressionLimit CPP kernel.

Parameters
[in]scores_inThe scores input tensor of size [count, num_classes]. Data types supported: QASYMM8/F16/F32
[in]boxes_inThe boxes input tensor of size [count, num_classes * 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8, otherwise same as scores_in
[in]batch_splits_inThe batch splits input tensor of size [batch_size]. Data types supported: Same as scores_in
Note
Can be a nullptr. If not a nullptr, scores_in and boxes_in have items from multiple images.
Parameters
[out]scores_outThe scores output tensor of size [N]. Data types supported: Same as scores_in
[out]boxes_outThe boxes output tensor of size [N, 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8, otherwise same as scores_in
[out]classesThe classes output tensor of size [N]. Data types supported: Same as scores_in
[out]batch_splits_out(Optional) The batch splits output tensor. Data types supported: Same as scores_in
[out]keeps(Optional) The keeps output tensor of size [N]. Data types supported: Same as scores_in
[in]keeps_size(Optional) Number of filtered indices per class tensor of size [num_classes]. Data types supported: Same as scores_in
[in]info(Optional) BoxNMSLimitInfo information.

Definition at line 119 of file CPPBoxWithNonMaximaSuppressionLimit.cpp.

121 {
122  ARM_COMPUTE_ERROR_ON_NULLPTR(scores_in, boxes_in, scores_out, boxes_out, classes);
123 
124  _is_qasymm8 = scores_in->info()->data_type() == DataType::QASYMM8;
125 
126  _scores_in = scores_in;
127  _boxes_in = boxes_in;
128  _batch_splits_in = batch_splits_in;
129  _scores_out = scores_out;
130  _boxes_out = boxes_out;
131  _classes = classes;
132  _batch_splits_out = batch_splits_out;
133  _keeps = keeps;
134 
135  if(_is_qasymm8)
136  {
137  // Manage intermediate buffers
138  _memory_group.manage(&_scores_in_f32);
139  _memory_group.manage(&_boxes_in_f32);
140  _memory_group.manage(&_scores_out_f32);
141  _memory_group.manage(&_boxes_out_f32);
142  _memory_group.manage(&_classes_f32);
143  _scores_in_f32.allocator()->init(scores_in->info()->clone()->set_data_type(DataType::F32));
144  _boxes_in_f32.allocator()->init(boxes_in->info()->clone()->set_data_type(DataType::F32));
145  if(batch_splits_in != nullptr)
146  {
147  _memory_group.manage(&_batch_splits_in_f32);
148  _batch_splits_in_f32.allocator()->init(batch_splits_in->info()->clone()->set_data_type(DataType::F32));
149  }
150  _scores_out_f32.allocator()->init(scores_out->info()->clone()->set_data_type(DataType::F32));
151  _boxes_out_f32.allocator()->init(boxes_out->info()->clone()->set_data_type(DataType::F32));
152  _classes_f32.allocator()->init(classes->info()->clone()->set_data_type(DataType::F32));
153  if(batch_splits_out != nullptr)
154  {
155  _memory_group.manage(&_batch_splits_out_f32);
156  _batch_splits_out_f32.allocator()->init(batch_splits_out->info()->clone()->set_data_type(DataType::F32));
157  }
158  if(keeps != nullptr)
159  {
160  _memory_group.manage(&_keeps_f32);
161  _keeps_f32.allocator()->init(keeps->info()->clone()->set_data_type(DataType::F32));
162  }
163 
164  _box_with_nms_limit_kernel.configure(&_scores_in_f32, &_boxes_in_f32, (batch_splits_in != nullptr) ? &_batch_splits_in_f32 : nullptr,
165  &_scores_out_f32, &_boxes_out_f32, &_classes_f32,
166  (batch_splits_out != nullptr) ? &_batch_splits_out_f32 : nullptr, (keeps != nullptr) ? &_keeps_f32 : nullptr,
167  keeps_size, info);
168  }
169  else
170  {
171  _box_with_nms_limit_kernel.configure(scores_in, boxes_in, batch_splits_in, scores_out, boxes_out, classes, batch_splits_out, keeps, keeps_size, info);
172  }
173 
174  if(_is_qasymm8)
175  {
176  _scores_in_f32.allocator()->allocate();
177  _boxes_in_f32.allocator()->allocate();
178  if(_batch_splits_in != nullptr)
179  {
180  _batch_splits_in_f32.allocator()->allocate();
181  }
182  _scores_out_f32.allocator()->allocate();
183  _boxes_out_f32.allocator()->allocate();
184  _classes_f32.allocator()->allocate();
185  if(batch_splits_out != nullptr)
186  {
187  _batch_splits_out_f32.allocator()->allocate();
188  }
189  if(keeps != nullptr)
190  {
191  _keeps_f32.allocator()->allocate();
192  }
193  }
194 }
void configure(const ITensor *scores_in, const ITensor *boxes_in, const ITensor *batch_splits_in, ITensor *scores_out, ITensor *boxes_out, ITensor *classes, ITensor *batch_splits_out=nullptr, ITensor *keeps=nullptr, ITensor *keeps_size=nullptr, const BoxNMSLimitInfo info=BoxNMSLimitInfo())
Initialise the kernel's input and output tensors.
void init(const TensorAllocator &allocator, const Coordinates &coords, TensorInfo &sub_info)
Shares the same backing memory with another tensor allocator, while the tensor info might be differen...
1 channel, 1 F32 per channel
TensorAllocator * allocator()
Return a pointer to the tensor's allocator.
Definition: Tensor.cpp:48
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
quantized, asymmetric fixed-point 8-bit number unsigned
void allocate() override
Allocate size specified by TensorInfo of CPU memory.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161

References TensorAllocator::allocate(), Tensor::allocator(), ARM_COMPUTE_ERROR_ON_NULLPTR, ICloneable< T >::clone(), CPPBoxWithNonMaximaSuppressionLimitKernel::configure(), ITensorInfo::data_type(), arm_compute::F32, ITensor::info(), arm_compute::test::validation::info, TensorAllocator::init(), MemoryGroup::manage(), and arm_compute::QASYMM8.

Referenced by NEGenerateProposalsLayer::configure(), and CLGenerateProposalsLayer::configure().

◆ operator=()

Prevent instances of this class from being copied (As this class contains pointers)

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For NEON kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 217 of file CPPBoxWithNonMaximaSuppressionLimit.cpp.

218 {
219  // Acquire all the temporaries
220  MemoryGroupResourceScope scope_mg(_memory_group);
221 
222  if(_is_qasymm8)
223  {
224  dequantize_tensor(_scores_in, &_scores_in_f32);
225  dequantize_tensor(_boxes_in, &_boxes_in_f32);
226  if(_batch_splits_in != nullptr)
227  {
228  dequantize_tensor(_batch_splits_in, &_batch_splits_in_f32);
229  }
230  }
231 
232  Scheduler::get().schedule(&_box_with_nms_limit_kernel, Window::DimY);
233 
234  if(_is_qasymm8)
235  {
236  quantize_tensor(&_scores_out_f32, _scores_out);
237  quantize_tensor(&_boxes_out_f32, _boxes_out);
238  quantize_tensor(&_classes_f32, _classes);
239  if(_batch_splits_out != nullptr)
240  {
241  quantize_tensor(&_batch_splits_out_f32, _batch_splits_out);
242  }
243  if(_keeps != nullptr)
244  {
245  quantize_tensor(&_keeps_f32, _keeps);
246  }
247  }
248 }
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
virtual void schedule(ICPPKernel *kernel, const Hints &hints)=0
Runs the kernel in the same thread as the caller synchronously.
static IScheduler & get()
Access the scheduler singleton.
Definition: Scheduler.cpp:95

References Window::DimY, Scheduler::get(), and IScheduler::schedule().

Referenced by NEGenerateProposalsLayer::run().

◆ validate()

static Status validate ( const ITensorInfo scores_in,
const ITensorInfo boxes_in,
const ITensorInfo batch_splits_in,
const ITensorInfo scores_out,
const ITensorInfo boxes_out,
const ITensorInfo classes,
const ITensorInfo batch_splits_out = nullptr,
const ITensorInfo keeps = nullptr,
const ITensorInfo keeps_size = nullptr,
const BoxNMSLimitInfo  info = BoxNMSLimitInfo() 
)
static

Static function to check if given info will lead to a valid configuration of CPPDetectionOutputLayer.

Parameters
[in]scores_inThe scores input tensor of size [count, num_classes]. Data types supported: QASYMM8/F16/F32
[in]boxes_inThe boxes input tensor of size [count, num_classes * 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8, otherwise same as scores_in
[in]batch_splits_inThe batch splits input tensor of size [batch_size]. Data types supported: Same as scores_in
Note
Can be a nullptr. If not a nullptr, scores_in and boxes_in have items from multiple images.
Parameters
[in]scores_outThe scores output tensor of size [N]. Data types supported: Same as scores_in
[in]boxes_outThe boxes output tensor of size [N, 4]. Data types supported: QASYMM16 with 0.125 scale and 0 offset if scores_in is QASYMM8, otherwise same as scores_in
[in]classesThe classes output tensor of size [N]. Data types supported: Same as scores_in
[in]batch_splits_out(Optional) The batch splits output tensor. Data types supported: Same as scores_in
[in]keeps(Optional) The keeps output tensor of size [N]. Data types supported: Same as scores_in
[in]keeps_size(Optional) Number of filtered indices per class tensor of size [num_classes]. Data types supported: Same as scores_in
[in]info(Optional) BoxNMSLimitInfo information.
Returns
a status

The documentation for this class was generated from the following files: