Compute Library
 21.02
CLCannyEdge Class Reference

Basic function to execute canny edge on OpenCL. More...

#include <CLCannyEdge.h>

Collaboration diagram for CLCannyEdge:
[legend]

Public Member Functions

 CLCannyEdge (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Constructor. More...
 
 CLCannyEdge (const CLCannyEdge &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLCannyEdgeoperator= (const CLCannyEdge &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 ~CLCannyEdge ()
 Default destructor. More...
 
void configure (ICLTensor *input, ICLTensor *output, int32_t upper_thr, int32_t lower_thr, int32_t gradient_size, int32_t norm_type, BorderMode border_mode, uint8_t constant_border_value=0)
 Initialise the function's source, destination, thresholds, gradient size, normalization type and border mode. More...
 
void configure (const CLCompileContext &compile_context, ICLTensor *input, ICLTensor *output, int32_t upper_thr, int32_t lower_thr, int32_t gradient_size, int32_t norm_type, BorderMode border_mode, uint8_t constant_border_value=0)
 Initialise the function's source, destination, thresholds, gradient size, normalization type and border mode. More...
 
virtual void run () override
 Run the kernels contained in the function. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 
virtual void prepare ()
 Prepare the function for executing. More...
 

Detailed Description

Basic function to execute canny edge on OpenCL.

This function calls the following OpenCL kernels and functions:

  1. CLFillBorderKernel (if border_mode == REPLICATE or border_mode == CONSTANT)
  2. CLSobel3x3 (if gradient_size == 3) or CLSobel5x5 (if gradient_size == 5) or CLSobel7x7 (if gradient_size == 7)
  3. CLGradientKernel
  4. CLEdgeNonMaxSuppressionKernel
  5. CLEdgeTraceKernel
Deprecated:
This function is deprecated and is intended to be removed in 21.05 release

Definition at line 55 of file CLCannyEdge.h.

Constructor & Destructor Documentation

◆ CLCannyEdge() [1/2]

CLCannyEdge ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Constructor.

Definition at line 41 of file CLCannyEdge.cpp.

42  : _memory_group(std::move(memory_manager)),
43  _sobel(),
44  _gradient(std::make_unique<CLGradientKernel>()),
45  _border_mag_gradient(std::make_unique<CLFillBorderKernel>()),
46  _non_max_suppr(std::make_unique<CLEdgeNonMaxSuppressionKernel>()),
47  _edge_trace(std::make_unique<CLEdgeTraceKernel>()),
48  _gx(),
49  _gy(),
50  _mag(),
51  _phase(),
52  _nonmax(),
53  _visited(),
54  _recorded(),
55  _l1_list_counter(),
56  _l1_stack(),
57  _output(nullptr)
58 {
59 }

◆ CLCannyEdge() [2/2]

CLCannyEdge ( const CLCannyEdge )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ ~CLCannyEdge()

~CLCannyEdge ( )
default

Default destructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( ICLTensor input,
ICLTensor output,
int32_t  upper_thr,
int32_t  lower_thr,
int32_t  gradient_size,
int32_t  norm_type,
BorderMode  border_mode,
uint8_t  constant_border_value = 0 
)

Initialise the function's source, destination, thresholds, gradient size, normalization type and border mode.

Parameters
[in,out]inputSource tensor. Data types supported: U8. (Written to only for border_mode != UNDEFINED)
[out]outputDestination tensor. Data types supported: U8.
[in]upper_thrUpper threshold used for the hysteresis.
[in]lower_thrLower threshold used for the hysteresis.
[in]gradient_sizeGradient size (3, 5 or 7).
[in]norm_typeNormalization type. if 1, L1-Norm otherwise L2-Norm.
[in]border_modeBorder mode to use for the convolution.
[in]constant_border_value(Optional) Constant value to use for borders if border_mode is set to CONSTANT.

Definition at line 63 of file CLCannyEdge.cpp.

References CLKernelLibrary::get().

65 {
66  configure(CLKernelLibrary::get().get_compile_context(), input, output, upper_thr, lower_thr, gradient_size, norm_type, border_mode, constant_border_value);
67 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void configure(ICLTensor *input, ICLTensor *output, int32_t upper_thr, int32_t lower_thr, int32_t gradient_size, int32_t norm_type, BorderMode border_mode, uint8_t constant_border_value=0)
Initialise the function&#39;s source, destination, thresholds, gradient size, normalization type and bord...
Definition: CLCannyEdge.cpp:63

◆ configure() [2/2]

void configure ( const CLCompileContext compile_context,
ICLTensor input,
ICLTensor output,
int32_t  upper_thr,
int32_t  lower_thr,
int32_t  gradient_size,
int32_t  norm_type,
BorderMode  border_mode,
uint8_t  constant_border_value = 0 
)

Initialise the function's source, destination, thresholds, gradient size, normalization type and border mode.

Parameters
[in]compile_contextThe compile context to be used.
[in,out]inputSource tensor. Data types supported: U8. (Written to only for border_mode != UNDEFINED)
[out]outputDestination tensor. Data types supported: U8.
[in]upper_thrUpper threshold used for the hysteresis.
[in]lower_thrLower threshold used for the hysteresis.
[in]gradient_sizeGradient size (3, 5 or 7).
[in]norm_typeNormalization type. if 1, L1-Norm otherwise L2-Norm.
[in]border_modeBorder mode to use for the convolution.
[in]constant_border_value(Optional) Constant value to use for borders if border_mode is set to CONSTANT.

Definition at line 69 of file CLCannyEdge.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_VAR, ITensorInfo::dimension(), ITensor::info(), arm_compute::test::validation::info, ITensorAllocator::init(), TensorInfo::init(), MemoryGroup::manage(), arm_compute::S16, arm_compute::S32, TensorShape::set(), arm_compute::test::validation::shape, ITensorInfo::tensor_shape(), arm_compute::U16, arm_compute::U32, arm_compute::U8, and arm_compute::UNDEFINED.

72 {
73  ARM_COMPUTE_ERROR_ON_NULLPTR(input, output);
76  ARM_COMPUTE_ERROR_ON((1 != norm_type) && (2 != norm_type));
77  ARM_COMPUTE_ERROR_ON((gradient_size != 3) && (gradient_size != 5) && (gradient_size != 7));
78  ARM_COMPUTE_ERROR_ON((lower_thr < 0) || (lower_thr >= upper_thr));
79 
80  _output = output;
81 
82  const unsigned int L1_hysteresis_stack_size = 8;
83  const TensorShape shape = input->info()->tensor_shape();
84 
85  TensorInfo gradient_info;
87 
88  // Initialize images
89  if(gradient_size < 7)
90  {
91  gradient_info.init(shape, 1, arm_compute::DataType::S16);
92  info.init(shape, 1, arm_compute::DataType::U16);
93  }
94  else
95  {
96  gradient_info.init(shape, 1, arm_compute::DataType::S32);
97  info.init(shape, 1, arm_compute::DataType::U32);
98  }
99 
100  _gx.allocator()->init(gradient_info);
101  _gy.allocator()->init(gradient_info);
102  _mag.allocator()->init(info);
103  _nonmax.allocator()->init(info);
104 
105  TensorInfo info_u8(shape, 1, arm_compute::DataType::U8);
106  _phase.allocator()->init(info_u8);
107  _l1_list_counter.allocator()->init(info_u8);
108 
109  TensorInfo info_u32(shape, 1, arm_compute::DataType::U32);
110  _visited.allocator()->init(info_u32);
111  _recorded.allocator()->init(info_u32);
112 
113  TensorShape shape_l1_stack = input->info()->tensor_shape();
114  shape_l1_stack.set(0, input->info()->dimension(0) * L1_hysteresis_stack_size);
115  TensorInfo info_s32(shape_l1_stack, 1, arm_compute::DataType::S32);
116  _l1_stack.allocator()->init(info_s32);
117 
118  // Manage intermediate buffers
119  _memory_group.manage(&_gx);
120  _memory_group.manage(&_gy);
121 
122  // Configure/Init sobelNxN
123  if(gradient_size == 3)
124  {
125  auto k = std::make_unique<CLSobel3x3>();
126  k->configure(compile_context, input, &_gx, &_gy, border_mode, constant_border_value);
127  _sobel = std::move(k);
128  }
129  else if(gradient_size == 5)
130  {
131  auto k = std::make_unique<CLSobel5x5>();
132  k->configure(compile_context, input, &_gx, &_gy, border_mode, constant_border_value);
133  _sobel = std::move(k);
134  }
135  else if(gradient_size == 7)
136  {
137  auto k = std::make_unique<CLSobel7x7>();
138  k->configure(compile_context, input, &_gx, &_gy, border_mode, constant_border_value);
139  _sobel = std::move(k);
140  }
141  else
142  {
143  ARM_COMPUTE_ERROR_VAR("Gradient size %d not supported", gradient_size);
144  }
145 
146  // Manage intermediate buffers
147  _memory_group.manage(&_mag);
148  _memory_group.manage(&_phase);
149 
150  // Configure gradient
151  _gradient->configure(compile_context, &_gx, &_gy, &_mag, &_phase, norm_type);
152 
153  // Allocate intermediate buffers
154  _gx.allocator()->allocate();
155  _gy.allocator()->allocate();
156 
157  // Manage intermediate buffers
158  _memory_group.manage(&_nonmax);
159 
160  // Configure non-maxima suppression
161  _non_max_suppr->configure(compile_context, &_mag, &_phase, &_nonmax, lower_thr, border_mode == BorderMode::UNDEFINED);
162 
163  // Allocate intermediate buffers
164  _phase.allocator()->allocate();
165 
166  // Fill border around magnitude image as non-maxima suppression will access
167  // it. If border mode is undefined filling the border is a nop.
168  _border_mag_gradient->configure(compile_context, &_mag, _non_max_suppr->border_size(), border_mode, constant_border_value);
169 
170  // Allocate intermediate buffers
171  _mag.allocator()->allocate();
172 
173  // Manage intermediate buffers
174  _memory_group.manage(&_visited);
175  _memory_group.manage(&_recorded);
176  _memory_group.manage(&_l1_stack);
177  _memory_group.manage(&_l1_list_counter);
178 
179  // Configure edge tracing
180  _edge_trace->configure(compile_context, &_nonmax, output, upper_thr, lower_thr, &_visited, &_recorded, &_l1_stack, &_l1_list_counter);
181 
182  // Allocate intermediate buffers
183  _visited.allocator()->allocate();
184  _recorded.allocator()->allocate();
185  _l1_stack.allocator()->allocate();
186  _l1_list_counter.allocator()->allocate();
187  _nonmax.allocator()->allocate();
188 }
Shape of a tensor.
Definition: TensorShape.h:39
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
1 channel, 1 U8 per channel
#define ARM_COMPUTE_ERROR_VAR(msg,...)
Print the given message then throw an std::runtime_error.
Definition: Error.h:346
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
CLTensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: CLTensor.cpp:61
1 channel, 1 U16 per channel
void init(const TensorInfo &input, size_t alignment=0)
Initialize a tensor based on the passed TensorInfo.
1 channel, 1 S32 per channel
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
1 channel, 1 U32 per channel
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
1 channel, 1 S16 per channel
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:790
void init(Format format)
Initialize the tensor info with just a format.
Definition: TensorInfo.cpp:109
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
Borders are left undefined.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
Store the tensor&#39;s metadata.
Definition: TensorInfo.h:45
TensorShape & set(size_t dimension, size_t value, bool apply_dim_correction=true, bool increase_dim_unit=true)
Accessor to set the value of one of the dimensions.
Definition: TensorShape.h:79

◆ operator=()

CLCannyEdge& operator= ( const CLCannyEdge )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 190 of file CLCannyEdge.cpp.

References ICLTensor::clear(), CLScheduler::enqueue(), and CLScheduler::get().

191 {
192  MemoryGroupResourceScope scope_mg(_memory_group);
193 
194  // Run sobel
195  _sobel->run();
196 
197  // Run phase and magnitude calculation
198  CLScheduler::get().enqueue(*_gradient, false);
199 
200  // Fill border before non-maxima suppression. Nop for border mode undefined.
201  CLScheduler::get().enqueue(*_border_mag_gradient, false);
202 
203  // Run non max suppresion
204  _nonmax.clear(CLScheduler::get().queue());
205  CLScheduler::get().enqueue(*_non_max_suppr, false);
206 
207  // Clear temporary structures and run edge trace
208  _output->clear(CLScheduler::get().queue());
209  _visited.clear(CLScheduler::get().queue());
210  _recorded.clear(CLScheduler::get().queue());
211  _l1_list_counter.clear(CLScheduler::get().queue());
212  _l1_stack.clear(CLScheduler::get().queue());
213  CLScheduler::get().enqueue(*_edge_trace, true);
214 }
static CLScheduler & get()
Access the scheduler singleton.
void clear(cl::CommandQueue &q)
Clear the contents of the tensor synchronously.
Definition: ICLTensor.cpp:46
void enqueue(ICLKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
Memory group resources scope handling class.
Definition: IMemoryGroup.h:82

The documentation for this class was generated from the following files: