Compute Library
 19.08
NEChannelCombineKernel Class Reference

Interface for the channel combine kernel. More...

#include <NEChannelCombineKernel.h>

Collaboration diagram for NEChannelCombineKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEChannelCombineKernel ()
 Default constructor. More...
 
 NEChannelCombineKernel (const NEChannelCombineKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEChannelCombineKerneloperator= (const NEChannelCombineKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEChannelCombineKernel (NEChannelCombineKernel &&)=default
 Allow instances of this class to be moved. More...
 
NEChannelCombineKerneloperator= (NEChannelCombineKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~NEChannelCombineKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *plane0, const ITensor *plane1, const ITensor *plane2, const ITensor *plane3, ITensor *output)
 Configure function's inputs and outputs. More...
 
void configure (const IImage *plane0, const IImage *plane1, const IImage *plane2, IMultiImage *output)
 Configure function's inputs and outputs. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
bool is_parallelisable () const override
 Indicates whether or not the kernel is parallelisable. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Detailed Description

Interface for the channel combine kernel.

Definition at line 39 of file NEChannelCombineKernel.h.

Constructor & Destructor Documentation

◆ NEChannelCombineKernel() [1/3]

Default constructor.

Definition at line 46 of file NEChannelCombineKernel.cpp.

47  : _func(nullptr), _planes{ { nullptr } }, _output(nullptr), _output_multi(nullptr), _x_subsampling{ { 1, 1, 1 } }, _y_subsampling{ { 1, 1, 1 } }, _num_elems_processed_per_iteration(8),
48 _is_parallelizable(true)
49 {
50 }

◆ NEChannelCombineKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEChannelCombineKernel() [3/3]

Allow instances of this class to be moved.

◆ ~NEChannelCombineKernel()

~NEChannelCombineKernel ( )
default

Default destructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( const ITensor plane0,
const ITensor plane1,
const ITensor plane2,
const ITensor plane3,
ITensor output 
)

Configure function's inputs and outputs.

Parameters
[in]plane0The 2D plane that forms channel 0. Data type supported: U8
[in]plane1The 2D plane that forms channel 1. Data type supported: U8
[in]plane2The 2D plane that forms channel 2. Data type supported: U8
[in]plane3The 2D plane that forms channel 3. Data type supported: U8
[out]outputThe single planar output tensor. Formats supported: RGB888/RGBA8888/UYVY422/YUYV422

Definition at line 52 of file NEChannelCombineKernel.cpp.

53 {
54  ARM_COMPUTE_ERROR_ON_NULLPTR(plane0, plane1, plane2, output);
55  ARM_COMPUTE_ERROR_ON(plane0 == output);
56  ARM_COMPUTE_ERROR_ON(plane1 == output);
57  ARM_COMPUTE_ERROR_ON(plane2 == output);
58 
63 
67 
68  const Format output_format = output->info()->format();
69 
70  // Check if horizontal dimension of Y plane is even and validate horizontal sub-sampling dimensions for U and V planes
71  if(Format::YUYV422 == output_format || Format::UYVY422 == output_format)
72  {
73  // Validate Y plane of input and output
74  ARM_COMPUTE_ERROR_ON_TENSORS_NOT_EVEN(output_format, plane0, output);
75 
76  // Validate U and V plane of the input
77  ARM_COMPUTE_ERROR_ON_TENSORS_NOT_SUBSAMPLED(output_format, plane0->info()->tensor_shape(), plane1, plane2);
78  }
79 
80  _planes[0] = plane0;
81  _planes[1] = plane1;
82  _planes[2] = plane2;
83  _planes[3] = nullptr;
84 
85  // Validate the last input tensor only for RGBA format
86  if(Format::RGBA8888 == output_format)
87  {
90 
93 
94  _planes[3] = plane3;
95  }
96 
97  _output = output;
98  _output_multi = nullptr;
99 
100  // Half the processed elements for U and V channels due to horizontal sub-sampling of 2
101  if(Format::YUYV422 == output_format || Format::UYVY422 == output_format)
102  {
103  _x_subsampling[1] = 2;
104  _x_subsampling[2] = 2;
105  }
106 
107  _num_elems_processed_per_iteration = 8;
108  _is_parallelizable = true;
109 
110  // Select function and number of elements to process given the output format
111  switch(output_format)
112  {
113  case Format::RGB888:
114  _func = &NEChannelCombineKernel::combine_3C;
115  break;
116  case Format::RGBA8888:
117  _func = &NEChannelCombineKernel::combine_4C;
118  break;
119  case Format::UYVY422:
120  _num_elems_processed_per_iteration = 16;
121  _func = &NEChannelCombineKernel::combine_YUV_1p<true>;
122  break;
123  case Format::YUYV422:
124  _num_elems_processed_per_iteration = 16;
125  _func = &NEChannelCombineKernel::combine_YUV_1p<false>;
126  break;
127  default:
128  ARM_COMPUTE_ERROR("Not supported format.");
129  break;
130  }
131 
132  Window win = calculate_max_window(*plane0->info(), Steps(_num_elems_processed_per_iteration));
133 
134  AccessWindowHorizontal output_access(output->info(), 0, _num_elems_processed_per_iteration);
135  AccessWindowHorizontal plane0_access(plane0->info(), 0, _num_elems_processed_per_iteration / _x_subsampling[1], 1.f / _x_subsampling[0]);
136  AccessWindowHorizontal plane1_access(plane1->info(), 0, _num_elems_processed_per_iteration / _x_subsampling[1], 1.f / _x_subsampling[1]);
137  AccessWindowHorizontal plane2_access(plane2->info(), 0, _num_elems_processed_per_iteration / _x_subsampling[1], 1.f / _x_subsampling[2]);
138  AccessWindowHorizontal plane3_access(plane3 == nullptr ? nullptr : plane3->info(), 0, _num_elems_processed_per_iteration);
139 
141  win,
142  plane0_access,
143  plane1_access,
144  plane2_access,
145  plane3_access,
146  output_access);
147 
149  plane1->info()->valid_region(),
150  plane2->info()->valid_region());
151 
152  if(plane3 != nullptr)
153  {
155  }
156 
157  output_access.set_valid_region(win, ValidRegion(valid_region.anchor, output->info()->tensor_shape()));
158 
159  INEKernel::configure(win);
160 }
#define ARM_COMPUTE_ERROR(...)
Print the given message then throw an std::runtime_error.
Definition: Error.h:261
A single plane of 32-bit macro pixel of U0, Y0, V0, Y1 byte.
1 channel, 1 U8 per channel
#define ARM_COMPUTE_ERROR_ON_FORMAT_NOT_IN(t,...)
Definition: Validate.h:642
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:337
Window calculate_max_window(const ValidRegion &valid_region, const Steps &steps=Steps(), bool skip_border=false, BorderSize border_size=BorderSize())
Calculate the maximum window for a given tensor shape and border setting.
Definition: Helpers.cpp:28
virtual ValidRegion valid_region() const =0
Valid region of the tensor.
3 channels, 1 U8 per channel
#define ARM_COMPUTE_ERROR_ON_TENSOR_NOT_2D(t)
Definition: Validate.h:855
virtual Format format() const =0
Colour format of the image.
bool update_window_and_padding(Window &win, Ts &&... patterns)
Update window and padding size for each of the access patterns.
Definition: Helpers.h:402
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
Format
Image colour formats.
Definition: Types.h:52
Class to describe a number of elements in each dimension.
Definition: Steps.h:40
Implementation of a row access pattern.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
#define ARM_COMPUTE_ERROR_ON_TENSORS_NOT_SUBSAMPLED(...)
Definition: Validate.h:351
ValidRegion intersect_valid_regions(const Ts &... regions)
Intersect multiple valid regions.
Definition: Helpers.h:503
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:789
4 channels, 1 U8 per channel
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
Container for valid region of a window.
Definition: Types.h:174
A single plane of 32-bit macro pixel of Y0, U0, Y1, V0 bytes.
Describe a multidimensional execution window.
Definition: Window.h:39
Coordinates anchor
Anchor for the start of the valid region.
Definition: Types.h:246
#define ARM_COMPUTE_ERROR_ON_TENSORS_NOT_EVEN(...)
Definition: Validate.h:318

References ValidRegion::anchor, ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_ERROR_ON_FORMAT_NOT_IN, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_TENSOR_NOT_2D, ARM_COMPUTE_ERROR_ON_TENSORS_NOT_EVEN, ARM_COMPUTE_ERROR_ON_TENSORS_NOT_SUBSAMPLED, arm_compute::calculate_max_window(), ITensorInfo::format(), ITensor::info(), arm_compute::intersect_valid_regions(), arm_compute::RGB888, arm_compute::RGBA8888, ITensorInfo::tensor_shape(), arm_compute::U8, arm_compute::update_window_and_padding(), arm_compute::UYVY422, arm_compute::test::validation::valid_region, ITensorInfo::valid_region(), and arm_compute::YUYV422.

◆ configure() [2/2]

void configure ( const IImage plane0,
const IImage plane1,
const IImage plane2,
IMultiImage output 
)

Configure function's inputs and outputs.

Parameters
[in]plane0The 2D plane that forms channel 0. Data type supported: U8
[in]plane1The 2D plane that forms channel 1. Data type supported: U8
[in]plane2The 2D plane that forms channel 2. Data type supported: U8
[out]outputThe multi planar output tensor. Formats supported: NV12/NV21/IYUV/YUV444

Definition at line 162 of file NEChannelCombineKernel.cpp.

163 {
164  ARM_COMPUTE_ERROR_ON_NULLPTR(plane0, plane1, plane2, output);
168 
173 
177 
178  const Format output_format = output->info()->format();
179 
180  // Validate shape of Y plane to be even and shape of sub-sampling dimensions for U and V planes
181  // Perform validation only for formats which require sub-sampling.
182  if(Format::YUV444 != output_format)
183  {
184  // Validate Y plane of input and output
185  ARM_COMPUTE_ERROR_ON_TENSORS_NOT_EVEN(output_format, plane0, output->plane(0));
186 
187  // Validate U and V plane of the input
188  ARM_COMPUTE_ERROR_ON_TENSORS_NOT_SUBSAMPLED(output_format, plane0->info()->tensor_shape(), plane1, plane2);
189 
190  // Validate second plane U (NV12 and NV21 have a UV88 combined plane while IYUV has only the U plane)
191  // MultiImage generates the correct tensor shape but also check in case the tensor shape of planes was changed to a wrong size
192  ARM_COMPUTE_ERROR_ON_TENSORS_NOT_SUBSAMPLED(output_format, plane0->info()->tensor_shape(), output->plane(1));
193 
194  // Validate the last plane V of format IYUV
195  if(Format::IYUV == output_format)
196  {
197  // Validate Y plane of the output
198  ARM_COMPUTE_ERROR_ON_TENSORS_NOT_SUBSAMPLED(output_format, plane0->info()->tensor_shape(), output->plane(2));
199  }
200  }
201 
202  _planes[0] = plane0;
203  _planes[1] = plane1;
204  _planes[2] = plane2;
205  _planes[3] = nullptr;
206  _output = nullptr;
207  _output_multi = output;
208 
209  bool has_two_planes = false;
210  unsigned int num_elems_written_plane1 = 8;
211 
212  _num_elems_processed_per_iteration = 8;
213  _is_parallelizable = true;
214 
215  switch(output_format)
216  {
217  case Format::NV12:
218  case Format::NV21:
219  _x_subsampling = { { 1, 2, 2 } };
220  _y_subsampling = { { 1, 2, 2 } };
221  _func = &NEChannelCombineKernel::combine_YUV_2p;
222  has_two_planes = true;
223  num_elems_written_plane1 = 16;
224  break;
225  case Format::IYUV:
226  _is_parallelizable = false;
227  _x_subsampling = { { 1, 2, 2 } };
228  _y_subsampling = { { 1, 2, 2 } };
229  _func = &NEChannelCombineKernel::combine_YUV_3p;
230  break;
231  case Format::YUV444:
232  _is_parallelizable = false;
233  _x_subsampling = { { 1, 1, 1 } };
234  _y_subsampling = { { 1, 1, 1 } };
235  _func = &NEChannelCombineKernel::combine_YUV_3p;
236  break;
237  default:
238  ARM_COMPUTE_ERROR("Not supported format.");
239  break;
240  }
241 
242  const unsigned int y_step = *std::max_element(_y_subsampling.begin(), _y_subsampling.end());
243 
244  Window win = calculate_max_window(*plane0->info(), Steps(_num_elems_processed_per_iteration, y_step));
245  AccessWindowRectangle output_plane0_access(output->plane(0)->info(), 0, 0, _num_elems_processed_per_iteration, 1, 1.f, 1.f / _y_subsampling[0]);
246  AccessWindowRectangle output_plane1_access(output->plane(1)->info(), 0, 0, num_elems_written_plane1, 1, 1.f / _x_subsampling[1], 1.f / _y_subsampling[1]);
247  AccessWindowRectangle output_plane2_access(has_two_planes ? nullptr : output->plane(2)->info(), 0, 0, _num_elems_processed_per_iteration, 1, 1.f / _x_subsampling[2], 1.f / _y_subsampling[2]);
248 
250  AccessWindowHorizontal(plane0->info(), 0, _num_elems_processed_per_iteration),
251  AccessWindowRectangle(plane1->info(), 0, 0, _num_elems_processed_per_iteration, 1, 1.f / _x_subsampling[1], 1.f / _y_subsampling[1]),
252  AccessWindowRectangle(plane2->info(), 0, 0, _num_elems_processed_per_iteration, 1, 1.f / _x_subsampling[2], 1.f / _y_subsampling[2]),
253  output_plane0_access,
254  output_plane1_access,
255  output_plane2_access);
256 
257  ValidRegion plane0_valid_region = plane0->info()->valid_region();
258  ValidRegion output_plane1_region = has_two_planes ? intersect_valid_regions(plane1->info()->valid_region(), plane2->info()->valid_region()) : plane2->info()->valid_region();
259 
260  output_plane0_access.set_valid_region(win, ValidRegion(plane0_valid_region.anchor, output->plane(0)->info()->tensor_shape()));
261  output_plane1_access.set_valid_region(win, ValidRegion(output_plane1_region.anchor, output->plane(1)->info()->tensor_shape()));
262  output_plane2_access.set_valid_region(win, ValidRegion(plane2->info()->valid_region().anchor, output->plane(2)->info()->tensor_shape()));
263 
264  INEKernel::configure(win);
265 }
#define ARM_COMPUTE_ERROR(...)
Print the given message then throw an std::runtime_error.
Definition: Error.h:261
1 channel, 1 U8 per channel
#define ARM_COMPUTE_ERROR_ON_FORMAT_NOT_IN(t,...)
Definition: Validate.h:642
A 2 plane YUV format of Luma (Y) and interleaved UV data at 4:2:0 sampling.
Window calculate_max_window(const ValidRegion &valid_region, const Steps &steps=Steps(), bool skip_border=false, BorderSize border_size=BorderSize())
Calculate the maximum window for a given tensor shape and border setting.
Definition: Helpers.cpp:28
A 2 plane YUV format of Luma (Y) and interleaved VU data at 4:2:0 sampling.
virtual ValidRegion valid_region() const =0
Valid region of the tensor.
#define ARM_COMPUTE_ERROR_ON_TENSOR_NOT_2D(t)
Definition: Validate.h:855
Implementation of a rectangular access pattern.
bool update_window_and_padding(Window &win, Ts &&... patterns)
Update window and padding size for each of the access patterns.
Definition: Helpers.h:402
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
Format
Image colour formats.
Definition: Types.h:52
Class to describe a number of elements in each dimension.
Definition: Steps.h:40
Implementation of a row access pattern.
A 3 plane of 8 bit 4:4:4 sampled Y, U, V planes.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
#define ARM_COMPUTE_ERROR_ON_TENSORS_NOT_SUBSAMPLED(...)
Definition: Validate.h:351
ValidRegion intersect_valid_regions(const Ts &... regions)
Intersect multiple valid regions.
Definition: Helpers.h:503
virtual const MultiImageInfo * info() const =0
Interface to be implemented by the child class to return the multi-planar image's metadata.
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:789
A 3 plane of 8-bit 4:2:0 sampled Y, U, V planes.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
virtual IImage * plane(unsigned int index)=0
Return a pointer to the requested plane of the image.
Container for valid region of a window.
Definition: Types.h:174
Format format() const
Colour format of the image.
Describe a multidimensional execution window.
Definition: Window.h:39
Coordinates anchor
Anchor for the start of the valid region.
Definition: Types.h:246
#define ARM_COMPUTE_ERROR_ON_TENSORS_NOT_EVEN(...)
Definition: Validate.h:318

References ValidRegion::anchor, ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_ERROR_ON_FORMAT_NOT_IN, ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_ON_TENSOR_NOT_2D, ARM_COMPUTE_ERROR_ON_TENSORS_NOT_EVEN, ARM_COMPUTE_ERROR_ON_TENSORS_NOT_SUBSAMPLED, arm_compute::calculate_max_window(), MultiImageInfo::format(), IMultiImage::info(), ITensor::info(), arm_compute::intersect_valid_regions(), arm_compute::IYUV, arm_compute::NV12, arm_compute::NV21, IMultiImage::plane(), ITensorInfo::tensor_shape(), arm_compute::U8, arm_compute::update_window_and_padding(), ITensorInfo::valid_region(), and arm_compute::YUV444.

◆ is_parallelisable()

bool is_parallelisable ( ) const
overridevirtual

Indicates whether or not the kernel is parallelisable.

If the kernel is parallelisable then the window returned by window() can be split into sub-windows which can then be run in parallel.

If the kernel is not parallelisable then only the window returned by window() can be passed to run()

Returns
True if the kernel is parallelisable

Reimplemented from IKernel.

Definition at line 267 of file NEChannelCombineKernel.cpp.

268 {
269  return _is_parallelizable;
270 }

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 42 of file NEChannelCombineKernel.h.

43  {
44  return "NEChannelCombineKernel";
45  }

◆ operator=() [1/2]

NEChannelCombineKernel& operator= ( const NEChannelCombineKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEChannelCombineKernel& operator= ( NEChannelCombineKernel &&  )
default

Allow instances of this class to be moved.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Implements ICPPKernel.

Definition at line 272 of file NEChannelCombineKernel.cpp.

273 {
277  ARM_COMPUTE_ERROR_ON(_func == nullptr);
278 
279  (this->*_func)(window);
280 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:337
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:160
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:940

References ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, arm_compute::test::validation::info, and IKernel::window().


The documentation for this class was generated from the following files: