Compute Library
 19.08
NEMinMaxLayerKernel Class Reference

Interface for the kernel to perform min max search on a 3D tensor. More...

#include <NEMinMaxLayerKernel.h>

Collaboration diagram for NEMinMaxLayerKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEMinMaxLayerKernel ()
 Default constructor. More...
 
 NEMinMaxLayerKernel (const NEMinMaxLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEMinMaxLayerKerneloperator= (const NEMinMaxLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEMinMaxLayerKernel (NEMinMaxLayerKernel &&)=default
 Allow instances of this class to be moved. More...
 
NEMinMaxLayerKerneloperator= (NEMinMaxLayerKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~NEMinMaxLayerKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input, ITensor *output)
 Initialise the kernel's input and outputs. More...
 
void reset ()
 Resets global minimum and maximum. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *output)
 Static function to check if given info will lead to a valid configuration of CLMinMaxLayerKernel. More...
 

Detailed Description

Interface for the kernel to perform min max search on a 3D tensor.

Definition at line 38 of file NEMinMaxLayerKernel.h.

Constructor & Destructor Documentation

◆ NEMinMaxLayerKernel() [1/3]

Default constructor.

Definition at line 89 of file NEMinMaxLayerKernel.cpp.

90  : _input(nullptr), _output(nullptr), _mtx()
91 {
92 }

◆ NEMinMaxLayerKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEMinMaxLayerKernel() [3/3]

Allow instances of this class to be moved.

◆ ~NEMinMaxLayerKernel()

~NEMinMaxLayerKernel ( )
default

Default destructor.

Member Function Documentation

◆ configure()

void configure ( const ITensor input,
ITensor output 
)

Initialise the kernel's input and outputs.

Note
output[0] = minimum
output[1] = maximum
Parameters
[in]inputInput tensor with at least 3 dimensions. The dimensions over the third will be interpreted as batches. Data type supported: F32.
[out]outputOutput tensor with shape [2, batches, ...] which stores the minimum and maximum value for each 3D input tensor. The dimensions over the second must match the batched dimensions of the input tensor. Data types supported: F32

Definition at line 94 of file NEMinMaxLayerKernel.cpp.

95 {
96  ARM_COMPUTE_ERROR_ON_NULLPTR(input, output);
97  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input->info(), output->info()));
98 
99  _input = input;
100  _output = output;
101 
102  auto win_config = validate_and_configure_window(input->info(), output->info());
103 
104  ARM_COMPUTE_ERROR_THROW_ON(std::get<0>(win_config));
105 
106  INEKernel::configure(std::get<1>(win_config));
107 }
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:327
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ITensor::info(), and arm_compute::validate_and_configure_window().

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 41 of file NEMinMaxLayerKernel.h.

42  {
43  return "NEMinMaxLayerKernel";
44  }

◆ operator=() [1/2]

NEMinMaxLayerKernel& operator= ( const NEMinMaxLayerKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

NEMinMaxLayerKernel& operator= ( NEMinMaxLayerKernel &&  )
default

Allow instances of this class to be moved.

◆ reset()

void reset ( )

Resets global minimum and maximum.

Definition at line 192 of file NEMinMaxLayerKernel.cpp.

193 {
195 
196  float32x2_t reset_values = vdup_n_f32(0.0f);
197  reset_values = vset_lane_f32(std::numeric_limits<float>::max(), reset_values, 0);
198  reset_values = vset_lane_f32(std::numeric_limits<float>::lowest(), reset_values, 1);
199 
200  Window window_output;
201  window_output.use_tensor_dimensions(_output->info()->tensor_shape());
202  window_output.set(Window::DimX, Window::Dimension(0, 1, 1));
203 
204  Iterator output(_output, window_output);
205 
206  execute_window_loop(window_output, [&](const Coordinates &)
207  {
208  vst1_f32(reinterpret_cast<float *>(output.ptr()), reset_values);
209  },
210  output);
211 }
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:122
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:940

References ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, Window::DimX, arm_compute::execute_window_loop(), ITensor::info(), arm_compute::support::cpp11::lowest(), Iterator::ptr(), Window::set(), ITensorInfo::tensor_shape(), and Window::use_tensor_dimensions().

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Implements ICPPKernel.

Definition at line 117 of file NEMinMaxLayerKernel.cpp.

118 {
122 
123  const int x_start = window.x().start();
124  const int x_end = window.x().end();
125 
126  Window window_output;
127  window_output.use_tensor_dimensions(_output->info()->tensor_shape());
128  window_output.set(Window::DimX, Window::Dimension(0, 1, 1));
129 
130  // Handle X dimension manually to split into two loops
131  // First one will use vector operations, second one processes the left over pixels
132  Window window_input(window);
133  window_input.set(Window::DimX, Window::Dimension(0, 1, 1));
134  window_input.set(3, Window::Dimension(0, 1, 1));
135 
136  Iterator input(_input, window_input);
137  Iterator output(_output, window_output);
138 
139  execute_window_loop(window_output, [&](const Coordinates & id_batch)
140  {
141  float32x2_t carry_min = vdup_n_f32(std::numeric_limits<float>::max());
142  float32x2_t carry_max = vdup_n_f32(std::numeric_limits<float>::lowest());
143 
144  float carry_min_scalar = std::numeric_limits<float>::max();
145  float carry_max_scalar = std::numeric_limits<float>::lowest();
146 
147  execute_window_loop(window_input, [&](const Coordinates &)
148  {
149  int x = x_start;
150  const auto in_ptr = reinterpret_cast<const float *>(input.ptr() + id_batch[1] * _input->info()->strides_in_bytes()[3]);
151 
152  // Vector loop
153  for(; x <= x_end - 8; x += 8)
154  {
155  const float32x4x2_t pixels = vld2q_f32(in_ptr + x);
156  const float32x4_t tmp_min1 = vminq_f32(pixels.val[0], pixels.val[1]);
157  const float32x4_t tmp_max1 = vmaxq_f32(pixels.val[0], pixels.val[1]);
158  const float32x2_t tmp_min2 = vmin_f32(vget_high_f32(tmp_min1), vget_low_f32(tmp_min1));
159  const float32x2_t tmp_max2 = vmax_f32(vget_high_f32(tmp_max1), vget_low_f32(tmp_max1));
160  carry_min = vmin_f32(tmp_min2, carry_min);
161  carry_max = vmax_f32(tmp_max2, carry_max);
162  }
163 
164  // Process leftover pixels
165  for(; x < x_end; ++x)
166  {
167  const float pixel = in_ptr[x];
168  carry_min_scalar = std::min(pixel, carry_min_scalar);
169  carry_max_scalar = std::max(pixel, carry_max_scalar);
170  }
171  },
172  input);
173 
174  // Reduce result
175  carry_min = vpmin_f32(carry_min, carry_min);
176  carry_max = vpmax_f32(carry_max, carry_max);
177  carry_min = vpmin_f32(carry_min, carry_min);
178  carry_max = vpmax_f32(carry_max, carry_max);
179 
180  // Extract max/min values
181  const float min_i = std::min(vget_lane_f32(carry_min, 0), carry_min_scalar);
182  const float max_i = std::max(vget_lane_f32(carry_max, 0), carry_max_scalar);
183 
184  auto out_ptr = reinterpret_cast<float *>(output.ptr());
185 
186  // Perform reduction of local min/max values
187  update_min_max(out_ptr, min_i, max_i);
188  },
189  output);
190 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:160
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:122
virtual const Strides & strides_in_bytes() const =0
The strides in bytes for accessing each dimension of the tensor.
constexpr int end() const
Return the end of the dimension.
Definition: Window.h:97
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
constexpr int start() const
Return the start of the dimension.
Definition: Window.h:92
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:940
constexpr const Dimension & x() const
Alias to access the first dimension of the window.
Definition: Window.h:143

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, Window::DimX, Window::Dimension::end(), arm_compute::execute_window_loop(), ITensor::info(), arm_compute::test::validation::info, arm_compute::support::cpp11::lowest(), Iterator::ptr(), Window::set(), Window::Dimension::start(), ITensorInfo::strides_in_bytes(), ITensorInfo::tensor_shape(), Window::use_tensor_dimensions(), IKernel::window(), and Window::x().

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo output 
)
static

Static function to check if given info will lead to a valid configuration of CLMinMaxLayerKernel.

Parameters
[in]inputInput tensor info. Data types supported: F32.
[in]outputOutput tensor info with shape [2, batches, ...] which stores the minimum and maximum values for each 3D input tensor. The dimensions over the second must match the batched dimensions of the input tensor. Data types supported: F32.
Returns
a status

Definition at line 109 of file NEMinMaxLayerKernel.cpp.

110 {
111  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(input, output));
112  ARM_COMPUTE_RETURN_ON_ERROR(std::get<0>(validate_and_configure_window(input->clone().get(), output->clone().get())));
113 
114  return Status{};
115 }
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:193

References ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), and arm_compute::validate_and_configure_window().


The documentation for this class was generated from the following files: