Compute Library
 21.02
NEMinMaxLayerKernel Class Reference

Interface for the kernel to perform min max search on a 3D tensor. More...

#include <NEMinMaxLayerKernel.h>

Collaboration diagram for NEMinMaxLayerKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEMinMaxLayerKernel ()
 Default constructor. More...
 
 NEMinMaxLayerKernel (const NEMinMaxLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEMinMaxLayerKerneloperator= (const NEMinMaxLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEMinMaxLayerKernel (NEMinMaxLayerKernel &&)=delete
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
NEMinMaxLayerKerneloperator= (NEMinMaxLayerKernel &&)=delete
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
 ~NEMinMaxLayerKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input, ITensor *output)
 Initialise the kernel's input and outputs. More...
 
void reset ()
 Resets global minimum and maximum. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *output)
 Static function to check if given info will lead to a valid configuration of CLMinMaxLayerKernel. More...
 

Detailed Description

Interface for the kernel to perform min max search on a 3D tensor.

Definition at line 38 of file NEMinMaxLayerKernel.h.

Constructor & Destructor Documentation

◆ NEMinMaxLayerKernel() [1/3]

Default constructor.

Definition at line 91 of file NEMinMaxLayerKernel.cpp.

Referenced by NEMinMaxLayerKernel::name().

92  : _input(nullptr), _output(nullptr), _mtx()
93 {
94 }

◆ NEMinMaxLayerKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEMinMaxLayerKernel() [3/3]

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ ~NEMinMaxLayerKernel()

~NEMinMaxLayerKernel ( )
default

Default destructor.

Referenced by NEMinMaxLayerKernel::name().

Member Function Documentation

◆ configure()

void configure ( const ITensor input,
ITensor output 
)

Initialise the kernel's input and outputs.

Note
output[0] = minimum
output[1] = maximum
Parameters
[in]inputInput tensor with at least 3 dimensions. The dimensions over the third will be interpreted as batches. Data type supported: F32.
[out]outputOutput tensor with shape [2, batches, ...] which stores the minimum and maximum value for each 3D input tensor. The dimensions over the second must match the batched dimensions of the input tensor. Data types supported: F32

Definition at line 96 of file NEMinMaxLayerKernel.cpp.

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ITensor::info(), arm_compute::test::validation::input, and arm_compute::validate_arguments().

Referenced by NEMinMaxLayerKernel::name().

97 {
99  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input->info(), output->info()));
100 
101  _input = input;
102  _output = output;
103 
104  auto win_config = validate_and_configure_window(input->info(), output->info());
105 
106  ARM_COMPUTE_ERROR_THROW_ON(std::get<0>(win_config));
107 
108  INEKernel::configure(std::get<1>(win_config));
109 }
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
Status validate_arguments(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const GEMMLowpOutputStageInfo *output_stage)
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161

◆ name()

◆ operator=() [1/2]

NEMinMaxLayerKernel& operator= ( const NEMinMaxLayerKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

Referenced by NEMinMaxLayerKernel::name().

◆ operator=() [2/2]

NEMinMaxLayerKernel& operator= ( NEMinMaxLayerKernel &&  )
delete

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ reset()

void reset ( )

Resets global minimum and maximum.

Definition at line 194 of file NEMinMaxLayerKernel.cpp.

References ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, Window::DimX, arm_compute::execute_window_loop(), ITensor::info(), arm_compute::support::cpp11::lowest(), Iterator::ptr(), Window::set(), ITensorInfo::tensor_shape(), and Window::use_tensor_dimensions().

Referenced by NEMinMaxLayerKernel::name().

195 {
197 
198  float32x2_t reset_values = vdup_n_f32(0.0f);
199  reset_values = vset_lane_f32(std::numeric_limits<float>::max(), reset_values, 0);
200  reset_values = vset_lane_f32(std::numeric_limits<float>::lowest(), reset_values, 1);
201 
202  Window window_output;
203  window_output.use_tensor_dimensions(_output->info()->tensor_shape());
204  window_output.set(Window::DimX, Window::Dimension(0, 1, 1));
205 
206  Iterator output(_output, window_output);
207 
208  execute_window_loop(window_output, [&](const Coordinates &)
209  {
210  vst1_f32(reinterpret_cast<float *>(output.ptr()), reset_values);
211  },
212  output);
213 }
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:941
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:77

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 119 of file NEMinMaxLayerKernel.cpp.

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, Window::DimX, Window::Dimension::end(), arm_compute::execute_window_loop(), ITensor::info(), arm_compute::test::validation::input, arm_compute::support::cpp11::lowest(), Iterator::ptr(), Window::set(), Window::Dimension::start(), ITensorInfo::strides_in_bytes(), ITensorInfo::tensor_shape(), Window::use_tensor_dimensions(), IKernel::window(), Window::x(), x_end, and x_start.

Referenced by NEMinMaxLayerKernel::name().

120 {
124 
125  const int x_start = window.x().start();
126  const int x_end = window.x().end();
127 
128  Window window_output;
129  window_output.use_tensor_dimensions(_output->info()->tensor_shape());
130  window_output.set(Window::DimX, Window::Dimension(0, 1, 1));
131 
132  // Handle X dimension manually to split into two loops
133  // First one will use vector operations, second one processes the left over pixels
134  Window window_input(window);
135  window_input.set(Window::DimX, Window::Dimension(0, 1, 1));
136  window_input.set(3, Window::Dimension(0, 1, 1));
137 
138  Iterator input(_input, window_input);
139  Iterator output(_output, window_output);
140 
141  execute_window_loop(window_output, [&](const Coordinates & id_batch)
142  {
143  float32x2_t carry_min = vdup_n_f32(std::numeric_limits<float>::max());
144  float32x2_t carry_max = vdup_n_f32(std::numeric_limits<float>::lowest());
145 
146  float carry_min_scalar = std::numeric_limits<float>::max();
147  float carry_max_scalar = std::numeric_limits<float>::lowest();
148 
149  execute_window_loop(window_input, [&](const Coordinates &)
150  {
151  int x = x_start;
152  const auto in_ptr = reinterpret_cast<const float *>(input.ptr() + id_batch[1] * _input->info()->strides_in_bytes()[3]);
153 
154  // Vector loop
155  for(; x <= x_end - 8; x += 8)
156  {
157  const float32x4x2_t pixels = vld2q_f32(in_ptr + x);
158  const float32x4_t tmp_min1 = vminq_f32(pixels.val[0], pixels.val[1]);
159  const float32x4_t tmp_max1 = vmaxq_f32(pixels.val[0], pixels.val[1]);
160  const float32x2_t tmp_min2 = vmin_f32(vget_high_f32(tmp_min1), vget_low_f32(tmp_min1));
161  const float32x2_t tmp_max2 = vmax_f32(vget_high_f32(tmp_max1), vget_low_f32(tmp_max1));
162  carry_min = vmin_f32(tmp_min2, carry_min);
163  carry_max = vmax_f32(tmp_max2, carry_max);
164  }
165 
166  // Process leftover pixels
167  for(; x < x_end; ++x)
168  {
169  const float pixel = in_ptr[x];
170  carry_min_scalar = std::min(pixel, carry_min_scalar);
171  carry_max_scalar = std::max(pixel, carry_max_scalar);
172  }
173  },
174  input);
175 
176  // Reduce result
177  carry_min = vpmin_f32(carry_min, carry_min);
178  carry_max = vpmax_f32(carry_max, carry_max);
179  carry_min = vpmin_f32(carry_min, carry_min);
180  carry_max = vpmax_f32(carry_max, carry_max);
181 
182  // Extract max/min values
183  const float min_i = std::min(vget_lane_f32(carry_min, 0), carry_min_scalar);
184  const float max_i = std::max(vget_lane_f32(carry_max, 0), carry_max_scalar);
185 
186  auto out_ptr = reinterpret_cast<float *>(output.ptr());
187 
188  // Perform reduction of local min/max values
189  update_min_max(out_ptr, min_i, max_i);
190  },
191  output);
192 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:941
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:77
virtual const Strides & strides_in_bytes() const =0
The strides in bytes for accessing each dimension of the tensor.
constexpr int end() const
Return the end of the dimension.
Definition: Window.h:99
constexpr int start() const
Return the start of the dimension.
Definition: Window.h:94
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
constexpr const Dimension & x() const
Alias to access the first dimension of the window.
Definition: Window.h:145

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo output 
)
static

Static function to check if given info will lead to a valid configuration of CLMinMaxLayerKernel.

Parameters
[in]inputInput tensor info. Data types supported: F32.
[in]outputOutput tensor info with shape [2, batches, ...] which stores the minimum and maximum values for each 3D input tensor. The dimensions over the second must match the batched dimensions of the input tensor. Data types supported: F32.
Returns
a status

Definition at line 111 of file NEMinMaxLayerKernel.cpp.

References ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), and arm_compute::validate_arguments().

Referenced by NEMinMaxLayerKernel::name().

112 {
114  ARM_COMPUTE_RETURN_ON_ERROR(std::get<0>(validate_and_configure_window(input->clone().get(), output->clone().get())));
115 
116  return Status{};
117 }
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
Status validate_arguments(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const GEMMLowpOutputStageInfo *output_stage)

The documentation for this class was generated from the following files: