Compute Library
 21.08
NEMinMaxLayerKernel Class Reference

Interface for the kernel to perform min max search on a 3D tensor. More...

#include <NEMinMaxLayerKernel.h>

Collaboration diagram for NEMinMaxLayerKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEMinMaxLayerKernel ()
 Default constructor. More...
 
 NEMinMaxLayerKernel (const NEMinMaxLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEMinMaxLayerKerneloperator= (const NEMinMaxLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEMinMaxLayerKernel (NEMinMaxLayerKernel &&)=delete
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
NEMinMaxLayerKerneloperator= (NEMinMaxLayerKernel &&)=delete
 Prevent instances of this class from being moved (As this class contains non movable objects) More...
 
 ~NEMinMaxLayerKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input, ITensor *output)
 Initialise the kernel's input and outputs. More...
 
void reset ()
 Resets global minimum and maximum. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
virtual void run_nd (const Window &window, const ThreadInfo &info, const Window &thread_locator)
 legacy compatibility layer for implemantions which do not support thread_locator In these cases we simply narrow the interface down the legacy version More...
 
virtual void run_op (ITensorPack &tensors, const Window &window, const ThreadInfo &info)
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 
bool is_window_configured () const
 Function to check if the embedded window of this kernel has been configured. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *output)
 Static function to check if given info will lead to a valid configuration of CLMinMaxLayerKernel. More...
 

Detailed Description

Interface for the kernel to perform min max search on a 3D tensor.

Definition at line 38 of file NEMinMaxLayerKernel.h.

Constructor & Destructor Documentation

◆ NEMinMaxLayerKernel() [1/3]

Default constructor.

Definition at line 89 of file NEMinMaxLayerKernel.cpp.

Referenced by NEMinMaxLayerKernel::name().

90  : _input(nullptr), _output(nullptr), _mtx()
91 {
92 }

◆ NEMinMaxLayerKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEMinMaxLayerKernel() [3/3]

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ ~NEMinMaxLayerKernel()

~NEMinMaxLayerKernel ( )
default

Default destructor.

Referenced by NEMinMaxLayerKernel::name().

Member Function Documentation

◆ configure()

void configure ( const ITensor input,
ITensor output 
)

Initialise the kernel's input and outputs.

Note
output[0] = minimum
output[1] = maximum
Parameters
[in]inputInput tensor with at least 3 dimensions. The dimensions over the third will be interpreted as batches. Data type supported: F32.
[out]outputOutput tensor with shape [2, batches, ...] which stores the minimum and maximum value for each 3D input tensor. The dimensions over the second must match the batched dimensions of the input tensor. Data types supported: F32

Definition at line 94 of file NEMinMaxLayerKernel.cpp.

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ITensor::info(), and arm_compute::test::validation::input.

Referenced by NEMinMaxLayerKernel::name().

95 {
97  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input->info(), output->info()));
98 
99  _input = input;
100  _output = output;
101 
102  auto win_config = validate_and_configure_window(input->info(), output->info());
103 
104  ARM_COMPUTE_ERROR_THROW_ON(std::get<0>(win_config));
105 
106  INEKernel::configure(std::get<1>(win_config));
107 }
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:455
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:157

◆ name()

◆ operator=() [1/2]

NEMinMaxLayerKernel& operator= ( const NEMinMaxLayerKernel )
delete

Prevent instances of this class from being copied (As this class contains pointers)

Referenced by NEMinMaxLayerKernel::name().

◆ operator=() [2/2]

NEMinMaxLayerKernel& operator= ( NEMinMaxLayerKernel &&  )
delete

Prevent instances of this class from being moved (As this class contains non movable objects)

◆ reset()

void reset ( )

Resets global minimum and maximum.

Definition at line 192 of file NEMinMaxLayerKernel.cpp.

References ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, Window::DimX, arm_compute::execute_window_loop(), ITensor::info(), arm_compute::support::cpp11::lowest(), Iterator::ptr(), Window::set(), ITensorInfo::tensor_shape(), and Window::use_tensor_dimensions().

Referenced by NEMinMaxLayerKernel::name().

193 {
195 
196  float32x2_t reset_values = vdup_n_f32(0.0f);
197  reset_values = vset_lane_f32(std::numeric_limits<float>::max(), reset_values, 0);
198  reset_values = vset_lane_f32(std::numeric_limits<float>::lowest(), reset_values, 1);
199 
200  Window window_output;
201  window_output.use_tensor_dimensions(_output->info()->tensor_shape());
202  window_output.set(Window::DimX, Window::Dimension(0, 1, 1));
203 
204  Iterator output(_output, window_output);
205 
206  execute_window_loop(window_output, [&](const Coordinates &)
207  {
208  vst1_f32(reinterpret_cast<float *>(output.ptr()), reset_values);
209  },
210  output);
211 }
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:915
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:77

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Reimplemented from ICPPKernel.

Definition at line 117 of file NEMinMaxLayerKernel.cpp.

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, Window::DimX, Window::Dimension::end(), arm_compute::execute_window_loop(), ITensor::info(), arm_compute::test::validation::input, arm_compute::support::cpp11::lowest(), Iterator::ptr(), Window::set(), Window::Dimension::start(), ITensorInfo::strides_in_bytes(), ITensorInfo::tensor_shape(), Window::use_tensor_dimensions(), IKernel::window(), Window::x(), x_end, and x_start.

Referenced by NEMinMaxLayerKernel::name().

118 {
122 
123  const int x_start = window.x().start();
124  const int x_end = window.x().end();
125 
126  Window window_output;
127  window_output.use_tensor_dimensions(_output->info()->tensor_shape());
128  window_output.set(Window::DimX, Window::Dimension(0, 1, 1));
129 
130  // Handle X dimension manually to split into two loops
131  // First one will use vector operations, second one processes the left over pixels
132  Window window_input(window);
133  window_input.set(Window::DimX, Window::Dimension(0, 1, 1));
134  window_input.set(3, Window::Dimension(0, 1, 1));
135 
136  Iterator input(_input, window_input);
137  Iterator output(_output, window_output);
138 
139  execute_window_loop(window_output, [&](const Coordinates & id_batch)
140  {
141  float32x2_t carry_min = vdup_n_f32(std::numeric_limits<float>::max());
142  float32x2_t carry_max = vdup_n_f32(std::numeric_limits<float>::lowest());
143 
144  float carry_min_scalar = std::numeric_limits<float>::max();
145  float carry_max_scalar = std::numeric_limits<float>::lowest();
146 
147  execute_window_loop(window_input, [&](const Coordinates &)
148  {
149  int x = x_start;
150  const auto in_ptr = reinterpret_cast<const float *>(input.ptr() + id_batch[1] * _input->info()->strides_in_bytes()[3]);
151 
152  // Vector loop
153  for(; x <= x_end - 8; x += 8)
154  {
155  const float32x4x2_t pixels = vld2q_f32(in_ptr + x);
156  const float32x4_t tmp_min1 = vminq_f32(pixels.val[0], pixels.val[1]);
157  const float32x4_t tmp_max1 = vmaxq_f32(pixels.val[0], pixels.val[1]);
158  const float32x2_t tmp_min2 = vmin_f32(vget_high_f32(tmp_min1), vget_low_f32(tmp_min1));
159  const float32x2_t tmp_max2 = vmax_f32(vget_high_f32(tmp_max1), vget_low_f32(tmp_max1));
160  carry_min = vmin_f32(tmp_min2, carry_min);
161  carry_max = vmax_f32(tmp_max2, carry_max);
162  }
163 
164  // Process leftover pixels
165  for(; x < x_end; ++x)
166  {
167  const float pixel = in_ptr[x];
168  carry_min_scalar = std::min(pixel, carry_min_scalar);
169  carry_max_scalar = std::max(pixel, carry_max_scalar);
170  }
171  },
172  input);
173 
174  // Reduce result
175  carry_min = vpmin_f32(carry_min, carry_min);
176  carry_max = vpmax_f32(carry_max, carry_max);
177  carry_min = vpmin_f32(carry_min, carry_min);
178  carry_max = vpmax_f32(carry_max, carry_max);
179 
180  // Extract max/min values
181  const float min_i = std::min(vget_lane_f32(carry_min, 0), carry_min_scalar);
182  const float max_i = std::max(vget_lane_f32(carry_max, 0), carry_max_scalar);
183 
184  auto out_ptr = reinterpret_cast<float *>(output.ptr());
185 
186  // Perform reduction of local min/max values
187  update_min_max(out_ptr, min_i, max_i);
188  },
189  output);
190 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:915
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
const uint32_t x_end
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:77
virtual const Strides & strides_in_bytes() const =0
The strides in bytes for accessing each dimension of the tensor.
constexpr int end() const
Return the end of the dimension.
Definition: Window.h:99
const uint32_t x_start
constexpr int start() const
Return the start of the dimension.
Definition: Window.h:94
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:201
constexpr const Dimension & x() const
Alias to access the first dimension of the window.
Definition: Window.h:145

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo output 
)
static

Static function to check if given info will lead to a valid configuration of CLMinMaxLayerKernel.

Parameters
[in]inputInput tensor info. Data types supported: F32.
[in]outputOutput tensor info with shape [2, batches, ...] which stores the minimum and maximum values for each 3D input tensor. The dimensions over the second must match the batched dimensions of the input tensor. Data types supported: F32.
Returns
a status

Definition at line 109 of file NEMinMaxLayerKernel.cpp.

References ARM_COMPUTE_RETURN_ON_ERROR, and ICloneable< T >::clone().

Referenced by NEMinMaxLayerKernel::name().

110 {
111  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(input, output));
112  ARM_COMPUTE_RETURN_ON_ERROR(std::get<0>(validate_and_configure_window(input->clone().get(), output->clone().get())));
113 
114  return Status{};
115 }
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204

The documentation for this class was generated from the following files: