Compute Library
 21.02
GCLogits1DShiftExpSumKernel Class Reference

Interface for shifting the logits values around the max value and exponentiating the result. More...

#include <GCSoftmaxLayerKernel.h>

Collaboration diagram for GCLogits1DShiftExpSumKernel:
[legend]

Public Member Functions

 GCLogits1DShiftExpSumKernel ()
 Default constructor. More...
 
 GCLogits1DShiftExpSumKernel (const GCLogits1DShiftExpSumKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
GCLogits1DShiftExpSumKerneloperator= (const GCLogits1DShiftExpSumKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 GCLogits1DShiftExpSumKernel (GCLogits1DShiftExpSumKernel &&)=default
 Allow instances of this class to be moved. More...
 
GCLogits1DShiftExpSumKerneloperator= (GCLogits1DShiftExpSumKernel &&)=default
 Allow instances of this class to be moved. More...
 
void configure (const IGCTensor *input, const IGCTensor *max, IGCTensor *output, IGCTensor *sum)
 Set the input and output tensors. More...
 
void run (const Window &window) override
 Enqueue the OpenGL ES shader to process the given window. More...
 
- Public Member Functions inherited from IGCKernel
 IGCKernel ()
 Constructor. More...
 
GCKernelkernel ()
 Returns a reference to the GLES kernel of this object. More...
 
void add_1D_tensor_argument (unsigned int &idx, const IGCTensor *tensor, const unsigned int binding_point, const Window &window)
 Add the passed 1D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_2D_tensor_argument (unsigned int &idx, const IGCTensor *tensor, const unsigned int binding_point, const Window &window)
 Add the passed 2D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
void add_3D_tensor_argument (unsigned int &idx, const IGCTensor *tensor, const unsigned int binding_point, const Window &window)
 Add the passed 3D tensor's parameters to the object's kernel's arguments starting from the index idx. More...
 
unsigned int num_arguments_per_1D_tensor () const
 Returns the number of arguments enqueued per 1D tensor object. More...
 
unsigned int num_arguments_per_2D_tensor () const
 Returns the number of arguments enqueued per 2D tensor object. More...
 
unsigned int num_arguments_per_3D_tensor () const
 Returns the number of arguments enqueued per 3D tensor object. More...
 
void set_lws_hint (gles::NDRange &lws_hint)
 Set the Local-Workgroup-Size hint. More...
 
void set_target (GPUTarget target)
 Set the targeted GPU architecture. More...
 
GPUTarget get_target () const
 Get the targeted GPU architecture. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Detailed Description

Interface for shifting the logits values around the max value and exponentiating the result.

Definition at line 46 of file GCSoftmaxLayerKernel.h.

Constructor & Destructor Documentation

◆ GCLogits1DShiftExpSumKernel() [1/3]

Default constructor.

Definition at line 105 of file GCSoftmaxLayerKernel.cpp.

106  : _input(nullptr), _max(nullptr), _output(nullptr), _sum(nullptr)
107 {
108 }

◆ GCLogits1DShiftExpSumKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ GCLogits1DShiftExpSumKernel() [3/3]

Allow instances of this class to be moved.

Member Function Documentation

◆ configure()

void configure ( const IGCTensor input,
const IGCTensor max,
IGCTensor output,
IGCTensor sum 
)

Set the input and output tensors.

Parameters
[in]inputSource tensor. Data types supported: F16/F32
[in]maxMax values tensor. Data types supported: same as input
[out]outputDestination tensor. Data types supported: same as input
[out]sumSum of 1D logits tensor. Data types supported: same as input

Definition at line 110 of file GCSoftmaxLayerKernel.cpp.

References ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_ERROR_ON_MISMATCHING_DATA_TYPES, ARM_COMPUTE_ERROR_ON_MISMATCHING_SHAPES, ARM_COMPUTE_ERROR_ON_NULLPTR, arm_compute::auto_init_if_empty(), arm_compute::calculate_max_window(), arm_compute::ceil_to_multiple(), GCKernelLibrary::create_kernel(), ITensorInfo::data_type(), ITensorInfo::dimension(), arm_compute::F16, arm_compute::F32, GCKernelLibrary::get(), ITensor::info(), arm_compute::test::validation::input, IGCKernel::num_arguments_per_3D_tensor(), num_elems_processed_per_iteration, sum(), ITensorInfo::tensor_shape(), arm_compute::support::cpp11::to_string(), arm_compute::update_window_and_padding(), and ITensorInfo::valid_region().

Referenced by GCSoftmaxLayer::configure().

111 {
113  ARM_COMPUTE_ERROR_ON_NULLPTR(max, sum, output);
114 
115  // Output auto initialization if not yet initialized
116  auto_init_if_empty(*sum->info(), max->info()->tensor_shape(), 1, input->info()->data_type());
117  auto_init_if_empty(*output->info(), input->info()->tensor_shape(), 1, input->info()->data_type());
118 
119  ARM_COMPUTE_ERROR_ON_MISMATCHING_DATA_TYPES(input, output, max, sum);
122 
123  _input = input;
124  _max = max;
125  _output = output;
126  _sum = sum;
127 
128  // Set build options
129  std::set<std::string> build_opts;
130  std::string dt_name = (input->info()->data_type() == DataType::F32) ? "DATA_TYPE_FP32" : "DATA_TYPE_FP16";
131  build_opts.insert("#define " + dt_name);
132  build_opts.emplace("#define LOCAL_SIZE_X " + support::cpp11::to_string(1));
133  build_opts.emplace("#define LOCAL_SIZE_Y " + support::cpp11::to_string(1));
134  build_opts.emplace("#define LOCAL_SIZE_Z " + support::cpp11::to_string(1));
135  build_opts.insert("#define SOFTMAX_LAYER_SHIFT_EXP_SUM");
136 
137  // Tell the kernel that the width is not a multiple of 8
138  if((input->info()->dimension(0) % 8) != 0)
139  {
140  build_opts.insert("#define NON_MULTIPLE_OF_8");
141  }
142 
143  // Create kernel
144  _kernel = static_cast<GCKernel>(GCKernelLibrary::get().create_kernel("softmax_layer_shift_exp_sum", build_opts));
145 
146  // Set fixed arguments
147  unsigned int idx = 4 * num_arguments_per_3D_tensor(); //Skip the input and output parameters
148  _kernel.set_argument(idx++, input->info()->dimension(0));
149 
150  // Configure window
151  // The kernel loops over all elements in steps of 8
152  const unsigned int num_elems_processed_per_iteration = ceil_to_multiple(input->info()->dimension(0), 8);
153  unsigned int num_elems_written_per_iteration = 1;
154  if(input->info()->data_type() == DataType::F16)
155  {
156  num_elems_written_per_iteration = 2;
157  }
158 
159  Window win = calculate_max_window(*input->info(), Steps(num_elems_processed_per_iteration));
160 
162  AccessWindowHorizontal max_access(max->info(), 0, num_elems_written_per_iteration);
164  AccessWindowHorizontal sum_access(sum->info(), 0, num_elems_written_per_iteration);
165 
166  update_window_and_padding(win, input_access, max_access, output_access, sum_access);
167 
168  output_access.set_valid_region(win, input->info()->valid_region());
169  sum_access.set_valid_region(win, ValidRegion(Coordinates(), sum->info()->tensor_shape()));
170 
171  IGCKernel::configure(win);
172 }
Window calculate_max_window(const ValidRegion &valid_region, const Steps &steps, bool skip_border, BorderSize border_size)
virtual size_t dimension(size_t index) const =0
Return the size of the requested dimension.
DATA_TYPE sum(__global const DATA_TYPE *input)
Calculate sum of a vector.
std::string to_string(T &&value)
Convert integer and float values to string.
virtual DataType data_type() const =0
Data type used for each element of the tensor.
1 channel, 1 F32 per channel
GCKernel class.
unsigned int num_arguments_per_3D_tensor() const
Returns the number of arguments enqueued per 3D tensor object.
Definition: IGCKernel.cpp:147
1 channel, 1 F16 per channel
virtual ValidRegion valid_region() const =0
Valid region of the tensor.
bool update_window_and_padding(Window &win, Ts &&... patterns)
Update window and padding size for each of the access patterns.
Definition: WindowHelpers.h:46
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
#define ARM_COMPUTE_ERROR_ON_MISMATCHING_DATA_TYPES(...)
Definition: Validate.h:543
auto ceil_to_multiple(S value, T divisor) -> decltype(((value+divisor - 1)/divisor) *divisor)
Computes the smallest number larger or equal to value that is a multiple of divisor.
Definition: Utils.h:71
Class to describe a number of elements in each dimension.
Definition: Steps.h:40
Coordinates of an item.
Definition: Coordinates.h:37
Implementation of a row access pattern.
#define ARM_COMPUTE_ERROR_ON_MISMATCHING_SHAPES(...)
Definition: Validate.h:441
bool auto_init_if_empty(ITensorInfo &info, const TensorShape &shape, int num_channels, DataType data_type, QuantizationInfo quantization_info=QuantizationInfo())
Auto initialize the tensor info (shape, number of channels and data type) if the current assignment i...
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:790
static GCKernelLibrary & get()
Get the static instance of GCKernelLibrary.
GCKernel create_kernel(const std::string &shader_name, const StringSet &build_options_set={}) const
Creates a kernel from the kernel library.
unsigned int num_elems_processed_per_iteration
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
Container for valid region of a window.
Definition: Types.h:188
Describe a multidimensional execution window.
Definition: Window.h:39

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Allow instances of this class to be moved.

◆ run()

void run ( const Window window)
overridevirtual

Enqueue the OpenGL ES shader to process the given window.

Parameters
[in]windowRegion on which to execute the kernel. (Must be a valid region of the window returned by window()).

Implements IGCKernel.

Definition at line 174 of file GCSoftmaxLayerKernel.cpp.

References IGCKernel::add_3D_tensor_argument(), ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, Window::collapse_if_possible(), Window::DimZ, arm_compute::enqueue(), Window::first_slice_window_3D(), arm_compute::test::validation::reference::slice(), Window::slide_window_slice_3D(), and IKernel::window().

175 {
178 
179  Window window_collapsed = window.collapse_if_possible(IGCKernel::window(), Window::DimZ);
180  Window slice = window_collapsed.first_slice_window_3D();
181 
182  _kernel.use();
183 
184  do
185  {
186  unsigned int idx = 0;
187  unsigned int binding = 1; // SSBO binding starts from 1.
188  // Set inputs
189  add_3D_tensor_argument(idx, _input, binding++, slice);
190  add_3D_tensor_argument(idx, _max, binding++, slice);
191  add_3D_tensor_argument(idx, _output, binding++, slice);
192  add_3D_tensor_argument(idx, _sum, binding++, slice);
193  _kernel.update_shader_params();
194  enqueue(*this, slice);
195  }
196  while(window_collapsed.slide_window_slice_3D(slice));
197 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
void add_3D_tensor_argument(unsigned int &idx, const IGCTensor *tensor, const unsigned int binding_point, const Window &window)
Add the passed 3D tensor&#39;s parameters to the object&#39;s kernel&#39;s arguments starting from the index idx...
Definition: IGCKernel.cpp:132
void enqueue(IGCKernel &kernel, const Window &window, const gles::NDRange &lws=gles::NDRange(1U, 1U, 1U))
Add the kernel to the command queue with the given window.
Definition: IGCKernel.cpp:41
Window collapse_if_possible(const Window &full_window, size_t first, size_t last, bool *has_collapsed=nullptr) const
Collapse the dimensions between first and last if possible.
Definition: Window.inl:68
bool slide_window_slice_3D(Window &slice) const
Slide the passed 3D window slice.
Definition: Window.h:335
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:941
static constexpr size_t DimZ
Alias for dimension 2 also known as Z dimension.
Definition: Window.h:47
Window first_slice_window_3D() const
First 3D slice of the window.
Definition: Window.h:291
Describe a multidimensional execution window.
Definition: Window.h:39
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
SimpleTensor< T > slice(const SimpleTensor< T > &src, Coordinates starts, Coordinates ends)

The documentation for this class was generated from the following files: