24 #ifndef ARM_COMPUTE_CLSOFTMAXLAYERKERNEL_H 25 #define ARM_COMPUTE_CLSOFTMAXLAYERKERNEL_H 103 static const unsigned int _grid_size;
104 static const unsigned int _serial_vector_size;
105 static const unsigned int _parallel_vector_size;
Interface for max, shifting, exponentiating and summing the logits.
const Window & window() const
The maximum window the kernel can be executed on.
DATA_TYPE sum(__global const DATA_TYPE *input)
Calculate sum of a vector.
CLLogits1DMaxShiftExpSumKernel()
Default constructor.
Store the tensor's metadata.
CLLogits1DMaxShiftExpSumKernel & operator=(const CLLogits1DMaxShiftExpSumKernel &)=delete
Prevent instances of this class from being copied (As this class contains pointers) ...
Common interface for all the OpenCL kernels.
Copyright (c) 2017-2021 Arm Limited.
Interface for calculating the final step of the Softmax Layer where each logit value is multiplied by...
void run(const Window &window, cl::CommandQueue &queue) override
Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue...
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
Interface for OpenCL tensor.
static ParallelReductionInfo is_parallel_reduction(size_t size)
Checks if the given size is eligible for parallel reduction.
static Status validate(const ITensorInfo *input, const ITensorInfo *max, const ITensorInfo *output, const ITensorInfo *sum)
Static function to check if given info will lead to a valid configuration of CLLogits1DMaxShiftExpSum...
std::tuple< bool, unsigned int > ParallelReductionInfo
Info for whether a parallel reduction will be run and the vector size of the execution.
Descriptor used by the softmax kernels.
Describe a multidimensional execution window.
void configure(const ICLTensor *input, ICLTensor *max, ICLTensor *output, ICLTensor *sum, const SoftmaxKernelInfo &info)
Set the input and output tensors.