Compute Library
 19.08
NEHeightConcatenateLayerKernel Class Reference

Interface for the height concatenate kernel. More...

#include <NEHeightConcatenateLayerKernel.h>

Collaboration diagram for NEHeightConcatenateLayerKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEHeightConcatenateLayerKernel ()
 Default constructor. More...
 
 NEHeightConcatenateLayerKernel (const NEHeightConcatenateLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEHeightConcatenateLayerKerneloperator= (const NEHeightConcatenateLayerKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEHeightConcatenateLayerKernel (NEHeightConcatenateLayerKernel &&)=default
 Allow instances of this class to be moved. More...
 
NEHeightConcatenateLayerKerneloperator= (NEHeightConcatenateLayerKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~NEHeightConcatenateLayerKernel ()=default
 Default destructor. More...
 
void configure (const ITensor *input, unsigned int height_offset, ITensor *output)
 Initialise the kernel's inputs and output. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, unsigned int height_offset, const ITensorInfo *output)
 Static function to check if given info will lead to a valid configuration of NEHeightConcatenateLayerKernel. More...
 

Detailed Description

Interface for the height concatenate kernel.

The input tensor will be concatenated into the output tensor.

Definition at line 38 of file NEHeightConcatenateLayerKernel.h.

Constructor & Destructor Documentation

◆ NEHeightConcatenateLayerKernel() [1/3]

Default constructor.

Definition at line 77 of file NEHeightConcatenateLayerKernel.cpp.

78  : _input(nullptr), _output(nullptr), _height_offset(0)
79 {
80 }

◆ NEHeightConcatenateLayerKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEHeightConcatenateLayerKernel() [3/3]

Allow instances of this class to be moved.

◆ ~NEHeightConcatenateLayerKernel()

Default destructor.

Member Function Documentation

◆ configure()

void configure ( const ITensor input,
unsigned int  height_offset,
ITensor output 
)

Initialise the kernel's inputs and output.

Parameters
[in]inputInput tensor. Data types supported: U8/S8/QASYMM8/U16/S16/F16/U32/S32/F32
[in]height_offsetThe starting offset on the Y axis for the output tensor.
[in,out]outputOutput tensor. Data types supported: Same as input.

Definition at line 82 of file NEHeightConcatenateLayerKernel.cpp.

83 {
84  ARM_COMPUTE_ERROR_ON_NULLPTR(input, output);
85  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(input->info(), height_offset, output->info()));
86 
87  _input = input;
88  _output = output;
89  _height_offset = height_offset;
90 
91  // Configure kernel window
92  auto win_config = validate_and_configure_window(input->info(), output->info());
93  ARM_COMPUTE_ERROR_THROW_ON(std::get<0>(win_config));
94 
95  INEKernel::configure(std::get<1>(win_config));
96 
97  // Set output valid region
98  output->info()->set_valid_region(ValidRegion(Coordinates(), output->info()->tensor_shape()));
99 }
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:327
virtual void set_valid_region(const ValidRegion &valid_region)=0
Set the valid region of the tensor.
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
Coordinates of an item.
Definition: Coordinates.h:37
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161
Container for valid region of a window.
Definition: Types.h:174

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ITensor::info(), ITensorInfo::set_valid_region(), ITensorInfo::tensor_shape(), and arm_compute::validate_and_configure_window().

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 41 of file NEHeightConcatenateLayerKernel.h.

42  {
43  return "NEHeightConcatenateLayerKernel";
44  }

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Allow instances of this class to be moved.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Implements ICPPKernel.

Definition at line 108 of file NEHeightConcatenateLayerKernel.cpp.

109 {
113 
114  // Offset output pointer to the correct position
115  uint8_t *output_ptr = _output->buffer() + _output->info()->offset_first_element_in_bytes() + _height_offset * _output->info()->strides_in_bytes()[Window::DimY];
116 
117  // Create iterators
118  Iterator input(_input, window);
119  Iterator output(_output, window);
120  const DataType dt = _input->info()->data_type();
121  const UniformQuantizationInfo &input_qinfo = _input->info()->quantization_info().uniform();
122  const UniformQuantizationInfo &output_qinfo = _output->info()->quantization_info().uniform();
123  if(dt == DataType::QASYMM8 && input_qinfo != output_qinfo)
124  {
126  {
127  vst1q_u8(output_ptr + output.offset(), vquantize(vdequantize(vld1q_u8(input.ptr()), input_qinfo), output_qinfo));
128  },
129  input, output);
130  }
131  else
132  {
134  {
135  const auto in_ptr = input.ptr();
136  const auto out_ptr = output_ptr + output.offset();
137 
138  wrapper::vstore(out_ptr, wrapper::vloadq(in_ptr));
139  },
140  input, output);
141  }
142 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
float32x4x2_t vdequantize(const uint8x8_t &qv, const UniformQuantizationInfo &qi)
Dequantize a neon vector holding 8 quantized values.
Definition: NEAsymm.h:164
uint8x16_t vloadq(const uint8_t *ptr)
Definition: load.h:58
virtual DataType data_type() const =0
Data type used for each element of the tensor.
Quantization info when assuming per layer quantization.
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:160
quantized, asymmetric fixed-point 8-bit number
Coordinates of an item.
Definition: Coordinates.h:37
virtual uint8_t * buffer() const =0
Interface to be implemented by the child class to return a pointer to CPU memory.
UniformQuantizationInfo uniform() const
Return per layer quantization info.
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
virtual QuantizationInfo quantization_info() const =0
Get the quantization settings (scale and offset) of the tensor.
virtual size_t offset_first_element_in_bytes() const =0
The offset from the beginning of the memory allocation to the first element of the tensor.
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
uint8x8_t vquantize(const float32x4x2_t &qv, const UniformQuantizationInfo &qi)
Quantize a neon vector holding 8 floating point values.
Definition: NEAsymm.h:258
void vstore(uint8_t *ptr, uint8x8_t val)
Definition: store.h:39
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:122
virtual const Strides & strides_in_bytes() const =0
The strides in bytes for accessing each dimension of the tensor.
Iterator updated by execute_window_loop for each window element.
Definition: Helpers.h:318
DataType
Available data types.
Definition: Types.h:74
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:940

References ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensor::buffer(), ITensorInfo::data_type(), Window::DimY, arm_compute::execute_window_loop(), ITensor::info(), arm_compute::test::validation::info, Iterator::offset(), ITensorInfo::offset_first_element_in_bytes(), Iterator::ptr(), arm_compute::QASYMM8, ITensorInfo::quantization_info(), ITensorInfo::strides_in_bytes(), QuantizationInfo::uniform(), arm_compute::vdequantize(), arm_compute::wrapper::vloadq(), arm_compute::vquantize(), arm_compute::wrapper::vstore(), and IKernel::window().

◆ validate()

Status validate ( const ITensorInfo input,
unsigned int  height_offset,
const ITensorInfo output 
)
static

Static function to check if given info will lead to a valid configuration of NEHeightConcatenateLayerKernel.

Parameters
[in]inputInput tensor info. Data types supported: U8/S8/QASYMM8/U16/S16/F16/U32/S32/F32
[in]height_offsetThe starting offset on the Y axis for the output tensor.
[in]outputOutput tensor info. Data types supported: Same as input.
Returns
a status

Definition at line 101 of file NEHeightConcatenateLayerKernel.cpp.

102 {
103  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(input, height_offset, output));
104  ARM_COMPUTE_RETURN_ON_ERROR(validate_and_configure_window(input->clone().get(), output->clone().get()).first);
105  return Status{};
106 }
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:193
Status class.
Definition: Error.h:52
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.

References ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), and arm_compute::validate_and_configure_window().


The documentation for this class was generated from the following files: