Compute Library
 19.08
NEGEMMMatrixAccumulateBiasesKernel Class Reference

NEON kernel to add a bias to each row of the input tensor. More...

#include <NEGEMMMatrixAccumulateBiasesKernel.h>

Collaboration diagram for NEGEMMMatrixAccumulateBiasesKernel:
[legend]

Public Member Functions

const char * name () const override
 Name of the kernel. More...
 
 NEGEMMMatrixAccumulateBiasesKernel ()
 Default constructor. More...
 
 NEGEMMMatrixAccumulateBiasesKernel (const NEGEMMMatrixAccumulateBiasesKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
NEGEMMMatrixAccumulateBiasesKerneloperator= (const NEGEMMMatrixAccumulateBiasesKernel &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 NEGEMMMatrixAccumulateBiasesKernel (NEGEMMMatrixAccumulateBiasesKernel &&)=default
 Allow instances of this class to be moved. More...
 
NEGEMMMatrixAccumulateBiasesKerneloperator= (NEGEMMMatrixAccumulateBiasesKernel &&)=default
 Allow instances of this class to be moved. More...
 
 ~NEGEMMMatrixAccumulateBiasesKernel ()=default
 Default destructor. More...
 
void configure (ITensor *accum, const ITensor *biases)
 Set the accumulate buffer and the biases of the kernel. More...
 
void run (const Window &window, const ThreadInfo &info) override
 Execute the kernel on the passed window. More...
 
- Public Member Functions inherited from ICPPKernel
virtual ~ICPPKernel ()=default
 Default destructor. More...
 
- Public Member Functions inherited from IKernel
 IKernel ()
 Constructor. More...
 
virtual ~IKernel ()=default
 Destructor. More...
 
virtual bool is_parallelisable () const
 Indicates whether or not the kernel is parallelisable. More...
 
virtual BorderSize border_size () const
 The size of the border for that kernel. More...
 
const Windowwindow () const
 The maximum window the kernel can be executed on. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *accum, const ITensorInfo *biases)
 Static function to check if given info will lead to a valid configuration of NEGEMMMatrixAccumulateBiasesKernel. More...
 

Detailed Description

NEON kernel to add a bias to each row of the input tensor.

Definition at line 33 of file NEGEMMMatrixAccumulateBiasesKernel.h.

Constructor & Destructor Documentation

◆ NEGEMMMatrixAccumulateBiasesKernel() [1/3]

Default constructor.

Definition at line 79 of file NEGEMMMatrixAccumulateBiasesKernel.cpp.

80  : _accum(nullptr), _biases(nullptr)
81 {
82 }

◆ NEGEMMMatrixAccumulateBiasesKernel() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ NEGEMMMatrixAccumulateBiasesKernel() [3/3]

Allow instances of this class to be moved.

◆ ~NEGEMMMatrixAccumulateBiasesKernel()

Default destructor.

Member Function Documentation

◆ configure()

void configure ( ITensor accum,
const ITensor biases 
)

Set the accumulate buffer and the biases of the kernel.

Parameters
[in,out]accumThe accumulate tensor to convert. Data type supported: F32
[in]biasesThe shared biases tensor to append. It must be 1D Tensor. Data type supported: Same as input

Definition at line 84 of file NEGEMMMatrixAccumulateBiasesKernel.cpp.

85 {
86  ARM_COMPUTE_ERROR_ON_NULLPTR(accum, biases);
87 
88  // Perform validate step
89  ARM_COMPUTE_ERROR_THROW_ON(validate_arguments(accum->info(), biases->info()));
90 
91  _biases = biases;
92  _accum = accum;
93 
94  // Configure kernel window
95  auto win_config = validate_and_configure_window(accum->info(), biases->info());
96  ARM_COMPUTE_ERROR_THROW_ON(win_config.first);
97  INEKernel::configure(win_config.second);
98 }
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_ERROR_THROW_ON(status)
Definition: Error.h:327
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
Definition: Validate.h:161

References ARM_COMPUTE_ERROR_ON_NULLPTR, ARM_COMPUTE_ERROR_THROW_ON, ITensor::info(), and arm_compute::validate_and_configure_window().

Referenced by NEFullyConnectedLayer::configure().

◆ name()

const char* name ( ) const
inlineoverridevirtual

Name of the kernel.

Returns
Kernel name

Implements ICPPKernel.

Definition at line 36 of file NEGEMMMatrixAccumulateBiasesKernel.h.

37  {
38  return "NEGEMMMatrixAccumulateBiasesKernel";
39  }

◆ operator=() [1/2]

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Allow instances of this class to be moved.

◆ run()

void run ( const Window window,
const ThreadInfo info 
)
overridevirtual

Execute the kernel on the passed window.

Warning
If is_parallelisable() returns false then the passed window must be equal to window()
Note
The window has to be a region within the window returned by the window() method
The width of the window has to be a multiple of num_elems_processed_per_iteration().
Parameters
[in]windowRegion on which to execute the kernel. (Must be a region of the window returned by window())
[in]infoInfo about executing thread and CPU.

Implements ICPPKernel.

Definition at line 108 of file NEGEMMMatrixAccumulateBiasesKernel.cpp.

109 {
113 
114  Window win_biases;
115  win_biases.set(Window::DimX, Window::Dimension(window.x().start(), window.x().end(), window.x().step()));
116  win_biases.set(Window::DimY, Window::Dimension(0, 1, 1));
117 
118  Iterator in0_out(_accum, window);
119  Iterator in1(_biases, win_biases);
120 
121  switch(_accum->info()->data_type())
122  {
123  case DataType::F32:
124  {
126  {
127  const float32x4x4_t accum = vld4q_f32(reinterpret_cast<const float *>(in0_out.ptr()));
128  const float32x4x4_t biases = vld4q_f32(reinterpret_cast<const float *>(in1.ptr()));
129  const float32x4x4_t res =
130  {
131  {
132  vaddq_f32(accum.val[0], biases.val[0]),
133  vaddq_f32(accum.val[1], biases.val[1]),
134  vaddq_f32(accum.val[2], biases.val[2]),
135  vaddq_f32(accum.val[3], biases.val[3])
136  }
137  };
138 
139  vst4q_f32(reinterpret_cast<float *>(in0_out.ptr()), res);
140  },
141  in0_out, in1);
142  break;
143  }
144 #ifdef __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
145  case DataType::F16:
146  {
148  {
149  const float16x8x2_t accum = vld2q_f16(reinterpret_cast<const float16_t *>(in0_out.ptr()));
150  const float16x8x2_t biases = vld2q_f16(reinterpret_cast<const float16_t *>(in1.ptr()));
151  const float16x8x2_t res =
152  {
153  {
154  vaddq_f16(accum.val[0], biases.val[0]),
155  vaddq_f16(accum.val[1], biases.val[1])
156  }
157  };
158 
159  vst2q_f16(reinterpret_cast<float16_t *>(in0_out.ptr()), res);
160  },
161  in0_out, in1);
162  break;
163  }
164 #endif /* __ARM_FEATURE_FP16_VECTOR_ARITHMETIC */
165  default:
166  ARM_COMPUTE_ERROR("Data type not supported");
167  break;
168  }
169 }
#define ARM_COMPUTE_ERROR(...)
Print the given message then throw an std::runtime_error.
Definition: Error.h:261
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
constexpr int step() const
Return the step of the dimension.
Definition: Window.h:102
virtual DataType data_type() const =0
Data type used for each element of the tensor.
1 channel, 1 F32 per channel
Describe one of the image's dimensions with a start, end and step.
Definition: Window.h:75
1 channel, 1 F16 per channel
static constexpr size_t DimX
Alias for dimension 0 also known as X dimension.
Definition: Window.h:43
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:160
Coordinates of an item.
Definition: Coordinates.h:37
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor's metadata.
void set(size_t dimension, const Dimension &dim)
Set the values of a given dimension.
Definition: Window.inl:48
static constexpr size_t DimY
Alias for dimension 1 also known as Y dimension.
Definition: Window.h:45
void execute_window_loop(const Window &w, L &&lambda_function, Ts &&... iterators)
Iterate through the passed window, automatically adjusting the iterators and calling the lambda_funct...
Definition: Helpers.inl:122
constexpr int end() const
Return the end of the dimension.
Definition: Window.h:97
Iterator updated by execute_window_loop for each window element.
Definition: Helpers.h:318
#define ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW(f, s)
Definition: Validate.h:205
constexpr int start() const
Return the start of the dimension.
Definition: Window.h:92
Describe a multidimensional execution window.
Definition: Window.h:39
#define ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL(k)
Definition: Validate.h:940
constexpr const Dimension & x() const
Alias to access the first dimension of the window.
Definition: Window.h:143

References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON_INVALID_SUBWINDOW, ARM_COMPUTE_ERROR_ON_UNCONFIGURED_KERNEL, ARM_COMPUTE_UNUSED, ITensorInfo::data_type(), Window::DimX, Window::DimY, Window::Dimension::end(), arm_compute::execute_window_loop(), arm_compute::F16, arm_compute::F32, ITensor::info(), arm_compute::test::validation::info, Iterator::ptr(), Window::set(), Window::Dimension::start(), Window::Dimension::step(), IKernel::window(), and Window::x().

◆ validate()

Status validate ( const ITensorInfo accum,
const ITensorInfo biases 
)
static

Static function to check if given info will lead to a valid configuration of NEGEMMMatrixAccumulateBiasesKernel.

Parameters
[in]accumThe accumulate tensor to convert. Data type supported: F32
[in]biasesThe shared biases tensor to append. It must be 1D Tensor. Data type supported: Same as input
Returns
a status

Definition at line 100 of file NEGEMMMatrixAccumulateBiasesKernel.cpp.

101 {
102  ARM_COMPUTE_RETURN_ON_ERROR(validate_arguments(accum, biases));
103  ARM_COMPUTE_RETURN_ON_ERROR(validate_and_configure_window(accum->clone().get(), biases->clone().get()).first);
104 
105  return Status{};
106 }
std::pair< Status, Window > validate_and_configure_window(ITensorInfo *input, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *output, const PadStrideInfo &conv_info, unsigned int depth_multiplier, const Size2D &dilation)
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:193
Status class.
Definition: Error.h:52
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.

References ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), and arm_compute::validate_and_configure_window().

Referenced by NEFullyConnectedLayer::validate().


The documentation for this class was generated from the following files: