Functions | |
arm_cmsis_nn_status | arm_nn_vec_mat_mul_result_acc_s16 (const int16_t *lhs, const int8_t *rhs, const int64_t *effective_bias, int16_t *dst, const int32_t dst_multiplier, const int32_t dst_shift, const int32_t rhs_cols, const int32_t rhs_rows, const int32_t batches, const int32_t batch_offset) |
The result of the multiplication is accumulated to the passed result buffer. Multiplies a matrix by a "batched" vector (i.e. a matrix with a batch dimension composed by input vectors independent from each other). | |
arm_cmsis_nn_status | arm_nn_vec_mat_mult_t_per_ch_s8 (const int8_t *lhs, const int8_t *rhs, const int32_t *kernel_sum, const int32_t *bias, int8_t *dst, const int32_t lhs_offset, const int32_t dst_offset, const int32_t *dst_multiplier, const int32_t *dst_shift, const int32_t rhs_cols, const int32_t rhs_rows, const int32_t activation_min, const int32_t activation_max, const int32_t address_offset, const int32_t rhs_offset) |
s8 Vector by Matrix (transposed) multiplication using per channel quantization for output | |
arm_cmsis_nn_status | arm_nn_vec_mat_mult_t_s16 (const int16_t *lhs, const int8_t *rhs, const int64_t *bias, int16_t *dst, const int32_t dst_multiplier, const int32_t dst_shift, const int32_t rhs_cols, const int32_t rhs_rows, const int32_t activation_min, const int32_t activation_max) |
s16 Vector by s8 Matrix (transposed) multiplication | |
arm_cmsis_nn_status | arm_nn_vec_mat_mult_t_s16_s16 (const int16_t *lhs, const int16_t *rhs, const int64_t *bias, int16_t *dst, const int32_t dst_multiplier, const int32_t dst_shift, const int32_t rhs_cols, const int32_t rhs_rows, const int32_t activation_min, const int32_t activation_max) |
s16 Vector by s16 Matrix (transposed) multiplication | |
arm_cmsis_nn_status | arm_nn_vec_mat_mult_t_s4 (const int8_t *lhs, const int8_t *packed_rhs, const int32_t *bias, int8_t *dst, const int32_t lhs_offset, const int32_t dst_offset, const int32_t dst_multiplier, const int32_t dst_shift, const int32_t rhs_cols, const int32_t rhs_rows, const int32_t activation_min, const int32_t activation_max) |
s4 Vector by Matrix (transposed) multiplication | |
arm_cmsis_nn_status | arm_nn_vec_mat_mult_t_s8 (const int8_t *lhs, const int8_t *rhs, const int32_t *kernel_sum, const int32_t *bias, int8_t *dst, const int32_t lhs_offset, const int32_t dst_offset, const int32_t dst_multiplier, const int32_t dst_shift, const int32_t rhs_cols, const int32_t rhs_rows, const int32_t activation_min, const int32_t activation_max, const int32_t address_offset, const int32_t rhs_offset) |
s8 Vector by Matrix (transposed) multiplication | |
arm_cmsis_nn_status | arm_nn_vec_mat_mult_t_svdf_s8 (const int8_t *lhs, const int8_t *rhs, int16_t *dst, const int32_t lhs_offset, const int32_t dst_offset, const int32_t dst_multiplier, const int32_t dst_shift, const int32_t rhs_cols, const int32_t rhs_rows, const int32_t activation_min, const int32_t activation_max) |
s8 Vector by Matrix (transposed) multiplication with s16 output | |
Support functions for Fully Connected
arm_cmsis_nn_status arm_nn_vec_mat_mul_result_acc_s16 | ( | const int16_t * | lhs, |
const int8_t * | rhs, | ||
const int64_t * | effective_bias, | ||
int16_t * | dst, | ||
const int32_t | dst_multiplier, | ||
const int32_t | dst_shift, | ||
const int32_t | rhs_cols, | ||
const int32_t | rhs_rows, | ||
const int32_t | batches, | ||
const int32_t | batch_offset | ||
) |
The result of the multiplication is accumulated to the passed result buffer. Multiplies a matrix by a "batched" vector (i.e. a matrix with a batch dimension composed by input vectors independent from each other).
[in] | lhs | Batched vector |
[in] | rhs | Weights - input matrix (H(Rows)xW(Columns)) |
[in] | effective_bias | Bias + lhs_offset * kernel_sum term precalculated into a constant vector. |
[out] | dst | Output |
[in] | dst_multiplier | Multiplier for quantization |
[in] | dst_shift | Shift for quantization |
[in] | rhs_cols | Vector/matarix column length |
[in] | rhs_rows | Row count of matrix |
[in] | batches | Batch size |
[in] | batch_offset | Number of timesteps between consecutive batches in input, see arm_nn_lstm_step_s16. Note that the output is always stored with sequential batches. |
ARM_CMSIS_NN_SUCCESS
arm_cmsis_nn_status arm_nn_vec_mat_mult_t_per_ch_s8 | ( | const int8_t * | lhs, |
const int8_t * | rhs, | ||
const int32_t * | kernel_sum, | ||
const int32_t * | bias, | ||
int8_t * | dst, | ||
const int32_t | lhs_offset, | ||
const int32_t | dst_offset, | ||
const int32_t * | dst_multiplier, | ||
const int32_t * | dst_shift, | ||
const int32_t | rhs_cols, | ||
const int32_t | rhs_rows, | ||
const int32_t | activation_min, | ||
const int32_t | activation_max, | ||
const int32_t | address_offset, | ||
const int32_t | rhs_offset | ||
) |
s8 Vector by Matrix (transposed) multiplication using per channel quantization for output
[in] | lhs | Input left-hand side vector |
[in] | rhs | Input right-hand side matrix (transposed) |
[in] | kernel_sum | Kernel sums of the kernels (rhs). See arm_vector_sum_s8 for more info. |
[in] | bias | Input bias |
[out] | dst | Output vector |
[in] | lhs_offset | Offset to be added to the input values of the left-hand side vector. Range: -127 to 128 |
[in] | dst_offset | Offset to be added to the output values. Range: -127 to 128 |
[in] | dst_multiplier | Output multipliers |
[in] | dst_shift | Output shifts |
[in] | rhs_cols | Number of columns in the right-hand side input matrix |
[in] | rhs_rows | Number of rows in the right-hand side input matrix |
[in] | activation_min | Minimum value to clamp the output to. Range: int8 |
[in] | activation_max | Maximum value to clamp the output to. Range: int8 |
[in] | address_offset | Memory position offset for dst. First output is stored at 'dst', the second at 'dst + address_offset' and so on. Default value is typically 1. |
[in] | rhs_offset | Offset to be added to the input values of the right-hand side vector. Range: -127 to 128 |
ARM_CMSIS_NN_SUCCESS
arm_cmsis_nn_status arm_nn_vec_mat_mult_t_s16 | ( | const int16_t * | lhs, |
const int8_t * | rhs, | ||
const int64_t * | bias, | ||
int16_t * | dst, | ||
const int32_t | dst_multiplier, | ||
const int32_t | dst_shift, | ||
const int32_t | rhs_cols, | ||
const int32_t | rhs_rows, | ||
const int32_t | activation_min, | ||
const int32_t | activation_max | ||
) |
s16 Vector by s8 Matrix (transposed) multiplication
[in] | lhs | Input left-hand side vector |
[in] | rhs | Input right-hand side matrix (transposed) |
[in] | bias | Input bias |
[out] | dst | Output vector |
[in] | dst_multiplier | Output multiplier |
[in] | dst_shift | Output shift |
[in] | rhs_cols | Number of columns in the right-hand side input matrix |
[in] | rhs_rows | Number of rows in the right-hand side input matrix |
[in] | activation_min | Minimum value to clamp the output to. Range: int16 |
[in] | activation_max | Maximum value to clamp the output to. Range: int16 |
ARM_CMSIS_NN_SUCCESS
arm_cmsis_nn_status arm_nn_vec_mat_mult_t_s16_s16 | ( | const int16_t * | lhs, |
const int16_t * | rhs, | ||
const int64_t * | bias, | ||
int16_t * | dst, | ||
const int32_t | dst_multiplier, | ||
const int32_t | dst_shift, | ||
const int32_t | rhs_cols, | ||
const int32_t | rhs_rows, | ||
const int32_t | activation_min, | ||
const int32_t | activation_max | ||
) |
s16 Vector by s16 Matrix (transposed) multiplication
[in] | lhs | Input left-hand side vector |
[in] | rhs | Input right-hand side matrix (transposed) |
[in] | bias | Input bias |
[out] | dst | Output vector |
[in] | dst_multiplier | Output multiplier |
[in] | dst_shift | Output shift |
[in] | rhs_cols | Number of columns in the right-hand side input matrix |
[in] | rhs_rows | Number of rows in the right-hand side input matrix |
[in] | activation_min | Minimum value to clamp the output to. Range: int16 |
[in] | activation_max | Maximum value to clamp the output to. Range: int16 |
ARM_CMSIS_NN_SUCCESS
arm_cmsis_nn_status arm_nn_vec_mat_mult_t_s4 | ( | const int8_t * | lhs, |
const int8_t * | packed_rhs, | ||
const int32_t * | bias, | ||
int8_t * | dst, | ||
const int32_t | lhs_offset, | ||
const int32_t | dst_offset, | ||
const int32_t | dst_multiplier, | ||
const int32_t | dst_shift, | ||
const int32_t | rhs_cols, | ||
const int32_t | rhs_rows, | ||
const int32_t | activation_min, | ||
const int32_t | activation_max | ||
) |
s4 Vector by Matrix (transposed) multiplication
[in] | lhs | Input left-hand side vector |
[in] | packed_rhs | Input right-hand side matrix (transposed) |
[in] | bias | Input bias |
[out] | dst | Output vector |
[in] | lhs_offset | Offset to be added to the input values of the left-hand side vector. Range: -127 to 128 |
[in] | dst_offset | Offset to be added to the output values. Range: -127 to 128 |
[in] | dst_multiplier | Output multiplier |
[in] | dst_shift | Output shift |
[in] | rhs_cols | Number of columns in the right-hand side input matrix |
[in] | rhs_rows | Number of rows in the right-hand side input matrix |
[in] | activation_min | Minimum value to clamp the output to. Range: int8 |
[in] | activation_max | Maximum value to clamp the output to. Range: int8 |
ARM_CMSIS_NN_SUCCESS
arm_cmsis_nn_status arm_nn_vec_mat_mult_t_s8 | ( | const int8_t * | lhs, |
const int8_t * | rhs, | ||
const int32_t * | kernel_sum, | ||
const int32_t * | bias, | ||
int8_t * | dst, | ||
const int32_t | lhs_offset, | ||
const int32_t | dst_offset, | ||
const int32_t | dst_multiplier, | ||
const int32_t | dst_shift, | ||
const int32_t | rhs_cols, | ||
const int32_t | rhs_rows, | ||
const int32_t | activation_min, | ||
const int32_t | activation_max, | ||
const int32_t | address_offset, | ||
const int32_t | rhs_offset | ||
) |
s8 Vector by Matrix (transposed) multiplication
[in] | lhs | Input left-hand side vector |
[in] | rhs | Input right-hand side matrix (transposed) |
[in] | kernel_sum | Kernel sums of the kernels (rhs). See arm_vector_sum_s8 for more info. |
[in] | bias | Input bias |
[out] | dst | Output vector |
[in] | lhs_offset | Offset to be added to the input values of the left-hand side vector. Range: -127 to 128 |
[in] | dst_offset | Offset to be added to the output values. Range: -127 to 128 |
[in] | dst_multiplier | Output multiplier |
[in] | dst_shift | Output shift |
[in] | rhs_cols | Number of columns in the right-hand side input matrix |
[in] | rhs_rows | Number of rows in the right-hand side input matrix |
[in] | activation_min | Minimum value to clamp the output to. Range: int8 |
[in] | activation_max | Maximum value to clamp the output to. Range: int8 |
[in] | address_offset | Memory position offset for dst. First output is stored at 'dst', the second at 'dst + address_offset' and so on. Default value is typically 1. |
[in] | rhs_offset | Offset to be added to the input values of the right-hand side vector. Range: -127 to 128 |
ARM_CMSIS_NN_SUCCESS
arm_cmsis_nn_status arm_nn_vec_mat_mult_t_svdf_s8 | ( | const int8_t * | lhs, |
const int8_t * | rhs, | ||
int16_t * | dst, | ||
const int32_t | lhs_offset, | ||
const int32_t | scatter_offset, | ||
const int32_t | dst_multiplier, | ||
const int32_t | dst_shift, | ||
const int32_t | rhs_cols, | ||
const int32_t | rhs_rows, | ||
const int32_t | activation_min, | ||
const int32_t | activation_max | ||
) |
s8 Vector by Matrix (transposed) multiplication with s16 output
[in] | lhs | Input left-hand side vector |
[in] | rhs | Input right-hand side matrix (transposed) |
[out] | dst | Output vector |
[in] | lhs_offset | Offset to be added to the input values of the left-hand side vector. Range: -127 to 128 |
[in] | scatter_offset | Address offset for dst. First output is stored at 'dst', the second at 'dst + scatter_offset' and so on. |
[in] | dst_multiplier | Output multiplier |
[in] | dst_shift | Output shift |
[in] | rhs_cols | Number of columns in the right-hand side input matrix |
[in] | rhs_rows | Number of rows in the right-hand side input matrix |
[in] | activation_min | Minimum value to clamp the output to. Range: int16 |
[in] | activation_max | Maximum value to clamp the output to. Range: int16 |
ARM_CMSIS_NN_SUCCESS