Content | |
GetBufferSizeFC | |
Functions | |
arm_cmsis_nn_status | arm_batch_matmul_s16 (const cmsis_nn_context *ctx, const cmsis_nn_bmm_params *bmm_params, const cmsis_nn_per_tensor_quant_params *quant_params, const cmsis_nn_dims *input_lhs_dims, const int16_t *input_lhs, const cmsis_nn_dims *input_rhs_dims, const int16_t *input_rhs, const cmsis_nn_dims *output_dims, int16_t *output) |
Batch matmul function with 16 bit input and output. | |
arm_cmsis_nn_status | arm_batch_matmul_s8 (const cmsis_nn_context *ctx, const cmsis_nn_bmm_params *bmm_params, const cmsis_nn_per_tensor_quant_params *quant_params, const cmsis_nn_dims *input_lhs_dims, const int8_t *input_lhs, const cmsis_nn_dims *input_rhs_dims, const int8_t *input_rhs, const cmsis_nn_dims *output_dims, int8_t *output) |
Batch matmul function with 8 bit input and output. | |
arm_cmsis_nn_status | arm_fully_connected_per_channel_s8 (const cmsis_nn_context *ctx, const cmsis_nn_fc_params *fc_params, const cmsis_nn_per_channel_quant_params *quant_params, const cmsis_nn_dims *input_dims, const int8_t *input_data, const cmsis_nn_dims *filter_dims, const int8_t *kernel, const cmsis_nn_dims *bias_dims, const int32_t *bias_data, const cmsis_nn_dims *output_dims, int8_t *output_data) |
Basic s8 Fully Connected function using per channel quantization. | |
arm_cmsis_nn_status | arm_fully_connected_s16 (const cmsis_nn_context *ctx, const cmsis_nn_fc_params *fc_params, const cmsis_nn_per_tensor_quant_params *quant_params, const cmsis_nn_dims *input_dims, const int16_t *input, const cmsis_nn_dims *filter_dims, const int8_t *kernel, const cmsis_nn_dims *bias_dims, const int64_t *bias, const cmsis_nn_dims *output_dims, int16_t *output) |
Basic s16 Fully Connected function. | |
arm_cmsis_nn_status | arm_fully_connected_s4 (const cmsis_nn_context *ctx, const cmsis_nn_fc_params *fc_params, const cmsis_nn_per_tensor_quant_params *quant_params, const cmsis_nn_dims *input_dims, const int8_t *input, const cmsis_nn_dims *filter_dims, const int8_t *kernel, const cmsis_nn_dims *bias_dims, const int32_t *bias, const cmsis_nn_dims *output_dims, int8_t *output) |
Basic s4 Fully Connected function. | |
arm_cmsis_nn_status | arm_fully_connected_s8 (const cmsis_nn_context *ctx, const cmsis_nn_fc_params *fc_params, const cmsis_nn_per_tensor_quant_params *quant_params, const cmsis_nn_dims *input_dims, const int8_t *input, const cmsis_nn_dims *filter_dims, const int8_t *kernel, const cmsis_nn_dims *bias_dims, const int32_t *bias, const cmsis_nn_dims *output_dims, int8_t *output) |
Basic s8 Fully Connected function. | |
arm_cmsis_nn_status | arm_fully_connected_wrapper_s8 (const cmsis_nn_context *ctx, const cmsis_nn_fc_params *fc_params, const cmsis_nn_quant_params *quant_params, const cmsis_nn_dims *input_dims, const int8_t *input_data, const cmsis_nn_dims *filter_dims, const int8_t *filter_data, const cmsis_nn_dims *bias_dims, const int32_t *bias_data, const cmsis_nn_dims *output_dims, int8_t *output_data) |
s8 Fully Connected layer wrapper function | |
arm_cmsis_nn_status | arm_vector_sum_s8 (int32_t *vector_sum_buf, const int32_t vector_cols, const int32_t vector_rows, const int8_t *vector_data, const int32_t lhs_offset, const int32_t rhs_offset, const int32_t *bias_data) |
Calculate the sum of each row in vector_data, multiply by lhs_offset and optionally add s32 bias_data. | |
arm_cmsis_nn_status | arm_vector_sum_s8_s64 (int64_t *vector_sum_buf, const int32_t vector_cols, const int32_t vector_rows, const int8_t *vector_data, const int32_t lhs_offset, const int64_t *bias_data) |
Calculate the sum of each row in vector_data, multiply by lhs_offset and optionally add s64 bias_data. | |
Collection of fully-connected and matrix multiplication functions.
Fully-connected layer is basically a matrix-vector multiplication with bias. The matrix is the weights and the input/output vectors are the activation values. Supported {weight, activation} precisions include {8-bit, 8-bit} and {8-bit, 16-bit}
arm_cmsis_nn_status arm_batch_matmul_s16 | ( | const cmsis_nn_context * | ctx, |
const cmsis_nn_bmm_params * | bmm_params, | ||
const cmsis_nn_per_tensor_quant_params * | quant_params, | ||
const cmsis_nn_dims * | input_lhs_dims, | ||
const int16_t * | input_lhs, | ||
const cmsis_nn_dims * | input_rhs_dims, | ||
const int16_t * | input_rhs, | ||
const cmsis_nn_dims * | output_dims, | ||
int16_t * | output | ||
) |
Batch matmul function with 16 bit input and output.
[in] | ctx | Temporary scratch buffer The caller is expected to clear the buffer, if applicable, for security reasons. Optional function arm_fully_connected_s8_get_buffer_size() provides the buffer size if an additional buffer is required. |
[in] | bmm_params | Batch matmul Parameters Adjoint flags are currently unused. |
[in] | quant_params | Quantization parameters |
[in] | input_lhs_dims | Input lhs tensor dimensions. This should be NHWC where LHS.C = RHS.C |
[in] | input_lhs | Pointer to input tensor |
[in] | input_rhs_dims | Input lhs tensor dimensions. This is expected to be transposed so should be NHWC where LHS.C = RHS.C |
[in] | input_rhs | Pointer to transposed input tensor |
[in] | output_dims | Output tensor dimensions |
[out] | output | Pointer to the output tensor |
ARM_CMSIS_NN_SUCCESS
arm_cmsis_nn_status arm_batch_matmul_s8 | ( | const cmsis_nn_context * | ctx, |
const cmsis_nn_bmm_params * | bmm_params, | ||
const cmsis_nn_per_tensor_quant_params * | quant_params, | ||
const cmsis_nn_dims * | input_lhs_dims, | ||
const int8_t * | input_lhs, | ||
const cmsis_nn_dims * | input_rhs_dims, | ||
const int8_t * | input_rhs, | ||
const cmsis_nn_dims * | output_dims, | ||
int8_t * | output | ||
) |
Batch matmul function with 8 bit input and output.
[in] | ctx | Temporary scratch buffer The caller is expected to clear the buffer, if applicable, for security reasons. Optional function arm_fully_connected_s8_get_buffer_size() provides the buffer size if an additional buffer is required. |
[in] | bmm_params | Batch matmul Parameters Adjoint flags are currently unused. |
[in] | quant_params | Quantization parameters |
[in] | input_lhs_dims | Input lhs tensor dimensions. This should be NHWC where lhs C = rhs C |
[in] | input_lhs | Pointer to input tensor |
[in] | input_rhs_dims | Input lhs tensor dimensions. This is expected to be transposed so should be NHWC where lhs C = rhs C |
[in] | input_rhs | Pointer to transposed input tensor |
[in] | output_dims | Output tensor dimensions |
[out] | output | Pointer to the output tensor |
ARM_CMSIS_NN_SUCCESS
arm_cmsis_nn_status arm_fully_connected_per_channel_s8 | ( | const cmsis_nn_context * | ctx, |
const cmsis_nn_fc_params * | fc_params, | ||
const cmsis_nn_per_channel_quant_params * | quant_params, | ||
const cmsis_nn_dims * | input_dims, | ||
const int8_t * | input_data, | ||
const cmsis_nn_dims * | filter_dims, | ||
const int8_t * | filter_data, | ||
const cmsis_nn_dims * | bias_dims, | ||
const int32_t * | bias_data, | ||
const cmsis_nn_dims * | output_dims, | ||
int8_t * | output_data | ||
) |
Basic s8 Fully Connected function using per channel quantization.
[in,out] | ctx | Function context (e.g. temporary buffer). Check the function definition file to see if an additional buffer is required. Optional function {API}_get_buffer_size() provides the buffer size if an additional buffer is required. The caller is expected to clear the buffer, if applicable, for security reasons. |
[in] | fc_params | Fully Connected layer parameters. Range of fc_params->input_offset : [-127, 128] fc_params->filter_offset : 0 Range of fc_params->output_offset : [-128, 127] |
[in] | quant_params | Per-channel quantization info. It contains the multiplier and shift values to be applied to each output channel |
[in] | input_dims | Input (activation) tensor dimensions. Format: [N, H, W, C_IN] Input dimension is taken as Nx(H * W * C_IN) |
[in] | input_data | Input (activation) data pointer. Data type: int8 |
[in] | filter_dims | Two dimensional filter dimensions. Format: [N, C] N : accumulation depth and equals (H * W * C_IN) from input_dims C : output depth and equals C_OUT in output_dims H & W : Not used |
[in] | filter_data | Filter data pointer. Data type: int8 |
[in] | bias_dims | Bias tensor dimensions. Format: [C_OUT] N, H, W : Not used |
[in] | bias_data | Bias data pointer. Data type: int32 |
[in] | output_dims | Output tensor dimensions. Format: [N, C_OUT] N : Batches C_OUT : Output depth H & W : Not used. |
[in,out] | output_data | Output data pointer. Data type: int8 |
ARM_CMSIS_NN_ARG_ERROR
if argument constraints fail. or, ARM_CMSIS_NN_SUCCESS
on successful completion.arm_cmsis_nn_status arm_fully_connected_s16 | ( | const cmsis_nn_context * | ctx, |
const cmsis_nn_fc_params * | fc_params, | ||
const cmsis_nn_per_tensor_quant_params * | quant_params, | ||
const cmsis_nn_dims * | input_dims, | ||
const int16_t * | input_data, | ||
const cmsis_nn_dims * | filter_dims, | ||
const int8_t * | filter_data, | ||
const cmsis_nn_dims * | bias_dims, | ||
const int64_t * | bias_data, | ||
const cmsis_nn_dims * | output_dims, | ||
int16_t * | output_data | ||
) |
Basic s16 Fully Connected function.
[in,out] | ctx | Function context (e.g. temporary buffer). Check the function definition file to see if an additional buffer is required. Optional function {API}_get_buffer_size() provides the buffer size if an additional buffer is required. The caller is expected to clear the buffer, if applicable, for security reasons. |
[in] | fc_params | Fully Connected layer parameters. fc_params->input_offset : 0 fc_params->filter_offset : 0 fc_params->output_offset : 0 |
[in] | quant_params | Per-tensor quantization info. It contains the multiplier and shift value to be applied to the output tensor. |
[in] | input_dims | Input (activation) tensor dimensions. Format: [N, H, W, C_IN] Input dimension is taken as Nx(H * W * C_IN) |
[in] | input_data | Input (activation) data pointer. Data type: int16 |
[in] | filter_dims | Two dimensional filter dimensions. Format: [N, C] N : accumulation depth and equals (H * W * C_IN) from input_dims C : output depth and equals C_OUT in output_dims H & W : Not used |
[in] | filter_data | Filter data pointer. Data type: int8 |
[in] | bias_dims | Bias tensor dimensions. Format: [C_OUT] N, H, W : Not used |
[in] | bias_data | Bias data pointer. Data type: int64 |
[in] | output_dims | Output tensor dimensions. Format: [N, C_OUT] N : Batches C_OUT : Output depth H & W : Not used. |
[in,out] | output_data | Output data pointer. Data type: int16 |
ARM_CMSIS_NN_SUCCESS
arm_cmsis_nn_status arm_fully_connected_s4 | ( | const cmsis_nn_context * | ctx, |
const cmsis_nn_fc_params * | fc_params, | ||
const cmsis_nn_per_tensor_quant_params * | quant_params, | ||
const cmsis_nn_dims * | input_dims, | ||
const int8_t * | input_data, | ||
const cmsis_nn_dims * | filter_dims, | ||
const int8_t * | filter_data, | ||
const cmsis_nn_dims * | bias_dims, | ||
const int32_t * | bias_data, | ||
const cmsis_nn_dims * | output_dims, | ||
int8_t * | output_data | ||
) |
Basic s4 Fully Connected function.
[in,out] | ctx | Function context (e.g. temporary buffer). Check the function definition file to see if an additional buffer is required. Optional function {API}_get_buffer_size() provides the buffer size if an additional buffer is required. The caller is expected to clear the buffer ,if applicable, for security reasons. |
[in] | fc_params | Fully Connected layer parameters. Range of fc_params->input_offset : [-127, 128] fc_params->filter_offset : 0 Range of fc_params->output_offset : [-128, 127] |
[in] | quant_params | Per-tensor quantization info. It contains the multiplier and shift value to be applied to the output tensor. |
[in] | input_dims | Input (activation) tensor dimensions. Format: [N, H, W, C_IN] Input dimension is taken as Nx(H * W * C_IN) |
[in] | input_data | Input (activation) data pointer. Data type: int8 |
[in] | filter_dims | Two dimensional filter dimensions. Format: [N, C] N : accumulation depth and equals (H * W * C_IN) from input_dims C : output depth and equals C_OUT in output_dims H & W : Not used |
[in] | filter_data | Filter data pointer. Data type: int8_t packed 4-bit weights, e.g four sequential weights [0x1, 0x2, 0x3, 0x4] packed as [0x21, 0x43]. |
[in] | bias_dims | Bias tensor dimensions. Format: [C_OUT] N, H, W : Not used |
[in] | bias_data | Bias data pointer. Data type: int32 |
[in] | output_dims | Output tensor dimensions. Format: [N, C_OUT] N : Batches C_OUT : Output depth H & W : Not used. |
[in,out] | output_data | Output data pointer. Data type: int8 |
ARM_CMSIS_NN_SUCCESS
arm_cmsis_nn_status arm_fully_connected_s8 | ( | const cmsis_nn_context * | ctx, |
const cmsis_nn_fc_params * | fc_params, | ||
const cmsis_nn_per_tensor_quant_params * | quant_params, | ||
const cmsis_nn_dims * | input_dims, | ||
const int8_t * | input_data, | ||
const cmsis_nn_dims * | filter_dims, | ||
const int8_t * | filter_data, | ||
const cmsis_nn_dims * | bias_dims, | ||
const int32_t * | bias_data, | ||
const cmsis_nn_dims * | output_dims, | ||
int8_t * | output_data | ||
) |
Basic s8 Fully Connected function.
[in,out] | ctx | Function context (e.g. temporary buffer). Check the function definition file to see if an additional buffer is required. Optional function {API}_get_buffer_size() provides the buffer size if an additional buffer is required. The caller is expected to clear the buffer, if applicable, for security reasons. |
[in] | fc_params | Fully Connected layer parameters. Range of fc_params->input_offset : [-127, 128] fc_params->filter_offset : 0 Range of fc_params->output_offset : [-128, 127] |
[in] | quant_params | Per-tensor quantization info. It contains the multiplier and shift value to be applied to the output tensor. |
[in] | input_dims | Input (activation) tensor dimensions. Format: [N, H, W, C_IN] Input dimension is taken as Nx(H * W * C_IN) |
[in] | input_data | Input (activation) data pointer. Data type: int8 |
[in] | filter_dims | Two dimensional filter dimensions. Format: [N, C] N : accumulation depth and equals (H * W * C_IN) from input_dims C : output depth and equals C_OUT in output_dims H & W : Not used |
[in] | filter_data | Filter data pointer. Data type: int8 |
[in] | bias_dims | Bias tensor dimensions. Format: [C_OUT] N, H, W : Not used |
[in] | bias_data | Bias data pointer. Data type: int32 |
[in] | output_dims | Output tensor dimensions. Format: [N, C_OUT] N : Batches C_OUT : Output depth H & W : Not used. |
[in,out] | output_data | Output data pointer. Data type: int8 |
ARM_CMSIS_NN_ARG_ERROR
if argument constraints fail. or, ARM_CMSIS_NN_SUCCESS
on successful completion.arm_cmsis_nn_status arm_fully_connected_wrapper_s8 | ( | const cmsis_nn_context * | ctx, |
const cmsis_nn_fc_params * | fc_params, | ||
const cmsis_nn_quant_params * | quant_params, | ||
const cmsis_nn_dims * | input_dims, | ||
const int8_t * | input_data, | ||
const cmsis_nn_dims * | filter_dims, | ||
const int8_t * | filter_data, | ||
const cmsis_nn_dims * | bias_dims, | ||
const int32_t * | bias_data, | ||
const cmsis_nn_dims * | output_dims, | ||
int8_t * | output_data | ||
) |
s8 Fully Connected layer wrapper function
[in,out] | ctx | Function context (e.g. temporary buffer). Check the function definition file to see if an additional buffer is required. Optional function {API}_get_buffer_size() provides the buffer size if an additional buffer is required. The caller is expected to clear the buffer, if applicable, for security reasons. |
[in] | fc_params | Fully Connected layer parameters. Range of fc_params->input_offset : [-127, 128] fc_params->filter_offset : 0 Range of fc_params->output_offset : [-128, 127] |
[in] | quant_params | Per-channel or per-tensor quantization info. Check struct defintion for details. It contains the multiplier and shift value(s) to be applied to each output channel |
[in] | input_dims | Input (activation) tensor dimensions. Format: [N, H, W, C_IN] Input dimension is taken as Nx(H * W * C_IN) |
[in] | input_data | Input (activation) data pointer. Data type: int8 |
[in] | filter_dims | Two dimensional filter dimensions. Format: [N, C] N : accumulation depth and equals (H * W * C_IN) from input_dims C : output depth and equals C_OUT in output_dims H & W : Not used |
[in] | filter_data | Filter data pointer. Data type: int8 |
[in] | bias_dims | Bias tensor dimensions. Format: [C_OUT] N, H, W : Not used |
[in] | bias_data | Bias data pointer. Data type: int32 |
[in] | output_dims | Output tensor dimensions. Format: [N, C_OUT] N : Batches C_OUT : Output depth H & W : Not used. |
[in,out] | output_data | Output data pointer. Data type: int8 |
ARM_CMSIS_NN_ARG_ERROR
if argument constraints fail. or, ARM_CMSIS_NN_SUCCESS
on successful completion.arm_cmsis_nn_status arm_vector_sum_s8 | ( | int32_t * | vector_sum_buf, |
const int32_t | vector_cols, | ||
const int32_t | vector_rows, | ||
const int8_t * | vector_data, | ||
const int32_t | lhs_offset, | ||
const int32_t | rhs_offset, | ||
const int32_t * | bias_data | ||
) |
Calculate the sum of each row in vector_data, multiply by lhs_offset and optionally add s32 bias_data.
[in,out] | vector_sum_buf | Buffer for vector sums |
[in] | vector_cols | Number of vector columns |
[in] | vector_rows | Number of vector rows |
[in] | vector_data | Vector of weigths data |
[in] | lhs_offset | Constant multiplied with each sum |
[in] | rhs_offset | Constant added to each vector element before sum |
[in] | bias_data | Vector of bias data, added to each sum. |
ARM_CMSIS_NN_SUCCESS
- Successful operation arm_cmsis_nn_status arm_vector_sum_s8_s64 | ( | int64_t * | vector_sum_buf, |
const int32_t | vector_cols, | ||
const int32_t | vector_rows, | ||
const int8_t * | vector_data, | ||
const int32_t | lhs_offset, | ||
const int64_t * | bias_data | ||
) |
Calculate the sum of each row in vector_data, multiply by lhs_offset and optionally add s64 bias_data.
[in,out] | vector_sum_buf | Buffer for vector sums |
[in] | vector_cols | Number of vector columns |
[in] | vector_rows | Number of vector rows |
[in] | vector_data | Vector of weigths data |
[in] | lhs_offset | Constant multiplied with each sum |
[in] | bias_data | Vector of bias data, added to each sum. |
ARM_CMSIS_NN_SUCCESS
- Successful operation