CMSIS-NN  
CMSIS NN Software Library
 
Loading...
Searching...
No Matches
Fully-connected Layer Functions

Content

 GetBufferSizeFC
 

Functions

arm_cmsis_nn_status arm_batch_matmul_s16 (const cmsis_nn_context *ctx, const cmsis_nn_bmm_params *bmm_params, const cmsis_nn_per_tensor_quant_params *quant_params, const cmsis_nn_dims *input_lhs_dims, const int16_t *input_lhs, const cmsis_nn_dims *input_rhs_dims, const int16_t *input_rhs, const cmsis_nn_dims *output_dims, int16_t *output)
 Batch matmul function with 16 bit input and output.
 
arm_cmsis_nn_status arm_batch_matmul_s8 (const cmsis_nn_context *ctx, const cmsis_nn_bmm_params *bmm_params, const cmsis_nn_per_tensor_quant_params *quant_params, const cmsis_nn_dims *input_lhs_dims, const int8_t *input_lhs, const cmsis_nn_dims *input_rhs_dims, const int8_t *input_rhs, const cmsis_nn_dims *output_dims, int8_t *output)
 Batch matmul function with 8 bit input and output.
 
arm_cmsis_nn_status arm_fully_connected_per_channel_s8 (const cmsis_nn_context *ctx, const cmsis_nn_fc_params *fc_params, const cmsis_nn_per_channel_quant_params *quant_params, const cmsis_nn_dims *input_dims, const int8_t *input_data, const cmsis_nn_dims *filter_dims, const int8_t *kernel, const cmsis_nn_dims *bias_dims, const int32_t *bias_data, const cmsis_nn_dims *output_dims, int8_t *output_data)
 Basic s8 Fully Connected function using per channel quantization.
 
arm_cmsis_nn_status arm_fully_connected_s16 (const cmsis_nn_context *ctx, const cmsis_nn_fc_params *fc_params, const cmsis_nn_per_tensor_quant_params *quant_params, const cmsis_nn_dims *input_dims, const int16_t *input, const cmsis_nn_dims *filter_dims, const int8_t *kernel, const cmsis_nn_dims *bias_dims, const int64_t *bias, const cmsis_nn_dims *output_dims, int16_t *output)
 Basic s16 Fully Connected function.
 
arm_cmsis_nn_status arm_fully_connected_s4 (const cmsis_nn_context *ctx, const cmsis_nn_fc_params *fc_params, const cmsis_nn_per_tensor_quant_params *quant_params, const cmsis_nn_dims *input_dims, const int8_t *input, const cmsis_nn_dims *filter_dims, const int8_t *kernel, const cmsis_nn_dims *bias_dims, const int32_t *bias, const cmsis_nn_dims *output_dims, int8_t *output)
 Basic s4 Fully Connected function.
 
arm_cmsis_nn_status arm_fully_connected_s8 (const cmsis_nn_context *ctx, const cmsis_nn_fc_params *fc_params, const cmsis_nn_per_tensor_quant_params *quant_params, const cmsis_nn_dims *input_dims, const int8_t *input, const cmsis_nn_dims *filter_dims, const int8_t *kernel, const cmsis_nn_dims *bias_dims, const int32_t *bias, const cmsis_nn_dims *output_dims, int8_t *output)
 Basic s8 Fully Connected function.
 
arm_cmsis_nn_status arm_fully_connected_wrapper_s8 (const cmsis_nn_context *ctx, const cmsis_nn_fc_params *fc_params, const cmsis_nn_quant_params *quant_params, const cmsis_nn_dims *input_dims, const int8_t *input_data, const cmsis_nn_dims *filter_dims, const int8_t *filter_data, const cmsis_nn_dims *bias_dims, const int32_t *bias_data, const cmsis_nn_dims *output_dims, int8_t *output_data)
 s8 Fully Connected layer wrapper function
 
arm_cmsis_nn_status arm_vector_sum_s8 (int32_t *vector_sum_buf, const int32_t vector_cols, const int32_t vector_rows, const int8_t *vector_data, const int32_t lhs_offset, const int32_t rhs_offset, const int32_t *bias_data)
 Calculate the sum of each row in vector_data, multiply by lhs_offset and optionally add s32 bias_data.
 
arm_cmsis_nn_status arm_vector_sum_s8_s64 (int64_t *vector_sum_buf, const int32_t vector_cols, const int32_t vector_rows, const int8_t *vector_data, const int32_t lhs_offset, const int64_t *bias_data)
 Calculate the sum of each row in vector_data, multiply by lhs_offset and optionally add s64 bias_data.
 

Description

Collection of fully-connected and matrix multiplication functions.

Fully-connected layer is basically a matrix-vector multiplication with bias. The matrix is the weights and the input/output vectors are the activation values. Supported {weight, activation} precisions include {8-bit, 8-bit} and {8-bit, 16-bit}

Function Documentation

◆ arm_batch_matmul_s16()

arm_cmsis_nn_status arm_batch_matmul_s16 ( const cmsis_nn_context ctx,
const cmsis_nn_bmm_params bmm_params,
const cmsis_nn_per_tensor_quant_params quant_params,
const cmsis_nn_dims input_lhs_dims,
const int16_t *  input_lhs,
const cmsis_nn_dims input_rhs_dims,
const int16_t *  input_rhs,
const cmsis_nn_dims output_dims,
int16_t *  output 
)

Batch matmul function with 16 bit input and output.

Parameters
[in]ctxTemporary scratch buffer The caller is expected to clear the buffer, if applicable, for security reasons. Optional function arm_fully_connected_s8_get_buffer_size() provides the buffer size if an additional buffer is required.
[in]bmm_paramsBatch matmul Parameters Adjoint flags are currently unused.
[in]quant_paramsQuantization parameters
[in]input_lhs_dimsInput lhs tensor dimensions. This should be NHWC where LHS.C = RHS.C
[in]input_lhsPointer to input tensor
[in]input_rhs_dimsInput lhs tensor dimensions. This is expected to be transposed so should be NHWC where LHS.C = RHS.C
[in]input_rhsPointer to transposed input tensor
[in]output_dimsOutput tensor dimensions
[out]outputPointer to the output tensor
Returns
The function returns ARM_CMSIS_NN_SUCCESS
  1. Supported framework: TensorFlow Lite Micro
  2. Performs row * row matrix multiplication with the RHS transposed.

◆ arm_batch_matmul_s8()

arm_cmsis_nn_status arm_batch_matmul_s8 ( const cmsis_nn_context ctx,
const cmsis_nn_bmm_params bmm_params,
const cmsis_nn_per_tensor_quant_params quant_params,
const cmsis_nn_dims input_lhs_dims,
const int8_t *  input_lhs,
const cmsis_nn_dims input_rhs_dims,
const int8_t *  input_rhs,
const cmsis_nn_dims output_dims,
int8_t *  output 
)

Batch matmul function with 8 bit input and output.

Parameters
[in]ctxTemporary scratch buffer The caller is expected to clear the buffer, if applicable, for security reasons. Optional function arm_fully_connected_s8_get_buffer_size() provides the buffer size if an additional buffer is required.
[in]bmm_paramsBatch matmul Parameters Adjoint flags are currently unused.
[in]quant_paramsQuantization parameters
[in]input_lhs_dimsInput lhs tensor dimensions. This should be NHWC where lhs C = rhs C
[in]input_lhsPointer to input tensor
[in]input_rhs_dimsInput lhs tensor dimensions. This is expected to be transposed so should be NHWC where lhs C = rhs C
[in]input_rhsPointer to transposed input tensor
[in]output_dimsOutput tensor dimensions
[out]outputPointer to the output tensor
Returns
The function returns ARM_CMSIS_NN_SUCCESS
  1. Supported framework: TensorFlow Lite Micro
  2. Performs row * row matrix multiplication with the RHS transposed.

◆ arm_fully_connected_per_channel_s8()

arm_cmsis_nn_status arm_fully_connected_per_channel_s8 ( const cmsis_nn_context ctx,
const cmsis_nn_fc_params fc_params,
const cmsis_nn_per_channel_quant_params quant_params,
const cmsis_nn_dims input_dims,
const int8_t *  input_data,
const cmsis_nn_dims filter_dims,
const int8_t *  filter_data,
const cmsis_nn_dims bias_dims,
const int32_t *  bias_data,
const cmsis_nn_dims output_dims,
int8_t *  output_data 
)

Basic s8 Fully Connected function using per channel quantization.

Parameters
[in,out]ctxFunction context (e.g. temporary buffer). Check the function definition file to see if an additional buffer is required. Optional function {API}_get_buffer_size() provides the buffer size if an additional buffer is required. The caller is expected to clear the buffer, if applicable, for security reasons.
[in]fc_paramsFully Connected layer parameters. Range of fc_params->input_offset : [-127, 128] fc_params->filter_offset : 0 Range of fc_params->output_offset : [-128, 127]
[in]quant_paramsPer-channel quantization info. It contains the multiplier and shift values to be applied to each output channel
[in]input_dimsInput (activation) tensor dimensions. Format: [N, H, W, C_IN] Input dimension is taken as Nx(H * W * C_IN)
[in]input_dataInput (activation) data pointer. Data type: int8
[in]filter_dimsTwo dimensional filter dimensions. Format: [N, C] N : accumulation depth and equals (H * W * C_IN) from input_dims C : output depth and equals C_OUT in output_dims H & W : Not used
[in]filter_dataFilter data pointer. Data type: int8
[in]bias_dimsBias tensor dimensions. Format: [C_OUT] N, H, W : Not used
[in]bias_dataBias data pointer. Data type: int32
[in]output_dimsOutput tensor dimensions. Format: [N, C_OUT] N : Batches C_OUT : Output depth H & W : Not used.
[in,out]output_dataOutput data pointer. Data type: int8
Returns
The function returns either ARM_CMSIS_NN_ARG_ERROR if argument constraints fail. or, ARM_CMSIS_NN_SUCCESS on successful completion.
  • Supported framework: TensorFlow Lite

◆ arm_fully_connected_s16()

arm_cmsis_nn_status arm_fully_connected_s16 ( const cmsis_nn_context ctx,
const cmsis_nn_fc_params fc_params,
const cmsis_nn_per_tensor_quant_params quant_params,
const cmsis_nn_dims input_dims,
const int16_t *  input_data,
const cmsis_nn_dims filter_dims,
const int8_t *  filter_data,
const cmsis_nn_dims bias_dims,
const int64_t *  bias_data,
const cmsis_nn_dims output_dims,
int16_t *  output_data 
)

Basic s16 Fully Connected function.

Parameters
[in,out]ctxFunction context (e.g. temporary buffer). Check the function definition file to see if an additional buffer is required. Optional function {API}_get_buffer_size() provides the buffer size if an additional buffer is required. The caller is expected to clear the buffer, if applicable, for security reasons.
[in]fc_paramsFully Connected layer parameters. fc_params->input_offset : 0 fc_params->filter_offset : 0 fc_params->output_offset : 0
[in]quant_paramsPer-tensor quantization info. It contains the multiplier and shift value to be applied to the output tensor.
[in]input_dimsInput (activation) tensor dimensions. Format: [N, H, W, C_IN] Input dimension is taken as Nx(H * W * C_IN)
[in]input_dataInput (activation) data pointer. Data type: int16
[in]filter_dimsTwo dimensional filter dimensions. Format: [N, C] N : accumulation depth and equals (H * W * C_IN) from input_dims C : output depth and equals C_OUT in output_dims H & W : Not used
[in]filter_dataFilter data pointer. Data type: int8
[in]bias_dimsBias tensor dimensions. Format: [C_OUT] N, H, W : Not used
[in]bias_dataBias data pointer. Data type: int64
[in]output_dimsOutput tensor dimensions. Format: [N, C_OUT] N : Batches C_OUT : Output depth H & W : Not used.
[in,out]output_dataOutput data pointer. Data type: int16
Returns
The function returns ARM_CMSIS_NN_SUCCESS
  • Supported framework: TensorFlow Lite

◆ arm_fully_connected_s4()

arm_cmsis_nn_status arm_fully_connected_s4 ( const cmsis_nn_context ctx,
const cmsis_nn_fc_params fc_params,
const cmsis_nn_per_tensor_quant_params quant_params,
const cmsis_nn_dims input_dims,
const int8_t *  input_data,
const cmsis_nn_dims filter_dims,
const int8_t *  filter_data,
const cmsis_nn_dims bias_dims,
const int32_t *  bias_data,
const cmsis_nn_dims output_dims,
int8_t *  output_data 
)

Basic s4 Fully Connected function.

Parameters
[in,out]ctxFunction context (e.g. temporary buffer). Check the function definition file to see if an additional buffer is required. Optional function {API}_get_buffer_size() provides the buffer size if an additional buffer is required. The caller is expected to clear the buffer ,if applicable, for security reasons.
[in]fc_paramsFully Connected layer parameters. Range of fc_params->input_offset : [-127, 128] fc_params->filter_offset : 0 Range of fc_params->output_offset : [-128, 127]
[in]quant_paramsPer-tensor quantization info. It contains the multiplier and shift value to be applied to the output tensor.
[in]input_dimsInput (activation) tensor dimensions. Format: [N, H, W, C_IN] Input dimension is taken as Nx(H * W * C_IN)
[in]input_dataInput (activation) data pointer. Data type: int8
[in]filter_dimsTwo dimensional filter dimensions. Format: [N, C] N : accumulation depth and equals (H * W * C_IN) from input_dims C : output depth and equals C_OUT in output_dims H & W : Not used
[in]filter_dataFilter data pointer. Data type: int8_t packed 4-bit weights, e.g four sequential weights [0x1, 0x2, 0x3, 0x4] packed as [0x21, 0x43].
[in]bias_dimsBias tensor dimensions. Format: [C_OUT] N, H, W : Not used
[in]bias_dataBias data pointer. Data type: int32
[in]output_dimsOutput tensor dimensions. Format: [N, C_OUT] N : Batches C_OUT : Output depth H & W : Not used.
[in,out]output_dataOutput data pointer. Data type: int8
Returns
The function returns ARM_CMSIS_NN_SUCCESS
  • Supported framework: TensorFlow Lite

◆ arm_fully_connected_s8()

arm_cmsis_nn_status arm_fully_connected_s8 ( const cmsis_nn_context ctx,
const cmsis_nn_fc_params fc_params,
const cmsis_nn_per_tensor_quant_params quant_params,
const cmsis_nn_dims input_dims,
const int8_t *  input_data,
const cmsis_nn_dims filter_dims,
const int8_t *  filter_data,
const cmsis_nn_dims bias_dims,
const int32_t *  bias_data,
const cmsis_nn_dims output_dims,
int8_t *  output_data 
)

Basic s8 Fully Connected function.

Parameters
[in,out]ctxFunction context (e.g. temporary buffer). Check the function definition file to see if an additional buffer is required. Optional function {API}_get_buffer_size() provides the buffer size if an additional buffer is required. The caller is expected to clear the buffer, if applicable, for security reasons.
[in]fc_paramsFully Connected layer parameters. Range of fc_params->input_offset : [-127, 128] fc_params->filter_offset : 0 Range of fc_params->output_offset : [-128, 127]
[in]quant_paramsPer-tensor quantization info. It contains the multiplier and shift value to be applied to the output tensor.
[in]input_dimsInput (activation) tensor dimensions. Format: [N, H, W, C_IN] Input dimension is taken as Nx(H * W * C_IN)
[in]input_dataInput (activation) data pointer. Data type: int8
[in]filter_dimsTwo dimensional filter dimensions. Format: [N, C] N : accumulation depth and equals (H * W * C_IN) from input_dims C : output depth and equals C_OUT in output_dims H & W : Not used
[in]filter_dataFilter data pointer. Data type: int8
[in]bias_dimsBias tensor dimensions. Format: [C_OUT] N, H, W : Not used
[in]bias_dataBias data pointer. Data type: int32
[in]output_dimsOutput tensor dimensions. Format: [N, C_OUT] N : Batches C_OUT : Output depth H & W : Not used.
[in,out]output_dataOutput data pointer. Data type: int8
Returns
The function returns either ARM_CMSIS_NN_ARG_ERROR if argument constraints fail. or, ARM_CMSIS_NN_SUCCESS on successful completion.
  • Supported framework: TensorFlow Lite

◆ arm_fully_connected_wrapper_s8()

arm_cmsis_nn_status arm_fully_connected_wrapper_s8 ( const cmsis_nn_context ctx,
const cmsis_nn_fc_params fc_params,
const cmsis_nn_quant_params quant_params,
const cmsis_nn_dims input_dims,
const int8_t *  input_data,
const cmsis_nn_dims filter_dims,
const int8_t *  filter_data,
const cmsis_nn_dims bias_dims,
const int32_t *  bias_data,
const cmsis_nn_dims output_dims,
int8_t *  output_data 
)

s8 Fully Connected layer wrapper function

Parameters
[in,out]ctxFunction context (e.g. temporary buffer). Check the function definition file to see if an additional buffer is required. Optional function {API}_get_buffer_size() provides the buffer size if an additional buffer is required. The caller is expected to clear the buffer, if applicable, for security reasons.
[in]fc_paramsFully Connected layer parameters. Range of fc_params->input_offset : [-127, 128] fc_params->filter_offset : 0 Range of fc_params->output_offset : [-128, 127]
[in]quant_paramsPer-channel or per-tensor quantization info. Check struct defintion for details. It contains the multiplier and shift value(s) to be applied to each output channel
[in]input_dimsInput (activation) tensor dimensions. Format: [N, H, W, C_IN] Input dimension is taken as Nx(H * W * C_IN)
[in]input_dataInput (activation) data pointer. Data type: int8
[in]filter_dimsTwo dimensional filter dimensions. Format: [N, C] N : accumulation depth and equals (H * W * C_IN) from input_dims C : output depth and equals C_OUT in output_dims H & W : Not used
[in]filter_dataFilter data pointer. Data type: int8
[in]bias_dimsBias tensor dimensions. Format: [C_OUT] N, H, W : Not used
[in]bias_dataBias data pointer. Data type: int32
[in]output_dimsOutput tensor dimensions. Format: [N, C_OUT] N : Batches C_OUT : Output depth H & W : Not used.
[in,out]output_dataOutput data pointer. Data type: int8
Returns
The function returns either ARM_CMSIS_NN_ARG_ERROR if argument constraints fail. or, ARM_CMSIS_NN_SUCCESS on successful completion.
  • Supported framework: TensorFlow Lite

◆ arm_vector_sum_s8()

arm_cmsis_nn_status arm_vector_sum_s8 ( int32_t *  vector_sum_buf,
const int32_t  vector_cols,
const int32_t  vector_rows,
const int8_t *  vector_data,
const int32_t  lhs_offset,
const int32_t  rhs_offset,
const int32_t *  bias_data 
)

Calculate the sum of each row in vector_data, multiply by lhs_offset and optionally add s32 bias_data.

Parameters
[in,out]vector_sum_bufBuffer for vector sums
[in]vector_colsNumber of vector columns
[in]vector_rowsNumber of vector rows
[in]vector_dataVector of weigths data
[in]lhs_offsetConstant multiplied with each sum
[in]rhs_offsetConstant added to each vector element before sum
[in]bias_dataVector of bias data, added to each sum.
Returns
The function returns ARM_CMSIS_NN_SUCCESS - Successful operation

◆ arm_vector_sum_s8_s64()

arm_cmsis_nn_status arm_vector_sum_s8_s64 ( int64_t *  vector_sum_buf,
const int32_t  vector_cols,
const int32_t  vector_rows,
const int8_t *  vector_data,
const int32_t  lhs_offset,
const int64_t *  bias_data 
)

Calculate the sum of each row in vector_data, multiply by lhs_offset and optionally add s64 bias_data.

Parameters
[in,out]vector_sum_bufBuffer for vector sums
[in]vector_colsNumber of vector columns
[in]vector_rowsNumber of vector rows
[in]vector_dataVector of weigths data
[in]lhs_offsetConstant multiplied with each sum
[in]bias_dataVector of bias data, added to each sum.
Returns
The function returns ARM_CMSIS_NN_SUCCESS - Successful operation