CMSIS-NN
Version 3.0.0
CMSIS NN Software Library
|
Functions | |
void | arm_q7_to_q15_no_shift (const q7_t *pSrc, q15_t *pDst, uint32_t blockSize) |
Converts the elements of the Q7 vector to Q15 vector without left-shift. More... | |
void | arm_q7_to_q15_reordered_no_shift (const q7_t *pSrc, q15_t *pDst, uint32_t blockSize) |
Converts the elements of the Q7 vector to reordered Q15 vector without left-shift. More... | |
void | arm_q7_to_q15_reordered_with_offset (const q7_t *src, q15_t *dst, uint32_t block_size, q15_t offset) |
Converts the elements of the Q7 vector to a reordered Q15 vector with an added offset. More... | |
void | arm_q7_to_q15_with_offset (const q7_t *src, q15_t *dst, uint32_t block_size, q15_t offset) |
Converts the elements from a q7 vector to a q15 vector with an added offset. More... | |
Perform data type conversion in-between neural network operations
void arm_q7_to_q15_no_shift | ( | const q7_t * | pSrc, |
q15_t * | pDst, | ||
uint32_t | blockSize | ||
) |
Converts the elements of the q7 vector to q15 vector without left-shift.
[in] | *pSrc | points to the Q7 input vector |
[out] | *pDst | points to the Q15 output vector |
[in] | blockSize | length of the input vector |
The equation used for the conversion process is:
pDst[n] = (q15_t) pSrc[n]; 0 <= n < blockSize.
References arm_nn_read_q7x4_ia(), and arm_nn_write_q15x2_ia().
Referenced by arm_avepool_q7_HWC(), arm_convolve_HWC_q7_basic(), and arm_convolve_HWC_q7_basic_nonsquare().
void arm_q7_to_q15_reordered_no_shift | ( | const q7_t * | pSrc, |
q15_t * | pDst, | ||
uint32_t | blockSize | ||
) |
Converts the elements of the q7 vector to reordered q15 vector without left-shift.
[in] | *pSrc | points to the Q7 input vector |
[out] | *pDst | points to the Q15 output vector |
[in] | blockSize | length of the input vector |
This function does the q7 to q15 expansion with re-ordering
| A1 | A2 | A3 | A4 |
0 7 8 15 16 23 24 31
is converted into:
| A1 | A3 | and | A2 | A4 |
0 15 16 31 0 15 16 31
This looks strange but is natural considering how sign-extension is done at assembly level.
The expansion of other other oprand will follow the same rule so that the end results are the same.
The tail (i.e., last (N % 4) elements) will still be in original order.
References arm_nn_read_q7x4_ia().
Referenced by arm_convolve_1x1_HWC_q7_fast_nonsquare(), arm_convolve_HWC_q7_fast(), arm_convolve_HWC_q7_fast_nonsquare(), arm_fully_connected_q7(), and arm_fully_connected_q7_opt().
void arm_q7_to_q15_reordered_with_offset | ( | const q7_t * | src, |
q15_t * | dst, | ||
uint32_t | block_size, | ||
q15_t | offset | ||
) |
Converts the elements of the q7 vector to reordered q15 vector with an added offset.
References arm_nn_read_q7x4_ia(), and arm_nn_write_q15x2_ia().
void arm_q7_to_q15_with_offset | ( | const q7_t * | src, |
q15_t * | dst, | ||
uint32_t | block_size, | ||
q15_t | offset | ||
) |
[in] | src | pointer to the q7 input vector |
[out] | dst | pointer to the q15 output vector |
[in] | block_size | length of the input vector |
[in] | offset | q7 offset to be added to each input vector element. |
The equation used for the conversion process is:
dst[n] = (q15_t) src[n] + offset; 0 <= n < block_size.
References arm_nn_read_q7x4_ia(), and arm_nn_write_q15x2_ia().
Referenced by arm_convolve_s8(), and arm_depthwise_conv_s8_opt().