ArmNN
 25.11
Loading...
Searching...
No Matches
TosaRescaleOperatorUtils.hpp File Reference
Include dependency graph for TosaRescaleOperatorUtils.hpp:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Functions

void CreateRawRescaleTosaOperator (const std::string &inputName, const std::string &outputName, const std::vector< int32_t > &multipliers, const std::vector< int32_t > &shifts, int32_t input_zp, int32_t output_zp, bool input_unsigned, bool output_unsigned, bool double_round, bool scale32, bool per_channel, TosaSerializationOperator **op)
 Creates a raw rescale TOSA operator.
void ComputeMultiplierAndShiftTosaScale32 (double scale, int32_t &multiplier, int32_t &shift)
 The following is taken from mlir/lib/Dialect/Tosa/Utils/QuantUtils.cpp in the LLVM project From a scale value, generates multiplier and shift values where mantissa is in [-1.0,-0.5] or [0.5, 1.0] such that multiplier = mantissa*2^shift for 32-bit scaling.
void ComputeMultiplierAndShiftTosaScale16 (double scale, int32_t &multiplier, int32_t &shift)
 The following is taken from mlir/lib/Dialect/Tosa/Utils/QuantUtils.cpp in the LLVM project From a scale value, generates multiplier and shift values where mantissa is in [-1.0,-0.5] or [0.5, 1.0] such that multiplier = mantissa*2^shift for 16-bit scaling.
void CreateRescaleTosaOperator (const std::string &inputName, const std::string &outputName, double scale, int32_t input_zp, int32_t output_zp, bool input_unsigned, bool output_unsigned, bool double_round, bool scale32, TosaSerializationOperator **op)
 Creates a Tosa rescale operator.
void CreateRescaleTosaOperatorForWeights (const std::string &inputName, const std::string &outputName, int32_t input_zp, int32_t output_zp, bool input_unsigned, bool output_unsigned, bool double_round, bool scale32, double input_scale, double output_scale, const std::vector< float > &weight_scales, TosaSerializationOperator **op)
 Creates a TOSA rescale operator for weight tensors.

Function Documentation

◆ ComputeMultiplierAndShiftTosaScale16()

void ComputeMultiplierAndShiftTosaScale16 ( double scale,
int32_t & multiplier,
int32_t & shift )
inline

The following is taken from mlir/lib/Dialect/Tosa/Utils/QuantUtils.cpp in the LLVM project From a scale value, generates multiplier and shift values where mantissa is in [-1.0,-0.5] or [0.5, 1.0] such that multiplier = mantissa*2^shift for 16-bit scaling.

Definition at line 137 of file TosaRescaleOperatorUtils.hpp.

140{
141 const double mantissa = std::frexp(scale, &shift);
142 auto shiftedM = std::round(mantissa * (int64_t(1) << 15));
143
144 // Can't be greater than 1.0.
145 if (!(shiftedM <= (int64_t(1) << 15)))
146 {
147 throw armnn::Exception("Shifted mantissa exceeds 16 signed bits");
148 }
149
150 if (shiftedM == (int64_t(1) << 15))
151 {
152 shiftedM /= 2;
153 shift++;
154 }
155
156 // TOSA expects right shift to be positive and embed (1 << 15) into right
157 // shift bits.
158 shift = (-shift) + 15;
159
160 if (!(shiftedM <= std::numeric_limits<int32_t>::max()))
161 {
162 throw armnn::Exception("Shifted mantissa exceeds 32-bit signed output type");
163 }
164
165 multiplier = static_cast<int32_t>(shiftedM);
166
167 // Shifting tops out at 62 bits. Right shift to make 62 bits the max.
168 // The limit of 62 on shift allows the shift to be decomposed as
169 // two right shifts of 31.
170 if (shift > 62)
171 {
172 // Shifting the multiplier by more than 31-bits is unnecessary.
173 multiplier = multiplier >> std::min<int32_t>(31, shift - 62);
174 shift = 62;
175 }
176}
Base class for all ArmNN exceptions so that users can filter to just those.

Referenced by CreateRescaleTosaOperator(), and CreateRescaleTosaOperatorForWeights().

◆ ComputeMultiplierAndShiftTosaScale32()

void ComputeMultiplierAndShiftTosaScale32 ( double scale,
int32_t & multiplier,
int32_t & shift )
inline

The following is taken from mlir/lib/Dialect/Tosa/Utils/QuantUtils.cpp in the LLVM project From a scale value, generates multiplier and shift values where mantissa is in [-1.0,-0.5] or [0.5, 1.0] such that multiplier = mantissa*2^shift for 32-bit scaling.

Definition at line 94 of file TosaRescaleOperatorUtils.hpp.

97{
98 const double mantissa = std::frexp(scale, &shift);
99 auto shiftedM = std::round(mantissa * (int64_t(1) << 31));
100
101 // Can't be greater than 1.0.
102 if (!(shiftedM <= (int64_t(1) << 31)))
103 {
104 throw armnn::Exception("Shifted mantissa exceeds 32 signed bits");
105 }
106
107 if (shiftedM == (int64_t(1) << 31))
108 {
109 shiftedM /= 2;
110 shift++;
111 }
112
113 // TOSA expects right shift to be positive, and embed (1 << 31) into right
114 // shift bits.
115 shift = (-shift) + 31;
116
117 if (!(shiftedM <= std::numeric_limits<int32_t>::max()))
118 {
119 throw armnn::Exception("Shifted mantissa exceeds 32-bit signed output type");
120 }
121
122 multiplier = static_cast<int32_t>(shiftedM);
123
124 // Shifting tops out at 47 bits. Right shift to make 47 bits the max.
125 int32_t maxShiftValue = 47;
126 if (shift > maxShiftValue)
127 {
128 multiplier = multiplier >> std::min<int32_t>(31, shift - maxShiftValue);
129 shift = maxShiftValue;
130 }
131}

Referenced by ConvertReduceToTosaOperator(), CreateRescaleTosaOperator(), and CreateRescaleTosaOperatorForWeights().

◆ CreateRawRescaleTosaOperator()

void CreateRawRescaleTosaOperator ( const std::string & inputName,
const std::string & outputName,
const std::vector< int32_t > & multipliers,
const std::vector< int32_t > & shifts,
int32_t input_zp,
int32_t output_zp,
bool input_unsigned,
bool output_unsigned,
bool double_round,
bool scale32,
bool per_channel,
TosaSerializationOperator ** op )
inline

Creates a raw rescale TOSA operator.

This inline function creates a raw rescale operator for TOSA that adjusts the quantization parameters for an input tensor. It validates the multipliers and shifts vectors, ensuring they meet specific criteria for per-channel or global quantization. If any validation fails, an exception is thrown.

Parameters
inputName: The name of the input tensor.
outputName: The name of the output tensor.
multipliers: A vector of multiplier values for scaling.
shifts: A vector of shift values corresponding to the multipliers.
input_zp: The zero point for the input tensor.
output_zp: The zero point for the output tensor.
input_unsigned: Indicates if the input tensor is unsigned.
output_unsigned: Indicates if the output tensor is unsigned.
double_round: If true, applies double rounding during quantization.
scale32: If true, performs 32-bit scaling; otherwise, 16-bit scaling is used.
per_channel: Determines whether per-channel quantization is applied.
op: Pointer to store the created TosaSerializationOperator.

Definition at line 32 of file TosaRescaleOperatorUtils.hpp.

44{
45 if (!op)
46 {
47 throw armnn::Exception("CreateRawRescaleTosaOperator: nullptr op.");
48 }
49
50 if (multipliers.empty())
51 {
52 throw armnn::Exception("CreateRawRescaleTosaOperator: multipliers is empty.");
53 }
54
55 if (multipliers.size() != shifts.size())
56 {
57 throw armnn::Exception("CreateRawRescaleTosaOperator: multipliers and shift not same size.");
58 }
59
60 if (multipliers.size() == 1 && per_channel)
61 {
62 throw armnn::Exception("CreateRawRescaleTosaOperator: \
63 multipliers must be greater than 1 if per_channel is true.");
64 }
65
66 if (multipliers.size() > 1 && !per_channel)
67 {
68 throw armnn::Exception("CreateRawRescaleTosaOperator: \
69 multipliers size must be 1 if per_channel is false.");
70 }
71
72 TosaRescaleAttribute attribute(input_zp,
73 output_zp,
74 multipliers,
75 shifts,
76 scale32,
77 double_round,
78 per_channel,
79 input_unsigned,
80 output_unsigned);
81
82 // op
83 *op = new TosaSerializationOperator(Op_RESCALE, Attribute_RescaleAttribute, &attribute, {inputName}, {outputName});
84 if (!(*op))
85 {
86 throw armnn::Exception("CreateRescaleTosaOperator: failed to created operator");
87 }
88}

Referenced by ConvertReduceToTosaOperator(), CreateRescaleTosaOperator(), and CreateRescaleTosaOperatorForWeights().

◆ CreateRescaleTosaOperator()

void CreateRescaleTosaOperator ( const std::string & inputName,
const std::string & outputName,
double scale,
int32_t input_zp,
int32_t output_zp,
bool input_unsigned,
bool output_unsigned,
bool double_round,
bool scale32,
TosaSerializationOperator ** op )
inline

Creates a Tosa rescale operator.

This inline function computes the multiplier and shift values based on the given scale using either 32-bit or 16-bit scaling. It then creates a raw rescale operator that adjusts the quantization parameters for the input tensor.

Parameters
inputName: The name of the input tensor.
outputName: The name of the output tensor.
scale: The scale factor used to compute the multiplier and shift.
input_zp: The zero point for the input tensor.
output_zp: The zero point for the output tensor.
input_unsigned: Indicates if the input tensor is unsigned.
output_unsignedIndicates if the output tensor is unsigned.
double_round: If true, uses double rounding for quantization.
scale32: If true, performs 32-bit scaling; otherwise, 16-bit scaling is used.
op: Pointer to a variable that will store the created TosaSerializationOperator.

Definition at line 197 of file TosaRescaleOperatorUtils.hpp.

207{
208 int32_t multiplier;
209 int32_t shift;
210
211 if (scale32)
212 {
213 ComputeMultiplierAndShiftTosaScale32(scale, multiplier, shift);
214 }
215 else
216 {
217 ComputeMultiplierAndShiftTosaScale16(scale, multiplier, shift);
218 }
219
220 const std::vector<int32_t> multipliers{multiplier};
221 const std::vector<int32_t> shifts{shift};
222
224 outputName,
225 multipliers,
226 shifts,
227 input_zp,
228 output_zp,
229 input_unsigned,
230 output_unsigned,
231 double_round,
232 scale32,
233 false,
234 op);
235}
void ComputeMultiplierAndShiftTosaScale16(double scale, int32_t &multiplier, int32_t &shift)
The following is taken from mlir/lib/Dialect/Tosa/Utils/QuantUtils.cpp in the LLVM project From a sca...
void CreateRawRescaleTosaOperator(const std::string &inputName, const std::string &outputName, const std::vector< int32_t > &multipliers, const std::vector< int32_t > &shifts, int32_t input_zp, int32_t output_zp, bool input_unsigned, bool output_unsigned, bool double_round, bool scale32, bool per_channel, TosaSerializationOperator **op)
Creates a raw rescale TOSA operator.
void ComputeMultiplierAndShiftTosaScale32(double scale, int32_t &multiplier, int32_t &shift)
The following is taken from mlir/lib/Dialect/Tosa/Utils/QuantUtils.cpp in the LLVM project From a sca...

References ComputeMultiplierAndShiftTosaScale16(), ComputeMultiplierAndShiftTosaScale32(), and CreateRawRescaleTosaOperator().

Referenced by ConvertElementwiseBinaryToTosaOperator(), ConvertLeakyReluToTosaOperator(), ConvertPReluToTosaOperator(), ConvertQuantizeToTosaOperator(), ConvertReduceToTosaOperator(), ConvertReluToTosaOperator(), ConvertResizeToTosaOperator(), ConvertSoftmaxToTosaOperator(), and ConvertSquaredDifferenceToTosaOperator().

◆ CreateRescaleTosaOperatorForWeights()

void CreateRescaleTosaOperatorForWeights ( const std::string & inputName,
const std::string & outputName,
int32_t input_zp,
int32_t output_zp,
bool input_unsigned,
bool output_unsigned,
bool double_round,
bool scale32,
double input_scale,
double output_scale,
const std::vector< float > & weight_scales,
TosaSerializationOperator ** op )
inline

Creates a TOSA rescale operator for weight tensors.

This function computes multipliers and shift values for each weight scale by combining the input scale, weight scale, and output scale. It determines the quantization parameters using either 32-bit or 16-bit calculations based on the scale32 flag. The per_channel flag is set true if the provided weight scales are more than one. An exception is thrown if any computation fails.

Parameters
inputName: The name of the input tensor.
outputName: The name of the output tensor.
input_zp: The zero point for the input tensor.
output_zp: The zero point for the output tensor.
input_unsigned: Indicates if the input tensor is unsigned.
output_unsigned: Indicates if the output tensor is unsigned.
double_round: If true, uses double rounding for quantization.
scale32: If true, uses 32-bit scaling; otherwise, uses 16-bit scaling.
input_scale: The scaling factor for the input tensor.
output_scale: The scaling factor for the output tensor.
weight_scales: Vector of weight scales for per-channel quantization.
op: Pointer to store the created TosaSerializationOperator.

Definition at line 258 of file TosaRescaleOperatorUtils.hpp.

270{
271 std::vector<int32_t> op_tensor_multipliers;
272 std::vector<int32_t> op_tensor_shifts;
273 op_tensor_multipliers.reserve(weight_scales.size());
274 op_tensor_shifts.reserve(weight_scales.size());
275
276 for (const float& weight_scale : weight_scales)
277 {
278 double op_tensor_scale = (input_scale * weight_scale) / output_scale;
279 int32_t multiplier;
280 int32_t shift;
281
282 if (scale32)
283 {
284 ComputeMultiplierAndShiftTosaScale32(op_tensor_scale, multiplier, shift);
285 }
286 else
287 {
288 ComputeMultiplierAndShiftTosaScale16(op_tensor_scale, multiplier, shift);
289 }
290
291 op_tensor_multipliers.push_back(multiplier);
292 op_tensor_shifts.push_back(shift);
293 }
294
295 bool per_channel = weight_scales.size() == 1 ? false : true;
297 outputName,
298 op_tensor_multipliers,
299 op_tensor_shifts,
300 input_zp,
301 output_zp,
302 input_unsigned,
303 output_unsigned,
304 double_round,
305 scale32,
306 per_channel,
307 op);
308}

References ComputeMultiplierAndShiftTosaScale16(), ComputeMultiplierAndShiftTosaScale32(), and CreateRawRescaleTosaOperator().

Referenced by ConvertBatchMatMulToTosaOperator(), ConvertConv2dToTosaOperator(), ConvertConv3dToTosaOperator(), ConvertDepthwiseConv2dToTosaOperator(), and ConvertFullyConnectedToTosaOperator().