Across all software interfaces to Arm NN there are a set of common configuration parameters.

These parameters control how a model is loaded or how the inference is executed. The widest set of options are available at the lowest, Arm NN C++ interface. They reduce as you move outward to the TfLite delegate. The tables below describe the arguments and in which interface they are available.

Compute device selection

The compute device selected is required to be specified across all interfaces. The device selection will dictate the availability of some parameters and whether some sub graphs are supported.

Interface

Device selection

Arm NN

The parameter "const std::vector<BackendId>& backendPreferences" to armnn::Optimize provides a vector of backendId's. If multiple devices are specifed the order of the vector dictates the order in which execution will be attempted. If all or part of the model is not supported by a backend, the next in order will be tried.

TfLite delegate

armnnDelegate::DelegateOptions Compute device or backend ids: This tells Arm NN which devices will be used to process the inference. A single device can be specified using the armnn::Compute enum. Multiple devices can be specified using a vector of armnn::BackendId. If multiple devices are specifed the order of the vector dictates the order in which execution will be attempted. If all or part of the model is not supported by a backend the next in order will be tried. Valid backend ids are: [EthosNAcc/GpuAcc/CpuAcc/CpuRef]

Runtime options

There a several levels at which Arm NN accepts runtime parameters. Some of these are specific to an Arm NN instance, some to a loaded network and some to the backend on which a network inference is to execute. Each of the external interfaces handles these options in different ways.

Arm NN Instance level options

In the Arm NN C++ interface these options are set by passing an armnn::CreationOptions struct to IRuntime. Not all available options are described here.

Arm NN Parameter	Delegate	Values	Description
m_DynamicBackendsPath	dynamic-backends-path	String file path	A path in which Arm NN will search for dynamic backends to load.
m_ProtectedMode	(Not Available)	["true"/"false"]	Setting this flag will allow the user to create the Runtime in protected mode. It will run all the inferences on protected memory and will make sure that INetworkProperties::m_ImportEnabled set to true with MemorySource::DmaBufProtected option. This requires that the backend supports Protected Memory and has an allocator capable of allocating Protected Memory associated with it.
m_CustomAllocatorMap	(Not Available)	std::map<BackendId, std::shared_ptr<ICustomAllocator>>	A map of Custom Allocator used for allocation of working memory in the backends. Required for Protected Mode in order to correctly allocate Protected Memory
m_MemoryOptimizerStrategyMap	(Not Available)	std::map<BackendId, std::shared_ptr<IMemoryOptimizerStrategy>>	A map to define a custom memory optimizer strategy for specific backend Ids.
m_GpuAccTunedParameters	gpu-tuning-level	["0"/"1"/"2"/"3"]	0=UseOnly(default), 1=RapidTuning, 2=NormalTuning, 3=ExhaustiveTuning. Requires option gpu-tuning-file. 1,2 and 3 will create a tuning-file, 0 will apply the tunings from an existing file
(Not Available)	disable-tflite-runtime-fallback	["true"/"false"]	Disable TfLite Runtime fallback in the Arm NN TfLite delegate. An exception will be thrown if unsupported operators are encountered. This option is only for testing purposes.
armnn::ConfigureLogging	logging-severity	[Trace/Debug/Info/Warning/Error/Fatal	Set the level of logging information output by Arm NN.
armnn::IOptimizedNetworkPtr->SerializeToDot	serialize-to-dot	String file path	Serialize the optimized network to the file specified in "dot" format.

A specific sub-struct of parameters exists to configure external profiling. This is held as a member, m_ProfilingOptions, of CreationOptions

Arm NN Parameter	Delegate	Values	Description
m_ProfilingOptions.m_EnableProfiling	enable-external-profiling	["true"/"false"]	Enable external profiling.
m_ProfilingOptions.m_TimelineEnabled	timeline-profiling	["true"/"false"]	Enable Arm Development studio Timeline events.
m_ProfilingOptions.m_OutgoingCaptureFile	outgoing-capture-file	String file path	Path to a file in which outgoing timeline profiling messages will be stored.
m_ProfilingOptions.m_IncomingCaptureFile	incoming-capture-file	String file path	Path to a file in which incoming timeline profiling messages will be stored.
m_ProfilingOptions.m_FileOnly	file-only-external-profiling	["true"/"false"]	Enable profiling output to file only.
m_ProfilingOptions.m_CapturePeriod	counter-capture-period	Integer (default : 10000)	Value in microseconds of the profiling capture period.
m_ProfilingOptions.m_FileFormat	profiling-file-format	String of ["binary"]	The format of the file used for outputting profiling data. Currently on "binary" is supported.

NetworkOptions

During Network creation you can specify several optional parameters via armnn::NetworkOptions.

Arm NN Parameter	Delegate	Values	Description
ShapeInferenceMethod	infer-output-shape	["true"/"false"]	Infers output tensor shape from input tensor shape and validate where applicable.
AllowExpandedDims	allow-expanded-dims	["true"/"false"]	If true will disregard dimensions with a size of 1 when validating tensor shapes. Tensor sizes must still match. This is an Experimental parameter that is incompatible with infer-output-shape.
profilingEnabled	enable-internal-profiling	["true"/"false"]	Enable json profiling in CpuAcc and GpuAcc backends.
detailsMethod	internal-profiling-detail	ProfilingDetailsMethod	Set the detail of internale porfiling. Options are DetailsWithEvents and DetailsOnly.

OptimizerOptions

OptimizerOptions are a set of parameters specifically targeting the Arm NN optimizer. This executes when a model is being loaded and these parameters are used to tune its operation.

Arm NN Parameter	Delegate	Values	Description
reduceFp32ToFp16	reduce-fp32-to-fp16	["true"/"false"]	Note This feature works best if all operators of the model are in Fp32. ArmNN will add conversion layers between layers that weren't in Fp32 in the first place or if the operator is not supported in Fp16. The overhead of these conversions can lead to a slower overall performance if too many conversions are required.
reduceFp32ToBf16	reduce-fp32-to-bf16	["true"/"false"]	This feature has been replaced by enabling Fast Math in compute library backend options. This is currently a placeholder option
debug	debug-data	["true"/"false"]	If the debug flag is set a DebugLayer is inserted after each layer. The action of each debug layer is backend specific.
importEnabled	memory-import	["true"/"false"]	Instructs the optimizer that this model will be importing it's input tensors. This value must match the MemorySource set for input in INetworkProperties.
exportEnabled	(Not available)	["true"/"false"]	Instructs the optimizer that this model will be exporting it's output tensors. This value must match the MemorySource set for output in INetworkProperties.

OptimizerOptions::ModelOptions

Model options is a vector of name value pairs contained inside OptimizerOptions. The options specifically target backends.

GpuAcc backend model options

Arm NN Parameter	Delegate	Values	Description
FastMathEnabled	enable-fast-math	["true"/"false"]	Enables fast_math options in backends that support it. If fastmath is enabled, Arm NN will automatically enable reduceFp32ToFp16 for models which have FP16 weights and biases and FP32 layers.
SaveCachedNetwork	save-cached-network	["true"/"false"]	Enables saving the cached network to the file given with cached-network-file option.
CachedNetworkFilePath	cached-network-filepath	String file path	If non-empty, the given file will be used to load/save cached network. If save-cached-network option is given will save the cached network to given file. If save-cached-network option is not given will load the cached network from given file.
MLGOTuningFilePath	gpu-mlgo-tuning-file	String file path	If non-empty, the given file will be used to load/save MLGO CL tuned parameters.
KernelProfilingEnabled	gpu-kernel-profiling-enabled	["true"/"false"]	Enables GPU kernel profiling

CpuAcc backend model options

Arm NN Parameter	Delegate	Values	Description
FastMathEnabled	enable-fast-math	["true"/"false"]	Enables fast_math options in backends that support it. If fastmath is enabled, Arm NN will automatically enable reduceFp32ToFp16 for models which have FP16 weights and biases and FP32 layers.
NumberOfThreads	number-of-threads	Integer [1-64]	Assign the number of threads used by the CpuAcc backend. Input value must be between 1 and 64. Default is set to 0 (Backend will decide number of threads to use).

EthosNAcc backend model options

Arm NN Parameter	Delegate	Values	Description
DisableWinograd	(Not available)	["true"/"false"]	Disables Winograd fast convolution.
StrictPrecision	(Not available)	["true"/"false"]	When enabled the network is more precise as the Re-quantize operations aren't fused, but it is slower to compile as there will be additional operations. This is currently only supported for the Concat operation.
SaveCachedNetwork	save-cached-network	["true"/"false"]	Enables saving the cached network to the file given with cached-network-file option.
CachedNetworkFilePath	cached-network-filepath	String file path	If non-empty, the given file will be used to load/save cached network. If save-cached-network option is given will save the cached network to given file. If save-cached-network option is not given will load the cached network from given file.