ArmNN
 24.02
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
AsyncExecutionSample.cpp

Yet another variant of the SimpleSample application. In this little sample app you will be shown how to run a network multiple times asynchronously.

Note
This is currently an experimental interface
//
// Copyright © 2021 Arm Ltd and Contributors. All rights reserved.
// SPDX-License-Identifier: MIT
//
#include <armnn/Utils.hpp>
#include <iostream>
#include <thread>
/// A simple example of using the ArmNN SDK API to run a network multiple times with different inputs in an asynchronous
/// manner.
///
/// Background info: The usual runtime->EnqueueWorkload, which is used to trigger the execution of a network, is not
/// thread safe. Each workload has memory assigned to it which would be overwritten by each thread.
/// Before we added support for this you had to load a network multiple times to execute it at the
/// same time. Every time a network is loaded, it takes up memory on your device. Making the
/// execution thread safe helps to reduce the memory footprint for concurrent executions significantly.
/// This example shows you how to execute a model concurrently (multiple threads) while still only
/// loading it once.
///
/// As in most of our simple samples, the network in this example will ask the user for a single input number for each
/// execution of the network.
/// The network consists of a single fully connected layer with a single neuron. The neurons weight is set to 1.0f
/// to produce an output number that is the same as the input.
int main()
{
using namespace armnn;
// The first part of this code is very similar to the SimpleSample.cpp you should check it out for comparison
// The interesting part starts when the graph is loaded into the runtime
std::vector<float> inputs;
float number1;
std::cout << "Please enter a number for the first iteration: " << std::endl;
std::cin >> number1;
float number2;
std::cout << "Please enter a number for the second iteration: " << std::endl;
std::cin >> number2;
// Turn on logging to standard output
// This is useful in this sample so that users can learn more about what is going on
// Construct ArmNN network
NetworkId networkIdentifier;
float weightsData[] = {1.0f}; // Identity
TensorInfo weightsInfo(TensorShape({1, 1}), DataType::Float32, 0.0f, 0, true);
weightsInfo.SetConstant();
ConstTensor weights(weightsInfo, weightsData);
// Constant layer that now holds weights data for FullyConnected
IConnectableLayer* const constantWeightsLayer = myNetwork->AddConstantLayer(weights, "const weights");
FullyConnectedDescriptor fullyConnectedDesc;
IConnectableLayer* const fullyConnectedLayer = myNetwork->AddFullyConnectedLayer(fullyConnectedDesc,
"fully connected");
IConnectableLayer* InputLayer = myNetwork->AddInputLayer(0);
IConnectableLayer* OutputLayer = myNetwork->AddOutputLayer(0);
InputLayer->GetOutputSlot(0).Connect(fullyConnectedLayer->GetInputSlot(0));
constantWeightsLayer->GetOutputSlot(0).Connect(fullyConnectedLayer->GetInputSlot(1));
fullyConnectedLayer->GetOutputSlot(0).Connect(OutputLayer->GetInputSlot(0));
// Create ArmNN runtime
IRuntime::CreationOptions options; // default options
IRuntimePtr run = IRuntime::Create(options);
//Set the tensors in the network.
TensorInfo inputTensorInfo(TensorShape({1, 1}), DataType::Float32);
InputLayer->GetOutputSlot(0).SetTensorInfo(inputTensorInfo);
TensorInfo outputTensorInfo(TensorShape({1, 1}), DataType::Float32);
fullyConnectedLayer->GetOutputSlot(0).SetTensorInfo(outputTensorInfo);
constantWeightsLayer->GetOutputSlot(0).SetTensorInfo(weightsInfo);
// Optimise ArmNN network
IOptimizedNetworkPtr optNet = Optimize(*myNetwork, {Compute::CpuRef}, run->GetDeviceSpec());
if (!optNet)
{
// This shouldn't happen for this simple sample, with reference backend.
// But in general usage Optimize could fail if the hardware at runtime cannot
// support the model that has been provided.
std::cerr << "Error: Failed to optimise the input network." << std::endl;
return 1;
}
// Load graph into runtime.
std::string errmsg; // To hold an eventual error message if loading the network fails
// Add network properties to enable async execution. The MemorySource::Undefined variables indicate
// that neither inputs nor outputs will be imported. Importing will be covered in another example.
run->LoadNetwork(networkIdentifier,
std::move(optNet),
errmsg,
networkProperties);
// Creates structures for inputs and outputs. A vector of float for each execution.
std::vector<std::vector<float>> inputData{{number1}, {number2}};
std::vector<std::vector<float>> outputData;
outputData.resize(2, std::vector<float>(1));
inputTensorInfo = run->GetInputTensorInfo(networkIdentifier, 0);
inputTensorInfo.SetConstant(true);
std::vector<InputTensors> inputTensors
{
{{0, armnn::ConstTensor(inputTensorInfo, inputData[0].data())}},
{{0, armnn::ConstTensor(inputTensorInfo, inputData[1].data())}}
};
std::vector<OutputTensors> outputTensors
{
{{0, armnn::Tensor(run->GetOutputTensorInfo(networkIdentifier, 0), outputData[0].data())}},
{{0, armnn::Tensor(run->GetOutputTensorInfo(networkIdentifier, 0), outputData[1].data())}}
};
// Lambda function to execute the network. We use it as thread function.
auto execute = [&](unsigned int executionIndex)
{
auto memHandle = run->CreateWorkingMemHandle(networkIdentifier);
run->Execute(*memHandle, inputTensors[executionIndex], outputTensors[executionIndex]);
};
// Prepare some threads and let each execute the network with a different input
std::vector<std::thread> threads;
for (unsigned int i = 0; i < inputTensors.size(); ++i)
{
threads.emplace_back(std::thread(execute, i));
}
// Wait for the threads to finish
for (std::thread& t : threads)
{
if(t.joinable())
{
t.join();
}
}
std::cout << "Your numbers were " << outputData[0][0] << " and " << outputData[1][0] << std::endl;
return 0;
}
armnn::InputLayer
A layer user-provided data can be bound to (e.g. inputs, outputs).
Definition: InputLayer.hpp:13
armnn::INetworkPtr
std::unique_ptr< INetwork, void(*)(INetwork *network)> INetworkPtr
Definition: INetwork.hpp:339
armnn::IOptimizedNetworkPtr
std::unique_ptr< IOptimizedNetwork, void(*)(IOptimizedNetwork *network)> IOptimizedNetworkPtr
Definition: INetwork.hpp:340
armnn::Tensor
A tensor defined by a TensorInfo (shape and data type) and a mutable backing store.
Definition: Tensor.hpp:321
IRuntime.hpp
armnn::FullyConnectedDescriptor
A FullyConnectedDescriptor for the FullyConnectedLayer.
Definition: Descriptors.hpp:507
Descriptors.hpp
armnn::Compute::CpuRef
@ CpuRef
CPU Execution: Reference C++ kernels.
armnn::OutputSlot::SetTensorInfo
void SetTensorInfo(const TensorInfo &tensorInfo) override
Definition: Layer.cpp:87
armnn::TensorInfo
Definition: Tensor.hpp:152
armnn::ConfigureLogging
void ConfigureLogging(bool printToStandardOutput, bool printToDebugOutput, LogSeverity severity)
Configures the logging behaviour of the ARMNN library.
Definition: Utils.cpp:18
armnn::DataType::Float32
@ Float32
armnn::Layer::GetOutputSlot
const OutputSlot & GetOutputSlot(unsigned int index=0) const override
Get the const output slot handle by slot index.
Definition: Layer.hpp:339
armnn::OutputSlot::Connect
int Connect(InputSlot &destination)
Definition: Layer.cpp:112
armnn::Layer::GetInputSlot
const InputSlot & GetInputSlot(unsigned int index) const override
Get a const input slot handle by slot index.
Definition: Layer.hpp:337
armnn::TensorShape
Definition: Tensor.hpp:20
armnn::NetworkId
int NetworkId
Definition: IRuntime.hpp:35
INetwork.hpp
armnn::IRuntimePtr
std::unique_ptr< IRuntime, void(*)(IRuntime *runtime)> IRuntimePtr
Definition: IRuntime.hpp:41
armnn::IOutputSlot::SetTensorInfo
virtual void SetTensorInfo(const TensorInfo &tensorInfo)=0
armnn::MemorySource::Undefined
@ Undefined
Utils.hpp
armnn::OutputLayer
A layer user-provided data can be bound to (e.g. inputs, outputs).
Definition: OutputLayer.hpp:13
armnn::INetworkProperties
Definition: IRuntime.hpp:43
main
int main(int argc, char *argv[])
Definition: ArmnnConverter.cpp:327
armnn::IOutputSlot::Connect
virtual int Connect(IInputSlot &destination)=0
armnn::IRuntime::CreationOptions
Definition: IRuntime.hpp:78
armnn::IRuntime::Create
static IRuntimePtr Create(const CreationOptions &options)
Definition: Runtime.cpp:52
armnn::LogSeverity::Warning
@ Warning
armnn::IConnectableLayer::GetOutputSlot
virtual const IOutputSlot & GetOutputSlot(unsigned int index) const =0
Get the const output slot handle by slot index.
armnn
Copyright (c) 2021 ARM Limited and Contributors.
Definition: 01_00_quick_start.dox:6
armnn::IConnectableLayer::GetInputSlot
virtual const IInputSlot & GetInputSlot(unsigned int index) const =0
Get a const input slot handle by slot index.
armnn::ConstTensor
A tensor defined by a TensorInfo (shape and data type) and an immutable backing store.
Definition: Tensor.hpp:329
armnn::IConnectableLayer
Interface for a layer that is connectable to other layers via InputSlots and OutputSlots.
Definition: INetwork.hpp:80
armnn::TensorInfo::SetConstant
void SetConstant(const bool IsConstant=true)
Marks the data corresponding to this tensor info as constant.
Definition: Tensor.cpp:514
armnn::Optimize
IOptimizedNetworkPtr Optimize(const INetwork &network, const std::vector< BackendId > &backendPreferences, const IDeviceSpec &deviceSpec, const OptimizerOptionsOpaque &options=OptimizerOptionsOpaque(), Optional< std::vector< std::string > & > messages=EmptyOptional())
Create an optimized version of the network.
Definition: Network.cpp:2132
armnn::INetwork::Create
static INetworkPtr Create(const NetworkOptions &networkOptions={})
Definition: Network.cpp:676