ArmNN
 24.08
GpuFsaBackend Class Reference

#include <GpuFsaBackend.hpp>

Inheritance diagram for GpuFsaBackend:
[legend]
Collaboration diagram for GpuFsaBackend:
[legend]

Classes

class  ClBackendCustomAllocatorMemoryRegion
 
class  GpuFsaBackendCustomAllocatorWrapper
 

Public Member Functions

 GpuFsaBackend ()
 
 GpuFsaBackend (std::shared_ptr< ICustomAllocator > allocator)
 
 ~GpuFsaBackend ()=default
 
const BackendIdGetId () const override
 
IBackendInternal::IMemoryManagerUniquePtr CreateMemoryManager () const override
 
IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory (const IBackendInternal::IMemoryManagerSharedPtr &memoryManager=nullptr) const override
 
IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory (TensorHandleFactoryRegistry &registry) const override
 
IWorkloadFactoryPtr CreateWorkloadFactory (class TensorHandleFactoryRegistry &tensorHandleFactoryRegistry, const ModelOptions &modelOptions, MemorySourceFlags inputFlags, MemorySourceFlags outputFlags) const override
 
std::vector< ITensorHandleFactory::FactoryIdGetHandleFactoryPreferences () const override
 (Optional) Returns a vector of supported TensorHandleFactory ids in preference order. More...
 
void RegisterTensorHandleFactories (TensorHandleFactoryRegistry &registry) override
 (Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor() IWorkloadFactory::CreateSubtensor() methods must be implemented. More...
 
void RegisterTensorHandleFactories (TensorHandleFactoryRegistry &registry, MemorySourceFlags inputFlags, MemorySourceFlags outputFlags) override
 (Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor() IWorkloadFactory::CreateSubtensor() methods must be implemented. More...
 
IBackendInternal::IBackendContextPtr CreateBackendContext (const IRuntime::CreationOptions &) const override
 Create the runtime context of the backend. More...
 
IBackendInternal::IBackendProfilingContextPtr CreateBackendProfilingContext (const IRuntime::CreationOptions &, IBackendProfilingPtr &backendProfiling) override
 Create context specifically used for profiling interaction from backends. More...
 
IBackendInternal::ILayerSupportSharedPtr GetLayerSupport () const override
 
OptimizationViews OptimizeSubgraphView (const SubgraphView &subgraph, const ModelOptions &modelOptions) const override
 
std::unique_ptr< ICustomAllocatorGetDefaultAllocator () const override
 Returns the default memory allocator for the backend. More...
 
BackendCapabilities GetCapabilities () const override
 Returns a BackendCapability if the backend lists the capability The BackendCapability must then be inspected to check whether or not that BackendCapability is supported Otherwise returns an EmptyOptional if the BackendCapability is unlisted. More...
 
virtual bool UseCustomMemoryAllocator (std::shared_ptr< ICustomAllocator > allocator, armnn::Optional< std::string & >) override
 Signals the backend to use a custom memory allocator provided by the user. More...
 
- Public Member Functions inherited from IBackendInternal
 ~IBackendInternal () override=default
 Allow backends created by the factory function to be destroyed through IBackendInternal. More...
 
virtual IWorkloadFactoryPtr CreateWorkloadFactory (const IMemoryManagerSharedPtr &memoryManager, const ModelOptions &modelOptions) const
 
virtual IWorkloadFactoryPtr CreateWorkloadFactory (class TensorHandleFactoryRegistry &tensorHandleFactoryRegistry, const ModelOptions &modelOptions) const
 
virtual IBackendSpecificModelContextPtr CreateBackendSpecificModelContext (const ModelOptions &modelOptions) const
 
virtual ILayerSupportSharedPtr GetLayerSupport (const ModelOptions &modelOptions) const
 
virtual OptimizationViews OptimizeSubgraphView (const SubgraphView &subgraph) const
 
bool SupportsTensorAllocatorAPI () const
 
ITensorHandleFactory::FactoryId GetBackwardCompatibleFavoriteHandleFactory ()
 
virtual unsigned int GetNumberOfCacheFiles () const
 Returns the number of files cached if backend supports caching. More...
 
virtual ExecutionData CreateExecutionData (WorkingMemDescriptor &workingMemDescriptor) const
 Returns ExecutionData for the backend. More...
 
virtual void UpdateExecutionData (ExecutionData &executionData, WorkingMemDescriptor &workingMemDescriptor) const
 Update the ExecutionData for a layer. More...
 

Static Public Member Functions

static const BackendIdGetIdStatic ()
 
- Static Public Member Functions inherited from IBackendInternal
static constexpr BackendVersion GetApiVersion ()
 Returns the version of the Backend API. More...
 

Public Attributes

std::shared_ptr< GpuFsaBackendCustomAllocatorWrapperm_CustomAllocator
 
bool m_UsingCustomAllocator = false
 

Additional Inherited Members

- Public Types inherited from IBackendInternal
using IWorkloadFactoryPtr = std::unique_ptr< IWorkloadFactory >
 
using IBackendContextPtr = std::unique_ptr< IBackendContext >
 
using IBackendProfilingContextPtr = std::shared_ptr< arm::pipe::IBackendProfilingContext >
 This is the bridge between backend and backend profiling we'll keep it in the backend namespace. More...
 
using IBackendProfilingPtr = std::unique_ptr< arm::pipe::IBackendProfiling >
 
using ILayerSupportSharedPtr = std::shared_ptr< ILayerSupport >
 
using IBackendSpecificModelContextPtr = std::shared_ptr< IBackendModelContext >
 
using IMemoryManagerUniquePtr = std::unique_ptr< IMemoryManager >
 
using IMemoryManagerSharedPtr = std::shared_ptr< IMemoryManager >
 
- Protected Member Functions inherited from IBackendInternal
 IBackendInternal ()=default
 Creation must be done through a specific backend interface. More...
 
- Protected Member Functions inherited from IBackend
 IBackend ()
 
virtual ~IBackend ()
 

Detailed Description

Definition at line 54 of file GpuFsaBackend.hpp.

Constructor & Destructor Documentation

◆ GpuFsaBackend() [1/2]

GpuFsaBackend ( )
inline

Definition at line 58 of file GpuFsaBackend.hpp.

58 : m_CustomAllocator(nullptr) {};

◆ GpuFsaBackend() [2/2]

GpuFsaBackend ( std::shared_ptr< ICustomAllocator allocator)
inline

Definition at line 60 of file GpuFsaBackend.hpp.

61  {
63  }

References GpuFsaBackend::UseCustomMemoryAllocator().

◆ ~GpuFsaBackend()

~GpuFsaBackend ( )
default

Member Function Documentation

◆ CreateBackendContext()

IBackendInternal::IBackendContextPtr CreateBackendContext ( const IRuntime::CreationOptions ) const
overridevirtual

Create the runtime context of the backend.

Implementations may return a default-constructed IBackendContextPtr if no context is needed at runtime. Implementations must throw BackendUnavailableException if the backend cannot be used (for example, necessary accelerator hardware is not present). The default implementation always returns a default-constructed pointer.

Reimplemented from IBackendInternal.

Definition at line 198 of file GpuFsaBackend.cpp.

199 {
200  return IBackendContextPtr{new GpuFsaBackendContext{options}};
201 }

◆ CreateBackendProfilingContext()

IBackendInternal::IBackendProfilingContextPtr CreateBackendProfilingContext ( const IRuntime::CreationOptions creationOptions,
IBackendProfilingPtr backendProfiling 
)
overridevirtual

Create context specifically used for profiling interaction from backends.

Reimplemented from IBackendInternal.

Definition at line 203 of file GpuFsaBackend.cpp.

205 {
207 }

◆ CreateMemoryManager()

IBackendInternal::IMemoryManagerUniquePtr CreateMemoryManager ( ) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 75 of file GpuFsaBackend.cpp.

76 {
78  {
79  return std::make_unique<GpuFsaMemoryManager>(m_CustomAllocator);
80  }
81  return std::make_unique<GpuFsaMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
82 }

References GpuFsaBackend::m_CustomAllocator, and GpuFsaBackend::m_UsingCustomAllocator.

◆ CreateWorkloadFactory() [1/3]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( class TensorHandleFactoryRegistry tensorHandleFactoryRegistry,
const ModelOptions modelOptions,
MemorySourceFlags  inputFlags,
MemorySourceFlags  outputFlags 
) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 111 of file GpuFsaBackend.cpp.

116 {
117 
118  // To allow force import if inputFlags/outputFlags are Undefined, set it as Malloc
119  if (inputFlags == static_cast<MemorySourceFlags>(MemorySource::Undefined))
120  {
121  inputFlags = static_cast<MemorySourceFlags>(MemorySource::Malloc);
122  }
123  if (outputFlags == static_cast<MemorySourceFlags>(MemorySource::Undefined))
124  {
125  outputFlags = static_cast<MemorySourceFlags>(MemorySource::Malloc);
126  }
127 
128  std::shared_ptr<GpuFsaMemoryManager> memoryManager;
130  {
131  memoryManager = std::make_shared<GpuFsaMemoryManager>(m_CustomAllocator);
132  }
133  else
134  {
135  memoryManager = std::make_shared<GpuFsaMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
136  }
137 
138  std::unique_ptr<ITensorHandleFactory> factory = std::make_unique<GpuFsaTensorHandleFactory>(memoryManager);
139 
140  registry.RegisterMemoryManager(memoryManager);
141  registry.RegisterFactory(std::move(factory));
142 
143  return std::make_unique<GpuFsaWorkloadFactory>(PolymorphicPointerDowncast<GpuFsaMemoryManager>(memoryManager));
144 }

References GpuFsaBackend::m_CustomAllocator, GpuFsaBackend::m_UsingCustomAllocator, armnn::Malloc, TensorHandleFactoryRegistry::RegisterFactory(), TensorHandleFactoryRegistry::RegisterMemoryManager(), and armnn::Undefined.

◆ CreateWorkloadFactory() [2/3]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( const IBackendInternal::IMemoryManagerSharedPtr memoryManager = nullptr) const
overridevirtual

Implements IBackendInternal.

Definition at line 84 of file GpuFsaBackend.cpp.

86 {
87  return std::make_unique<GpuFsaWorkloadFactory>(PolymorphicPointerDowncast<GpuFsaMemoryManager>(memoryManager));
88 }

◆ CreateWorkloadFactory() [3/3]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( TensorHandleFactoryRegistry registry) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 90 of file GpuFsaBackend.cpp.

92 {
93  std::shared_ptr<GpuFsaMemoryManager> memoryManager;
95  {
96  memoryManager = std::make_shared<GpuFsaMemoryManager>(m_CustomAllocator);
97  }
98  else
99  {
100  memoryManager = std::make_shared<GpuFsaMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
101  }
102 
103  std::unique_ptr<ITensorHandleFactory> factory = std::make_unique<GpuFsaTensorHandleFactory>(memoryManager);
104 
105  registry.RegisterMemoryManager(memoryManager);
106  registry.RegisterFactory(std::move(factory));
107 
108  return std::make_unique<GpuFsaWorkloadFactory>(PolymorphicPointerDowncast<GpuFsaMemoryManager>(memoryManager));
109 }

References GpuFsaBackend::m_CustomAllocator, GpuFsaBackend::m_UsingCustomAllocator, TensorHandleFactoryRegistry::RegisterFactory(), and TensorHandleFactoryRegistry::RegisterMemoryManager().

◆ GetCapabilities()

BackendCapabilities GetCapabilities ( ) const
inlineoverridevirtual

Returns a BackendCapability if the backend lists the capability The BackendCapability must then be inspected to check whether or not that BackendCapability is supported Otherwise returns an EmptyOptional if the BackendCapability is unlisted.

Reimplemented from IBackendInternal.

Definition at line 100 of file GpuFsaBackend.hpp.

101  {
102  return gpuFsaCapabilities;
103  };

References armnn::gpuFsaCapabilities.

◆ GetDefaultAllocator()

std::unique_ptr< ICustomAllocator > GetDefaultAllocator ( ) const
overridevirtual

Returns the default memory allocator for the backend.

Returns
- Returns unique pointer to the Default Allocator of the Backend

Reimplemented from IBackendInternal.

Definition at line 215 of file GpuFsaBackend.cpp.

216 {
217  return std::make_unique<GpuFsaBackendDefaultAllocator>();
218 }

◆ GetHandleFactoryPreferences()

std::vector< ITensorHandleFactory::FactoryId > GetHandleFactoryPreferences ( ) const
overridevirtual

(Optional) Returns a vector of supported TensorHandleFactory ids in preference order.

Reimplemented from IBackendInternal.

Definition at line 146 of file GpuFsaBackend.cpp.

147 {
148  return std::vector<ITensorHandleFactory::FactoryId> { GpuFsaTensorHandleFactory::GetIdStatic() };
149 }

References GpuFsaTensorHandleFactory::GetIdStatic().

◆ GetId()

const BackendId& GetId ( ) const
inlineoverridevirtual

Implements IBackend.

Definition at line 67 of file GpuFsaBackend.hpp.

67 { return GetIdStatic(); }

References GpuFsaBackend::GetIdStatic().

Referenced by GpuFsaBackend::OptimizeSubgraphView().

◆ GetIdStatic()

const BackendId & GetIdStatic ( )
static

Definition at line 69 of file GpuFsaBackend.cpp.

70 {
71  static const BackendId s_Id{GpuFsaBackendId()};
72  return s_Id;
73 }

References armnn::GpuFsaBackendId().

Referenced by GpuFsaBackend::GetId().

◆ GetLayerSupport()

IBackendInternal::ILayerSupportSharedPtr GetLayerSupport ( ) const
overridevirtual

Implements IBackendInternal.

Definition at line 209 of file GpuFsaBackend.cpp.

210 {
211  static ILayerSupportSharedPtr layerSupport{new GpuFsaLayerSupport};
212  return layerSupport;
213 }

◆ OptimizeSubgraphView()

OptimizationViews OptimizeSubgraphView ( const SubgraphView subgraph,
const ModelOptions modelOptions 
) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 220 of file GpuFsaBackend.cpp.

222 {
223  OptimizationViews optimizationViews(modelOptions);
224 
225  using namespace arm_compute::experimental::dynamic_fusion;
226 
227  auto it = subgraph.end();
228  std::map<LayerGuid, Layer*> untouched;
229  while (it != subgraph.begin())
230  {
231  --it;
232  Layer& base = *(PolymorphicDowncast<Layer*>(*it));
233  untouched.insert({base.GetGuid(), &base});
234  }
235 
236  GpuFsaLayerSupport supportChecker;
237  it = subgraph.end();
238  arm_compute::CLCompileContext* compileCtx = &(arm_compute::CLKernelLibrary::get().get_compile_context());
239 
240  // Setup the GpuWokloadContext which will exist for the lifetime of the Graph. This contains the TensorInfos
241  std::shared_ptr<GpuWorkloadContext> workloadContext = std::make_shared<GpuWorkloadContext>(compileCtx);
242  while (it != subgraph.begin())
243  {
244  --it;
245  Layer& base = *(PolymorphicDowncast<Layer*>(*it));
246  // Create a GpuFsaPreCompiledBlob, this contains all of the information needed to execute an operator
247  GpuFsaPreCompiledBlob* preCompiledBlobPtr = new GpuFsaPreCompiledBlob();
248  preCompiledBlobPtr->workloadContext = workloadContext;
249  preCompiledBlobPtr->sketch = std::make_unique<GpuWorkloadSketch>(workloadContext.get());
250 
251  // Configure and setup the sketch for each supported op. Their data will be wrapped into a PreCompiled layer
252  switch (base.GetType())
253  {
254  case (LayerType::Activation):
255  {
256  auto desc = PolymorphicDowncast<const ActivationDescriptor*>(&base.GetParameters());
257  auto input = base.GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo();
258  GpuFsaActivationCreateOp(preCompiledBlobPtr, input, *desc);
259  break;
260  }
261  case (LayerType::Cast):
262  {
263  auto input = base.GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo();
264  auto output = base.GetOutputSlot(0).GetTensorInfo();
265  GpuFsaCastCreateOp(preCompiledBlobPtr, input, output);
266  break;
267  }
269  {
270  auto input = base.GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo();
271  auto weights = base.GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo();
272 
273  auto desc = PolymorphicDowncast<const Convolution2dDescriptor*>(&base.GetParameters());
274  if (desc->m_BiasEnabled)
275  {
276  auto bias = base.GetInputSlot(2).GetConnectedOutputSlot()->GetTensorInfo();
277  GpuFsaConvolution2dCreateOp(preCompiledBlobPtr,
278  input,
279  *desc,
280  weights,
281  bias);
282  }
283  else
284  {
285  GpuFsaConvolution2dCreateOp(preCompiledBlobPtr,
286  input,
287  *desc,
288  weights,
289  EmptyOptional());
290  }
291  break;
292  }
293  case (LayerType::BatchMatMul):
294  {
295  auto input0 = base.GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo();
296  auto input1 = base.GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo();
297  auto desc = PolymorphicDowncast<const BatchMatMulDescriptor*>(&base.GetParameters());
298  GpuFsaBatchMatMulCreateOp(preCompiledBlobPtr, input0, input1, *desc);
299  break;
300  }
302  {
303  auto input = base.GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo();
304  auto weights = base.GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo();
305 
306  auto desc = PolymorphicDowncast<const DepthwiseConvolution2dDescriptor*>(&base.GetParameters());
307  if (desc->m_BiasEnabled)
308  {
309  auto bias = base.GetInputSlot(2).GetConnectedOutputSlot()->GetTensorInfo();
310  GpuFsaDepthwiseConvolution2dCreateOp(preCompiledBlobPtr,
311  input,
312  *desc,
313  weights,
314  bias);
315  }
316  else
317  {
318  GpuFsaDepthwiseConvolution2dCreateOp(preCompiledBlobPtr,
319  input,
320  *desc,
321  weights,
322  EmptyOptional());
323  }
324  break;
325  }
327  {
328  auto desc = PolymorphicDowncast<const ElementwiseBinaryDescriptor *>(&base.GetParameters());
329  auto input0 = base.GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo();
330  auto input1 = base.GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo();
331  GpuFsaElementwiseBinaryCreateOp(preCompiledBlobPtr, input0, input1, *desc);
332  break;
333  }
334  case (LayerType::Pooling2d):
335  {
336  auto input = base.GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo();
337  auto desc = PolymorphicDowncast<const Pooling2dDescriptor*>(&base.GetParameters());
338  GpuFsaPooling2dCreateOp(preCompiledBlobPtr, input, *desc);
339  break;
340  }
341  case LayerType::Reshape:
342  {
343  auto input = base.GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo();
344  auto desc = PolymorphicDowncast<const ReshapeDescriptor*>(&base.GetParameters());
345  GpuFsaReshapeCreateOp(preCompiledBlobPtr, input, *desc);
346 
347  break;
348  }
349  case (LayerType::Resize):
350  {
351  auto input = base.GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo();
352  auto desc = PolymorphicDowncast<const ResizeDescriptor*>(&base.GetParameters());
353  GpuFsaResizeCreateOp(preCompiledBlobPtr, input, *desc);
354  break;
355  }
356  case (LayerType::Softmax):
357  {
358  auto input = base.GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo();
359  auto output = base.GetOutputSlot(0).GetTensorInfo();
360 
361  auto desc = PolymorphicDowncast<const SoftmaxDescriptor*>(&base.GetParameters());
362  GpuFsaSoftmaxCreateOp(preCompiledBlobPtr,
363  input,
364  output,
365  *desc);
366  break;
367  }
368  default:
369  // unsupported layer for GpuFsa backend
370  continue;
371  }
372 
373  auto compiledBlob =
374  std::make_unique<PreCompiledObjectPtr>(preCompiledBlobPtr, DeleteAsType<GpuFsaPreCompiledBlob>);
375 
376  IConnectableLayer* preCompiledLayer = optimizationViews.GetINetwork()->AddPrecompiledLayer(
377  PreCompiledDescriptor(base.GetNumInputSlots(), base.GetNumOutputSlots()),
378  std::move(*compiledBlob),
380  "GpuFsa_Pre_Compiled_Layer");
381 
382  // Copy the output tensor infos from sub-graph
383  for (unsigned int i = 0; i < subgraph.GetNumOutputSlots(); i++)
384  {
385  preCompiledLayer->GetOutputSlot(i).SetTensorInfo(base.GetOutputSlot(i).GetTensorInfo());
386  }
387 
388  SubgraphView::SubgraphViewPtr substituteSubgraph =
390  CreateOutputsFrom(&base),
391  {&base});
392 
393  optimizationViews.AddSubstitution({ std::move(*substituteSubgraph), SubgraphView(preCompiledLayer) });
394 
395  untouched.erase(base.GetGuid());
396  }
397 
398  if (optimizationViews.GetSubstitutions().empty())
399  {
400  optimizationViews.AddUntouchedSubgraph(SubgraphView(subgraph));
401  }
402  else
403  {
404  ReportUntouchedLayers(optimizationViews, untouched);
405  }
406 
407 
408  return optimizationViews;
409 }

References armnn::Activation, INetwork::AddPrecompiledLayer(), OptimizationViews::AddSubstitution(), OptimizationViews::AddUntouchedSubgraph(), armnn::BatchMatMul, SubgraphView::begin(), armnn::Cast, armnn::Convolution2d, armnn::CreateInputsFrom(), armnn::CreateOutputsFrom(), armnn::CreateSubgraphViewFrom(), armnn::DepthwiseConvolution2d, armnn::ElementwiseBinary, SubgraphView::end(), InputSlot::GetConnectedOutputSlot(), Layer::GetGuid(), GpuFsaBackend::GetId(), OptimizationViews::GetINetwork(), Layer::GetInputSlot(), Layer::GetNumInputSlots(), SubgraphView::GetNumOutputSlots(), Layer::GetNumOutputSlots(), IConnectableLayer::GetOutputSlot(), Layer::GetOutputSlot(), Layer::GetParameters(), OptimizationViews::GetSubstitutions(), OutputSlot::GetTensorInfo(), Layer::GetType(), armnn::GpuFsaActivationCreateOp(), armnn::GpuFsaBatchMatMulCreateOp(), armnn::GpuFsaCastCreateOp(), armnn::GpuFsaConvolution2dCreateOp(), armnn::GpuFsaDepthwiseConvolution2dCreateOp(), armnn::GpuFsaElementwiseBinaryCreateOp(), armnn::GpuFsaPooling2dCreateOp(), armnn::GpuFsaReshapeCreateOp(), armnn::GpuFsaResizeCreateOp(), armnn::GpuFsaSoftmaxCreateOp(), armnn::Pooling2d, armnn::ReportUntouchedLayers(), armnn::Reshape, armnn::Resize, IOutputSlot::SetTensorInfo(), GpuFsaPreCompiledBlob::sketch, armnn::Softmax, and GpuFsaPreCompiledBlob::workloadContext.

◆ RegisterTensorHandleFactories() [1/2]

void RegisterTensorHandleFactories ( TensorHandleFactoryRegistry )
overridevirtual

(Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor() IWorkloadFactory::CreateSubtensor() methods must be implemented.

Reimplemented from IBackendInternal.

Definition at line 151 of file GpuFsaBackend.cpp.

152 {
153  std::shared_ptr<GpuFsaMemoryManager> memoryManager;
155  {
156  memoryManager = std::make_shared<GpuFsaMemoryManager>(m_CustomAllocator);
157  }
158  else
159  {
160  memoryManager = std::make_shared<GpuFsaMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
161  }
162 
163  std::unique_ptr<ITensorHandleFactory> factory = std::make_unique<GpuFsaTensorHandleFactory>(memoryManager);
164  registry.RegisterMemoryManager(memoryManager);
165  registry.RegisterFactory(std::move(factory));
166 
167 }

References GpuFsaBackend::m_CustomAllocator, GpuFsaBackend::m_UsingCustomAllocator, TensorHandleFactoryRegistry::RegisterFactory(), and TensorHandleFactoryRegistry::RegisterMemoryManager().

◆ RegisterTensorHandleFactories() [2/2]

void RegisterTensorHandleFactories ( TensorHandleFactoryRegistry registry,
MemorySourceFlags  inputFlags,
MemorySourceFlags  outputFlags 
)
overridevirtual

(Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor() IWorkloadFactory::CreateSubtensor() methods must be implemented.

Reimplemented from IBackendInternal.

Definition at line 169 of file GpuFsaBackend.cpp.

172 {
173  // To allow force import if inputFlags/outputFlags are Undefined, set it as Malloc
174  if (inputFlags == static_cast<MemorySourceFlags>(MemorySource::Undefined))
175  {
176  inputFlags = static_cast<MemorySourceFlags>(MemorySource::Malloc);
177  }
178  if (outputFlags == static_cast<MemorySourceFlags>(MemorySource::Undefined))
179  {
180  outputFlags = static_cast<MemorySourceFlags>(MemorySource::Malloc);
181  }
182 
183  std::shared_ptr<GpuFsaMemoryManager> memoryManager;
185  {
186  memoryManager = std::make_shared<GpuFsaMemoryManager>(m_CustomAllocator);
187  }
188  else
189  {
190  memoryManager = std::make_shared<GpuFsaMemoryManager>(std::make_unique<arm_compute::CLBufferAllocator>());
191  }
192 
193  std::unique_ptr<ITensorHandleFactory> factory = std::make_unique<GpuFsaTensorHandleFactory>(memoryManager);
194  registry.RegisterMemoryManager(memoryManager);
195  registry.RegisterFactory(std::move(factory));
196 }

References GpuFsaBackend::m_CustomAllocator, GpuFsaBackend::m_UsingCustomAllocator, armnn::Malloc, TensorHandleFactoryRegistry::RegisterFactory(), TensorHandleFactoryRegistry::RegisterMemoryManager(), and armnn::Undefined.

◆ UseCustomMemoryAllocator()

virtual bool UseCustomMemoryAllocator ( std::shared_ptr< ICustomAllocator allocator,
armnn::Optional< std::string & >  errMsg 
)
inlineoverridevirtual

Signals the backend to use a custom memory allocator provided by the user.

Parameters
allocator- a pointer to the provided ICustomAllocator to use with this backend
errMsg- Optional string variable to return error messages
Returns
- Returns true if switching to custom allocator was successful

Reimplemented from IBackendInternal.

Definition at line 105 of file GpuFsaBackend.hpp.

107  {
108  ARMNN_LOG(info) << "Using Custom Allocator for GpuFsaBackend";
109 
110  // Set flag to signal the backend to use a custom memory allocator
111  m_CustomAllocator = std::make_shared<GpuFsaBackendCustomAllocatorWrapper>(std::move(allocator));
112  m_UsingCustomAllocator = true;
113  return m_UsingCustomAllocator;
114  }

References ARMNN_LOG, armnn::info, GpuFsaBackend::m_CustomAllocator, and GpuFsaBackend::m_UsingCustomAllocator.

Referenced by GpuFsaBackend::GpuFsaBackend().

Member Data Documentation

◆ m_CustomAllocator

◆ m_UsingCustomAllocator


The documentation for this class was generated from the following files:
armnn::MemorySource::Malloc
@ Malloc
armnn::GpuFsaSoftmaxCreateOp
void GpuFsaSoftmaxCreateOp(GpuFsaPreCompiledBlob *blob, const TensorInfo &input, const TensorInfo &output, const SoftmaxDescriptor &descriptor)
Definition: GpuFsaSoftmax.cpp:63
armnn::GpuFsaElementwiseBinaryCreateOp
void GpuFsaElementwiseBinaryCreateOp(GpuFsaPreCompiledBlob *blob, const TensorInfo &input0, const TensorInfo &input1, const ElementwiseBinaryDescriptor &descriptor)
Definition: GpuFsaElementwiseBinary.cpp:63
armnn::Optional
Definition: Optional.hpp:270
armnn::GpuFsaPooling2dCreateOp
void GpuFsaPooling2dCreateOp(GpuFsaPreCompiledBlob *blob, const TensorInfo &input, const Pooling2dDescriptor &descriptor)
Definition: GpuFsaPooling2d.cpp:40
armnn::GpuFsaBackend::GetIdStatic
static const BackendId & GetIdStatic()
Definition: GpuFsaBackend.cpp:69
armnn::GpuFsaBatchMatMulCreateOp
void GpuFsaBatchMatMulCreateOp(GpuFsaPreCompiledBlob *blob, const TensorInfo &input0, const TensorInfo &input1, const BatchMatMulDescriptor &descriptor)
Definition: GpuFsaBatchMatMul.cpp:51
armnn::GpuFsaBackend::m_UsingCustomAllocator
bool m_UsingCustomAllocator
Definition: GpuFsaBackend.hpp:304
armnn::GpuFsaTensorHandleFactory::GetIdStatic
static const FactoryId & GetIdStatic()
Definition: GpuFsaTensorHandleFactory.cpp:86
armnn::GpuFsaActivationCreateOp
void GpuFsaActivationCreateOp(GpuFsaPreCompiledBlob *blob, const TensorInfo &input, const ActivationDescriptor &descriptor)
Definition: GpuFsaActivation.cpp:58
armnn::MemorySourceFlags
unsigned int MemorySourceFlags
Definition: MemorySources.hpp:15
armnn::GpuFsaBackendId
constexpr const char * GpuFsaBackendId()
Definition: GpuFsaBackendId.hpp:10
armnn::IBackendInternal::IBackendContextPtr
std::unique_ptr< IBackendContext > IBackendContextPtr
Definition: IBackendInternal.hpp:90
armnn::LayerType::ElementwiseBinary
@ ElementwiseBinary
armnn::GpuFsaReshapeCreateOp
void GpuFsaReshapeCreateOp(GpuFsaPreCompiledBlob *blob, const TensorInfo &input, const ReshapeDescriptor &descriptor)
Definition: GpuFsaReshape.cpp:49
ARMNN_LOG
#define ARMNN_LOG(severity)
Definition: Logging.hpp:212
armnn::MemorySource::Undefined
@ Undefined
armnn::EmptyOptional
EmptyOptional is used to initialize the Optional class in case we want to have default value for an O...
Definition: Optional.hpp:32
armnn::LayerType::Softmax
@ Softmax
armnn::SubgraphView::SubgraphViewPtr
std::shared_ptr< SubgraphView > SubgraphViewPtr
Definition: SubgraphView.hpp:56
armnn::GpuFsaDepthwiseConvolution2dCreateOp
void GpuFsaDepthwiseConvolution2dCreateOp(GpuFsaPreCompiledBlob *blob, const TensorInfo &input, const DepthwiseConvolution2dDescriptor &descriptor, const TensorInfo &weights, const Optional< TensorInfo > &biases)
Definition: GpuFsaDepthwiseConvolution2d.cpp:89
armnn::GpuFsaBackend::UseCustomMemoryAllocator
virtual bool UseCustomMemoryAllocator(std::shared_ptr< ICustomAllocator > allocator, armnn::Optional< std::string & >) override
Signals the backend to use a custom memory allocator provided by the user.
Definition: GpuFsaBackend.hpp:105
armnn::CreateSubgraphViewFrom
SubgraphView::SubgraphViewPtr CreateSubgraphViewFrom(SubgraphView::InputSlots &&inputs, SubgraphView::OutputSlots &&outputs, SubgraphView::Layers &&layers)
Definition: GpuFsaBackend.cpp:62
armnn::GpuFsaBackend::m_CustomAllocator
std::shared_ptr< GpuFsaBackendCustomAllocatorWrapper > m_CustomAllocator
Definition: GpuFsaBackend.hpp:303
armnn::LayerType::Pooling2d
@ Pooling2d
armnn::IBackendInternal::IBackendProfilingContextPtr
std::shared_ptr< arm::pipe::IBackendProfilingContext > IBackendProfilingContextPtr
This is the bridge between backend and backend profiling we'll keep it in the backend namespace.
Definition: IBackendInternal.hpp:92
armnn::GpuFsaResizeCreateOp
void GpuFsaResizeCreateOp(GpuFsaPreCompiledBlob *blob, const TensorInfo &input, const ResizeDescriptor &descriptor)
Definition: GpuFsaResize.cpp:39
armnn::LayerType::BatchMatMul
@ BatchMatMul
armnn::LayerType::DepthwiseConvolution2d
@ DepthwiseConvolution2d
armnn::LayerType::Cast
@ Cast
armnn::GpuFsaCastCreateOp
void GpuFsaCastCreateOp(GpuFsaPreCompiledBlob *blob, const TensorInfo &input, const TensorInfo &output)
Definition: GpuFsaCast.cpp:61
armnn::LayerType::Reshape
@ Reshape
armnn::GpuFsaBackend::GetId
const BackendId & GetId() const override
Definition: GpuFsaBackend.hpp:67
armnn::gpuFsaCapabilities
const BackendCapabilities gpuFsaCapabilities("GpuFsa", { {"NonConstWeights", false}, {"AsyncExecution", false}, {"ProtectedContentAllocation", false}, {"ConstantTensorsAsInputs", true}, {"PreImportIOTensors", false}, {"ExternallyManagedMemory", false}, {"MultiAxisPacking", false}, {"SingleAxisPacking", false} })
armnn::IBackendInternal::ILayerSupportSharedPtr
std::shared_ptr< ILayerSupport > ILayerSupportSharedPtr
Definition: IBackendInternal.hpp:94
armnn::LayerType::Resize
@ Resize
armnn::ReportUntouchedLayers
void ReportUntouchedLayers(OptimizationViews &optimizationViews, std::map< LayerGuid, Layer * > untouched)
Definition: SubgraphUtils.hpp:220
armnn::LayerType::Convolution2d
@ Convolution2d
armnn::GpuFsaConvolution2dCreateOp
void GpuFsaConvolution2dCreateOp(GpuFsaPreCompiledBlob *blob, const TensorInfo &input, const Convolution2dDescriptor &descriptor, const TensorInfo &weights, const Optional< TensorInfo > &biases)
Definition: GpuFsaConvolution2d.cpp:70
armnn::LayerType::Activation
@ Activation
armnn::CreateOutputsFrom
SubgraphView::OutputSlots CreateOutputsFrom(Layer *layer)
Definition: GpuFsaBackend.cpp:52
armnn::CreateInputsFrom
SubgraphView::InputSlots CreateInputsFrom(Layer *layer)
Definition: GpuFsaBackend.cpp:42