ArmNN
 24.08
NeonBackend Class Reference

#include <NeonBackend.hpp>

Inheritance diagram for NeonBackend:
[legend]
Collaboration diagram for NeonBackend:
[legend]

Public Member Functions

 NeonBackend ()=default
 
 ~NeonBackend ()=default
 
const BackendIdGetId () const override
 
IBackendInternal::IMemoryManagerUniquePtr CreateMemoryManager () const override
 
IWorkloadFactoryPtr CreateWorkloadFactory (const IBackendInternal::IMemoryManagerSharedPtr &memoryManager=nullptr) const override
 
IWorkloadFactoryPtr CreateWorkloadFactory (class TensorHandleFactoryRegistry &tensorHandleFactoryRegistry) const override
 
IWorkloadFactoryPtr CreateWorkloadFactory (const IMemoryManagerSharedPtr &memoryManager, const ModelOptions &modelOptions) const override
 
IWorkloadFactoryPtr CreateWorkloadFactory (class TensorHandleFactoryRegistry &tensorHandleFactoryRegistry, const ModelOptions &modelOptions) const override
 
IBackendInternal::IBackendContextPtr CreateBackendContext (const IRuntime::CreationOptions &) const override
 Create the runtime context of the backend. More...
 
IBackendInternal::IBackendProfilingContextPtr CreateBackendProfilingContext (const IRuntime::CreationOptions &, IBackendProfilingPtr &backendProfiling) override
 Create context specifically used for profiling interaction from backends. More...
 
IBackendInternal::ILayerSupportSharedPtr GetLayerSupport () const override
 
IBackendInternal::ILayerSupportSharedPtr GetLayerSupport (const ModelOptions &modelOptions) const override
 
OptimizationViews OptimizeSubgraphView (const SubgraphView &subgraph, const ModelOptions &modelOptions) const override
 
std::vector< ITensorHandleFactory::FactoryIdGetHandleFactoryPreferences () const override
 (Optional) Returns a vector of supported TensorHandleFactory ids in preference order. More...
 
void RegisterTensorHandleFactories (class TensorHandleFactoryRegistry &registry) override
 (Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor() IWorkloadFactory::CreateSubtensor() methods must be implemented. More...
 
IBackendInternal::IBackendSpecificModelContextPtr CreateBackendSpecificModelContext (const ModelOptions &modelOptions) const override
 
BackendCapabilities GetCapabilities () const override
 Returns a BackendCapability if the backend lists the capability The BackendCapability must then be inspected to check whether or not that BackendCapability is supported Otherwise returns an EmptyOptional if the BackendCapability is unlisted. More...
 
std::unique_ptr< ICustomAllocatorGetDefaultAllocator () const override
 Returns the default memory allocator for the backend. More...
 
- Public Member Functions inherited from IBackendInternal
 ~IBackendInternal () override=default
 Allow backends created by the factory function to be destroyed through IBackendInternal. More...
 
virtual IWorkloadFactoryPtr CreateWorkloadFactory (class TensorHandleFactoryRegistry &tensorHandleFactoryRegistry, const ModelOptions &modelOptions, MemorySourceFlags inputFlags, MemorySourceFlags outputFlags) const
 
virtual OptimizationViews OptimizeSubgraphView (const SubgraphView &subgraph) const
 
bool SupportsTensorAllocatorAPI () const
 
ITensorHandleFactory::FactoryId GetBackwardCompatibleFavoriteHandleFactory ()
 
virtual void RegisterTensorHandleFactories (class TensorHandleFactoryRegistry &registry, MemorySourceFlags inputFlags, MemorySourceFlags outputFlags)
 (Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor() IWorkloadFactory::CreateSubtensor() methods must be implemented. More...
 
virtual bool UseCustomMemoryAllocator (std::shared_ptr< ICustomAllocator > allocator, armnn::Optional< std::string & > errMsg)
 Signals the backend to use a custom memory allocator provided by the user. More...
 
virtual unsigned int GetNumberOfCacheFiles () const
 Returns the number of files cached if backend supports caching. More...
 
virtual ExecutionData CreateExecutionData (WorkingMemDescriptor &workingMemDescriptor) const
 Returns ExecutionData for the backend. More...
 
virtual void UpdateExecutionData (ExecutionData &executionData, WorkingMemDescriptor &workingMemDescriptor) const
 Update the ExecutionData for a layer. More...
 

Static Public Member Functions

static const BackendIdGetIdStatic ()
 
- Static Public Member Functions inherited from IBackendInternal
static constexpr BackendVersion GetApiVersion ()
 Returns the version of the Backend API. More...
 

Additional Inherited Members

- Public Types inherited from IBackendInternal
using IWorkloadFactoryPtr = std::unique_ptr< IWorkloadFactory >
 
using IBackendContextPtr = std::unique_ptr< IBackendContext >
 
using IBackendProfilingContextPtr = std::shared_ptr< arm::pipe::IBackendProfilingContext >
 This is the bridge between backend and backend profiling we'll keep it in the backend namespace. More...
 
using IBackendProfilingPtr = std::unique_ptr< arm::pipe::IBackendProfiling >
 
using ILayerSupportSharedPtr = std::shared_ptr< ILayerSupport >
 
using IBackendSpecificModelContextPtr = std::shared_ptr< IBackendModelContext >
 
using IMemoryManagerUniquePtr = std::unique_ptr< IMemoryManager >
 
using IMemoryManagerSharedPtr = std::shared_ptr< IMemoryManager >
 
- Protected Member Functions inherited from IBackendInternal
 IBackendInternal ()=default
 Creation must be done through a specific backend interface. More...
 
- Protected Member Functions inherited from IBackend
 IBackend ()
 
virtual ~IBackend ()
 

Detailed Description

Definition at line 29 of file NeonBackend.hpp.

Constructor & Destructor Documentation

◆ NeonBackend()

NeonBackend ( )
default

◆ ~NeonBackend()

~NeonBackend ( )
default

Member Function Documentation

◆ CreateBackendContext()

IBackendInternal::IBackendContextPtr CreateBackendContext ( const IRuntime::CreationOptions ) const
overridevirtual

Create the runtime context of the backend.

Implementations may return a default-constructed IBackendContextPtr if no context is needed at runtime. Implementations must throw BackendUnavailableException if the backend cannot be used (for example, necessary accelerator hardware is not present). The default implementation always returns a default-constructed pointer.

Reimplemented from IBackendInternal.

Definition at line 109 of file NeonBackend.cpp.

110 {
111  return IBackendContextPtr{};
112 }

◆ CreateBackendProfilingContext()

IBackendInternal::IBackendProfilingContextPtr CreateBackendProfilingContext ( const IRuntime::CreationOptions creationOptions,
IBackendProfilingPtr backendProfiling 
)
overridevirtual

Create context specifically used for profiling interaction from backends.

Reimplemented from IBackendInternal.

Definition at line 114 of file NeonBackend.cpp.

116 {
118 }

◆ CreateBackendSpecificModelContext()

IBackendInternal::IBackendSpecificModelContextPtr CreateBackendSpecificModelContext ( const ModelOptions modelOptions) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 120 of file NeonBackend.cpp.

122 {
123  return IBackendSpecificModelContextPtr{new NeonBackendModelContext{modelOptions}};
124 }

Referenced by NeonBackend::CreateWorkloadFactory(), and NeonBackend::GetLayerSupport().

◆ CreateMemoryManager()

IBackendInternal::IMemoryManagerUniquePtr CreateMemoryManager ( ) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 52 of file NeonBackend.cpp.

53 {
54  return std::make_unique<NeonMemoryManager>(std::make_unique<arm_compute::Allocator>(),
56 }

References BaseMemoryManager::Offset.

◆ CreateWorkloadFactory() [1/4]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( class TensorHandleFactoryRegistry tensorHandleFactoryRegistry) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 72 of file NeonBackend.cpp.

74 {
75  auto memoryManager = std::make_shared<NeonMemoryManager>(std::make_unique<arm_compute::Allocator>(),
77 
78  tensorHandleFactoryRegistry.RegisterMemoryManager(memoryManager);
79 
80  auto factory = std::make_unique<NeonTensorHandleFactory>(memoryManager);
81  // Register copy and import factory pair
82  tensorHandleFactoryRegistry.RegisterCopyAndImportFactoryPair(factory->GetId(), factory->GetId());
83  // Register the factory
84  tensorHandleFactoryRegistry.RegisterFactory(std::move(factory));
85 
86 
87  return std::make_unique<NeonWorkloadFactory>(
88  PolymorphicPointerDowncast<NeonMemoryManager>(memoryManager));
89 }

References BaseMemoryManager::Offset, TensorHandleFactoryRegistry::RegisterCopyAndImportFactoryPair(), TensorHandleFactoryRegistry::RegisterFactory(), and TensorHandleFactoryRegistry::RegisterMemoryManager().

◆ CreateWorkloadFactory() [2/4]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( class TensorHandleFactoryRegistry tensorHandleFactoryRegistry,
const ModelOptions modelOptions 
) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 91 of file NeonBackend.cpp.

93 {
94  auto memoryManager = std::make_shared<NeonMemoryManager>(std::make_unique<arm_compute::Allocator>(),
96 
97  tensorHandleFactoryRegistry.RegisterMemoryManager(memoryManager);
98 
99  auto factory = std::make_unique<NeonTensorHandleFactory>(memoryManager);
100  // Register copy and import factory pair
101  tensorHandleFactoryRegistry.RegisterCopyAndImportFactoryPair(factory->GetId(), factory->GetId());
102  // Register the factory
103  tensorHandleFactoryRegistry.RegisterFactory(std::move(factory));
104 
105  return std::make_unique<NeonWorkloadFactory>(
106  PolymorphicPointerDowncast<NeonMemoryManager>(memoryManager), CreateBackendSpecificModelContext(modelOptions));
107 }

References NeonBackend::CreateBackendSpecificModelContext(), BaseMemoryManager::Offset, TensorHandleFactoryRegistry::RegisterCopyAndImportFactoryPair(), TensorHandleFactoryRegistry::RegisterFactory(), and TensorHandleFactoryRegistry::RegisterMemoryManager().

◆ CreateWorkloadFactory() [3/4]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( const IBackendInternal::IMemoryManagerSharedPtr memoryManager = nullptr) const
overridevirtual

Implements IBackendInternal.

Definition at line 58 of file NeonBackend.cpp.

60 {
61  return std::make_unique<NeonWorkloadFactory>(
62  PolymorphicPointerDowncast<NeonMemoryManager>(memoryManager));
63 }

◆ CreateWorkloadFactory() [4/4]

IBackendInternal::IWorkloadFactoryPtr CreateWorkloadFactory ( const IMemoryManagerSharedPtr memoryManager,
const ModelOptions modelOptions 
) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 65 of file NeonBackend.cpp.

67 {
68  return std::make_unique<NeonWorkloadFactory>(
69  PolymorphicPointerDowncast<NeonMemoryManager>(memoryManager), CreateBackendSpecificModelContext(modelOptions));
70 }

References NeonBackend::CreateBackendSpecificModelContext().

◆ GetCapabilities()

BackendCapabilities GetCapabilities ( ) const
inlineoverridevirtual

Returns a BackendCapability if the backend lists the capability The BackendCapability must then be inspected to check whether or not that BackendCapability is supported Otherwise returns an EmptyOptional if the BackendCapability is unlisted.

Reimplemented from IBackendInternal.

Definition at line 68 of file NeonBackend.hpp.

69  {
70  return cpuAccCapabilities;
71  };

References armnn::cpuAccCapabilities.

◆ GetDefaultAllocator()

std::unique_ptr< ICustomAllocator > GetDefaultAllocator ( ) const
overridevirtual

Returns the default memory allocator for the backend.

Returns
- Returns unique pointer to the Default Allocator of the Backend

Reimplemented from IBackendInternal.

Definition at line 637 of file NeonBackend.cpp.

638 {
639  return std::make_unique<DefaultAllocator>();
640 }

◆ GetHandleFactoryPreferences()

std::vector< ITensorHandleFactory::FactoryId > GetHandleFactoryPreferences ( ) const
overridevirtual

(Optional) Returns a vector of supported TensorHandleFactory ids in preference order.

Reimplemented from IBackendInternal.

Definition at line 618 of file NeonBackend.cpp.

619 {
620  return std::vector<ITensorHandleFactory::FactoryId>() = { NeonTensorHandleFactory::GetIdStatic() };
621 }

References NeonTensorHandleFactory::GetIdStatic().

◆ GetId()

const BackendId& GetId ( ) const
inlineoverridevirtual

Implements IBackend.

Definition at line 36 of file NeonBackend.hpp.

36 { return GetIdStatic(); }

References NeonBackend::GetIdStatic().

◆ GetIdStatic()

const BackendId & GetIdStatic ( )
static

Definition at line 46 of file NeonBackend.cpp.

47 {
48  static const BackendId s_Id{NeonBackendId()};
49  return s_Id;
50 }

References armnn::NeonBackendId().

Referenced by NeonBackend::GetId().

◆ GetLayerSupport() [1/2]

IBackendInternal::ILayerSupportSharedPtr GetLayerSupport ( ) const
overridevirtual

Implements IBackendInternal.

Definition at line 126 of file NeonBackend.cpp.

127 {
128  static ILayerSupportSharedPtr layerSupport
129  {
131  };
132  return layerSupport;
133 }

◆ GetLayerSupport() [2/2]

IBackendInternal::ILayerSupportSharedPtr GetLayerSupport ( const ModelOptions modelOptions) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 135 of file NeonBackend.cpp.

136 {
137  static ILayerSupportSharedPtr layerSupport
138  {
139  new NeonLayerSupport(CreateBackendSpecificModelContext(modelOptions))
140  };
141  return layerSupport;
142 }

References NeonBackend::CreateBackendSpecificModelContext().

◆ OptimizeSubgraphView()

OptimizationViews OptimizeSubgraphView ( const SubgraphView subgraph,
const ModelOptions modelOptions 
) const
overridevirtual

Reimplemented from IBackendInternal.

Definition at line 144 of file NeonBackend.cpp.

146 {
147  OptimizationViews optimizationViews(modelOptions);
148 
149  auto it = subgraph.end();
150  std::map<LayerGuid, Layer*> untouched;
151 
152  while (it != subgraph.begin())
153  {
154  --it;
155  Layer& base = *(PolymorphicDowncast<Layer*>(*it));
156  untouched.insert({base.GetGuid(), &base});
157  }
158 
159  it = subgraph.end();
160  while (it != subgraph.begin())
161  {
162  --it;
163  Layer& base = *(PolymorphicDowncast<Layer*>(*it));
164 
165  // Fuse activation into previous layer if supported by backend
166  if ((base.GetType() == LayerType::DepthwiseConvolution2d || base.GetType() == LayerType::Convolution2d
167  || base.GetType() == LayerType::BatchNormalization || base.GetType() == LayerType::FullyConnected
168  || base.GetType() == LayerType::Addition || base.GetType() == LayerType::Multiplication
169  || base.GetType() == LayerType::Subtraction || base.GetType() == LayerType::Division
170  || base.GetType() == LayerType::ElementwiseBinary)
171  && (base.GetAdditionalInformation<ActivationDescriptor>() == nullptr))
172  {
173  for (auto output = base.BeginOutputSlots(); output != base.EndOutputSlots(); ++output)
174  {
175  if (output->GetNumConnections() == 1)
176  {
177  for (auto&& childInput : output->GetConnections())
178  {
179  if ((childInput->GetOwningLayer().GetType() == LayerType::Activation) &&
180  (checkDataTypeInputandOutput(childInput->GetOwningLayer())))
181  {
182  Layer& child = childInput->GetOwningLayer();
183 
184  auto* activationLayer = PolymorphicDowncast<ActivationLayer*>(&child);
185 
186  const std::string name = std::string("fused-") + child.GetName() + std::string("-into-") +
187  base.GetName();
188 
189  // Get params from activation layer
190  ActivationDescriptor activationDesc = activationLayer->GetParameters();
191 
192  if (base.GetType() == LayerType::Convolution2d)
193  {
194  Convolution2dLayer* baseLayer = PolymorphicDowncast<Convolution2dLayer*>(&base);
195 
196  Optional<TensorInfo> biases;
197 
198  if (baseLayer->GetParameters().m_BiasEnabled)
199  {
200  biases = baseLayer->GetInputSlot(2).GetConnectedOutputSlot()->GetTensorInfo();
201  }
202 
204  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
205  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
206  baseLayer->GetParameters(),
207  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
208  biases,
209  false,
210  &activationDesc);
211 
212  if (status)
213  {
214  FuseConvolution2dLayer<Convolution2dLayer>(optimizationViews,
215  baseLayer,
216  activationLayer,
217  activationDesc,
218  name);
219  untouched.erase(baseLayer->GetGuid());
220  untouched.erase(activationLayer->GetGuid());
221  }
222  }
223  else if (base.GetType() == LayerType::DepthwiseConvolution2d)
224  {
225  DepthwiseConvolution2dLayer* baseLayer =
226  PolymorphicDowncast<DepthwiseConvolution2dLayer*>(&base);
227 
228  Optional<TensorInfo> biases;
229 
230  if (baseLayer->GetParameters().m_BiasEnabled)
231  {
232  biases = baseLayer->GetInputSlot(2).GetConnectedOutputSlot()->GetTensorInfo();
233  }
234 
236  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
237  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
238  baseLayer->GetParameters(),
239  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
240  biases,
241  &activationDesc);
242 
243  if (status)
244  {
245  FuseDepthwiseConvolution2dLayer<DepthwiseConvolution2dLayer>(optimizationViews,
246  baseLayer,
247  activationLayer,
248  activationDesc,
249  name);
250  untouched.erase(baseLayer->GetGuid());
251  untouched.erase(activationLayer->GetGuid());
252  }
253  }
254  else if (base.GetType() == LayerType::FullyConnected)
255  {
256  FullyConnectedLayer* baseLayer = PolymorphicDowncast<FullyConnectedLayer*>(&base);
257  FullyConnectedDescriptor descriptor = baseLayer->GetParameters();
258 
259  // As bias is optional only try to get TensorInfo from input if bias is enabled.
260  Optional<TensorInfo> biases;
261  if (descriptor.m_BiasEnabled)
262  {
263  biases = baseLayer->GetInputSlot(2).GetConnectedOutputSlot()->GetTensorInfo();
264  }
265 
267  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
268  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
269  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
270  biases,
271  baseLayer->GetParameters(),
272  &activationDesc);
273 
274  if (status)
275  {
276  FuseFullyConnectedLayer<FullyConnectedLayer>(optimizationViews,
277  baseLayer,
278  activationLayer,
279  activationDesc,
280  name);
281  untouched.erase(baseLayer->GetGuid());
282  untouched.erase(activationLayer->GetGuid());
283  }
284  }
285  else if (base.GetType() == LayerType::BatchNormalization)
286  {
287  BatchNormalizationLayer* baseLayer =
288  PolymorphicDowncast<BatchNormalizationLayer*>(&base);
289 
291  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
292  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
293  baseLayer->m_Mean->GetTensorInfo(),
294  baseLayer->m_Variance->GetTensorInfo(),
295  baseLayer->m_Beta->GetTensorInfo(),
296  baseLayer->m_Gamma->GetTensorInfo(),
297  baseLayer->GetParameters(),
298  &activationDesc);
299 
300  if (status)
301  {
302  BatchNormalizationLayer* replacementLayer =
303  FuseBatchNormalizationLayer<BatchNormalizationLayer>(optimizationViews,
304  baseLayer,
305  activationLayer,
306  activationDesc,
307  name);
308 
309  replacementLayer->m_Beta = std::move(baseLayer->m_Beta);
310  replacementLayer->m_Gamma = std::move(baseLayer->m_Gamma);
311  replacementLayer->m_Mean = std::move(baseLayer->m_Mean);
312  replacementLayer->m_Variance = std::move(baseLayer->m_Variance);
313  untouched.erase(baseLayer->GetGuid());
314  untouched.erase(activationLayer->GetGuid());
315  }
316  }
317  else if (base.GetType() == LayerType::Addition)
318  {
319  AdditionLayer* baseLayer = PolymorphicDowncast<AdditionLayer*>(&base);
320 
322  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
323  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
324  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
325  &activationDesc);
326 
327  if (status)
328  {
329  FuseAdditionLayer<AdditionLayer>(optimizationViews,
330  baseLayer,
331  activationLayer,
332  activationDesc,
333  name);
334  untouched.erase(baseLayer->GetGuid());
335  untouched.erase(activationLayer->GetGuid());
336  }
337  }
338  else if (base.GetType() == LayerType::Division)
339  {
340  DivisionLayer* baseLayer = PolymorphicDowncast<DivisionLayer*>(&base);
341 
343  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
344  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
345  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
346  &activationDesc);
347 
348  if (status)
349  {
350  FuseDivisionLayer<DivisionLayer>(optimizationViews,
351  baseLayer,
352  activationLayer,
353  activationDesc,
354  name);
355  untouched.erase(baseLayer->GetGuid());
356  untouched.erase(activationLayer->GetGuid());
357  }
358  }
359  else if (base.GetType() == LayerType::Multiplication)
360  {
361  MultiplicationLayer* baseLayer = PolymorphicDowncast<MultiplicationLayer*>(&base);
362 
364  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
365  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
366  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
367  &activationDesc);
368 
369  if (status)
370  {
371  FuseMultiplicationLayer<MultiplicationLayer>(optimizationViews,
372  baseLayer,
373  activationLayer,
374  activationDesc,
375  name);
376  untouched.erase(baseLayer->GetGuid());
377  untouched.erase(activationLayer->GetGuid());
378  }
379  }
380  else if (base.GetType() == LayerType::Subtraction)
381  {
382  SubtractionLayer* baseLayer = PolymorphicDowncast<SubtractionLayer*>(&base);
383 
385  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
386  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
387  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
388  &activationDesc);
389 
390  if (status)
391  {
392  FuseSubtractionLayer<SubtractionLayer>(optimizationViews,
393  baseLayer,
394  activationLayer,
395  activationDesc,
396  name);
397  untouched.erase(baseLayer->GetGuid());
398  untouched.erase(activationLayer->GetGuid());
399  }
400  }
401  else if (base.GetType() == LayerType::ElementwiseBinary)
402  {
403  ElementwiseBinaryLayer* baseLayer = PolymorphicDowncast<ElementwiseBinaryLayer*>(&base);
404 
405  if (baseLayer->GetParameters().m_Operation == BinaryOperation::Add)
406  {
408  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
409  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
410  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
411  &activationDesc);
412 
413  if (status)
414  {
415  FuseElementwiseBinaryLayer<ElementwiseBinaryLayer>(optimizationViews,
416  baseLayer,
417  activationLayer,
418  activationDesc,
420  name);
421  untouched.erase(baseLayer->GetGuid());
422  untouched.erase(activationLayer->GetGuid());
423  }
424  }
425  else if (baseLayer->GetParameters().m_Operation == BinaryOperation::Div)
426  {
428  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
429  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
430  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
431  &activationDesc);
432 
433  if (status)
434  {
435  FuseElementwiseBinaryLayer<ElementwiseBinaryLayer>(optimizationViews,
436  baseLayer,
437  activationLayer,
438  activationDesc,
440  name);
441  untouched.erase(baseLayer->GetGuid());
442  untouched.erase(activationLayer->GetGuid());
443  }
444  }
445  else if (baseLayer->GetParameters().m_Operation == BinaryOperation::Mul)
446  {
448  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
449  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
450  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
451  &activationDesc);
452 
453  if (status)
454  {
455  FuseElementwiseBinaryLayer<ElementwiseBinaryLayer>(optimizationViews,
456  baseLayer,
457  activationLayer,
458  activationDesc,
460  name);
461  untouched.erase(baseLayer->GetGuid());
462  untouched.erase(activationLayer->GetGuid());
463  }
464  }
465  else if (baseLayer->GetParameters().m_Operation == BinaryOperation::Sub)
466  {
468  baseLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
469  baseLayer->GetInputSlot(1).GetConnectedOutputSlot()->GetTensorInfo(),
470  activationLayer->GetInputSlot(0).GetConnectedOutputSlot()->GetTensorInfo(),
471  &activationDesc);
472 
473  if (status)
474  {
475  FuseElementwiseBinaryLayer<ElementwiseBinaryLayer>(optimizationViews,
476  baseLayer,
477  activationLayer,
478  activationDesc,
480  name);
481  untouched.erase(baseLayer->GetGuid());
482  untouched.erase(activationLayer->GetGuid());
483  }
484  }
485  // No fusion available for other BinaryOperations
486  }
487  }
488  }
489  }
490  }
491  }
492 
493  // Separate reduce layer with multiple axes into multiple reduce layers with 1 axis.
494  if (base.GetType() == LayerType::Reduce)
495  {
496  ReduceLayer* baseLayer = PolymorphicDowncast<ReduceLayer*>(&base);
497  ReduceDescriptor reduceDescriptor = baseLayer->GetParameters();
498 
499  if (!reduceDescriptor.m_vAxis.empty() && reduceDescriptor.m_vAxis.size() > 1)
500  {
501  // Add new layers to the graph and connect them.
502  std::vector<IConnectableLayer*> layers = ChainReduceLayers<ReduceLayer>(optimizationViews,
503  baseLayer,
504  reduceDescriptor);
505 
506  // Replace existing baselayer with new subgraph.
507  ReplaceLayers<ReduceLayer>(optimizationViews, baseLayer, layers);
508  untouched.erase(baseLayer->GetGuid());
509  }
510  }
511 
512  // Remove Reshape where possible
513  if (base.GetType() == LayerType::Reshape)
514  {
515  ReshapeLayer* baseLayer = PolymorphicDowncast<ReshapeLayer*>(&base);
516 
517  // Cannot remove a Reshape if it's connected to any layer that has an NCHW layout
518  if (ConnectedToLayerWithNCHW(baseLayer))
519  {
520  continue;
521  }
522  RemoveReshapeLayer(baseLayer, untouched, optimizationViews);
523  }
524 
525  // Replace Add/Mul/Add where possible
526  Layer* layerList[4] = {nullptr, nullptr, nullptr, nullptr};
527  const std::vector<ActivationFunction> validActivates = { ActivationFunction::ReLu,
529  if (IsLayerSequence<BinaryOperation>(base,
531  layerList,
532  true, // handleValidActivates
533  validActivates))
534  {
535  bool fuseReLu = false;
536  unsigned int numInputs = 0;
537  unsigned int numOutputs = 0;
538  std::vector<TensorInfo> inputInfos;
539  std::vector<TensorInfo> outputInfos;
540  const ActivationDescriptor* activationDescriptor = nullptr;
541 
542  if (BuildAddMulAddTensorInfoLists<Layer>(layerList,
543  numInputs,
544  numOutputs,
545  inputInfos,
546  outputInfos,
547  activationDescriptor,
548  fuseReLu))
549  {
550  // Create the new Add/Mul/Add layer and set the Relu activation function
551  FusedDescriptor fusedDescriptor(numInputs, numOutputs, FusedKernelType::AddMulAdd);
552  arm_compute::Status status = NeonFusedWorkloadValidate({inputInfos.begin(), inputInfos.end()},
553  {outputInfos.begin(), outputInfos.end()},
554  fusedDescriptor,
555  activationDescriptor);
556  if (status)
557  {
558  std::string fusedName;
559  GetFusedName(layerList, fusedName);
560 
561  IConnectableLayer* addMulAddLayer =
562  optimizationViews.GetINetwork()->AddFusedLayer(fusedDescriptor, fusedName.c_str());
563 
564  if (fuseReLu)
565  {
566  FusedLayer* addMulAddFusedLayer = PolymorphicDowncast<FusedLayer*>(addMulAddLayer);
567  addMulAddFusedLayer->SetAdditionalInfoForObject(
568  std::make_shared<ActivationDescriptor>(*activationDescriptor));
569  }
570 
571  // Update the graph
572  std::vector<IConnectableLayer*> originalLayers;
573  for (unsigned int layerIdx = 0; layerIdx < 4; ++layerIdx)
574  {
575  if (layerList[layerIdx])
576  {
577  originalLayers.push_back(layerList[layerIdx]);
578  }
579  }
580 
581  std::vector<SlotList> inputLayersSlotLists, outputLayersSlotLists;
582  BuildAddMulAddSlotLists<SlotList>(fuseReLu,
583  outputInfos.size() > 1,
584  inputLayersSlotLists,
585  outputLayersSlotLists);
586 
587  ReplaceMultipleLayers<FusedLayer>(optimizationViews,
588  originalLayers,
589  PolymorphicDowncast<FusedLayer*>(addMulAddLayer),
590  inputLayersSlotLists,
591  outputLayersSlotLists);
592 
593  // Remove unused layers
594  for (unsigned int layerIdx = 0; layerIdx < 4; ++layerIdx)
595  {
596  if (layerList[layerIdx])
597  {
598  untouched.erase(layerList[layerIdx]->GetGuid());
599  }
600  }
601  }
602  }
603  }
604  }
605 
606  if (optimizationViews.GetSubstitutions().empty() && optimizationViews.GetDeletedSubgraphs().empty())
607  {
608  optimizationViews.AddUntouchedSubgraph(SubgraphView(subgraph));
609  }
610  else
611  {
612  ReportUntouchedLayers(optimizationViews, untouched);
613  }
614 
615  return optimizationViews;
616 }

References armnn::Activation, armnn::Add, INetwork::AddFusedLayer(), armnn::Addition, armnn::AddMulAdd, OptimizationViews::AddUntouchedSubgraph(), armnn::BatchNormalization, SubgraphView::begin(), Layer::BeginOutputSlots(), armnn::BoundedReLu, armnn::ConnectedToLayerWithNCHW(), armnn::Convolution2d, armnn::DepthwiseConvolution2d, armnn::Div, armnn::Division, armnn::ElementwiseBinary, SubgraphView::end(), Layer::EndOutputSlots(), armnn::FullyConnected, Layer::GetAdditionalInformation(), InputSlot::GetConnectedOutputSlot(), OptimizationViews::GetDeletedSubgraphs(), armnn::GetFusedName(), Layer::GetGuid(), OptimizationViews::GetINetwork(), Layer::GetInputSlot(), Layer::GetName(), LayerWithParameters< Parameters >::GetParameters(), OptimizationViews::GetSubstitutions(), OutputSlot::GetTensorInfo(), Layer::GetType(), BatchNormalizationLayer::m_Beta, FullyConnectedDescriptor::m_BiasEnabled, Convolution2dDescriptor::m_BiasEnabled, DepthwiseConvolution2dDescriptor::m_BiasEnabled, BatchNormalizationLayer::m_Gamma, BatchNormalizationLayer::m_Mean, ElementwiseBinaryDescriptor::m_Operation, BatchNormalizationLayer::m_Variance, ReduceDescriptor::m_vAxis, armnn::Mul, armnn::Multiplication, armnn::NeonAdditionWorkloadValidate(), armnn::NeonBatchNormalizationValidate(), armnn::NeonConvolution2dWorkloadValidate(), armnn::NeonDepthwiseConvolutionWorkloadValidate(), armnn::NeonDivisionWorkloadValidate(), armnn::NeonFullyConnectedWorkloadValidate(), armnn::NeonFusedWorkloadValidate(), armnn::NeonMultiplicationWorkloadValidate(), armnn::NeonSubtractionWorkloadValidate(), armnn::Reduce, armnn::ReLu, armnn::RemoveReshapeLayer(), armnn::ReportUntouchedLayers(), armnn::Reshape, Layer::SetAdditionalInfoForObject(), armnn::Sub, and armnn::Subtraction.

◆ RegisterTensorHandleFactories()

void RegisterTensorHandleFactories ( class TensorHandleFactoryRegistry )
overridevirtual

(Optional) Register TensorHandleFactories Either this method or CreateMemoryManager() and IWorkloadFactory::CreateTensor() IWorkloadFactory::CreateSubtensor() methods must be implemented.

Reimplemented from IBackendInternal.

Definition at line 623 of file NeonBackend.cpp.

624 {
625  auto memoryManager = std::make_shared<NeonMemoryManager>(std::make_unique<arm_compute::Allocator>(),
627 
628  registry.RegisterMemoryManager(memoryManager);
629 
630  auto factory = std::make_unique<NeonTensorHandleFactory>(memoryManager);
631  // Register copy and import factory pair
632  registry.RegisterCopyAndImportFactoryPair(factory->GetId(), factory->GetId());
633  // Register the factory
634  registry.RegisterFactory(std::move(factory));
635 }

References BaseMemoryManager::Offset, TensorHandleFactoryRegistry::RegisterCopyAndImportFactoryPair(), TensorHandleFactoryRegistry::RegisterFactory(), and TensorHandleFactoryRegistry::RegisterMemoryManager().


The documentation for this class was generated from the following files:
armnn::NeonFullyConnectedWorkloadValidate
arm_compute::Status NeonFullyConnectedWorkloadValidate(const TensorInfo &input, const TensorInfo &output, const TensorInfo &weights, const Optional< TensorInfo > &biases, const FullyConnectedDescriptor &descriptor, const ActivationDescriptor *activationDescriptor)
Definition: NeonFullyConnectedWorkload.cpp:24
armnn::BaseMemoryManager::MemoryAffinity::Offset
@ Offset
armnn::BinaryOperation::Mul
@ Mul
armnn::BinaryOperation::Add
@ Add
armnn::NeonAdditionWorkloadValidate
arm_compute::Status NeonAdditionWorkloadValidate(const TensorInfo &input0, const TensorInfo &input1, const TensorInfo &output, const ActivationDescriptor *activationDescriptor)
Definition: NeonAdditionWorkload.cpp:20
armnn::LayerType::BatchNormalization
@ BatchNormalization
armnn::NeonMultiplicationWorkloadValidate
arm_compute::Status NeonMultiplicationWorkloadValidate(const TensorInfo &input0, const TensorInfo &input1, const TensorInfo &output, const ActivationDescriptor *activationDescriptor)
Definition: NeonMultiplicationWorkload.cpp:19
armnn::NeonSubtractionWorkloadValidate
arm_compute::Status NeonSubtractionWorkloadValidate(const TensorInfo &input0, const TensorInfo &input1, const TensorInfo &output, const ActivationDescriptor *activationDescriptor)
Definition: NeonSubtractionWorkload.cpp:22
armnn::GetFusedName
void GetFusedName(Layer *layerList[4], std::string &fusedName)
Definition: NeonBackendOptimizationUtils.hpp:71
armnn::BinaryOperation::Sub
@ Sub
armnn::NeonTensorHandleFactory::GetIdStatic
static const FactoryId & GetIdStatic()
Definition: NeonTensorHandleFactory.cpp:89
armnn::ActivationFunction::BoundedReLu
@ BoundedReLu
min(a, max(b, input)) ReLu1 & ReLu6.
armnn::LayerType::Reduce
@ Reduce
armnn::IBackendInternal::IBackendContextPtr
std::unique_ptr< IBackendContext > IBackendContextPtr
Definition: IBackendInternal.hpp:90
armnn::LayerType::ElementwiseBinary
@ ElementwiseBinary
armnn::FusedKernelType::AddMulAdd
@ AddMulAdd
armnn::cpuAccCapabilities
const BackendCapabilities cpuAccCapabilities("CpuAcc", { {"NonConstWeights", true}, {"AsyncExecution", false}, {"ProtectedContentAllocation", false}, {"ConstantTensorsAsInputs", true}, {"PreImportIOTensors", false}, {"ExternallyManagedMemory", true}, {"MultiAxisPacking", false}, {"SingleAxisPacking", true}, {"HasFp16", arm_compute::CPUInfo::get().has_fp16()} })
armnn::NeonDivisionWorkloadValidate
arm_compute::Status NeonDivisionWorkloadValidate(const TensorInfo &input0, const TensorInfo &input1, const TensorInfo &output, const ActivationDescriptor *activationDescriptor)
Definition: NeonDivisionWorkload.cpp:18
armnn::LayerType::Subtraction
@ Subtraction
armnn::RemoveReshapeLayer
void RemoveReshapeLayer(ReshapeLayer *baseLayer, std::map< LayerGuid, Layer * > &untouched, OptimizationViews &optimizationViews)
Definition: SubgraphUtils.hpp:293
armnn::LayerType::Multiplication
@ Multiplication
armnn::LayerType::Addition
@ Addition
armnn::NeonDepthwiseConvolutionWorkloadValidate
arm_compute::Status NeonDepthwiseConvolutionWorkloadValidate(const TensorInfo &input, const TensorInfo &output, const DepthwiseConvolution2dDescriptor &descriptor, const TensorInfo &weights, const Optional< TensorInfo > &biases, const ActivationDescriptor *activationDescriptor)
Definition: NeonDepthwiseConvolutionWorkload.cpp:29
armnn::NeonBackend::GetIdStatic
static const BackendId & GetIdStatic()
Definition: NeonBackend.cpp:46
armnn::LayerType::Division
@ Division
armnn::IBackendInternal::IBackendProfilingContextPtr
std::shared_ptr< arm::pipe::IBackendProfilingContext > IBackendProfilingContextPtr
This is the bridge between backend and backend profiling we'll keep it in the backend namespace.
Definition: IBackendInternal.hpp:92
armnn::LayerType::FullyConnected
@ FullyConnected
armnn::NeonBackend::CreateBackendSpecificModelContext
IBackendInternal::IBackendSpecificModelContextPtr CreateBackendSpecificModelContext(const ModelOptions &modelOptions) const override
Definition: NeonBackend.cpp:120
armnn::LayerType::DepthwiseConvolution2d
@ DepthwiseConvolution2d
armnn::Status
Status
Definition: Types.hpp:42
armnn::LayerType::Reshape
@ Reshape
armnn::NeonFusedWorkloadValidate
arm_compute::Status NeonFusedWorkloadValidate(const std::vector< std::reference_wrapper< TensorInfo >> &inputInfos, const std::vector< std::reference_wrapper< TensorInfo >> &outputInfos, const FusedDescriptor &fusedDescriptor, const ActivationDescriptor *activationDescriptor)
Definition: NeonFusedWorkload.cpp:22
armnn::NeonConvolution2dWorkloadValidate
arm_compute::Status NeonConvolution2dWorkloadValidate(const TensorInfo &input, const TensorInfo &output, const Convolution2dDescriptor &descriptor, const TensorInfo &weights, const Optional< TensorInfo > &biases, bool isFastMathEnabled, const ActivationDescriptor *activationDescriptor)
Definition: NeonConvolution2dWorkload.cpp:24
armnn::NeonBatchNormalizationValidate
arm_compute::Status NeonBatchNormalizationValidate(const TensorInfo &input, const TensorInfo &output, const TensorInfo &mean, const TensorInfo &var, const TensorInfo &beta, const TensorInfo &gamma, const BatchNormalizationDescriptor &descriptor, const ActivationDescriptor *activationDescriptor)
Definition: NeonBatchNormalizationWorkload.cpp:24
armnn::ActivationFunction::ReLu
@ ReLu
armnn::IBackendInternal::ILayerSupportSharedPtr
std::shared_ptr< ILayerSupport > ILayerSupportSharedPtr
Definition: IBackendInternal.hpp:94
armnn::ReportUntouchedLayers
void ReportUntouchedLayers(OptimizationViews &optimizationViews, std::map< LayerGuid, Layer * > untouched)
Definition: SubgraphUtils.hpp:220
armnn::ConnectedToLayerWithNCHW
bool ConnectedToLayerWithNCHW(Layer *baseLayer)
Checks if the Layer is connected to any Layer that has an NCHW layout.
Definition: SubgraphUtils.hpp:250
armnn::BinaryOperation::Div
@ Div
armnn::LayerType::Convolution2d
@ Convolution2d
armnn::NeonBackendId
constexpr const char * NeonBackendId()
Definition: NeonBackendId.hpp:10
armnn::LayerType::Activation
@ Activation
armnn::IBackendInternal::IBackendSpecificModelContextPtr
std::shared_ptr< IBackendModelContext > IBackendSpecificModelContextPtr
Definition: IBackendInternal.hpp:96