Compute Library
 21.05
CLTensorAllocator Class Reference

Basic implementation of a CL memory tensor allocator. More...

#include <CLTensorAllocator.h>

Collaboration diagram for CLTensorAllocator:
[legend]

Public Member Functions

 CLTensorAllocator (IMemoryManageable *owner=nullptr, CLRuntimeContext *ctx=nullptr)
 Default constructor. More...
 
 CLTensorAllocator (const CLTensorAllocator &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLTensorAllocatoroperator= (const CLTensorAllocator &)=delete
 Prevent instances of this class from being copy assigned (As this class contains pointers) More...
 
 CLTensorAllocator (CLTensorAllocator &&)=default
 Allow instances of this class to be moved. More...
 
CLTensorAllocatoroperator= (CLTensorAllocator &&)=default
 Allow instances of this class to be moved. More...
 
uint8_t * data ()
 Interface to be implemented by the child class to return the pointer to the mapped data. More...
 
const cl::Buffer & cl_data () const
 Interface to be implemented by the child class to return the pointer to the CL data. More...
 
CLQuantization quantization () const
 Wrapped quantization info data accessor. More...
 
uint8_t * map (cl::CommandQueue &q, bool blocking)
 Enqueue a map operation of the allocated buffer on the given queue. More...
 
void unmap (cl::CommandQueue &q, uint8_t *mapping)
 Enqueue an unmap operation of the allocated buffer on the given queue. More...
 
void allocate () override
 Allocate size specified by TensorInfo of OpenCL memory. More...
 
void free () override
 Free allocated OpenCL memory. More...
 
Status import_memory (cl::Buffer buffer)
 Import an existing memory as a tensor's backing memory. More...
 
void set_associated_memory_group (IMemoryGroup *associated_memory_group)
 Associates the tensor with a memory group. More...
 
- Public Member Functions inherited from ITensorAllocator
 ITensorAllocator ()
 Default constructor. More...
 
 ITensorAllocator (const ITensorAllocator &)=default
 Allow instances of this class to be copy constructed. More...
 
ITensorAllocatoroperator= (const ITensorAllocator &)=default
 Allow instances of this class to be copied. More...
 
 ITensorAllocator (ITensorAllocator &&)=default
 Allow instances of this class to be move constructed. More...
 
ITensorAllocatoroperator= (ITensorAllocator &&)=default
 Allow instances of this class to be moved. More...
 
virtual ~ITensorAllocator ()=default
 Default virtual destructor. More...
 
void init (const TensorInfo &input, size_t alignment=0)
 Initialize a tensor based on the passed TensorInfo. More...
 
TensorInfoinfo ()
 Return a reference to the tensor's metadata. More...
 
const TensorInfoinfo () const
 Return a constant reference to the tensor's metadata. More...
 
size_t alignment () const
 Return underlying's tensor buffer alignment. More...
 

Static Public Member Functions

static void set_global_allocator (IAllocator *allocator)
 Sets global allocator that will be used by all CLTensor objects. More...
 

Detailed Description

Basic implementation of a CL memory tensor allocator.

Definition at line 43 of file CLTensorAllocator.h.

Constructor & Destructor Documentation

◆ CLTensorAllocator() [1/3]

CLTensorAllocator ( IMemoryManageable owner = nullptr,
CLRuntimeContext ctx = nullptr 
)

Default constructor.

Parameters
[in]owner(Optional) Owner of the allocator.
[in]ctx(Optional) Runtime context.

Definition at line 109 of file CLTensorAllocator.cpp.

110  : _ctx(ctx), _owner(owner), _associated_memory_group(nullptr), _memory(), _mapping(nullptr), _scale(), _offset()
111 {
112 }

◆ CLTensorAllocator() [2/3]

CLTensorAllocator ( const CLTensorAllocator )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLTensorAllocator() [3/3]

Allow instances of this class to be moved.

Member Function Documentation

◆ allocate()

void allocate ( )
overridevirtual

Allocate size specified by TensorInfo of OpenCL memory.

Note
: The tensor must not already be allocated when calling this function.

Implements ITensorAllocator.

Definition at line 129 of file CLTensorAllocator.cpp.

130 {
131  // Allocate tensor backing memory
132  if(_associated_memory_group == nullptr)
133  {
134  // Perform memory allocation
135  if(static_global_cl_allocator != nullptr)
136  {
137  _memory.set_owned_region(static_global_cl_allocator->make_region(info().total_size(), 0));
138  }
139  else if(_ctx == nullptr)
140  {
141  auto legacy_ctx = CLCoreRuntimeContext(nullptr, CLScheduler::get().context(), CLScheduler::get().queue());
142  _memory.set_owned_region(allocate_region(&legacy_ctx, info().total_size(), 0));
143  }
144  else
145  {
146  _memory.set_owned_region(allocate_region(_ctx->core_runtime_context(), info().total_size(), 0));
147  }
148  }
149  else
150  {
151  // Finalize memory management instead
152  _associated_memory_group->finalize_memory(_owner, _memory, info().total_size(), alignment());
153  }
154 
155  // Allocate and fill the quantization parameter arrays
157  {
158  const size_t pad_size = 0;
159  populate_quantization_info(_scale, _offset, info().quantization_info(), pad_size);
160  }
161 
162  // Lock allocator
163  info().set_is_resizable(false);
164 }
static CLScheduler & get()
Access the scheduler singleton.
CLCoreRuntimeContext * core_runtime_context()
const DataType data_type
Definition: Im2Col.cpp:150
virtual void finalize_memory(IMemoryManageable *obj, IMemory &obj_memory, size_t size, size_t alignment)=0
Finalizes memory for a given object.
bool is_data_type_quantized_per_channel(DataType dt)
Check if a given data type is of per channel type.
Definition: Utils.h:1044
void set_owned_region(std::unique_ptr< IMemoryRegion > region) final
Sets a memory region.
Definition: CLMemory.cpp:76
size_t total_size() const override
Returns the total size of the tensor in bytes.
Definition: TensorInfo.h:250
size_t alignment() const
Return underlying's tensor buffer alignment.
ITensorInfo & set_is_resizable(bool is_resizable) override
Set the flag whether the tensor size can be changed.
Definition: TensorInfo.h:270
TensorInfo & info()
Return a reference to the tensor's metadata.

References ITensorAllocator::alignment(), CLRuntimeContext::core_runtime_context(), arm_compute::test::validation::data_type, IMemoryGroup::finalize_memory(), CLScheduler::get(), ITensorAllocator::info(), arm_compute::is_data_type_quantized_per_channel(), TensorInfo::set_is_resizable(), CLMemory::set_owned_region(), and TensorInfo::total_size().

Referenced by CLTensorHandle::allocate(), arm_compute::test::validation::compute_float_target_in_place(), CLReduceMean::configure(), CLFFT2D::configure(), CLRNNLayer::configure(), CLFFT1D::configure(), CLL2NormalizeLayer::configure(), CLReductionOperation::configure(), CLArgMinMaxLayer::configure(), CLInstanceNormalizationLayer::configure(), CLWinogradConvolutionLayer::configure(), CLFFTConvolutionLayer::configure(), CLGEMMLowpMatrixMultiplyCore::configure(), CLGenerateProposalsLayer::configure(), CLGEMMDeconvolutionLayer::configure(), CLDirectDeconvolutionLayer::configure(), CLLSTMLayerQuantized::configure(), CLLSTMLayer::configure(), CLQLSTMLayer::configure(), CLGEMMConvolutionLayer::configure(), CLGEMMLowpMatrixMultiplyCore::prepare(), CLWinogradConvolutionLayer::prepare(), CLGEMMDeconvolutionLayer::prepare(), CLFFTConvolutionLayer::prepare(), CLDirectDeconvolutionLayer::prepare(), CLLSTMLayerQuantized::prepare(), CLFullyConnectedLayer::prepare(), CLGEMM::prepare(), CLQLSTMLayer::prepare(), CLGEMMConvolutionLayer::prepare(), CLFullyConnectedLayerReshapeWeightsManaged::run(), CLGEMMReshapeRHSMatrixKernelManaged::run(), CLConvertFullyConnectedWeightsManaged::run(), CLConvolutionLayerReshapeWeightsTransform::run(), and arm_compute::test::validation::TEST_CASE().

◆ cl_data()

const cl::Buffer & cl_data ( ) const

Interface to be implemented by the child class to return the pointer to the CL data.

Returns
pointer to the CL data.

Definition at line 124 of file CLTensorAllocator.cpp.

125 {
126  return _memory.region() == nullptr ? _empty_buffer : _memory.cl_region()->cl_data();
127 }
const cl::Buffer & cl_data() const
Returns the underlying CL buffer.
IMemoryRegion * region() final
Region accessor.
Definition: CLMemory.cpp:59
ICLMemoryRegion * cl_region()
OpenCL Region accessor.
Definition: CLMemory.cpp:49

References ICLMemoryRegion::cl_data(), CLMemory::cl_region(), and CLMemory::region().

Referenced by CLTensor::cl_buffer().

◆ data()

uint8_t * data ( )

Interface to be implemented by the child class to return the pointer to the mapped data.

Returns
pointer to the mapped data.

Definition at line 119 of file CLTensorAllocator.cpp.

120 {
121  return _mapping;
122 }

◆ free()

void free ( )
overridevirtual

Free allocated OpenCL memory.

Note
The tensor must have been allocated when calling this function.

Implements ITensorAllocator.

Definition at line 166 of file CLTensorAllocator.cpp.

167 {
168  _mapping = nullptr;
169  _memory.set_region(nullptr);
170  clear_quantization_arrays(_scale, _offset);
171  info().set_is_resizable(true);
172 }
ITensorInfo & set_is_resizable(bool is_resizable) override
Set the flag whether the tensor size can be changed.
Definition: TensorInfo.h:270
TensorInfo & info()
Return a reference to the tensor's metadata.
void set_region(IMemoryRegion *region) final
Sets a memory region.
Definition: CLMemory.cpp:69

References ITensorAllocator::info(), TensorInfo::set_is_resizable(), and CLMemory::set_region().

Referenced by CLTensorHandle::free(), CLWinogradConvolutionLayer::prepare(), CLGEMMDeconvolutionLayer::prepare(), CLFFTConvolutionLayer::prepare(), CLDirectDeconvolutionLayer::prepare(), CLLSTMLayerQuantized::prepare(), CLGEMMConvolutionLayer::prepare(), CLFullyConnectedLayerReshapeWeightsManaged::release(), CLGEMMReshapeRHSMatrixKernelManaged::release(), CLConvertFullyConnectedWeightsManaged::release(), CLConvolutionLayerReshapeWeightsTransform::release(), CLTensorHandle::release_if_unused(), and arm_compute::test::validation::TEST_CASE().

◆ import_memory()

Status import_memory ( cl::Buffer  buffer)

Import an existing memory as a tensor's backing memory.

Warning
memory should have been created under the same context that Compute Library uses.
memory is expected to be aligned with the device requirements.
tensor shouldn't be memory managed.
ownership of memory is not transferred.
memory must be writable in case of in-place operations
padding should be accounted by the client code.
Note
buffer size will be checked to be compliant with total_size reported by ITensorInfo.
Parameters
[in]bufferBuffer to be used as backing memory
Returns
An error status

Definition at line 174 of file CLTensorAllocator.cpp.

175 {
176  ARM_COMPUTE_RETURN_ERROR_ON(buffer.get() == nullptr);
177  ARM_COMPUTE_RETURN_ERROR_ON(buffer.getInfo<CL_MEM_SIZE>() < info().total_size());
178  ARM_COMPUTE_RETURN_ERROR_ON(buffer.getInfo<CL_MEM_CONTEXT>().get() != CLScheduler::get().context().get());
179  ARM_COMPUTE_RETURN_ERROR_ON(_associated_memory_group != nullptr);
180 
181  if(_ctx == nullptr)
182  {
183  auto legacy_ctx = CLCoreRuntimeContext(nullptr, CLScheduler::get().context(), CLScheduler::get().queue());
184  _memory.set_owned_region(std::make_unique<CLBufferMemoryRegion>(buffer, &legacy_ctx));
185  }
186  else
187  {
188  _memory.set_owned_region(std::make_unique<CLBufferMemoryRegion>(buffer, _ctx->core_runtime_context()));
189  }
190 
191  info().set_is_resizable(false);
192  return Status{};
193 }
static CLScheduler & get()
Access the scheduler singleton.
CLCoreRuntimeContext * core_runtime_context()
#define ARM_COMPUTE_RETURN_ERROR_ON(cond)
If the condition is true, an error is returned.
Definition: Error.h:296
cl::Context & context()
Accessor for the associated CL context.
Definition: CLScheduler.cpp:32
void set_owned_region(std::unique_ptr< IMemoryRegion > region) final
Sets a memory region.
Definition: CLMemory.cpp:76
ITensorInfo & set_is_resizable(bool is_resizable) override
Set the flag whether the tensor size can be changed.
Definition: TensorInfo.h:270
TensorInfo & info()
Return a reference to the tensor's metadata.

References ARM_COMPUTE_RETURN_ERROR_ON, CLScheduler::context(), CLRuntimeContext::core_runtime_context(), CLScheduler::get(), ITensorAllocator::info(), TensorInfo::set_is_resizable(), and CLMemory::set_owned_region().

Referenced by CLFFTConvolutionLayer::run(), and arm_compute::test::validation::TEST_CASE().

◆ map()

uint8_t * map ( cl::CommandQueue &  q,
bool  blocking 
)

Enqueue a map operation of the allocated buffer on the given queue.

Parameters
[in,out]qThe CL command queue to use for the mapping operation.
[in]blockingIf true, then the mapping will be ready to use by the time this method returns, else it is the caller's responsibility to flush the queue and wait for the mapping operation to have completed before using the returned mapping pointer.
Returns
The mapping address.

Definition at line 235 of file CLTensorAllocator.cpp.

236 {
237  ARM_COMPUTE_ERROR_ON(_mapping != nullptr);
238  ARM_COMPUTE_ERROR_ON(_memory.region() == nullptr);
239  ARM_COMPUTE_ERROR_ON(_memory.region()->buffer() != nullptr);
240 
241  _mapping = reinterpret_cast<uint8_t *>(_memory.cl_region()->map(q, blocking));
242  return _mapping;
243 }
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
virtual void * buffer()=0
Returns the pointer to the allocated data.
IMemoryRegion * region() final
Region accessor.
Definition: CLMemory.cpp:59
ICLMemoryRegion * cl_region()
OpenCL Region accessor.
Definition: CLMemory.cpp:49
virtual void * map(cl::CommandQueue &q, bool blocking)=0
Enqueue a map operation of the allocated buffer on the given queue.

References ARM_COMPUTE_ERROR_ON, IMemoryRegion::buffer(), CLMemory::cl_region(), ICLMemoryRegion::map(), and CLMemory::region().

◆ operator=() [1/2]

CLTensorAllocator& operator= ( const CLTensorAllocator )
delete

Prevent instances of this class from being copy assigned (As this class contains pointers)

◆ operator=() [2/2]

CLTensorAllocator& operator= ( CLTensorAllocator &&  )
default

Allow instances of this class to be moved.

◆ quantization()

CLQuantization quantization ( ) const

Wrapped quantization info data accessor.

Returns
A wrapped quantization info object.

Definition at line 114 of file CLTensorAllocator.cpp.

115 {
116  return { &_scale, &_offset };
117 }

Referenced by CLTensor::quantization().

◆ set_associated_memory_group()

void set_associated_memory_group ( IMemoryGroup associated_memory_group)

Associates the tensor with a memory group.

Parameters
[in]associated_memory_groupMemory group to associate the tensor with

Definition at line 195 of file CLTensorAllocator.cpp.

196 {
197  ARM_COMPUTE_ERROR_ON(associated_memory_group == nullptr);
198  ARM_COMPUTE_ERROR_ON(_associated_memory_group != nullptr && _associated_memory_group != associated_memory_group);
199  ARM_COMPUTE_ERROR_ON(_memory.region() != nullptr && _memory.cl_region()->cl_data().get() != nullptr);
200 
201  _associated_memory_group = associated_memory_group;
202 }
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
const cl::Buffer & cl_data() const
Returns the underlying CL buffer.
IMemoryRegion * region() final
Region accessor.
Definition: CLMemory.cpp:59
ICLMemoryRegion * cl_region()
OpenCL Region accessor.
Definition: CLMemory.cpp:49

References ARM_COMPUTE_ERROR_ON, ICLMemoryRegion::cl_data(), CLMemory::cl_region(), and CLMemory::region().

Referenced by CLTensor::associate_memory_group().

◆ set_global_allocator()

void set_global_allocator ( IAllocator allocator)
static

Sets global allocator that will be used by all CLTensor objects.

Parameters
[in]allocatorAllocator to be used as a global allocator

Definition at line 204 of file CLTensorAllocator.cpp.

205 {
206  static_global_cl_allocator = allocator;
207 }
input allocator() -> allocate()

References arm_compute::test::validation::allocator().

Referenced by arm_compute::test::validation::TEST_CASE().

◆ unmap()

void unmap ( cl::CommandQueue &  q,
uint8_t *  mapping 
)

Enqueue an unmap operation of the allocated buffer on the given queue.

Note
This method simply enqueue the unmap operation, it is the caller's responsibility to flush the queue and make sure the unmap is finished before the memory is accessed by the device.
Parameters
[in,out]qThe CL command queue to use for the mapping operation.
[in]mappingThe cpu mapping to unmap.

Definition at line 245 of file CLTensorAllocator.cpp.

246 {
247  ARM_COMPUTE_ERROR_ON(_mapping == nullptr);
248  ARM_COMPUTE_ERROR_ON(_mapping != mapping);
249  ARM_COMPUTE_ERROR_ON(_memory.region() == nullptr);
250  ARM_COMPUTE_ERROR_ON(_memory.region()->buffer() == nullptr);
251  ARM_COMPUTE_UNUSED(mapping);
252 
253  _memory.cl_region()->unmap(q);
254  _mapping = nullptr;
255 }
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
virtual void * buffer()=0
Returns the pointer to the allocated data.
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152
IMemoryRegion * region() final
Region accessor.
Definition: CLMemory.cpp:59
ICLMemoryRegion * cl_region()
OpenCL Region accessor.
Definition: CLMemory.cpp:49
virtual void unmap(cl::CommandQueue &q)=0
Enqueue an unmap operation of the allocated buffer on the given queue.

References ARM_COMPUTE_ERROR_ON, ARM_COMPUTE_UNUSED, IMemoryRegion::buffer(), CLMemory::cl_region(), CLMemory::region(), and ICLMemoryRegion::unmap().


The documentation for this class was generated from the following files: