Compute Library
 19.08
CLScheduler Class Reference

Provides global access to a CL context and command queue. More...

#include <CLScheduler.h>

Public Member Functions

void default_init (ICLTuner *cl_tuner=nullptr)
 Initialises the context and command queue used by the scheduler to default values and sets a default device and kernel path for the CLKernelLibrary. More...
 
void default_init_with_context (cl::Device &device, cl::Context &ctx, ICLTuner *cl_tuner=nullptr)
 Initialises the scheduler with context and device provided by the user. More...
 
void enqueue (ICLKernel &kernel, bool flush=true)
 Schedule the execution of the passed kernel if possible. More...
 
void init (cl::Context context, cl::CommandQueue queue, const cl::Device &device, ICLTuner *cl_tuner=nullptr)
 Initialises the context and command queue to be used by the scheduler. More...
 
cl::Context & context ()
 Accessor for the associated CL context. More...
 
cl::CommandQueue & queue ()
 Accessor for the associated CL command queue. More...
 
GPUTarget target () const
 Get the target GPU. More...
 
void set_context (cl::Context context)
 Accessor to set the CL context to be used by the scheduler. More...
 
void set_queue (cl::CommandQueue queue)
 Accessor to set the CL command queue to be used by the scheduler. More...
 
void set_target (GPUTarget target)
 Accessor to set target GPU to be used by the scheduler. More...
 
void set_tuner (ICLTuner *tuner)
 Accessor to set the CL tuner to be used by the scheduler. More...
 
void sync ()
 Blocks until all commands in the associated command queue have finished. More...
 
cl::Event enqueue_sync_event ()
 Enqueues a marker into the associated command queue and return the event. More...
 
void tune_kernel_static (ICLKernel &kernel)
 Tunes OpenCL kernel. More...
 
bool is_initialised () const
 

Static Public Member Functions

static CLSchedulerget ()
 Access the scheduler singleton. More...
 

Detailed Description

Provides global access to a CL context and command queue.

Definition at line 40 of file CLScheduler.h.

Member Function Documentation

◆ context()

cl::Context& context ( )
inline

Accessor for the associated CL context.

Returns
A CL context.

Definition at line 91 of file CLScheduler.h.

92  {
93  ARM_COMPUTE_ERROR_ON(!_is_initialised);
94  _context = CLKernelLibrary::get().context();
95  return _context;
96  }
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:337
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
cl::Context & context()
Accessor for the associated CL context.

References ARM_COMPUTE_ERROR_ON, CLKernelLibrary::context(), and CLKernelLibrary::get().

Referenced by CLTensorAllocator::import_memory(), CLScheduler::init(), arm_compute::utils::restore_program_cache_from_file(), Framework::run(), and CLScheduler::set_context().

◆ default_init()

void default_init ( ICLTuner cl_tuner = nullptr)

Initialises the context and command queue used by the scheduler to default values and sets a default device and kernel path for the CLKernelLibrary.

Parameters
[in]cl_tuner(Optional) Pointer to ICLTuner (default=nullptr)

Definition at line 60 of file CLScheduler.cpp.

61 {
62  if(!_is_initialised)
63  {
64  cl::Context ctx;
65  cl::Device dev;
66  cl_int err;
67  std::tie(ctx, dev, err) = create_opencl_context_and_device();
68  ARM_COMPUTE_ERROR_ON_MSG(err != CL_SUCCESS, "Failed to create OpenCL context");
69  cl::CommandQueue queue = cl::CommandQueue(ctx, dev);
70  CLKernelLibrary::get().init("./cl_kernels/", ctx, dev);
71  init(ctx, queue, dev, cl_tuner);
72  // Create a default static tuner and set if none was provided
73  _cl_default_static_tuner = tuners::TunerFactory::create_tuner(_target);
74  }
75 
76  // Set CL tuner
77  _cl_tuner = (cl_tuner == nullptr) ? _cl_default_static_tuner.get() : cl_tuner;
78 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void init(cl::Context context, cl::CommandQueue queue, const cl::Device &device, ICLTuner *cl_tuner=nullptr)
Initialises the context and command queue to be used by the scheduler.
Definition: CLScheduler.cpp:86
std::tuple< cl::Context, cl::Device, cl_int > create_opencl_context_and_device()
This function creates an OpenCL context and a device.
Definition: CLHelpers.cpp:86
void init(std::string kernel_path, cl::Context context, cl::Device device)
Initialises the kernel library.
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.h:102
static std::unique_ptr< ICLTuner > create_tuner(GPUTarget target)
Definition: Tuners.h:40
#define ARM_COMPUTE_ERROR_ON_MSG(cond,...)
Definition: Error.h:328

References ARM_COMPUTE_ERROR_ON_MSG, arm_compute::create_opencl_context_and_device(), TunerFactory::create_tuner(), CLKernelLibrary::get(), CLScheduler::init(), CLKernelLibrary::init(), and CLScheduler::queue().

Referenced by CLDeviceBackend::initialize_backend(), arm_compute::utils::restore_program_cache_from_file(), and arm_compute::test::validation::TEST_CASE().

◆ default_init_with_context()

void default_init_with_context ( cl::Device &  device,
cl::Context &  ctx,
ICLTuner cl_tuner = nullptr 
)

Initialises the scheduler with context and device provided by the user.

Parameters
[in]deviceOpenCL device to be used
[in]ctxOpenCL ctx to be used
[in]cl_tuner(Optional) Pointer to ICLTuner (default=nullptr)

Definition at line 48 of file CLScheduler.cpp.

49 {
50  if(!_is_initialised)
51  {
52  cl::CommandQueue queue = cl::CommandQueue(ctx, device);
53  CLKernelLibrary::get().init("./cl_kernels/", ctx, device);
54  init(ctx, queue, device, cl_tuner);
55  _cl_default_static_tuner = tuners::TunerFactory::create_tuner(_target);
56  _cl_tuner = (cl_tuner == nullptr) ? _cl_default_static_tuner.get() : cl_tuner;
57  }
58 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void init(cl::Context context, cl::CommandQueue queue, const cl::Device &device, ICLTuner *cl_tuner=nullptr)
Initialises the context and command queue to be used by the scheduler.
Definition: CLScheduler.cpp:86
void init(std::string kernel_path, cl::Context context, cl::Device device)
Initialises the kernel library.
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.h:102
static std::unique_ptr< ICLTuner > create_tuner(GPUTarget target)
Definition: Tuners.h:40

References TunerFactory::create_tuner(), CLKernelLibrary::get(), CLScheduler::init(), CLKernelLibrary::init(), and CLScheduler::queue().

Referenced by main().

◆ enqueue()

void enqueue ( ICLKernel kernel,
bool  flush = true 
)

Schedule the execution of the passed kernel if possible.

Parameters
[in]kernelKernel to execute.
[in]flush(Optional) Specifies if the command queue will be flushed after running the kernel.

Definition at line 95 of file CLScheduler.cpp.

96 {
97  ARM_COMPUTE_ERROR_ON_MSG(!_is_initialised,
98  "The CLScheduler is not initialised yet! Please call the CLScheduler::get().default_init(), \
99  or CLScheduler::get()::init() and CLKernelLibrary::get()::init() function before running functions!");
100 
101  // Tune the kernel if the CLTuner has been provided
102  if(_cl_tuner != nullptr)
103  {
104  // Tune the OpenCL kernel
105  _cl_tuner->tune_kernel_dynamic(kernel);
106  }
107 
108  // Run kernel
109  kernel.run(kernel.window(), _queue);
110 
111  if(flush)
112  {
113  _queue.flush();
114  }
115 }
const Window & window() const
The maximum window the kernel can be executed on.
Definition: IKernel.cpp:28
virtual void tune_kernel_dynamic(ICLKernel &kernel)=0
Tune OpenCL kernel dynamically.
virtual void run(const Window &window, cl::CommandQueue &queue)=0
Enqueue the OpenCL kernel to process the given window on the passed OpenCL command queue.
#define ARM_COMPUTE_ERROR_ON_MSG(cond,...)
Definition: Error.h:328

References ARM_COMPUTE_ERROR_ON_MSG, ICLKernel::run(), ICLTuner::tune_kernel_dynamic(), and IKernel::window().

Referenced by CLLocallyConnectedLayer::prepare(), CLGEMMLowpMatrixMultiplyCore::prepare(), CLWinogradConvolutionLayer::prepare(), CLDepthwiseConvolutionLayer3x3::prepare(), CLGEMM::prepare(), CLLSTMLayer::prepare(), CLDepthwiseConvolutionLayer::prepare(), ICLSimpleFunction::run(), CLIntegralImage::run(), CLEqualizeHistogram::run(), CLDepthToSpaceLayer::run(), CLSpaceToDepthLayer::run(), CLHistogram::run(), CLHOGDescriptor::run(), CLHOGGradient::run(), CLGaussian5x5::run(), CLHOGDetector::run(), CLNormalizePlanarYUVLayer::run(), CLFFT1D::run(), CLSobel7x7::run(), CLSobel5x5::run(), CLMinMaxLocation::run(), CLRNNLayer::run(), CLStackLayer::run(), CLNormalizationLayer::run(), CLCannyEdge::run(), CLL2NormalizeLayer::run(), CLReductionOperation::run(), CLFastCorners::run(), CLUpsampleLayer::run(), CLBatchToSpaceLayer::run(), CLDirectConvolutionLayer::run(), CLSoftmaxLayer::run(), CLConvolutionLayerReshapeWeights::run(), CLPadLayer::run(), CLConcatenateLayer::run(), CLDeconvolutionLayerUpsample::run(), CLBatchNormalizationLayer::run(), CLHarrisCorners::run(), CLHOGMultiDetection::run(), CLConvolutionSquare< matrix_size >::run(), CLGaussianPyramidHalf::run(), CLLocallyConnectedLayer::run(), CLFuseBatchNormalization::run(), CLOpticalFlow::run(), CLCropResize::run(), CLGEMMLowpMatrixMultiplyCore::run(), CLSpaceToBatchLayer::run(), CLWinogradConvolutionLayer::run(), CLDepthwiseConvolutionLayer3x3::run(), CLGEMM::run(), CLGenerateProposalsLayer::run(), CLGEMMDeconvolutionLayer::run(), CLGaussianPyramidOrb::run(), CLSynthetizeFunctionInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::run(), CLFullyConnectedLayer::run(), CLGEMMConvolutionLayer::run(), CLLSTMLayer::run(), and CLDepthwiseConvolutionLayer::run().

◆ enqueue_sync_event()

cl::Event enqueue_sync_event ( )
inline

Enqueues a marker into the associated command queue and return the event.

Returns
An event that can be waited on to block the executing thread.

Definition at line 160 of file CLScheduler.h.

161  {
162  cl::Event event;
163  _queue.enqueueMarker(&event);
164 
165  return event;
166  }

◆ get()

CLScheduler & get ( )
static

Access the scheduler singleton.

Returns
The scheduler

Definition at line 41 of file CLScheduler.cpp.

42 {
43  std::call_once(_initialize_symbols, opencl_is_available);
44  static CLScheduler scheduler;
45  return scheduler;
46 }
Provides global access to a CL context and command queue.
Definition: CLScheduler.h:40
bool opencl_is_available()
Check if OpenCL is available.
Definition: OpenCL.cpp:136

References arm_compute::opencl_is_available().

Referenced by CLTensorAllocator::allocate(), CLLut::clear(), CLPriorBoxLayer::configure(), CLFlattenLayer::configure(), CLRange::configure(), CLPoolingLayer::configure(), CLScale::configure(), CLFFT1D::configure(), CLDirectConvolutionLayer::configure(), CLMeanStdDev::configure(), CLSoftmaxLayer::configure(), CLHOGDetector::configure(), CLMinMaxLocation::configure(), CLFastCorners::configure(), CLLocallyConnectedLayer::configure(), CLDepthwiseConvolutionLayer3x3::configure(), CLHOGMultiDetection::configure(), CLGEMMLowpMatrixMultiplyCore::configure(), CLGEMM::configure(), CLConvolutionLayer::configure(), CLFullyConnectedLayer::configure(), CLSynthetizeFunctionInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::configure(), CLGEMMConvolutionLayer::configure(), CLDepthwiseConvolutionLayer::configure(), arm_compute::test::validation::DATA_TEST_CASE(), CLTensorAllocator::import_memory(), CLHOG::init(), CLDeviceBackend::initialize_backend(), main(), CLHOG::map(), CLTensor::map(), CLDistribution1D::map(), CLLut::map(), CLArray< cl_int >::map(), CLSubTensor::map(), OpenCLClock< output_timestamps >::OpenCLClock(), CLLocallyConnectedLayer::prepare(), CLGEMMLowpMatrixMultiplyCore::prepare(), CLWinogradConvolutionLayer::prepare(), CLDepthwiseConvolutionLayer3x3::prepare(), CLGEMM::prepare(), CLFFTConvolutionLayer::prepare(), CLFullyConnectedLayer::prepare(), CLGEMMConvolutionLayer::prepare(), CLLSTMLayer::prepare(), CLDepthwiseConvolutionLayer::prepare(), arm_compute::utils::restore_program_cache_from_file(), ICLSimpleFunction::run(), CLIntegralImage::run(), CLEqualizeHistogram::run(), CLHistogram::run(), CLSpaceToDepthLayer::run(), CLDepthToSpaceLayer::run(), CLHOGDescriptor::run(), CLHOGGradient::run(), CLGaussian5x5::run(), CLSplit::run(), CLHOGDetector::run(), CLNormalizePlanarYUVLayer::run(), CLFFT1D::run(), CLSobel5x5::run(), CLSobel7x7::run(), CLMinMaxLocation::run(), CLRNNLayer::run(), CLStackLayer::run(), CLNormalizationLayer::run(), CLCannyEdge::run(), CLL2NormalizeLayer::run(), CLFastCorners::run(), CLReductionOperation::run(), CLUpsampleLayer::run(), CLBatchToSpaceLayer::run(), CLDirectConvolutionLayer::run(), CLSoftmaxLayer::run(), CLConvolutionLayerReshapeWeights::run(), CLPadLayer::run(), CLDeconvolutionLayerUpsample::run(), CLConcatenateLayer::run(), CLBatchNormalizationLayer::run(), CLHarrisCorners::run(), CLHOGMultiDetection::run(), CLConvolutionSquare< matrix_size >::run(), CLGaussianPyramidHalf::run(), CLLocallyConnectedLayer::run(), CLFuseBatchNormalization::run(), CLOpticalFlow::run(), CLCropResize::run(), CLGEMMLowpMatrixMultiplyCore::run(), CLSpaceToBatchLayer::run(), CLWinogradConvolutionLayer::run(), CLDepthwiseConvolutionLayer3x3::run(), CLGEMM::run(), CLGenerateProposalsLayer::run(), CLGEMMDeconvolutionLayer::run(), CLGaussianPyramidOrb::run(), CLSynthetizeFunctionInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::run(), CLFullyConnectedLayer::run(), CLGEMMConvolutionLayer::run(), CLLSTMLayer::run(), CLDepthwiseConvolutionLayer::run(), Framework::run(), arm_compute::utils::save_program_cache_to_file(), arm_compute::test::validation::TEST_CASE(), OpenCLClock< output_timestamps >::test_measurements(), CLHOG::unmap(), CLTensor::unmap(), CLDistribution1D::unmap(), CLLut::unmap(), CLArray< cl_int >::unmap(), CLSubTensor::unmap(), CLDirectConvolutionLayer::validate(), CLGEMMLowpMatrixMultiplyCore::validate(), CLGEMM::validate(), CLConvolutionLayer::validate(), and CLFullyConnectedLayer::validate().

◆ init()

void init ( cl::Context  context,
cl::CommandQueue  queue,
const cl::Device &  device,
ICLTuner cl_tuner = nullptr 
)

Initialises the context and command queue to be used by the scheduler.

Parameters
[in]contextA CL context.
[in]queueA CL command queue.
[in]deviceA CL device.
[in]cl_tuner(Optional) Pointer to OpenCL tuner (default=nullptr) Note: It is caller's responsibility to release the allocated memory for CLTuner

Definition at line 86 of file CLScheduler.cpp.

87 {
88  set_context(std::move(context));
89  _queue = std::move(queue);
90  _target = get_target_from_device(device);
91  _is_initialised = true;
92  _cl_tuner = cl_tuner;
93 }
void set_context(cl::Context context)
Accessor to set the CL context to be used by the scheduler.
Definition: CLScheduler.cpp:80
GPUTarget get_target_from_device(const cl::Device &device)
Helper function to get the GPU target from CL device.
Definition: CLHelpers.cpp:131
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.h:102
cl::Context & context()
Accessor for the associated CL context.
Definition: CLScheduler.h:91

References CLScheduler::context(), arm_compute::get_target_from_device(), CLScheduler::queue(), and CLScheduler::set_context().

Referenced by CLScheduler::default_init(), and CLScheduler::default_init_with_context().

◆ is_initialised()

bool is_initialised ( ) const
inline

Definition at line 180 of file CLScheduler.h.

181  {
182  return _is_initialised;
183  }

Referenced by arm_compute::utils::restore_program_cache_from_file().

◆ queue()

◆ set_context()

void set_context ( cl::Context  context)

Accessor to set the CL context to be used by the scheduler.

Parameters
[in]contextA CL context.

Definition at line 80 of file CLScheduler.cpp.

81 {
82  _context = std::move(context);
84 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void set_context(cl::Context context)
Sets the CL context used to create programs.
cl::Context & context()
Accessor for the associated CL context.
Definition: CLScheduler.h:91

References CLScheduler::context(), CLKernelLibrary::get(), and CLKernelLibrary::set_context().

Referenced by CLScheduler::init(), and Framework::run().

◆ set_queue()

void set_queue ( cl::CommandQueue  queue)
inline

Accessor to set the CL command queue to be used by the scheduler.

Parameters
[in]queueA CL command queue.

Definition at line 127 of file CLScheduler.h.

128  {
129  _queue = std::move(queue);
130  }
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.h:102

References CLScheduler::queue().

Referenced by OpenCLClock< output_timestamps >::OpenCLClock(), and Framework::run().

◆ set_target()

void set_target ( GPUTarget  target)
inline

Accessor to set target GPU to be used by the scheduler.

Parameters
[in]targetThe target GPU.

Definition at line 136 of file CLScheduler.h.

137  {
138  _target = target;
139  }
GPUTarget target() const
Get the target GPU.
Definition: CLScheduler.h:112

References CLScheduler::target().

◆ set_tuner()

void set_tuner ( ICLTuner tuner)
inline

Accessor to set the CL tuner to be used by the scheduler.

Parameters
[in]tunerA CL tuner

Definition at line 145 of file CLScheduler.h.

146  {
147  _cl_tuner = tuner;
148  }

◆ sync()

void sync ( )
inline

Blocks until all commands in the associated command queue have finished.

Definition at line 151 of file CLScheduler.h.

152  {
153  _queue.finish();
154  }

Referenced by main(), CLPadLayer::run(), CLCropResize::run(), and arm_compute::test::validation::TEST_CASE().

◆ target()

◆ tune_kernel_static()

void tune_kernel_static ( ICLKernel kernel)
inline

Tunes OpenCL kernel.

Parameters
[in]kernelKernel to tune

Definition at line 172 of file CLScheduler.h.

173  {
174  if(_cl_tuner != nullptr)
175  {
176  _cl_tuner->tune_kernel_static(kernel);
177  }
178  }
virtual void tune_kernel_static(ICLKernel &kernel)=0
Tune OpenCL kernel statically.

References ICLTuner::tune_kernel_static().

Referenced by CLFlattenLayer::configure(), CLRange::configure(), CLScale::configure(), CLPoolingLayer::configure(), CLDirectConvolutionLayer::configure(), CLLocallyConnectedLayer::configure(), CLGEMMConvolutionLayer::configure(), and CLDepthwiseConvolutionLayer::configure().


The documentation for this class was generated from the following files: