Compute Library
 22.02
CLScheduler Class Referencefinal

Provides global access to a CL context and command queue. More...

#include <CLScheduler.h>

Public Member Functions

 CLScheduler ()
 Constructor. More...
 
 CLScheduler (const CLScheduler &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLScheduleroperator= (const CLScheduler &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 ~CLScheduler ()=default
 Default destructor. More...
 
void default_init (ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr, CLBackendType cl_backend_type=CLBackendType::Native)
 Initialises the context and command queue used by the scheduler to default values and sets a default device and kernel path for the CLKernelLibrary. More...
 
void default_init_with_context (cl::Device &device, cl::Context &ctx, ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr)
 Initialises the scheduler with context and device provided by the user. More...
 
void enqueue (ICLKernel &kernel, bool flush=true)
 Schedule the execution of the passed kernel if possible. More...
 
void enqueue_op (ICLKernel &kernel, ITensorPack &tensors, bool flush=true)
 Schedule the execution of the passed kernel if possible. More...
 
void init (cl::Context context, cl::CommandQueue queue, const cl::Device &device, ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr, CLBackendType cl_backend_type=CLBackendType::Native)
 Initialises the context and command queue to be used by the scheduler. More...
 
cl::Context & context ()
 Accessor for the associated CL context. More...
 
cl::CommandQueue & queue ()
 Accessor for the associated CL command queue. More...
 
GPUTarget target () const
 Get the target GPU. More...
 
CLGEMMHeuristicsHandlegemm_heuristics () const
 Accessor for the associated CLGEMMHeuristicsHandle. More...
 
void set_context (cl::Context context)
 Accessor to set the CL context to be used by the scheduler. More...
 
void set_queue (cl::CommandQueue queue)
 Accessor to set the CL command queue to be used by the scheduler. More...
 
void set_target (GPUTarget target)
 Accessor to set target GPU to be used by the scheduler. More...
 
void set_tuner (ICLTuner *tuner)
 Accessor to set the CL tuner to be used by the scheduler. More...
 
void sync ()
 Blocks until all commands in the associated command queue have finished. More...
 
cl::Event enqueue_sync_event ()
 Enqueues a marker into the associated command queue and return the event. More...
 
void tune_kernel_static (ICLKernel &kernel)
 Tunes OpenCL kernel. More...
 
void enable_job_chaining (int job_chaining_size)
 Enable job chaining. More...
 
bool is_initialised () const
 

Static Public Member Functions

static CLSchedulerget ()
 Access the scheduler singleton. More...
 

Detailed Description

Provides global access to a CL context and command queue.

Definition at line 43 of file CLScheduler.h.

Constructor & Destructor Documentation

◆ CLScheduler() [1/2]

Constructor.

Definition at line 97 of file CLScheduler.cpp.

98  : _context(), _queue(), _target(GPUTarget::MIDGARD), _is_initialised(false), _cl_tuner(nullptr), _gemm_heuristics(nullptr), _backend_type(CLBackendType::Native), _job_chaining_enabled(false),
99  _job_chaining_size(), _job_chaining_count(0)
100 {
101 }
OpenCL native backend.

◆ CLScheduler() [2/2]

CLScheduler ( const CLScheduler )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ ~CLScheduler()

~CLScheduler ( )
default

Default destructor.

Member Function Documentation

◆ context()

cl::Context & context ( )

Accessor for the associated CL context.

Returns
A CL context.

Definition at line 32 of file CLScheduler.cpp.

References ARM_COMPUTE_ERROR_ON, CLKernelLibrary::context(), and CLKernelLibrary::get().

Referenced by CLTensorAllocator::import_memory(), arm_compute::restore_program_cache_from_file(), and Framework::run().

33 {
34  ARM_COMPUTE_ERROR_ON(!_is_initialised);
35  _context = CLKernelLibrary::get().context();
36  return _context;
37 }
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
cl::Context & context()
Accessor for the associated CL context.

◆ default_init()

void default_init ( ICLTuner cl_tuner = nullptr,
CLGEMMHeuristicsHandle gemm_h = nullptr,
CLBackendType  cl_backend_type = CLBackendType::Native 
)

Initialises the context and command queue used by the scheduler to default values and sets a default device and kernel path for the CLKernelLibrary.

Parameters
[in]cl_tuner(Optional) Pointer to ICLTuner (default=nullptr)
[in]gemm_h(Optional) Pointer to CLGEMMHeuristicsHandle (default = nullptr)
[in]cl_backend_type(Optional) Type of backend to use (default = CLBackendType::Native)

Definition at line 122 of file CLScheduler.cpp.

References ARM_COMPUTE_ERROR_ON_MSG, arm_compute::create_opencl_context_and_device(), CLKernelLibrary::get(), CLKernelLibrary::init(), CLScheduler::init(), and CLScheduler::queue().

Referenced by CLDeviceBackend::initialize_backend(), and arm_compute::restore_program_cache_from_file().

123 {
124  if(!_is_initialised)
125  {
126  cl::Context ctx;
127  cl::Device dev;
128  cl_int err;
129  std::tie(ctx, dev, err) = create_opencl_context_and_device(cl_backend_type);
130  ARM_COMPUTE_ERROR_ON_MSG(err != CL_SUCCESS, "Failed to create OpenCL context");
131  cl::CommandQueue queue = cl::CommandQueue(ctx, dev);
132  CLKernelLibrary::get().init("./cl_kernels/", ctx, dev);
133  init(ctx, queue, dev, cl_tuner, gemm_h);
134  }
135 
136  // Set CL tuner
137  _cl_tuner = cl_tuner;
138 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
std::tuple< cl::Context, cl::Device, cl_int > create_opencl_context_and_device(CLBackendType cl_backend_type)
This function creates an OpenCL context and a device.
Definition: CLHelpers.cpp:126
#define ARM_COMPUTE_ERROR_ON_MSG(cond, msg)
Definition: Error.h:456
void init(std::string kernel_path, cl::Context context, cl::Device device)
Initialises the kernel library.
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:39
void init(cl::Context context, cl::CommandQueue queue, const cl::Device &device, ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr, CLBackendType cl_backend_type=CLBackendType::Native)
Initialises the context and command queue to be used by the scheduler.

◆ default_init_with_context()

void default_init_with_context ( cl::Device &  device,
cl::Context &  ctx,
ICLTuner cl_tuner = nullptr,
CLGEMMHeuristicsHandle gemm_h = nullptr 
)

Initialises the scheduler with context and device provided by the user.

Parameters
[in]deviceOpenCL device to be used
[in]ctxOpenCL ctx to be used
[in]cl_tuner(Optional) Pointer to ICLTuner (default=nullptr)
[in]gemm_h(Optional) Pointer to CLGEMMHeuristicsHandle (default = nullptr)

Definition at line 110 of file CLScheduler.cpp.

References CLKernelLibrary::get(), CLKernelLibrary::init(), CLScheduler::init(), and CLScheduler::queue().

Referenced by main(), and arm_compute::utils::run_example().

111 {
112  if(!_is_initialised)
113  {
114  const std::string cl_kernels_folder("./cl_kernels/");
115  cl::CommandQueue queue = cl::CommandQueue(ctx, device);
116  CLKernelLibrary::get().init(cl_kernels_folder, ctx, device);
117  init(ctx, queue, device, cl_tuner, gemm_h);
118  _cl_tuner = cl_tuner;
119  }
120 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void init(std::string kernel_path, cl::Context context, cl::Device device)
Initialises the kernel library.
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:39
void init(cl::Context context, cl::CommandQueue queue, const cl::Device &device, ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr, CLBackendType cl_backend_type=CLBackendType::Native)
Initialises the context and command queue to be used by the scheduler.

◆ enable_job_chaining()

void enable_job_chaining ( int  job_chaining_size)

Enable job chaining.

The command queue will only be flushed when job_chaining_size kernels have been enqueued.

Parameters
[in]job_chaining_sizeKernels to enqueue before flushing

Definition at line 199 of file CLScheduler.cpp.

200 {
201  _job_chaining_enabled = true;
202  _job_chaining_size = job_chaining_size;
203 }

◆ enqueue()

◆ enqueue_op()

void enqueue_op ( ICLKernel kernel,
ITensorPack tensors,
bool  flush = true 
)

Schedule the execution of the passed kernel if possible.

Parameters
[in]kernelKernel to execute.
[in]tensorsVector containing the tensors to operate on.
[in]flush(Optional) Specifies if the command queue will be flushed after running the kernel. This will be ignored if job chaining is enabled.

Definition at line 194 of file CLScheduler.cpp.

Referenced by ClGemm::prepare(), ClGemmLowpMatrixMultiplyCore::prepare(), ClWinogradConv2d::prepare(), ClGemmConv2d::prepare(), CLQLSTMLayer::prepare(), ClDequantize::run(), ClQuantize::run(), ICLOperator::run(), ClScale::run(), ClSoftmax::run(), ClConcatenate::run(), ClDirectConv2d::run(), ClDirectConv3d::run(), ClGemmLowpOutputStage::run(), ClGemm::run(), ClGemmLowpMatrixMultiplyCore::run(), ClWinogradConv2d::run(), CLSynthetizeOperatorInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::run(), ClGemmConv2d::run(), CLLSTMLayer::run(), and ClSynthetizeOperatorWithBorder< K >::run().

195 {
196  enqueue_common(kernel, tensors, flush);
197 }

◆ enqueue_sync_event()

cl::Event enqueue_sync_event ( )

Enqueues a marker into the associated command queue and return the event.

Returns
An event that can be waited on to block the executing thread.

Definition at line 75 of file CLScheduler.cpp.

76 {
77  cl::Event event;
78  _queue.enqueueMarker(&event);
79  return event;
80 }

◆ gemm_heuristics()

◆ get()

CLScheduler & get ( )
static

Access the scheduler singleton.

This method has been deprecated and will be removed in future releases

Returns
The scheduler

Definition at line 103 of file CLScheduler.cpp.

References arm_compute::opencl_is_available().

Referenced by CLTuner::add_tuning_params(), CLBufferAllocator::allocate(), ClQueue::cl_queue(), CLBufferMemoryRegion::CLBufferMemoryRegion(), ClScale::configure(), ClPool2d::configure(), ClSoftmax::configure(), ClDirectConv2d::configure(), CLPriorBoxLayer::configure(), CLRange::configure(), CLFFT1D::configure(), ClGemm::configure(), CLDepthwiseConvolutionLayer::configure(), ClGemmLowpMatrixMultiplyCore::configure(), CLSynthetizeOperatorInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::configure(), CLCropResize::configure(), ClGemmConv2d::configure(), ClConv2d::configure(), CLConvolutionLayer::configure(), CLSynthetizeFunctionInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::configure(), arm_compute::test::validation::DATA_TEST_CASE(), ClQueue::finish(), CLTensorAllocator::import_memory(), CLDeviceBackend::initialize_backend(), main(), CLArray< cl_int >::map(), CLSubTensor::map(), CLTensor::map(), OpenCLClock< output_timestamps >::OpenCLClock(), ClGemm::prepare(), ClGemmLowpMatrixMultiplyCore::prepare(), ClWinogradConv2d::prepare(), ClGemmConv2d::prepare(), CLFFTConvolutionLayer::prepare(), CLQLSTMLayer::prepare(), arm_compute::restore_program_cache_from_file(), ClDequantize::run(), ClQuantize::run(), CLSplit::run(), ICLOperator::run(), ClScale::run(), ClSoftmax::run(), ClConcatenate::run(), ClDirectConv2d::run(), ClDirectConv3d::run(), ClGemmLowpOutputStage::run(), CLSpaceToDepthLayer::run(), CLFFT1D::run(), ClGemm::run(), CLDeconvolutionLayerUpsample::run(), CLStackLayer::run(), ClGemmLowpMatrixMultiplyCore::run(), CLL2NormalizeLayer::run(), CLNormalizationLayer::run(), CLArgMinMaxLayer::run(), CLPadLayer::run(), ClWinogradConv2d::run(), CLReductionOperation::run(), CLSynthetizeOperatorInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::run(), CLDepthwiseConvolutionLayer::run(), CLMaxUnpoolingLayer::run(), ClGemmConv2d::run(), CLBatchToSpaceLayer::run(), CLFuseBatchNormalization::run(), CLCropResize::run(), CLBatchNormalizationLayer::run(), CLSpaceToBatchLayer::run(), CLGEMMDeconvolutionLayer::run(), CLGenerateProposalsLayer::run(), CLSynthetizeFunctionInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::run(), CLLSTMLayer::run(), CLQLSTMLayer::run(), ClSynthetizeOperatorWithBorder< K >::run(), Framework::run(), arm_compute::utils::run_example(), arm_compute::save_program_cache_to_file(), arm_compute::schedule_kernel_on_ctx(), ClQueue::scheduler(), arm_compute::cl_gemm::auto_heuristics::select_mlgo_gemm_config_native(), arm_compute::cl_gemm::auto_heuristics::select_mlgo_gemm_config_reshaped(), arm_compute::cl_gemm::auto_heuristics::select_mlgo_gemm_config_reshaped_only_rhs(), arm_compute::cl_gemm::auto_heuristics::select_mlgo_gemm_kernel(), ClContext::set_cl_ctx(), ClQueue::set_cl_queue(), CLTensorAllocator::set_global_allocator(), CLDeviceBackend::setup_backend_context(), CLDeviceBackend::sync(), arm_compute::test::sync_if_necessary(), arm_compute::test::validation::TEST_CASE(), OpenCLClock< output_timestamps >::test_measurements(), CLArray< cl_int >::unmap(), CLSubTensor::unmap(), CLTensor::unmap(), CLBufferMemoryRegion::unmap(), ClGemm::validate(), ClGemmLowpMatrixMultiplyCore::validate(), CLDepthwiseConvolutionLayer::validate(), ClConv2d::validate(), CLGenerateProposalsLayer::validate(), and CLConvolutionLayer::validate().

104 {
105  std::call_once(_initialize_symbols, opencl_is_available);
106  static CLScheduler scheduler;
107  return scheduler;
108 }
CLScheduler()
Constructor.
Definition: CLScheduler.cpp:97
bool opencl_is_available()
Check if OpenCL is available.
Definition: OpenCL.cpp:154

◆ init()

void init ( cl::Context  context,
cl::CommandQueue  queue,
const cl::Device &  device,
ICLTuner cl_tuner = nullptr,
CLGEMMHeuristicsHandle gemm_h = nullptr,
CLBackendType  cl_backend_type = CLBackendType::Native 
)

Initialises the context and command queue to be used by the scheduler.

Parameters
[in]contextA CL context.
[in]queueA CL command queue.
[in]deviceA CL device.
[in]cl_tuner(Optional) Pointer to OpenCL tuner (default=nullptr) Note: It is caller's responsibility to release the allocated memory for CLTuner
[in]gemm_h(Optional) Pointer to CLGEMMHeuristicsHandle (default = nullptr)
[in]cl_backend_type(Optional) Type of backend to use (default = CLBackendType::Native)

Definition at line 146 of file CLScheduler.cpp.

References ARM_COMPUTE_ERROR_ON_MSG, ITensorPack::empty(), arm_compute::get_target_from_device(), ICLKernel::run(), ICLKernel::run_op(), CLScheduler::set_context(), ICLTuner::tune_kernel_dynamic(), and IKernel::window().

Referenced by CLScheduler::default_init(), and CLScheduler::default_init_with_context().

147 {
148  set_context(std::move(context));
149  _queue = std::move(queue);
150  _target = get_target_from_device(device);
151  _is_initialised = true;
152  _cl_tuner = cl_tuner;
153  _gemm_heuristics = gemm_h;
154  _backend_type = cl_backend_type;
155 }
void set_context(cl::Context context)
Accessor to set the CL context to be used by the scheduler.
cl::Context & context()
Accessor for the associated CL context.
Definition: CLScheduler.cpp:32
GPUTarget get_target_from_device(const cl::Device &device)
Helper function to get the GPU target from CL device.
Definition: CLHelpers.cpp:223
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:39

◆ is_initialised()

bool is_initialised ( ) const

Definition at line 90 of file CLScheduler.cpp.

Referenced by arm_compute::restore_program_cache_from_file().

91 {
92  return _is_initialised;
93 }

◆ operator=()

CLScheduler& operator= ( const CLScheduler )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ queue()

◆ set_context()

void set_context ( cl::Context  context)

Accessor to set the CL context to be used by the scheduler.

Parameters
[in]contextA CL context.

Definition at line 140 of file CLScheduler.cpp.

References CLKernelLibrary::get(), and CLKernelLibrary::set_context().

Referenced by CLScheduler::init(), Framework::run(), and ClContext::set_cl_ctx().

141 {
142  _context = std::move(context);
143  CLKernelLibrary::get().set_context(_context);
144 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void set_context(cl::Context context)
Sets the CL context used to create programs.
cl::Context & context()
Accessor for the associated CL context.
Definition: CLScheduler.cpp:32

◆ set_queue()

void set_queue ( cl::CommandQueue  queue)

Accessor to set the CL command queue to be used by the scheduler.

Parameters
[in]queueA CL command queue.

Definition at line 55 of file CLScheduler.cpp.

Referenced by OpenCLClock< output_timestamps >::OpenCLClock(), Framework::run(), and ClQueue::set_cl_queue().

56 {
57  _queue = std::move(queue);
58 }
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:39

◆ set_target()

void set_target ( GPUTarget  target)

Accessor to set target GPU to be used by the scheduler.

Parameters
[in]targetThe target GPU.

Definition at line 60 of file CLScheduler.cpp.

References CLScheduler::target().

61 {
62  _target = target;
63 }
GPUTarget target() const
Get the target GPU.
Definition: CLScheduler.cpp:45

◆ set_tuner()

void set_tuner ( ICLTuner tuner)

Accessor to set the CL tuner to be used by the scheduler.

Parameters
[in]tunerA CL tuner

Definition at line 65 of file CLScheduler.cpp.

66 {
67  _cl_tuner = tuner;
68 }

◆ sync()

void sync ( )

Blocks until all commands in the associated command queue have finished.

Definition at line 70 of file CLScheduler.cpp.

Referenced by CLCropResize::configure(), main(), CLCropResize::run(), CLDeviceBackend::sync(), arm_compute::test::sync_if_necessary(), and arm_compute::test::validation::TEST_CASE().

71 {
72  _queue.finish();
73 }

◆ target()

◆ tune_kernel_static()

void tune_kernel_static ( ICLKernel kernel)

Tunes OpenCL kernel.

Parameters
[in]kernelKernel to tune

Definition at line 82 of file CLScheduler.cpp.

References ICLTuner::tune_kernel_static().

Referenced by ClScale::configure(), ClPool2d::configure(), ClDirectConv2d::configure(), CLRange::configure(), and ClGemmConv2d::configure().

83 {
84  if(_cl_tuner != nullptr)
85  {
86  _cl_tuner->tune_kernel_static(kernel);
87  }
88 }
virtual void tune_kernel_static(ICLKernel &kernel)=0
Tune OpenCL kernel statically.

The documentation for this class was generated from the following files: