Compute Library
 22.08
CLScheduler Class Referencefinal

Provides global access to a CL context and command queue. More...

#include <CLScheduler.h>

Public Member Functions

 CLScheduler ()
 Constructor. More...
 
 CLScheduler (const CLScheduler &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLScheduleroperator= (const CLScheduler &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 ~CLScheduler ()=default
 Default destructor. More...
 
void default_init (ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr, CLBackendType cl_backend_type=CLBackendType::Native)
 Initialises the context and command queue used by the scheduler to default values and sets a default device and kernel path for the CLKernelLibrary. More...
 
void default_init_with_context (cl::Device &device, cl::Context &ctx, ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr)
 Initialises the scheduler with context and device provided by the user. More...
 
void default_reinit (ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr, CLBackendType cl_backend_type=CLBackendType::Native)
 Re-initializes the context and command queue used by the scheduler to default values and sets a default device and kernel path for the CLKernelLibrary. More...
 
void enqueue (ICLKernel &kernel, bool flush=true)
 Schedule the execution of the passed kernel if possible. More...
 
void enqueue_op (ICLKernel &kernel, ITensorPack &tensors, bool flush=true)
 Schedule the execution of the passed kernel if possible. More...
 
void enqueue_op (ICLKernel &kernel, ITensorPack &tensors, const experimental::dynamic_fusion::ClExecutionDescriptor &exec_desc, bool flush=true)
 Schedule the execution of the passed kernel if possible. More...
 
void init (cl::Context context, cl::CommandQueue queue, const cl::Device &device, ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr, CLBackendType cl_backend_type=CLBackendType::Native)
 Initialises the context and command queue to be used by the scheduler. More...
 
cl::Context & context ()
 Accessor for the associated CL context. More...
 
cl::CommandQueue & queue ()
 Accessor for the associated CL command queue. More...
 
GPUTarget target () const
 Get the target GPU. More...
 
CLGEMMHeuristicsHandlegemm_heuristics () const
 Accessor for the associated CLGEMMHeuristicsHandle. More...
 
void set_context (cl::Context context)
 Accessor to set the CL context to be used by the scheduler. More...
 
void set_queue (cl::CommandQueue queue)
 Accessor to set the CL command queue to be used by the scheduler. More...
 
void set_target (GPUTarget target)
 Accessor to set target GPU to be used by the scheduler. More...
 
void set_tuner (ICLTuner *tuner)
 Accessor to set the CL tuner to be used by the scheduler. More...
 
void sync ()
 Blocks until all commands in the associated command queue have finished. More...
 
cl::Event enqueue_sync_event ()
 Enqueues a marker into the associated command queue and return the event. More...
 
void tune_kernel_static (ICLKernel &kernel)
 Tunes OpenCL kernel. More...
 
void enable_job_chaining (int job_chaining_size)
 Enable job chaining. More...
 
bool is_initialised () const
 

Static Public Member Functions

static CLSchedulerget ()
 Access the scheduler singleton. More...
 

Detailed Description

Provides global access to a CL context and command queue.

Definition at line 56 of file CLScheduler.h.

Constructor & Destructor Documentation

◆ CLScheduler() [1/2]

Constructor.

Definition at line 101 of file CLScheduler.cpp.

102  : _context(), _queue(), _target(GPUTarget::MIDGARD), _is_initialised(false), _cl_tuner(nullptr), _gemm_heuristics(nullptr), _backend_type(CLBackendType::Native), _job_chaining_enabled(false),
103  _job_chaining_size(), _job_chaining_count(0)
104 {
105 }
OpenCL native backend.

◆ CLScheduler() [2/2]

CLScheduler ( const CLScheduler )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ ~CLScheduler()

~CLScheduler ( )
default

Default destructor.

Member Function Documentation

◆ context()

cl::Context & context ( )

Accessor for the associated CL context.

Returns
A CL context.

Definition at line 36 of file CLScheduler.cpp.

References ARM_COMPUTE_ERROR_ON, CLKernelLibrary::context(), and CLKernelLibrary::get().

Referenced by CLTensorAllocator::import_memory(), arm_compute::restore_program_cache_from_file(), and Framework::run().

37 {
38  ARM_COMPUTE_ERROR_ON(!_is_initialised);
39  _context = CLKernelLibrary::get().context();
40  return _context;
41 }
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
cl::Context & context()
Accessor for the associated CL context.

◆ default_init()

void default_init ( ICLTuner cl_tuner = nullptr,
CLGEMMHeuristicsHandle gemm_h = nullptr,
CLBackendType  cl_backend_type = CLBackendType::Native 
)

Initialises the context and command queue used by the scheduler to default values and sets a default device and kernel path for the CLKernelLibrary.

Parameters
[in]cl_tuner(Optional) Pointer to ICLTuner (default=nullptr)
[in]gemm_h(Optional) Pointer to CLGEMMHeuristicsHandle (default = nullptr)
[in]cl_backend_type(Optional) Type of backend to use (default = CLBackendType::Native)
Examples:
dynamic_fusion/cl_fused_conv2d_elementwise_add.cpp.

Definition at line 126 of file CLScheduler.cpp.

References ARM_COMPUTE_ERROR_ON_MSG, arm_compute::create_opencl_context_and_device(), CLKernelLibrary::get(), CLKernelLibrary::init(), CLScheduler::init(), and CLScheduler::queue().

Referenced by CLScheduler::default_reinit(), CLDeviceBackend::initialize_backend(), and arm_compute::restore_program_cache_from_file().

127 {
128  if(!_is_initialised)
129  {
130  cl::Context ctx;
131  cl::Device dev;
132  cl_int err;
133  std::tie(ctx, dev, err) = create_opencl_context_and_device(cl_backend_type);
134  ARM_COMPUTE_ERROR_ON_MSG(err != CL_SUCCESS, "Failed to create OpenCL context");
135  cl::CommandQueue queue = cl::CommandQueue(ctx, dev);
136  CLKernelLibrary::get().init("./cl_kernels/", ctx, dev);
137  init(ctx, queue, dev, cl_tuner, gemm_h);
138  }
139 
140  // Set CL tuner and GEMM heuristics
141  _cl_tuner = cl_tuner;
142  _gemm_heuristics = gemm_h;
143 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
std::tuple< cl::Context, cl::Device, cl_int > create_opencl_context_and_device(CLBackendType cl_backend_type)
This function creates an OpenCL context and a device.
Definition: CLHelpers.cpp:126
#define ARM_COMPUTE_ERROR_ON_MSG(cond, msg)
Definition: Error.h:456
void init(std::string kernel_path, cl::Context context, cl::Device device)
Initialises the kernel library.
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:43
void init(cl::Context context, cl::CommandQueue queue, const cl::Device &device, ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr, CLBackendType cl_backend_type=CLBackendType::Native)
Initialises the context and command queue to be used by the scheduler.

◆ default_init_with_context()

void default_init_with_context ( cl::Device &  device,
cl::Context &  ctx,
ICLTuner cl_tuner = nullptr,
CLGEMMHeuristicsHandle gemm_h = nullptr 
)

Initialises the scheduler with context and device provided by the user.

Parameters
[in]deviceOpenCL device to be used
[in]ctxOpenCL ctx to be used
[in]cl_tuner(Optional) Pointer to ICLTuner (default=nullptr)
[in]gemm_h(Optional) Pointer to CLGEMMHeuristicsHandle (default = nullptr)

Definition at line 114 of file CLScheduler.cpp.

References CLKernelLibrary::get(), CLKernelLibrary::init(), CLScheduler::init(), and CLScheduler::queue().

Referenced by main(), and arm_compute::utils::run_example().

115 {
116  if(!_is_initialised)
117  {
118  const std::string cl_kernels_folder("./cl_kernels/");
119  cl::CommandQueue queue = cl::CommandQueue(ctx, device);
120  CLKernelLibrary::get().init(cl_kernels_folder, ctx, device);
121  init(ctx, queue, device, cl_tuner, gemm_h);
122  _cl_tuner = cl_tuner;
123  }
124 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void init(std::string kernel_path, cl::Context context, cl::Device device)
Initialises the kernel library.
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:43
void init(cl::Context context, cl::CommandQueue queue, const cl::Device &device, ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr, CLBackendType cl_backend_type=CLBackendType::Native)
Initialises the context and command queue to be used by the scheduler.

◆ default_reinit()

void default_reinit ( ICLTuner cl_tuner = nullptr,
CLGEMMHeuristicsHandle gemm_h = nullptr,
CLBackendType  cl_backend_type = CLBackendType::Native 
)

Re-initializes the context and command queue used by the scheduler to default values and sets a default device and kernel path for the CLKernelLibrary.

Parameters
[in]cl_tuner(Optional) Pointer to ICLTuner (default=nullptr)
[in]gemm_h(Optional) Pointer to CLGEMMHeuristicsHandle (default = nullptr)
[in]cl_backend_type(Optional) Type of backend to use (default = CLBackendType::Native)

Definition at line 145 of file CLScheduler.cpp.

References CLScheduler::default_init().

Referenced by arm_compute::test::validation::TEST_CASE().

146 {
147  _is_initialised = false;
148 
149  default_init(cl_tuner, gemm_h, cl_backend_type);
150 }
void default_init(ICLTuner *cl_tuner=nullptr, CLGEMMHeuristicsHandle *gemm_h=nullptr, CLBackendType cl_backend_type=CLBackendType::Native)
Initialises the context and command queue used by the scheduler to default values and sets a default ...

◆ enable_job_chaining()

void enable_job_chaining ( int  job_chaining_size)

Enable job chaining.

The command queue will only be flushed when job_chaining_size kernels have been enqueued.

Parameters
[in]job_chaining_sizeKernels to enqueue before flushing

Definition at line 257 of file CLScheduler.cpp.

258 {
259  _job_chaining_enabled = true;
260  _job_chaining_size = job_chaining_size;
261 }

◆ enqueue()

◆ enqueue_op() [1/2]

void enqueue_op ( ICLKernel kernel,
ITensorPack tensors,
bool  flush = true 
)

Schedule the execution of the passed kernel if possible.

Parameters
[in]kernelKernel to execute.
[in]tensorsVector containing the tensors to operate on.
[in]flush(Optional) Specifies if the command queue will be flushed after running the kernel. This will be ignored if job chaining is enabled.

Definition at line 243 of file CLScheduler.cpp.

Referenced by ClGemm::prepare(), ClGemmLowpMatrixMultiplyCore::prepare(), ClWinogradConv2d::prepare(), ClGemmConv2d::prepare(), ClCompositeOperator::prepare(), CLQLSTMLayer::prepare(), ClDequantize::run(), ClQuantize::run(), ICLOperator::run(), ClScale::run(), ClSoftmax::run(), ClConcatenate::run(), ClDirectConv2d::run(), ClDirectConv3d::run(), ClGemmLowpOutputStage::run(), ClGemm::run(), ClGemmLowpMatrixMultiplyCore::run(), CLSynthetizeOperatorInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::run(), ClWinogradConv2d::run(), ClGemmConv2d::run(), ClCompositeOperator::run(), CLLSTMLayer::run(), and ClSynthetizeOperatorWithBorder< K >::run().

244 {
245  enqueue_common(kernel, tensors, flush);
246 }

◆ enqueue_op() [2/2]

void enqueue_op ( ICLKernel kernel,
ITensorPack tensors,
const experimental::dynamic_fusion::ClExecutionDescriptor exec_desc,
bool  flush = true 
)

Schedule the execution of the passed kernel if possible.

Parameters
[in]kernelKernel to execute.
[in]tensorsMap containing the tensors to operate on.
[in]exec_descExecution descriptor
[in]flush(Optional) Specifies if the command queue will be flushed after running the kernel. This will be ignored if job chaining is enabled.

Definition at line 250 of file CLScheduler.cpp.

251 {
252  enqueue_common(kernel, tensors, exec_desc, flush);
253 }

◆ enqueue_sync_event()

cl::Event enqueue_sync_event ( )

Enqueues a marker into the associated command queue and return the event.

Returns
An event that can be waited on to block the executing thread.

Definition at line 79 of file CLScheduler.cpp.

80 {
81  cl::Event event;
82  _queue.enqueueMarker(&event);
83  return event;
84 }

◆ gemm_heuristics()

◆ get()

CLScheduler & get ( )
static

Access the scheduler singleton.

This method has been deprecated and will be removed in future releases

Returns
The scheduler
Examples:
dynamic_fusion/cl_fused_conv2d_elementwise_add.cpp.

Definition at line 107 of file CLScheduler.cpp.

References arm_compute::opencl_is_available().

Referenced by CLTuner::add_tuning_params(), CLBufferAllocator::allocate(), ClDirectConvolutionKernelComponent::allocate_shared_vars(), ClQueue::cl_queue(), CLBufferMemoryRegion::CLBufferMemoryRegion(), ClScale::configure(), ClPool2d::configure(), ClPool3d::configure(), ClSoftmax::configure(), ClDirectConv2d::configure(), CLPriorBoxLayer::configure(), CLRange::configure(), CLFFT1D::configure(), ClGemm::configure(), CLDepthwiseConvolutionLayer::configure(), ClGemmLowpMatrixMultiplyCore::configure(), ClWinogradConv2d::configure(), CLSynthetizeOperatorInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::configure(), CLCropResize::configure(), ClGemmConv2d::configure(), ClConv2d::configure(), CLConvolutionLayer::configure(), CLSynthetizeFunctionInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::configure(), arm_compute::test::validation::DATA_TEST_CASE(), ClQueue::finish(), ClDirectConvolutionKernelComponent::generate_build_options(), CLTensorAllocator::import_memory(), CLDeviceBackend::initialize_backend(), main(), CLArray< cl_int >::map(), CLSubTensor::map(), CLTensor::map(), OpenCLClock< output_timestamps >::OpenCLClock(), ClGemm::prepare(), ClGemmLowpMatrixMultiplyCore::prepare(), ClWinogradConv2d::prepare(), ClGemmConv2d::prepare(), CLFFTConvolutionLayer::prepare(), ClCompositeOperator::prepare(), CLQLSTMLayer::prepare(), arm_compute::restore_program_cache_from_file(), ClDequantize::run(), ClQuantize::run(), CLSplit::run(), ICLOperator::run(), ClScale::run(), ClSoftmax::run(), ClConcatenate::run(), ClDirectConv2d::run(), ClDirectConv3d::run(), ClGemmLowpOutputStage::run(), CLSpaceToDepthLayer::run(), CLDeconvolutionLayerUpsample::run(), CLFFT1D::run(), ClGemm::run(), CLStackLayer::run(), ClGemmLowpMatrixMultiplyCore::run(), CLL2NormalizeLayer::run(), CLNormalizationLayer::run(), CLPadLayer::run(), CLArgMinMaxLayer::run(), CLReductionOperation::run(), ClWinogradConv2d::run(), CLSynthetizeOperatorInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::run(), CLDepthwiseConvolutionLayer::run(), CLMaxUnpoolingLayer::run(), ClGemmConv2d::run(), CLBatchToSpaceLayer::run(), CLFuseBatchNormalization::run(), CLCropResize::run(), CLBatchNormalizationLayer::run(), CLSpaceToBatchLayer::run(), CLGEMMDeconvolutionLayer::run(), CLGenerateProposalsLayer::run(), ClCompositeOperator::run(), CLSynthetizeFunctionInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >::run(), CLLSTMLayer::run(), CLQLSTMLayer::run(), ClSynthetizeOperatorWithBorder< K >::run(), Framework::run(), arm_compute::utils::run_example(), arm_compute::save_program_cache_to_file(), arm_compute::schedule_kernel_on_ctx(), ClQueue::scheduler(), arm_compute::cl_gemm::auto_heuristics::select_mlgo_gemm_config_native(), arm_compute::cl_gemm::auto_heuristics::select_mlgo_gemm_config_reshaped(), arm_compute::cl_gemm::auto_heuristics::select_mlgo_gemm_config_reshaped_only_rhs(), arm_compute::cl_gemm::auto_heuristics::select_mlgo_gemm_kernel(), ClContext::set_cl_ctx(), ClQueue::set_cl_queue(), CLTensorAllocator::set_global_allocator(), CLDeviceBackend::setup_backend_context(), CLDeviceBackend::sync(), arm_compute::test::sync_if_necessary(), arm_compute::test::validation::TEST_CASE(), OpenCLClock< output_timestamps >::test_measurements(), Conv2dContent::translate(), CLArray< cl_int >::unmap(), CLSubTensor::unmap(), CLTensor::unmap(), CLBufferMemoryRegion::unmap(), ClGemm::validate(), ClGemmLowpMatrixMultiplyCore::validate(), CLDepthwiseConvolutionLayer::validate(), ClConv2d::validate(), CLGenerateProposalsLayer::validate(), and CLConvolutionLayer::validate().

108 {
109  std::call_once(_initialize_symbols, opencl_is_available);
110  static CLScheduler scheduler;
111  return scheduler;
112 }
CLScheduler()
Constructor.
bool opencl_is_available()
Check if OpenCL is available.
Definition: OpenCL.cpp:188

◆ init()

void init ( cl::Context  context,
cl::CommandQueue  queue,
const cl::Device &  device,
ICLTuner cl_tuner = nullptr,
CLGEMMHeuristicsHandle gemm_h = nullptr,
CLBackendType  cl_backend_type = CLBackendType::Native 
)

Initialises the context and command queue to be used by the scheduler.

Parameters
[in]contextA CL context.
[in]queueA CL command queue.
[in]deviceA CL device.
[in]cl_tuner(Optional) Pointer to OpenCL tuner (default=nullptr) Note: It is caller's responsibility to release the allocated memory for CLTuner
[in]gemm_h(Optional) Pointer to CLGEMMHeuristicsHandle (default = nullptr)
[in]cl_backend_type(Optional) Type of backend to use (default = CLBackendType::Native)

Definition at line 158 of file CLScheduler.cpp.

References ARM_COMPUTE_ERROR_ON_MSG, ITensorPack::empty(), arm_compute::get_target_from_device(), ICLKernel::run(), ICLKernel::run_composite_op(), ICLKernel::run_op(), CLScheduler::set_context(), ICLTuner::tune_kernel_dynamic(), and IKernel::window().

Referenced by CLScheduler::default_init(), and CLScheduler::default_init_with_context().

159 {
160  set_context(std::move(context));
161  _queue = std::move(queue);
162  _target = get_target_from_device(device);
163  _is_initialised = true;
164  _cl_tuner = cl_tuner;
165  _gemm_heuristics = gemm_h;
166  _backend_type = cl_backend_type;
167 }
void set_context(cl::Context context)
Accessor to set the CL context to be used by the scheduler.
cl::Context & context()
Accessor for the associated CL context.
Definition: CLScheduler.cpp:36
GPUTarget get_target_from_device(const cl::Device &device)
Helper function to get the GPU target from CL device.
Definition: CLHelpers.cpp:223
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:43

◆ is_initialised()

bool is_initialised ( ) const

Definition at line 94 of file CLScheduler.cpp.

Referenced by arm_compute::restore_program_cache_from_file().

95 {
96  return _is_initialised;
97 }

◆ operator=()

CLScheduler& operator= ( const CLScheduler )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ queue()

◆ set_context()

void set_context ( cl::Context  context)

Accessor to set the CL context to be used by the scheduler.

Parameters
[in]contextA CL context.

Definition at line 152 of file CLScheduler.cpp.

References CLKernelLibrary::get(), and CLKernelLibrary::set_context().

Referenced by CLScheduler::init(), Framework::run(), and ClContext::set_cl_ctx().

153 {
154  _context = std::move(context);
155  CLKernelLibrary::get().set_context(_context);
156 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void set_context(cl::Context context)
Sets the CL context used to create programs.
cl::Context & context()
Accessor for the associated CL context.
Definition: CLScheduler.cpp:36

◆ set_queue()

void set_queue ( cl::CommandQueue  queue)

Accessor to set the CL command queue to be used by the scheduler.

Parameters
[in]queueA CL command queue.

Definition at line 59 of file CLScheduler.cpp.

Referenced by OpenCLClock< output_timestamps >::OpenCLClock(), Framework::run(), and ClQueue::set_cl_queue().

60 {
61  _queue = std::move(queue);
62 }
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:43

◆ set_target()

void set_target ( GPUTarget  target)

Accessor to set target GPU to be used by the scheduler.

Parameters
[in]targetThe target GPU.

Definition at line 64 of file CLScheduler.cpp.

References CLScheduler::target().

65 {
66  _target = target;
67 }
GPUTarget target() const
Get the target GPU.
Definition: CLScheduler.cpp:49

◆ set_tuner()

void set_tuner ( ICLTuner tuner)

Accessor to set the CL tuner to be used by the scheduler.

Parameters
[in]tunerA CL tuner

Definition at line 69 of file CLScheduler.cpp.

70 {
71  _cl_tuner = tuner;
72 }

◆ sync()

void sync ( )

Blocks until all commands in the associated command queue have finished.

Examples:
dynamic_fusion/cl_fused_conv2d_elementwise_add.cpp.

Definition at line 74 of file CLScheduler.cpp.

Referenced by CLCropResize::configure(), main(), CLCropResize::run(), CLDeviceBackend::sync(), arm_compute::test::sync_if_necessary(), and arm_compute::test::validation::TEST_CASE().

75 {
76  _queue.finish();
77 }

◆ target()

◆ tune_kernel_static()

void tune_kernel_static ( ICLKernel kernel)

Tunes OpenCL kernel.

Parameters
[in]kernelKernel to tune

Definition at line 86 of file CLScheduler.cpp.

References ICLTuner::tune_kernel_static().

Referenced by ClScale::configure(), ClPool2d::configure(), ClPool3d::configure(), ClDirectConv2d::configure(), CLRange::configure(), and ClGemmConv2d::configure().

87 {
88  if(_cl_tuner != nullptr)
89  {
90  _cl_tuner->tune_kernel_static(kernel);
91  }
92 }
virtual void tune_kernel_static(ICLKernel &kernel)=0
Tune OpenCL kernel statically.

The documentation for this class was generated from the following files: