Compute Library
 22.08
CLTuner Class Reference

Basic implementation of the OpenCL tuner interface. More...

#include <CLTuner.h>

Collaboration diagram for CLTuner:
[legend]

Public Member Functions

 CLTuner (bool tune_new_kernels=true, CLTuningInfo tuning_info=CLTuningInfo())
 Constructor. More...
 
 ~CLTuner ()=default
 Destructor. More...
 
void set_tune_new_kernels (bool tune_new_kernels)
 Setter for tune_new_kernels option. More...
 
bool tune_new_kernels () const
 Tune kernels that are not in the tuning parameters table. More...
 
void set_tuning_parameters (CLTuningInfo tuning_info)
 Setter for tune parameters option. More...
 
void set_tuner_mode (CLTunerMode mode)
 Set OpenCL tuner mode. More...
 
void add_tuning_params (const std::string &kernel_id, CLTuningParams optimal_tuning_params)
 Manually add tuning parameters for a kernel. More...
 
void import_tuning_params (const std::unordered_map< std::string, CLTuningParams > &tuning_params_table)
 Import tuning parameters table. More...
 
const std::unordered_map< std::string, CLTuningParams > & tuning_params_table () const
 Give read access to the tuning params table. More...
 
void set_cl_kernel_event (cl_event kernel_event)
 Set the OpenCL kernel event. More...
 
void load_from_file (const std::string &filename)
 Load the tuning parameters table from file. More...
 
bool save_to_file (const std::string &filename) const
 Save the content of the tuning parameters table to file. More...
 
void tune_kernel_static (ICLKernel &kernel) override
 Tune OpenCL kernel statically. More...
 
void tune_kernel_dynamic (ICLKernel &kernel) override
 Tune OpenCL kernel dynamically. More...
 
void tune_kernel_dynamic (ICLKernel &kernel, ITensorPack &tensors) override
 Tune OpenCL kernel dynamically. More...
 
void tune_kernel_dynamic (ICLKernel &kernel, ITensorPack &tensors, const experimental::dynamic_fusion::ClExecutionDescriptor &exec_desc) override
 Tune OpenCL kernel dynamically for dynamic fusion interface. More...
 
bool kernel_event_is_set () const
 Is the kernel_event set ? More...
 
- Public Member Functions inherited from ICLTuner
virtual ~ICLTuner ()=default
 Virtual destructor. More...
 

Data Fields

std::function< decltype(clEnqueueNDRangeKernel)> real_clEnqueueNDRangeKernel
 clEnqueueNDRangeKernel symbol More...
 

Detailed Description

Basic implementation of the OpenCL tuner interface.

Examples:
dynamic_fusion/cl_fused_conv2d_elementwise_add.cpp.

Definition at line 40 of file CLTuner.h.

Constructor & Destructor Documentation

◆ CLTuner()

CLTuner ( bool  tune_new_kernels = true,
CLTuningInfo  tuning_info = CLTuningInfo() 
)

Constructor.

Parameters
[in]tune_new_kernelsFind the optimal local workgroup size for kernels which are not present in the table ?
[in]tuning_info(Optional) opencl parameters to tune

Definition at line 41 of file CLTuner.cpp.

References ITensorPack::empty(), ICLKernel::run(), ICLKernel::run_composite_op(), ICLKernel::run_op(), and IKernel::window().

42  : real_clEnqueueNDRangeKernel(nullptr), _tuning_params_table(), _lws_table(), _kernel_event(), _tune_new_kernels(tune_new_kernels), _tuning_info(tuning_info)
43 {
44 }
std::function< decltype(clEnqueueNDRangeKernel)> real_clEnqueueNDRangeKernel
clEnqueueNDRangeKernel symbol
Definition: CLTuner.h:106
bool tune_new_kernels() const
Tune kernels that are not in the tuning parameters table.
Definition: CLTuner.cpp:101

◆ ~CLTuner()

~CLTuner ( )
default

Destructor.

Member Function Documentation

◆ add_tuning_params()

void add_tuning_params ( const std::string &  kernel_id,
CLTuningParams  optimal_tuning_params 
)

Manually add tuning parameters for a kernel.

Parameters
[in]kernel_idUnique identifiant of the kernel
[in]optimal_tuning_paramsOptimal tuning parameters to use for the given kernel

Definition at line 177 of file CLTuner.cpp.

References CLSymbols::clEnqueueNDRangeKernel_ptr, clRetainEvent(), arm_compute::mlgo::parser::end(), CLKernelLibrary::get(), CLScheduler::get(), CLSymbols::get(), CLTuningParams::get_lws(), arm_compute::cl_tuner::get_tuning_parameters_list(), CLTuningParams::get_wbsm(), ICLKernel::gws_from_window(), CLKernelLibrary::is_wbsm_supported(), CLTuner::kernel_event_is_set(), CLScheduler::queue(), CLTuner::real_clEnqueueNDRangeKernel, CLTuner::set_cl_kernel_event(), CLTuningParams::set_lws(), and CLTuningParams::set_wbsm().

Referenced by CLTuner::load_from_file(), and CLTuner::tune_kernel_dynamic().

178 {
179  _tuning_params_table.emplace(kernel_id, optimal_tuning_params);
180 }

◆ import_tuning_params()

void import_tuning_params ( const std::unordered_map< std::string, CLTuningParams > &  tuning_params_table)

Import tuning parameters table.

Parameters
[in]tuning_params_tableThe unordered_map container to import

Definition at line 304 of file CLTuner.cpp.

References CLTuner::tuning_params_table().

305 {
306  _tuning_params_table.clear();
307  _tuning_params_table = tuning_params_table;
308 }
const std::unordered_map< std::string, CLTuningParams > & tuning_params_table() const
Give read access to the tuning params table.
Definition: CLTuner.cpp:299

◆ kernel_event_is_set()

bool kernel_event_is_set ( ) const

Is the kernel_event set ?

Returns
true if the kernel_event is set.

Definition at line 88 of file CLTuner.cpp.

Referenced by CLTuner::add_tuning_params().

89 {
90  return _kernel_event() != nullptr;
91 }

◆ load_from_file()

void load_from_file ( const std::string &  filename)

Load the tuning parameters table from file.

It also sets up the tuning read from the file

Parameters
[in]filenameLoad the tuning parameters table from this file.(Must exist)

Definition at line 310 of file CLTuner.cpp.

References CLTuner::add_tuning_params(), ARM_COMPUTE_ERROR_VAR, and CLTuningParams::from_string().

Referenced by main(), and CLDeviceBackend::setup_backend_context().

311 {
312  std::ifstream fs;
313  fs.exceptions(std::ifstream::badbit);
314  fs.open(filename, std::ios::in);
315  if(!fs.is_open())
316  {
317  ARM_COMPUTE_ERROR_VAR("Failed to open '%s' (%s [%d])", filename.c_str(), strerror(errno), errno);
318  }
319  std::string line;
320  bool header_line = true;
321  while(!std::getline(fs, line).fail())
322  {
323  if(header_line)
324  {
325  header_line = false;
326  size_t pos_lws = line.find("lws");
327  size_t pos_wbsm = line.find("wbsm");
328  _tuning_info.tune_wbsm = false;
329  if(pos_lws != std::string::npos || pos_wbsm != std::string::npos)
330  {
331  // The file has in the first line the parameters it has been tuned on
332  if(pos_wbsm != std::string::npos)
333  {
334  _tuning_info.tune_wbsm = true;
335  }
336  // Once the line with the tuning parameter is read we can
337  // read the next one to start collecting the values
338  if(std::getline(fs, line).fail())
339  {
340  break;
341  }
342  }
343  }
344 
345  CLTuningParams tuning_params;
346  size_t pos = line.find(";");
347  if(pos == std::string::npos)
348  {
349  ARM_COMPUTE_ERROR_VAR("Malformed row '%s' in %s", line.c_str(), filename.c_str());
350  }
351  std::string kernel_id = line.substr(0, pos);
352  line.erase(0, pos + 1);
353  if(!tuning_params.from_string(_tuning_info, line))
354  {
355  ARM_COMPUTE_ERROR_VAR("Malformed row '%s' in %s", line.c_str(), filename.c_str());
356  }
357  add_tuning_params(kernel_id, tuning_params);
358  }
359  fs.close();
360 }
bool tune_wbsm
Flag to tune the batches of work groups distributed to compute units.
Definition: CLTunerTypes.h:46
#define ARM_COMPUTE_ERROR_VAR(msg,...)
Print the given message then throw an std::runtime_error.
Definition: Error.h:346
void add_tuning_params(const std::string &kernel_id, CLTuningParams optimal_tuning_params)
Manually add tuning parameters for a kernel.
Definition: CLTuner.cpp:177

◆ save_to_file()

bool save_to_file ( const std::string &  filename) const

Save the content of the tuning parameters table to file.

Parameters
[in]filenameSave the tuning parameters table to this file. (Content will be overwritten)
Returns
true if the file was created

Definition at line 362 of file CLTuner.cpp.

References CLTuningParams::to_string().

Referenced by main(), and CLDeviceBackend::~CLDeviceBackend().

363 {
364  if(!_tune_new_kernels || _tuning_params_table.empty() || filename.empty())
365  {
366  return false;
367  }
368  std::ofstream fs;
369  fs.exceptions(std::ifstream::failbit | std::ifstream::badbit);
370  fs.open(filename, std::ios::out);
371  std::string header_string = "";
372  header_string += "lws";
373  if(_tuning_info.tune_wbsm)
374  {
375  if(!header_string.empty())
376  {
377  header_string += " ";
378  }
379  header_string += "wbsm";
380  }
381  fs << header_string << std::endl;
382  for(auto const &kernel_data : _tuning_params_table)
383  {
384  CLTuningParams tun_pams(kernel_data.second);
385  fs << kernel_data.first << tun_pams.to_string(_tuning_info) << std::endl;
386  }
387  fs.close();
388  return true;
389 }
bool tune_wbsm
Flag to tune the batches of work groups distributed to compute units.
Definition: CLTunerTypes.h:46

◆ set_cl_kernel_event()

void set_cl_kernel_event ( cl_event  kernel_event)

Set the OpenCL kernel event.

Note
The interceptor can use this function to store the event associated to the OpenCL kernel
Parameters
[in]kernel_eventThe OpenCL kernel event

Definition at line 92 of file CLTuner.cpp.

Referenced by CLTuner::add_tuning_params().

93 {
94  _kernel_event = kernel_event;
95 }

◆ set_tune_new_kernels()

void set_tune_new_kernels ( bool  tune_new_kernels)

Setter for tune_new_kernels option.

Parameters
[in]tune_new_kernelsFind the optimal local workgroup size for kernels which are not present in the table ?

Definition at line 97 of file CLTuner.cpp.

References CLTuner::tune_new_kernels().

Referenced by main(), and CLDeviceBackend::set_kernel_tuning().

98 {
99  _tune_new_kernels = tune_new_kernels;
100 }
bool tune_new_kernels() const
Tune kernels that are not in the tuning parameters table.
Definition: CLTuner.cpp:101

◆ set_tuner_mode()

void set_tuner_mode ( CLTunerMode  mode)

Set OpenCL tuner mode.

Parameters
[in]modeIndicates how exhaustive the search for the optimal tuning parameters should be while tuning. Default is Exhaustive mode
Examples:
dynamic_fusion/cl_fused_conv2d_elementwise_add.cpp.

Definition at line 106 of file CLTuner.cpp.

References clang_tidy_rules::mode.

Referenced by main(), and CLDeviceBackend::set_kernel_tuning_mode().

107 {
108  _tuning_info.tuner_mode = mode;
109 }
CLTunerMode tuner_mode
Parameter to select the level (granularity) of the tuning.
Definition: CLTunerTypes.h:45

◆ set_tuning_parameters()

void set_tuning_parameters ( CLTuningInfo  tuning_info)

Setter for tune parameters option.

Parameters
[in]tuning_infoopencl parameters to tune

◆ tune_kernel_dynamic() [1/3]

void tune_kernel_dynamic ( ICLKernel kernel)
overridevirtual

Tune OpenCL kernel dynamically.

Note
Tuning requires memory to be available on all kernel tensors and objects in order to be performed
Parameters
[in]kernelKernel to tune

Implements ICLTuner.

Definition at line 116 of file CLTuner.cpp.

References CLTuner::add_tuning_params(), ICLKernel::config_id(), CLKernelLibrary::get(), CLTuningParams::get_lws(), CLKernelLibrary::get_num_compute_units(), ICLKernel::get_target(), CLTuningParams::get_wbsm(), arm_compute::test::validation::pack, ICLKernel::set_lws_hint(), ICLKernel::set_wbsm_hint(), arm_compute::string_from_target(), and arm_compute::support::cpp11::to_string().

117 {
118  ITensorPack pack;
119  tune_kernel_dynamic(kernel, pack);
120 }
void tune_kernel_dynamic(ICLKernel &kernel) override
Tune OpenCL kernel dynamically.
Definition: CLTuner.cpp:116

◆ tune_kernel_dynamic() [2/3]

void tune_kernel_dynamic ( ICLKernel kernel,
ITensorPack tensors 
)
overridevirtual

Tune OpenCL kernel dynamically.

Parameters
[in]kernelKernel to tune
[in,out]tensorsTensors for the kernel to use

Implements ICLTuner.

Definition at line 161 of file CLTuner.cpp.

162 {
163  DefaultKernelData data{ tensors };
164 
165  do_tune_kernel_dynamic(kernel, &data);
166 }

◆ tune_kernel_dynamic() [3/3]

void tune_kernel_dynamic ( ICLKernel kernel,
ITensorPack tensors,
const experimental::dynamic_fusion::ClExecutionDescriptor exec_desc 
)
overridevirtual

Tune OpenCL kernel dynamically for dynamic fusion interface.

Parameters
[in]kernelKernel to tune
[in,out]tensorsTensors for the kernel to use
[in]exec_descExecution descriptor

Implements ICLTuner.

Definition at line 169 of file CLTuner.cpp.

170 {
171  CompositeKernelData data{ tensors, exec_desc };
172 
173  do_tune_kernel_dynamic(kernel, &data);
174 }

◆ tune_kernel_static()

void tune_kernel_static ( ICLKernel kernel)
overridevirtual

Tune OpenCL kernel statically.

Note
Tuning is performed using only kernel and tensor metadata, thus can be performed when memory is not available
Parameters
[in]kernelKernel to tune

Implements ICLTuner.

Definition at line 111 of file CLTuner.cpp.

References ARM_COMPUTE_UNUSED.

112 {
113  ARM_COMPUTE_UNUSED(kernel);
114 }
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152

◆ tune_new_kernels()

bool tune_new_kernels ( ) const

Tune kernels that are not in the tuning parameters table.

Returns
True if tuning of new kernels is enabled.

Definition at line 101 of file CLTuner.cpp.

Referenced by CLTuner::set_tune_new_kernels().

102 {
103  return _tune_new_kernels;
104 }

◆ tuning_params_table()

const std::unordered_map< std::string, CLTuningParams > & tuning_params_table ( ) const

Give read access to the tuning params table.

Returns
The tuning params table as unordered_map container

Definition at line 299 of file CLTuner.cpp.

Referenced by CLTuner::import_tuning_params().

300 {
301  return _tuning_params_table;
302 }

Field Documentation

◆ real_clEnqueueNDRangeKernel

std::function<decltype(clEnqueueNDRangeKernel)> real_clEnqueueNDRangeKernel

clEnqueueNDRangeKernel symbol

Definition at line 106 of file CLTuner.h.

Referenced by CLTuner::add_tuning_params().


The documentation for this class was generated from the following files: