Compute Library
 21.05
CLTuner Class Reference

Basic implementation of the OpenCL tuner interface. More...

#include <CLTuner.h>

Collaboration diagram for CLTuner:
[legend]

Public Member Functions

 CLTuner (bool tune_new_kernels=true, CLTuningInfo tuning_info=CLTuningInfo())
 Constructor. More...
 
 ~CLTuner ()=default
 Destructor. More...
 
void set_tune_new_kernels (bool tune_new_kernels)
 Setter for tune_new_kernels option. More...
 
bool tune_new_kernels () const
 Tune kernels that are not in the tuning parameters table. More...
 
void set_tuning_parameters (CLTuningInfo tuning_info)
 Setter for tune parameters option. More...
 
void set_tuner_mode (CLTunerMode mode)
 Set OpenCL tuner mode. More...
 
CLTunerMode get_tuner_mode () const
 Get the current OpenCL tuner mode. More...
 
void add_lws_to_table (const std::string &kernel_id, cl::NDRange optimal_lws)
 Manually add a LWS for a kernel. More...
 
void add_tuning_params (const std::string &kernel_id, CLTuningParams optimal_tuning_params)
 Manually add tuning parameters for a kernel. More...
 
void import_lws_table (const std::unordered_map< std::string, cl::NDRange > &lws_table)
 Import LWS table. More...
 
void import_tuning_params (const std::unordered_map< std::string, CLTuningParams > &tuning_params_table)
 Import tuning parameters table. More...
 
const std::unordered_map< std::string, cl::NDRange > & lws_table ()
 Give read access to the LWS table. More...
 
const std::unordered_map< std::string, CLTuningParams > & tuning_params_table () const
 Give read access to the tuning params table. More...
 
void set_cl_kernel_event (cl_event kernel_event)
 Set the OpenCL kernel event. More...
 
void load_from_file (const std::string &filename)
 Load the tuning parameters table from file. More...
 
bool save_to_file (const std::string &filename) const
 Save the content of the tuning parameters table to file. More...
 
void tune_kernel_static (ICLKernel &kernel) override
 Tune OpenCL kernel statically. More...
 
void tune_kernel_dynamic (ICLKernel &kernel) override
 Tune OpenCL kernel dynamically. More...
 
void tune_kernel_dynamic (ICLKernel &kernel, ITensorPack &tensors) override
 Tune OpenCL kernel dynamically. More...
 
bool kernel_event_is_set () const
 Is the kernel_event set ? More...
 
- Public Member Functions inherited from ICLTuner
virtual ~ICLTuner ()=default
 Virtual destructor. More...
 

Data Fields

std::function< decltype(clEnqueueNDRangeKernel)> real_clEnqueueNDRangeKernel
 clEnqueueNDRangeKernel symbol More...
 

Detailed Description

Basic implementation of the OpenCL tuner interface.

Definition at line 40 of file CLTuner.h.

Constructor & Destructor Documentation

◆ CLTuner()

CLTuner ( bool  tune_new_kernels = true,
CLTuningInfo  tuning_info = CLTuningInfo() 
)

Constructor.

Parameters
[in]tune_new_kernelsFind the optimal local workgroup size for kernels which are not present in the table ?
[in]tuning_info(Optional) opencl parameters to tune

Definition at line 38 of file CLTuner.cpp.

39  : real_clEnqueueNDRangeKernel(nullptr), _tuning_params_table(), _lws_table(), _kernel_event(), _tune_new_kernels(tune_new_kernels), _tuning_info(tuning_info)
40 {
41 }
std::function< decltype(clEnqueueNDRangeKernel)> real_clEnqueueNDRangeKernel
clEnqueueNDRangeKernel symbol
Definition: CLTuner.h:142
bool tune_new_kernels() const
Tune kernels that are not in the tuning parameters table.
Definition: CLTuner.cpp:56

◆ ~CLTuner()

~CLTuner ( )
default

Destructor.

Member Function Documentation

◆ add_lws_to_table()

void add_lws_to_table ( const std::string &  kernel_id,
cl::NDRange  optimal_lws 
)

Manually add a LWS for a kernel.

Parameters
[in]kernel_idUnique identifiant of the kernel
[in]optimal_lwsOptimal local workgroup size to use for the given kernel
Deprecated:
This function is deprecated and is intended to be removed in 21.08 release

Definition at line 122 of file CLTuner.cpp.

123 {
124  add_tuning_params(kernel_id, CLTuningParams(optimal_lws));
125 }
void add_tuning_params(const std::string &kernel_id, CLTuningParams optimal_tuning_params)
Manually add tuning parameters for a kernel.
Definition: CLTuner.cpp:127

References CLTuner::add_tuning_params().

◆ add_tuning_params()

void add_tuning_params ( const std::string &  kernel_id,
CLTuningParams  optimal_tuning_params 
)

Manually add tuning parameters for a kernel.

Parameters
[in]kernel_idUnique identifiant of the kernel
[in]optimal_tuning_paramsOptimal tuning parameters to use for the given kernel

Definition at line 127 of file CLTuner.cpp.

128 {
129  _tuning_params_table.emplace(kernel_id, optimal_tuning_params);
130 }

Referenced by CLTuner::add_lws_to_table(), CLTuner::import_lws_table(), CLTuner::load_from_file(), and CLTuner::tune_kernel_dynamic().

◆ get_tuner_mode()

CLTunerMode get_tuner_mode ( ) const

Get the current OpenCL tuner mode.

Returns
tuner_mode Indicates how exhaustive the search for the optimal tuning parameters should be while tuning
Deprecated:
This function is deprecated and is intended to be removed in 21.08 release

Definition at line 66 of file CLTuner.cpp.

67 {
68  return _tuning_info.tuner_mode;
69 }
CLTunerMode tuner_mode
Parameter to select the level (granularity) of the tuning.
Definition: CLTunerTypes.h:45

References CLTuningInfo::tuner_mode.

◆ import_lws_table()

void import_lws_table ( const std::unordered_map< std::string, cl::NDRange > &  lws_table)

Import LWS table.

Parameters
[in]lws_tableThe unordered_map container to import
Deprecated:
This function is deprecated and is intended to be removed in 21.08 release

Definition at line 250 of file CLTuner.cpp.

251 {
252  _tuning_params_table.clear();
253  for(auto && params : lws_table)
254  {
255  add_tuning_params(params.first, CLTuningParams(params.second));
256  }
257 }
const std::unordered_map< std::string, cl::NDRange > & lws_table()
Give read access to the LWS table.
Definition: CLTuner.cpp:259
void add_tuning_params(const std::string &kernel_id, CLTuningParams optimal_tuning_params)
Manually add tuning parameters for a kernel.
Definition: CLTuner.cpp:127

References CLTuner::add_tuning_params(), and CLTuner::lws_table().

◆ import_tuning_params()

void import_tuning_params ( const std::unordered_map< std::string, CLTuningParams > &  tuning_params_table)

Import tuning parameters table.

Parameters
[in]tuning_params_tableThe unordered_map container to import

Definition at line 274 of file CLTuner.cpp.

275 {
276  _tuning_params_table.clear();
277  _tuning_params_table = tuning_params_table;
278 }
const std::unordered_map< std::string, CLTuningParams > & tuning_params_table() const
Give read access to the tuning params table.
Definition: CLTuner.cpp:269

References CLTuner::tuning_params_table().

◆ kernel_event_is_set()

bool kernel_event_is_set ( ) const

Is the kernel_event set ?

Returns
true if the kernel_event is set.

Definition at line 43 of file CLTuner.cpp.

44 {
45  return _kernel_event() != nullptr;
46 }

◆ load_from_file()

void load_from_file ( const std::string &  filename)

Load the tuning parameters table from file.

It also sets up the tuning read from the file

Parameters
[in]filenameLoad the tuning parameters table from this file.(Must exist)

Definition at line 280 of file CLTuner.cpp.

281 {
282  std::ifstream fs;
283  fs.exceptions(std::ifstream::badbit);
284  fs.open(filename, std::ios::in);
285  if(!fs.is_open())
286  {
287  ARM_COMPUTE_ERROR_VAR("Failed to open '%s' (%s [%d])", filename.c_str(), strerror(errno), errno);
288  }
289  std::string line;
290  bool header_line = true;
291  while(!std::getline(fs, line).fail())
292  {
293  if(header_line)
294  {
295  header_line = false;
296  size_t pos_lws = line.find("lws");
297  size_t pos_wbsm = line.find("wbsm");
298  _tuning_info.tune_wbsm = false;
299  if(pos_lws != std::string::npos || pos_wbsm != std::string::npos)
300  {
301  // The file has in the first line the parameters it has been tuned on
302  if(pos_wbsm != std::string::npos)
303  {
304  _tuning_info.tune_wbsm = true;
305  }
306  // Once the line with the tuning parameter is read we can
307  // read the next one to start collecting the values
308  if(std::getline(fs, line).fail())
309  {
310  break;
311  }
312  }
313  }
314 
315  CLTuningParams tuning_params;
316  size_t pos = line.find(";");
317  if(pos == std::string::npos)
318  {
319  ARM_COMPUTE_ERROR_VAR("Malformed row '%s' in %s", line.c_str(), filename.c_str());
320  }
321  std::string kernel_id = line.substr(0, pos);
322  line.erase(0, pos + 1);
323  if(!tuning_params.from_string(_tuning_info, line))
324  {
325  ARM_COMPUTE_ERROR_VAR("Malformed row '%s' in %s", line.c_str(), filename.c_str());
326  }
327  add_tuning_params(kernel_id, tuning_params);
328  }
329  fs.close();
330 }
bool tune_wbsm
Flag to tune the batches of work groups distributed to compute units.
Definition: CLTunerTypes.h:46
#define ARM_COMPUTE_ERROR_VAR(msg,...)
Print the given message then throw an std::runtime_error.
Definition: Error.h:346
void add_tuning_params(const std::string &kernel_id, CLTuningParams optimal_tuning_params)
Manually add tuning parameters for a kernel.
Definition: CLTuner.cpp:127

References CLTuner::add_tuning_params(), ARM_COMPUTE_ERROR_VAR, CLTuningParams::from_string(), and CLTuningInfo::tune_wbsm.

Referenced by main(), and CLDeviceBackend::setup_backend_context().

◆ lws_table()

const std::unordered_map< std::string, cl::NDRange > & lws_table ( )

Give read access to the LWS table.

Returns
The lws table as unordered_map container
Deprecated:
This function is deprecated and is intended to be removed in 21.08 release

Definition at line 259 of file CLTuner.cpp.

260 {
261  _lws_table.clear();
262  for(auto && params : _tuning_params_table)
263  {
264  _lws_table.emplace(params.first, params.second.get_lws());
265  }
266  return _lws_table;
267 }

Referenced by CLTuner::import_lws_table().

◆ save_to_file()

bool save_to_file ( const std::string &  filename) const

Save the content of the tuning parameters table to file.

Parameters
[in]filenameSave the tuning parameters table to this file. (Content will be overwritten)
Returns
true if the file was created

Definition at line 332 of file CLTuner.cpp.

333 {
334  if(!_tune_new_kernels || _tuning_params_table.empty() || filename.empty())
335  {
336  return false;
337  }
338  std::ofstream fs;
339  fs.exceptions(std::ifstream::failbit | std::ifstream::badbit);
340  fs.open(filename, std::ios::out);
341  std::string header_string = "";
342  header_string += "lws";
343  if(_tuning_info.tune_wbsm)
344  {
345  if(!header_string.empty())
346  {
347  header_string += " ";
348  }
349  header_string += "wbsm";
350  }
351  fs << header_string << std::endl;
352  for(auto const &kernel_data : _tuning_params_table)
353  {
354  CLTuningParams tun_pams(kernel_data.second);
355  fs << kernel_data.first << tun_pams.to_string(_tuning_info) << std::endl;
356  }
357  fs.close();
358  return true;
359 }
bool tune_wbsm
Flag to tune the batches of work groups distributed to compute units.
Definition: CLTunerTypes.h:46

References CLTuningParams::to_string(), and CLTuningInfo::tune_wbsm.

Referenced by main(), and CLDeviceBackend::~CLDeviceBackend().

◆ set_cl_kernel_event()

void set_cl_kernel_event ( cl_event  kernel_event)

Set the OpenCL kernel event.

Note
The interceptor can use this function to store the event associated to the OpenCL kernel
Parameters
[in]kernel_eventThe OpenCL kernel event

Definition at line 47 of file CLTuner.cpp.

48 {
49  _kernel_event = kernel_event;
50 }

◆ set_tune_new_kernels()

void set_tune_new_kernels ( bool  tune_new_kernels)

Setter for tune_new_kernels option.

Parameters
[in]tune_new_kernelsFind the optimal local workgroup size for kernels which are not present in the table ?

Definition at line 52 of file CLTuner.cpp.

53 {
54  _tune_new_kernels = tune_new_kernels;
55 }
bool tune_new_kernels() const
Tune kernels that are not in the tuning parameters table.
Definition: CLTuner.cpp:56

References CLTuner::tune_new_kernels().

Referenced by main(), and CLDeviceBackend::set_kernel_tuning().

◆ set_tuner_mode()

void set_tuner_mode ( CLTunerMode  mode)

Set OpenCL tuner mode.

Parameters
[in]modeIndicates how exhaustive the search for the optimal tuning parameters should be while tuning. Default is Exhaustive mode

Definition at line 61 of file CLTuner.cpp.

62 {
63  _tuning_info.tuner_mode = mode;
64 }
CLTunerMode tuner_mode
Parameter to select the level (granularity) of the tuning.
Definition: CLTunerTypes.h:45

References clang_tidy_rules::mode, and CLTuningInfo::tuner_mode.

Referenced by main(), and CLDeviceBackend::set_kernel_tuning_mode().

◆ set_tuning_parameters()

void set_tuning_parameters ( CLTuningInfo  tuning_info)

Setter for tune parameters option.

Parameters
[in]tuning_infoopencl parameters to tune

◆ tune_kernel_dynamic() [1/2]

void tune_kernel_dynamic ( ICLKernel kernel)
overridevirtual

Tune OpenCL kernel dynamically.

Note
Tuning requires memory to be available on all kernel tensors and objects in order to be performed
Parameters
[in]kernelKernel to tune

Implements ICLTuner.

Definition at line 76 of file CLTuner.cpp.

77 {
78  ITensorPack pack;
79  tune_kernel_dynamic(kernel, pack);
80 }
void tune_kernel_dynamic(ICLKernel &kernel) override
Tune OpenCL kernel dynamically.
Definition: CLTuner.cpp:76

◆ tune_kernel_dynamic() [2/2]

void tune_kernel_dynamic ( ICLKernel kernel,
ITensorPack tensors 
)
overridevirtual

Tune OpenCL kernel dynamically.

Parameters
[in]kernelKernel to tune
[in,out]tensorsTensors for the kernel to use

Implements ICLTuner.

Definition at line 82 of file CLTuner.cpp.

83 {
84  // Get the configuration ID from the kernel and append GPU target name and number of available compute units
85  const std::string config_id = kernel.config_id() + "_" + string_from_target(kernel.get_target()) + "_MP" + support::cpp11::to_string(CLKernelLibrary::get().get_num_compute_units());
86 
87  // Check if we need to find the Optimal LWS. If the kernel's config_id is equal to default_config_id, the kernel does not require to be tuned
88  if(kernel.config_id() != arm_compute::default_config_id)
89  {
90  auto p = _tuning_params_table.find(config_id);
91 
92  if(p == _tuning_params_table.end())
93  {
94  if(_tune_new_kernels)
95  {
96  // Find the optimal LWS for the kernel
97  CLTuningParams opt_tuning_params = find_optimal_tuning_params(kernel, tensors);
98 
99  // Insert the optimal LWS in the table
100  add_tuning_params(config_id, opt_tuning_params);
101 
102  // Set Local-Workgroup-Size
103  kernel.set_lws_hint(opt_tuning_params.get_lws());
104  if(_tuning_info.tune_wbsm)
105  {
106  kernel.set_wbsm_hint(opt_tuning_params.get_wbsm());
107  }
108  }
109  }
110  else
111  {
112  // Set Local-Workgroup-Size
113  kernel.set_lws_hint(p->second.get_lws());
114  if(_tuning_info.tune_wbsm)
115  {
116  kernel.set_wbsm_hint(p->second.get_wbsm());
117  }
118  }
119  }
120 }
bool tune_wbsm
Flag to tune the batches of work groups distributed to compute units.
Definition: CLTunerTypes.h:46
std::string to_string(T &&value)
Convert integer and float values to string.
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
const std::string & string_from_target(GPUTarget target)
Translates a given gpu device target to string.
Definition: GPUTarget.cpp:115
cl_uint get_num_compute_units()
Return the maximum number of compute units in the device.
void add_tuning_params(const std::string &kernel_id, CLTuningParams optimal_tuning_params)
Manually add tuning parameters for a kernel.
Definition: CLTuner.cpp:127

References CLTuner::add_tuning_params(), ICLKernel::config_id(), CLKernelLibrary::get(), CLTuningParams::get_lws(), CLKernelLibrary::get_num_compute_units(), ICLKernel::get_target(), CLTuningParams::get_wbsm(), ICLKernel::set_lws_hint(), ICLKernel::set_wbsm_hint(), arm_compute::string_from_target(), arm_compute::support::cpp11::to_string(), and CLTuningInfo::tune_wbsm.

◆ tune_kernel_static()

void tune_kernel_static ( ICLKernel kernel)
overridevirtual

Tune OpenCL kernel statically.

Note
Tuning is performed using only kernel and tensor metadata, thus can be performed when memory is not available
Parameters
[in]kernelKernel to tune

Implements ICLTuner.

Definition at line 71 of file CLTuner.cpp.

72 {
73  ARM_COMPUTE_UNUSED(kernel);
74 }
#define ARM_COMPUTE_UNUSED(...)
To avoid unused variables warnings.
Definition: Error.h:152

References ARM_COMPUTE_UNUSED.

◆ tune_new_kernels()

bool tune_new_kernels ( ) const

Tune kernels that are not in the tuning parameters table.

Returns
True if tuning of new kernels is enabled.

Definition at line 56 of file CLTuner.cpp.

57 {
58  return _tune_new_kernels;
59 }

Referenced by CLTuner::set_tune_new_kernels().

◆ tuning_params_table()

const std::unordered_map< std::string, CLTuningParams > & tuning_params_table ( ) const

Give read access to the tuning params table.

Returns
The tuning params table as unordered_map container

Definition at line 269 of file CLTuner.cpp.

270 {
271  return _tuning_params_table;
272 }

Referenced by CLTuner::import_tuning_params().

Field Documentation

◆ real_clEnqueueNDRangeKernel

std::function<decltype(clEnqueueNDRangeKernel)> real_clEnqueueNDRangeKernel

clEnqueueNDRangeKernel symbol

Definition at line 142 of file CLTuner.h.


The documentation for this class was generated from the following files: