Compute Library
 21.05
OpenCLClock< output_timestamps > Class Template Reference

Instrument creating measurements based on the information returned by clGetEventProfilingInfo for each OpenCL kernel executed. More...

#include <OpenCLTimer.h>

Collaboration diagram for OpenCLClock< output_timestamps >:
[legend]

Public Member Functions

 OpenCLClock (ScaleFactor scale_factor)
 Construct an OpenCL timer. More...
 
std::string id () const override
 Identifier for the instrument. More...
 
void test_start () override
 Start of the test. More...
 
void start () override
 Start measuring. More...
 
void stop () override
 Stop measuring. More...
 
void test_stop () override
 End of the test. More...
 
MeasurementsMap measurements () const override
 Return the latest measurements. More...
 
MeasurementsMap test_measurements () const override
 Return the latest test measurements. More...
 
- Public Member Functions inherited from Instrument
 Instrument ()=default
 Default constructor. More...
 
 Instrument (const Instrument &)=default
 Allow instances of this class to be copy constructed. More...
 
 Instrument (Instrument &&)=default
 Allow instances of this class to be move constructed. More...
 
Instrumentoperator= (const Instrument &)=default
 Allow instances of this class to be copied. More...
 
Instrumentoperator= (Instrument &&)=default
 Allow instances of this class to be moved. More...
 
virtual ~Instrument ()=default
 Default destructor. More...
 

Additional Inherited Members

- Public Types inherited from Instrument
using MeasurementsMap = std::map< std::string, Measurement >
 Map of measurements. More...
 
- Static Public Member Functions inherited from Instrument
template<typename T , ScaleFactor scale>
static std::unique_ptr< Instrumentmake_instrument ()
 Helper function to create an instrument of the given type. More...
 

Detailed Description

template<bool output_timestamps>
class arm_compute::test::framework::OpenCLClock< output_timestamps >

Instrument creating measurements based on the information returned by clGetEventProfilingInfo for each OpenCL kernel executed.

Definition at line 45 of file OpenCLTimer.h.

Constructor & Destructor Documentation

◆ OpenCLClock()

OpenCLClock ( ScaleFactor  scale_factor)

Construct an OpenCL timer.

Parameters
[in]scale_factorMeasurement scale factor.

Definition at line 56 of file OpenCLTimer.cpp.

57  : _kernels(), _real_function(nullptr), _real_graph_function(nullptr), _prefix(), _timer_enabled(false)
58 {
59  auto q = CLScheduler::get().queue();
60  cl_command_queue_properties props = q.getInfo<CL_QUEUE_PROPERTIES>();
61  if((props & CL_QUEUE_PROFILING_ENABLE) == 0)
62  {
63  CLScheduler::get().set_queue(cl::CommandQueue(CLScheduler::get().context(), props | CL_QUEUE_PROFILING_ENABLE));
64  }
65 
66  switch(scale_factor)
67  {
68  case ScaleFactor::NONE:
69  _scale_factor = 1.f;
70  _unit = "ns";
71  break;
73  _scale_factor = 1000.f;
74  _unit = "us";
75  break;
77  _scale_factor = 1000000.f;
78  _unit = "ms";
79  break;
81  _scale_factor = 1000000000.f;
82  _unit = "s";
83  break;
84  default:
85  ARM_COMPUTE_ERROR("Invalid scale");
86  }
87 }
static CLScheduler & get()
Access the scheduler singleton.
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:39
void set_queue(cl::CommandQueue queue)
Accessor to set the CL command queue to be used by the scheduler.
Definition: CLScheduler.cpp:55

References ARM_COMPUTE_ERROR, CLScheduler::get(), arm_compute::test::framework::NONE, CLScheduler::queue(), CLScheduler::set_queue(), arm_compute::test::framework::TIME_MS, arm_compute::test::framework::TIME_S, and arm_compute::test::framework::TIME_US.

Member Function Documentation

◆ id()

std::string id ( ) const
overridevirtual

Identifier for the instrument.

Implements Instrument.

Definition at line 43 of file OpenCLTimer.cpp.

44 {
45  if(output_timestamps)
46  {
47  return "OpenCLTimestamps";
48  }
49  else
50  {
51  return "OpenCLTimer";
52  }
53 }

◆ measurements()

Instrument::MeasurementsMap measurements ( ) const
overridevirtual

Return the latest measurements.

Returns
the latest measurements.

Reimplemented from Instrument.

Definition at line 184 of file OpenCLTimer.cpp.

185 {
187  unsigned int kernel_number = 0;
188  for(auto const &kernel : _kernels)
189  {
190  cl_ulong queued;
191  cl_ulong flushed;
192  cl_ulong start;
193  cl_ulong end;
194  kernel.event.getProfilingInfo(CL_PROFILING_COMMAND_QUEUED, &queued);
195  kernel.event.getProfilingInfo(CL_PROFILING_COMMAND_SUBMIT, &flushed);
196  kernel.event.getProfilingInfo(CL_PROFILING_COMMAND_START, &start);
197  kernel.event.getProfilingInfo(CL_PROFILING_COMMAND_END, &end);
198  std::string name = kernel.name + " #" + support::cpp11::to_string(kernel_number++);
199 
200  if(output_timestamps)
201  {
202  measurements.emplace("[start]" + name, Measurement(start / static_cast<cl_ulong>(_scale_factor), _unit));
203  measurements.emplace("[queued]" + name, Measurement(queued / static_cast<cl_ulong>(_scale_factor), _unit));
204  measurements.emplace("[flushed]" + name, Measurement(flushed / static_cast<cl_ulong>(_scale_factor), _unit));
205  measurements.emplace("[end]" + name, Measurement(end / static_cast<cl_ulong>(_scale_factor), _unit));
206  }
207  else
208  {
209  measurements.emplace(name, Measurement((end - start) / _scale_factor, _unit));
210  }
211  }
212 
213  return measurements;
214 }
std::string to_string(T &&value)
Convert integer and float values to string.
void start() override
Start measuring.
void end(TokenStream &in, bool &valid)
Definition: MLGOParser.cpp:290
const char * name
std::map< std::string, Measurement > MeasurementsMap
Map of measurements.
Definition: Instrument.h:109
MeasurementsMap measurements() const override
Return the latest measurements.

References arm_compute::mlgo::parser::end(), name, and arm_compute::support::cpp11::to_string().

◆ start()

void start ( )
overridevirtual

Start measuring.

Called just before the run of the test starts

Reimplemented from Instrument.

Definition at line 162 of file OpenCLTimer.cpp.

163 {
164  _kernels.clear();
165  _timer_enabled = true;
166 }

◆ stop()

void stop ( )
overridevirtual

Stop measuring.

Called just after the run of the test ends

Reimplemented from Instrument.

Definition at line 168 of file OpenCLTimer.cpp.

169 {
170  _timer_enabled = false;
171 }

◆ test_measurements()

Instrument::MeasurementsMap test_measurements ( ) const
overridevirtual

Return the latest test measurements.

Returns
the latest test measurements.

Reimplemented from Instrument.

Definition at line 217 of file OpenCLTimer.cpp.

218 {
220 
221  if(output_timestamps)
222  {
223  // The OpenCL clock and the wall clock are not in sync, so we use
224  // this trick to calculate the offset between the two clocks:
225  ::cl::Event event;
226  cl_ulong now_gpu;
227 
228  // Enqueue retrieve current CPU clock and enqueue a dummy marker
229  std::chrono::high_resolution_clock::time_point now_cpu = std::chrono::high_resolution_clock::now();
230  CLScheduler::get().queue().enqueueMarker(&event);
231 
232  CLScheduler::get().queue().finish();
233  //Access the time at which the marker was enqueued:
234  event.getProfilingInfo(CL_PROFILING_COMMAND_QUEUED, &now_gpu);
235 
236  measurements.emplace("Now Wall clock", Measurement(now_cpu.time_since_epoch().count() / 1000, "us"));
237  measurements.emplace("Now OpenCL", Measurement(now_gpu / static_cast<cl_ulong>(_scale_factor), _unit));
238  }
239 
240  return measurements;
241 }
static CLScheduler & get()
Access the scheduler singleton.
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:39
std::map< std::string, Measurement > MeasurementsMap
Map of measurements.
Definition: Instrument.h:109
MeasurementsMap measurements() const override
Return the latest measurements.

References CLScheduler::get(), and CLScheduler::queue().

◆ test_start()

void test_start ( )
overridevirtual

Start of the test.

Called before the test set up starts

Reimplemented from Instrument.

Definition at line 90 of file OpenCLTimer.cpp.

91 {
92  // Start intercepting enqueues:
93  ARM_COMPUTE_ERROR_ON(_real_function != nullptr);
94  ARM_COMPUTE_ERROR_ON(_real_graph_function != nullptr);
95  _real_function = CLSymbols::get().clEnqueueNDRangeKernel_ptr;
96  _real_graph_function = graph::TaskExecutor::get().execute_function;
97  auto interceptor = [this](
98  cl_command_queue command_queue,
99  cl_kernel kernel,
100  cl_uint work_dim,
101  const size_t *gwo,
102  const size_t *gws,
103  const size_t *lws,
104  cl_uint num_events_in_wait_list,
105  const cl_event * event_wait_list,
106  cl_event * event)
107  {
108  if(this->_timer_enabled)
109  {
110  kernel_info info;
111  cl::Kernel cpp_kernel(kernel, true);
112  std::stringstream ss;
113  ss << this->_prefix << cpp_kernel.getInfo<CL_KERNEL_FUNCTION_NAME>();
114  if(gws != nullptr)
115  {
116  ss << " GWS[" << gws[0] << "," << gws[1] << "," << gws[2] << "]";
117  }
118  if(lws != nullptr)
119  {
120  ss << " LWS[" << lws[0] << "," << lws[1] << "," << lws[2] << "]";
121  }
122  info.name = ss.str();
123  cl_event tmp;
124  cl_int retval = this->_real_function(command_queue, kernel, work_dim, gwo, gws, lws, num_events_in_wait_list, event_wait_list, &tmp);
125  info.event = tmp;
126  this->_kernels.push_back(std::move(info));
127 
128  if(event != nullptr)
129  {
130  //return cl_event from the intercepted call
131  clRetainEvent(tmp);
132  *event = tmp;
133  }
134  return retval;
135  }
136  else
137  {
138  return this->_real_function(command_queue, kernel, work_dim, gwo, gws, lws, num_events_in_wait_list, event_wait_list, event);
139  }
140  };
141 
142  // Start intercepting tasks:
143  auto task_interceptor = [this](graph::ExecutionTask & task)
144  {
145  if(task.node != nullptr && !task.node->name().empty())
146  {
147  this->_prefix = task.node->name() + "/";
148  }
149  else
150  {
151  this->_prefix = "";
152  }
153  this->_real_graph_function(task);
154  this->_prefix = "";
155  };
156 
158  graph::TaskExecutor::get().execute_function = task_interceptor;
159 }
std::stringstream ss(mlgo_str)
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
cl_int clRetainEvent(cl_event event)
Definition: OpenCL.cpp:873
std::function< decltype(clEnqueueNDRangeKernel)> clEnqueueNDRangeKernel_ptr
Definition: OpenCL.h:98
static TaskExecutor & get()
Task executor accessor.
Definition: Workload.cpp:75
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
static CLSymbols & get()
Get the static instance of CLSymbols.
Definition: OpenCL.cpp:45
std::function< decltype(execute_task)> execute_function
Function that is responsible for executing tasks.
Definition: Workload.h:63

References ARM_COMPUTE_ERROR_ON, CLSymbols::clEnqueueNDRangeKernel_ptr, clRetainEvent(), TaskExecutor::execute_function, TaskExecutor::get(), CLSymbols::get(), arm_compute::test::validation::info, and arm_compute::test::validation::ss().

◆ test_stop()

void test_stop ( )
overridevirtual

End of the test.

Called after the test teardown ended

Reimplemented from Instrument.

Definition at line 174 of file OpenCLTimer.cpp.

175 {
176  // Restore real function
177  CLSymbols::get().clEnqueueNDRangeKernel_ptr = _real_function;
178  graph::TaskExecutor::get().execute_function = _real_graph_function;
179  _real_graph_function = nullptr;
180  _real_function = nullptr;
181 }
std::function< decltype(clEnqueueNDRangeKernel)> clEnqueueNDRangeKernel_ptr
Definition: OpenCL.h:98
static TaskExecutor & get()
Task executor accessor.
Definition: Workload.cpp:75
static CLSymbols & get()
Get the static instance of CLSymbols.
Definition: OpenCL.cpp:45
std::function< decltype(execute_task)> execute_function
Function that is responsible for executing tasks.
Definition: Workload.h:63

References CLSymbols::clEnqueueNDRangeKernel_ptr, TaskExecutor::execute_function, TaskExecutor::get(), and CLSymbols::get().


The documentation for this class was generated from the following files: