Compute Library
 22.11
OpenCLClock< output_timestamps > Class Template Reference

Instrument creating measurements based on the information returned by clGetEventProfilingInfo for each OpenCL kernel executed. More...

#include <OpenCLTimer.h>

Collaboration diagram for OpenCLClock< output_timestamps >:
[legend]

Public Member Functions

 OpenCLClock (ScaleFactor scale_factor)
 Construct an OpenCL timer. More...
 
std::string id () const override
 Identifier for the instrument. More...
 
void test_start () override
 Start of the test. More...
 
void start () override
 Start measuring. More...
 
void stop () override
 Stop measuring. More...
 
void test_stop () override
 End of the test. More...
 
MeasurementsMap measurements () const override
 Return the latest measurements. More...
 
MeasurementsMap test_measurements () const override
 Return the latest test measurements. More...
 
- Public Member Functions inherited from Instrument
 Instrument ()=default
 Default constructor. More...
 
 Instrument (const Instrument &)=default
 Allow instances of this class to be copy constructed. More...
 
 Instrument (Instrument &&)=default
 Allow instances of this class to be move constructed. More...
 
Instrumentoperator= (const Instrument &)=default
 Allow instances of this class to be copied. More...
 
Instrumentoperator= (Instrument &&)=default
 Allow instances of this class to be moved. More...
 
virtual ~Instrument ()=default
 Default destructor. More...
 
virtual std::string instrument_header () const
 Return JSON formatted instrument header string. More...
 

Additional Inherited Members

- Public Types inherited from Instrument
using MeasurementsMap = std::map< std::string, Measurement >
 Map of measurements. More...
 
- Static Public Member Functions inherited from Instrument
template<typename T , ScaleFactor scale>
static std::unique_ptr< Instrumentmake_instrument ()
 Helper function to create an instrument of the given type. More...
 

Detailed Description

template<bool output_timestamps>
class arm_compute::test::framework::OpenCLClock< output_timestamps >

Instrument creating measurements based on the information returned by clGetEventProfilingInfo for each OpenCL kernel executed.

Definition at line 45 of file OpenCLTimer.h.

Constructor & Destructor Documentation

◆ OpenCLClock()

OpenCLClock ( ScaleFactor  scale_factor)

Construct an OpenCL timer.

Parameters
[in]scale_factorMeasurement scale factor.

Definition at line 56 of file OpenCLTimer.cpp.

References ARM_COMPUTE_ERROR, CLScheduler::get(), arm_compute::test::framework::NONE, CLScheduler::queue(), CLScheduler::set_queue(), arm_compute::test::framework::TIME_MS, arm_compute::test::framework::TIME_S, and arm_compute::test::framework::TIME_US.

57  : _kernels(),
58  _real_function(nullptr),
59 #ifdef ARM_COMPUTE_GRAPH_ENABLED
60  _real_graph_function(nullptr),
61 #endif /* ARM_COMPUTE_GRAPH_ENABLED */
62  _prefix(),
63  _timer_enabled(false)
64 {
65  auto q = CLScheduler::get().queue();
66  cl_command_queue_properties props = q.getInfo<CL_QUEUE_PROPERTIES>();
67  if((props & CL_QUEUE_PROFILING_ENABLE) == 0)
68  {
69  CLScheduler::get().set_queue(cl::CommandQueue(CLScheduler::get().context(), props | CL_QUEUE_PROFILING_ENABLE));
70  }
71 
72  switch(scale_factor)
73  {
74  case ScaleFactor::NONE:
75  _scale_factor = 1.f;
76  _unit = "ns";
77  break;
79  _scale_factor = 1000.f;
80  _unit = "us";
81  break;
83  _scale_factor = 1000000.f;
84  _unit = "ms";
85  break;
87  _scale_factor = 1000000000.f;
88  _unit = "s";
89  break;
90  default:
91  ARM_COMPUTE_ERROR("Invalid scale");
92  }
93 }
static CLScheduler & get()
Access the scheduler singleton.
#define ARM_COMPUTE_ERROR(msg)
Print the given message then throw an std::runtime_error.
Definition: Error.h:352
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:43
void set_queue(cl::CommandQueue queue)
Accessor to set the CL command queue to be used by the scheduler.
Definition: CLScheduler.cpp:59

Member Function Documentation

◆ id()

std::string id ( ) const
overridevirtual

Identifier for the instrument.

Implements Instrument.

Definition at line 43 of file OpenCLTimer.cpp.

44 {
45  if(output_timestamps)
46  {
47  return "OpenCLTimestamps";
48  }
49  else
50  {
51  return "OpenCLTimer";
52  }
53 }

◆ measurements()

Instrument::MeasurementsMap measurements ( ) const
overridevirtual

Return the latest measurements.

Returns
the latest measurements.

Reimplemented from Instrument.

Definition at line 193 of file OpenCLTimer.cpp.

References arm_compute::mlgo::parser::end(), name, OpenCLClock< output_timestamps >::start(), and arm_compute::support::cpp11::to_string().

Referenced by OpenCLClock< output_timestamps >::test_measurements().

194 {
196  unsigned int kernel_number = 0;
197  for(auto const &kernel : _kernels)
198  {
199  cl_ulong queued;
200  cl_ulong flushed;
201  cl_ulong start;
202  cl_ulong end;
203  kernel.event.getProfilingInfo(CL_PROFILING_COMMAND_QUEUED, &queued);
204  kernel.event.getProfilingInfo(CL_PROFILING_COMMAND_SUBMIT, &flushed);
205  kernel.event.getProfilingInfo(CL_PROFILING_COMMAND_START, &start);
206  kernel.event.getProfilingInfo(CL_PROFILING_COMMAND_END, &end);
207  std::string name = kernel.name + " #" + support::cpp11::to_string(kernel_number++);
208 
209  if(output_timestamps)
210  {
211  measurements.emplace("[start]" + name, Measurement(start / static_cast<cl_ulong>(_scale_factor), _unit));
212  measurements.emplace("[queued]" + name, Measurement(queued / static_cast<cl_ulong>(_scale_factor), _unit));
213  measurements.emplace("[flushed]" + name, Measurement(flushed / static_cast<cl_ulong>(_scale_factor), _unit));
214  measurements.emplace("[end]" + name, Measurement(end / static_cast<cl_ulong>(_scale_factor), _unit));
215  }
216  else
217  {
218  measurements.emplace(name, Measurement((end - start) / _scale_factor, _unit));
219  }
220  }
221 
222  return measurements;
223 }
std::string to_string(T &&value)
Convert integer and float values to string.
void start() override
Start measuring.
void end(TokenStream &in, bool &valid)
Definition: MLGOParser.cpp:290
const char * name
std::map< std::string, Measurement > MeasurementsMap
Map of measurements.
Definition: Instrument.h:109
MeasurementsMap measurements() const override
Return the latest measurements.

◆ start()

void start ( )
overridevirtual

Start measuring.

Called just before the run of the test starts

Reimplemented from Instrument.

Definition at line 169 of file OpenCLTimer.cpp.

Referenced by OpenCLClock< output_timestamps >::measurements().

170 {
171  _kernels.clear();
172  _timer_enabled = true;
173 }

◆ stop()

void stop ( )
overridevirtual

Stop measuring.

Called just after the run of the test ends

Reimplemented from Instrument.

Definition at line 175 of file OpenCLTimer.cpp.

176 {
177  _timer_enabled = false;
178 }

◆ test_measurements()

Instrument::MeasurementsMap test_measurements ( ) const
overridevirtual

Return the latest test measurements.

Returns
the latest test measurements.

Reimplemented from Instrument.

Definition at line 226 of file OpenCLTimer.cpp.

References CLScheduler::get(), OpenCLClock< output_timestamps >::measurements(), and CLScheduler::queue().

227 {
229 
230  if(output_timestamps)
231  {
232  // The OpenCL clock and the wall clock are not in sync, so we use
233  // this trick to calculate the offset between the two clocks:
234  ::cl::Event event;
235  cl_ulong now_gpu;
236 
237  // Enqueue retrieve current CPU clock and enqueue a dummy marker
238  std::chrono::high_resolution_clock::time_point now_cpu = std::chrono::high_resolution_clock::now();
239  CLScheduler::get().queue().enqueueMarker(&event);
240 
241  CLScheduler::get().queue().finish();
242  //Access the time at which the marker was enqueued:
243  event.getProfilingInfo(CL_PROFILING_COMMAND_QUEUED, &now_gpu);
244 
245  measurements.emplace("Now Wall clock", Measurement(now_cpu.time_since_epoch().count() / 1000, "us"));
246  measurements.emplace("Now OpenCL", Measurement(now_gpu / static_cast<cl_ulong>(_scale_factor), _unit));
247  }
248 
249  return measurements;
250 }
static CLScheduler & get()
Access the scheduler singleton.
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:43
std::map< std::string, Measurement > MeasurementsMap
Map of measurements.
Definition: Instrument.h:109
MeasurementsMap measurements() const override
Return the latest measurements.

◆ test_start()

void test_start ( )
overridevirtual

Start of the test.

Called before the test set up starts

Reimplemented from Instrument.

Definition at line 96 of file OpenCLTimer.cpp.

References ARM_COMPUTE_ERROR_ON, CLSymbols::clEnqueueNDRangeKernel_ptr, clRetainEvent(), TaskExecutor::execute_function, TaskExecutor::get(), CLSymbols::get(), arm_compute::test::validation::info, and arm_compute::test::validation::ss().

97 {
98  // Start intercepting enqueues:
99  ARM_COMPUTE_ERROR_ON(_real_function != nullptr);
100  _real_function = CLSymbols::get().clEnqueueNDRangeKernel_ptr;
101  auto interceptor = [this](
102  cl_command_queue command_queue,
103  cl_kernel kernel,
104  cl_uint work_dim,
105  const size_t *gwo,
106  const size_t *gws,
107  const size_t *lws,
108  cl_uint num_events_in_wait_list,
109  const cl_event * event_wait_list,
110  cl_event * event)
111  {
112  if(this->_timer_enabled)
113  {
114  kernel_info info;
115  cl::Kernel cpp_kernel(kernel, true);
116  std::stringstream ss;
117  ss << this->_prefix << cpp_kernel.getInfo<CL_KERNEL_FUNCTION_NAME>();
118  if(gws != nullptr)
119  {
120  ss << " GWS[" << gws[0] << "," << gws[1] << "," << gws[2] << "]";
121  }
122  if(lws != nullptr)
123  {
124  ss << " LWS[" << lws[0] << "," << lws[1] << "," << lws[2] << "]";
125  }
126  info.name = ss.str();
127  cl_event tmp;
128  cl_int retval = this->_real_function(command_queue, kernel, work_dim, gwo, gws, lws, num_events_in_wait_list, event_wait_list, &tmp);
129  info.event = tmp;
130  this->_kernels.push_back(std::move(info));
131 
132  if(event != nullptr)
133  {
134  //return cl_event from the intercepted call
135  clRetainEvent(tmp);
136  *event = tmp;
137  }
138  return retval;
139  }
140  else
141  {
142  return this->_real_function(command_queue, kernel, work_dim, gwo, gws, lws, num_events_in_wait_list, event_wait_list, event);
143  }
144  };
146 
147 #ifdef ARM_COMPUTE_GRAPH_ENABLED
148  ARM_COMPUTE_ERROR_ON(_real_graph_function != nullptr);
149  _real_graph_function = graph::TaskExecutor::get().execute_function;
150  // Start intercepting tasks:
151  auto task_interceptor = [this](graph::ExecutionTask & task)
152  {
153  if(task.node != nullptr && !task.node->name().empty())
154  {
155  this->_prefix = task.node->name() + "/";
156  }
157  else
158  {
159  this->_prefix = "";
160  }
161  this->_real_graph_function(task);
162  this->_prefix = "";
163  };
164  graph::TaskExecutor::get().execute_function = task_interceptor;
165 #endif /* ARM_COMPUTE_GRAPH_ENABLED */
166 }
std::stringstream ss(mlgo_str)
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
Definition: Error.h:466
cl_int clRetainEvent(cl_event event)
Definition: OpenCL.cpp:907
std::function< decltype(clEnqueueNDRangeKernel)> clEnqueueNDRangeKernel_ptr
Definition: OpenCL.h:99
static TaskExecutor & get()
Task executor accessor.
Definition: Workload.cpp:75
ScaleKernelInfo info(interpolation_policy, default_border_mode, PixelValue(), sampling_policy, false)
static CLSymbols & get()
Get the static instance of CLSymbols.
Definition: OpenCL.cpp:45
std::function< decltype(execute_task)> execute_function
Function that is responsible for executing tasks.
Definition: Workload.h:63

◆ test_stop()

void test_stop ( )
overridevirtual

End of the test.

Called after the test teardown ended

Reimplemented from Instrument.

Definition at line 181 of file OpenCLTimer.cpp.

References CLSymbols::clEnqueueNDRangeKernel_ptr, TaskExecutor::execute_function, TaskExecutor::get(), and CLSymbols::get().

182 {
183  // Restore real function
184  CLSymbols::get().clEnqueueNDRangeKernel_ptr = _real_function;
185  _real_function = nullptr;
186 #ifdef ARM_COMPUTE_GRAPH_ENABLED
187  graph::TaskExecutor::get().execute_function = _real_graph_function;
188  _real_graph_function = nullptr;
189 #endif /* ARM_COMPUTE_GRAPH_ENABLED */
190 }
std::function< decltype(clEnqueueNDRangeKernel)> clEnqueueNDRangeKernel_ptr
Definition: OpenCL.h:99
static TaskExecutor & get()
Task executor accessor.
Definition: Workload.cpp:75
static CLSymbols & get()
Get the static instance of CLSymbols.
Definition: OpenCL.cpp:45
std::function< decltype(execute_task)> execute_function
Function that is responsible for executing tasks.
Definition: Workload.h:63

The documentation for this class was generated from the following files: