24.02.1
|
Data Structures | |
class | BenchmarkResult |
class | GEMMBenchmarkResultRecorder |
class | GEMMConfigDistribution |
class | GEMMParam |
class | Measurement |
class | NativeGEMMConfig |
class | ReshapedGEMMConfig |
class | ReshapedOnlyRHSGEMMConfig |
Functions | |
Dict[str, str] | parse_benchmark_commandline (str commandline) |
Functions. More... | |
Generator[BenchmarkResult, None, None] | extract_benchmark_results (Dict json_results, measurement_method="avg") |
def | parse_json (dir_name) |
def | check_out_path (out_path) |
def | dump_json (out_path, dict) |
def | main (args) |
Main. More... | |
Variables | |
Strategy = Enum("Strategy", ["Native", "ReshapedOnlyRHS", "Reshaped"]) | |
Types. More... | |
GEMMConfigT | |
dictionary | GEMM_CONFIG_FACTORY |
dictionary | EXAMPLE_FILE_2_STRATEGY |
dictionary | GEMM_EXAMPLE_ARGS_FACTORY |
string | BENCHMARK_RESULT_JSON_EXTENSION = "gemmtuner_benchmark" |
parser = argparse.ArgumentParser(description="CL GEMM Tuner") | |
dest | |
metavar | |
action | |
type | |
help | |
required | |
default | |
args = parser.parse_args() | |
logging_level = logging.DEBUG if args.debug else logging.INFO | |
level | |
def GemmTuner.check_out_path | ( | out_path | ) |
Definition at line 566 of file GemmTuner.py.
References update_supported_ops.format, and arm_compute::test::validation.input.
Referenced by GEMMBenchmarkResultRecorder.save_to_jsons().
def GemmTuner.dump_json | ( | out_path, | |
dict | |||
) |
Definition at line 582 of file GemmTuner.py.
Referenced by GEMMBenchmarkResultRecorder.save_to_jsons().
Generator[BenchmarkResult, None, None] GemmTuner.extract_benchmark_results | ( | Dict | json_results, |
measurement_method = "avg" |
|||
) |
Parse the benchmark result and extract relevant information, namely: GEMM param, Strategy, GEMM config, Measurements
Definition at line 488 of file GemmTuner.py.
References update_supported_ops.format, and parse_benchmark_commandline().
Referenced by main().
def GemmTuner.main | ( | args | ) |
Main.
Definition at line 593 of file GemmTuner.py.
References extract_benchmark_results(), update_supported_ops.format, and parse_json().
Dict[str, str] GemmTuner.parse_benchmark_commandline | ( | str | commandline | ) |
Functions.
Parse the benchmark example command-line string into a dictionary of command-line arguments
Definition at line 468 of file GemmTuner.py.
References arm_compute::utils.map().
Referenced by extract_benchmark_results().
def GemmTuner.parse_json | ( | dir_name | ) |
Glob all benchmark result json files and parse them into json objects (dicts).
Definition at line 558 of file GemmTuner.py.
References update_supported_ops.format.
Referenced by main().
action |
Definition at line 646 of file GemmTuner.py.
args = parser.parse_args() |
Definition at line 679 of file GemmTuner.py.
Referenced by GpuKernelComponentGraph.add_new_component(), Graph.add_node(), CommandLineParser.add_option(), CommandLineParser.add_positional_option(), arm_compute::test.apply(), arm_compute::test::framework.apply_impl(), DepthwiseDepthfirstMultiplier< TInput, TWeight, TOutput, TAccum, is_generic, OutputStage >.compute_tile_padded(), CLSynthetizeOperator< ClGemmMatrixMultiplyReshapedOnlyRhsKernel >.configure(), NESynthetizeFunction< K >.configure(), NESynthetizeFunctionWithZeroConstantBorder< K, bordersize >.configure(), CLSynthetizeOperatorInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >.configure(), NESynthetizeFunctionWithZeroConstantKernelBorder< K >.configure(), CLSynthetizeFunction< K >.configure(), CLSynthetizeFunctionWithZeroConstantBorder< K, bordersize >.configure(), CLSynthetizeFunctionInitOutputWithZeroAndWithZeroConstantBorder< K, bordersize >.configure(), ClSynthetizeOperatorWithBorder< K >.configure(), GpuKernelComponentFactory.create(), arm_compute::graph::backends.create_named_function(), arm_compute::graph::backends.create_named_memory_managed_function(), GpuWorkloadContext.create_tensor_info(), arm_conv::depthwise.depthwise(), GemmImplementation< Top, Tret, OutputStage >.do_cycle_estimate(), GemmImplementation< Top, Tret, Nothing >.do_cycle_estimate(), GemmImplementation< Top, Tret, OutputStage >.do_instantiate(), GemmImplementation< Top, Tret, Nothing >.do_instantiate(), GemmImplementation< Top, Tret, OutputStage >.do_is_supported(), GemmImplementation< Top, Tret, Nothing >.do_is_supported(), GemmHybrid< strategy, To, Tr >.estimate_cycles(), GemmInterleaved< strategy, To, Tr, OutputStage, MergeStep, FixedFormat, ForceThreadColumns, ForceFloatAccumulate >.estimate_cycles(), StrategyType< IsGeneric, TInput, TWeight, TOutput, TAccum, OutputStage >.execute(), StrategyType< true, TInput, TWeight, TOutput, TAccum, OutputStage >.execute(), StrategyType< false, TInput, TWeight, TOutput, int32_t, arm_gemm::Requantize32 >.execute(), StrategyType< true, TInput, TWeight, TOutput, int32_t, arm_gemm::Requantize32 >.execute(), PrepareInputSample< true >.execute(), arm_conv::pooling.find_implementation(), arm_conv::depthwise.find_implementation(), arm_gemm.find_implementation(), arm_compute::utility.for_each(), arm_compute::detail.for_each_error(), arm_gemm.gemm(), GemmImplementation< Top, Tret, OutputStage >.GemmImplementation(), GemmImplementation< Top, Tret, Nothing >.GemmImplementation(), GemvBatched< To, Tr >.GemvBatched(), arm_conv::depthwise.get_compatible_kernels(), arm_gemm.get_compatible_kernels(), PoolingImplementation< TInput, TOutput, OutputStage >.get_cycle_estimate(), DepthwiseImplementation< TInput, TWeight, TOutput, OutputStage >.get_cycle_estimate(), GenericInputArrayElement< T >.get_element_size(), InputArrayElement< T >.get_element_size(), InputPatchElement< T, IsGeneric, OutputStage >.get_element_size(), arm_gemm.get_gemm_method(), PoolingImplementation< TInput, TOutput, OutputStage >.get_instance(), DepthwiseImplementation< TInput, TWeight, TOutput, OutputStage >.get_instance(), PoolingImplementation< TInput, TOutput, OutputStage >.get_is_supported(), DepthwiseImplementation< TInput, TWeight, TOutput, OutputStage >.get_is_supported(), GpuCkwDriver.get_kernel_arguments(), arm_conv::depthwise::interleaves::quantized.get_storage_size(), DepthfirstStrategy< __fp16, __fp16, __fp16, __fp16, typename DefaultOutputStage< __fp16 >::Type >.get_storage_size(), DepthfirstMultiplierStrategy< TInput, TWeight, TOutput, TAccum >.get_storage_size(), GenericDepthfirstStrategy< TInput, TWeight, TOutput, TAccum, OutputStage >.get_storage_size(), DepthfirstMultiplierStrategy< TInput, TWeight, TOutput, int32_t >.get_storage_size(), DepthwiseDepthfirstStrategy< TInput, TWeight, TOutput, int32_t >.get_storage_size(), PlanarStrategy< uint8_t, int8_t >.get_storage_size(), GenericDepthfirstMultiplierStrategy< TInput, TWeight, TOutput, TAccum, OutputStage >.get_storage_size(), arm_conv::depthwise::interleaves.get_storage_size_generic(), DepthwiseDepthfirstGeneric< TInput, TWeight, TOutput, TAccum, OutputStage >.get_working_size_per_thread(), DepthwiseDepthfirst< TInput, TWeight, TOutput, TAccum, OutputStage >.get_working_size_per_thread(), DepthwiseDepthfirstMultiplier< TInput, TWeight, TOutput, TAccum, is_generic, OutputStage >.get_working_size_per_thread(), arm_gemm.has_opt_gemm(), CpuGemmAssemblyDispatch.has_opt_impl(), GenericInputArrayElement< T >.initialise(), InputArrayElement< T >.initialise(), InputPatchElement< T, IsGeneric, OutputStage >.initialise(), DepthwiseDepthfirstGeneric< TInput, TWeight, TOutput, TAccum, OutputStage >.initialise_working_space(), DepthwiseDepthfirst< TInput, TWeight, TOutput, TAccum, OutputStage >.initialise_working_space(), DepthwiseDepthfirstMultiplier< TInput, TWeight, TOutput, TAccum, is_generic, OutputStage >.initialise_working_space(), arm_conv::pooling.is_supported(), Logger.log(), arm_compute::utils::memory.make_deep_unique(), arm_compute.operator<<(), arm_conv::depthwise::interleaves::quantized.pack_parameters(), DepthfirstMultiplierStrategy< TInput, TWeight, TOutput, TAccum >.pack_parameters(), DepthfirstStrategy< __fp16, __fp16, __fp16, __fp16, typename DefaultOutputStage< __fp16 >::Type >.pack_parameters(), DepthfirstMultiplierStrategy< TInput, TWeight, TOutput, int32_t >.pack_parameters(), GenericDepthfirstStrategy< TInput, TWeight, TOutput, TAccum, OutputStage >.pack_parameters(), DepthwiseDepthfirstStrategy< TInput, TWeight, TOutput, int32_t >.pack_parameters(), PlanarStrategy< uint8_t, int8_t >.pack_parameters(), GenericDepthfirstMultiplierStrategy< TInput, TWeight, TOutput, TAccum, OutputStage >.pack_parameters(), arm_conv::depthwise::interleaves.pack_parameters_generic(), arm_conv::pooling.pooling(), QuantizeWrapper< To, Tr, Tgemm >.QuantizeWrapper(), arm_compute::support::cpp11.snprintf(), arm_compute::support::cpp11.stof(), arm_compute::logging.string_with_format(), arm_compute.to_string(), CLSynthetizeOperator< ClGemmMatrixMultiplyReshapedOnlyRhsKernel >.validate(), NESynthetizeFunction< K >.validate(), and CLSynthetizeFunction< K >.validate().
string BENCHMARK_RESULT_JSON_EXTENSION = "gemmtuner_benchmark" |
Definition at line 461 of file GemmTuner.py.
default |
Definition at line 668 of file GemmTuner.py.
dest |
Definition at line 644 of file GemmTuner.py.
Referenced by arm_conv::addressing.fill_patch_array_generic_kernel(), arm_conv::addressing.fill_pointer_array(), and arm_conv::addressing.fill_pointer_array_generic_kernel().
dictionary EXAMPLE_FILE_2_STRATEGY |
Definition at line 434 of file GemmTuner.py.
dictionary GEMM_CONFIG_FACTORY |
Definition at line 426 of file GemmTuner.py.
dictionary GEMM_EXAMPLE_ARGS_FACTORY |
Definition at line 450 of file GemmTuner.py.
GEMMConfigT |
Definition at line 189 of file GemmTuner.py.
help |
Definition at line 648 of file GemmTuner.py.
Referenced by Option.set_help().
level |
Definition at line 681 of file GemmTuner.py.
Referenced by arm_compute::test::framework.operator<<(), arm_compute::test::framework.operator>>(), and arm_compute::test::framework.to_string().
logging_level = logging.DEBUG if args.debug else logging.INFO |
Definition at line 680 of file GemmTuner.py.
metavar |
Definition at line 645 of file GemmTuner.py.
parser = argparse.ArgumentParser(description="CL GEMM Tuner") |
Definition at line 640 of file GemmTuner.py.
Referenced by CommonGemmExampleOptions.CommonGemmExampleOptions(), CommonGraphOptions.CommonGraphOptions(), CommonGraphValidateOptions.CommonGraphValidateOptions(), CommonOptions.CommonOptions(), GraphValidateExample< DepthwiseConvolutionLayer, DepthConvolutionOptions, DepthConvolutionVerifyAccessor >.do_setup(), main(), and arm_compute::utils.run_example().
required |
Definition at line 652 of file GemmTuner.py.
Referenced by arm_compute.adjust_down(), and arm_compute.adjust_up().
Strategy = Enum("Strategy", ["Native", "ReshapedOnlyRHS", "Reshaped"]) |
Types.
Definition at line 41 of file GemmTuner.py.
type |
Definition at line 647 of file GemmTuner.py.