23.11
|
Data Structures | |
class | ArgumentPack |
This is a generic class that packs the arguments of an operator. More... | |
struct | AuxMemoryInfo |
Memory information for tensors with MemoryType::Auxiliary. More... | |
class | CastAttributes |
Attributes are backend-agnostic parameters (in addition to the input/output tensors) of an operator. More... | |
class | ClampAttributes |
Attributes are backend-agnostic parameters (in addition to the input/output tensors) of an operator. More... | |
class | ClComponentActivation |
class | ClComponentCast |
class | ClComponentCastSettings |
Component specific settings. More... | |
class | ClComponentDepthwiseConv2d |
class | ClComponentDepthwiseConv2dSettings |
Component specific settings. More... | |
class | ClComponentDirectConv2d |
class | ClComponentDirectConv2dSettings |
Component specific settings. More... | |
class | ClComponentElementwiseBinary |
class | ClComponentLogits1DMaxShiftExpSum |
Component to calculate max-shifted exponentials and their sum. More... | |
class | ClComponentLogits1DNorm |
Component to calculate the final step of the Softmax Layer where each logit value is multiplied by the inverse of the sum of the logits. More... | |
class | ClComponentMatMul |
class | ClComponentPool2d |
class | ClComponentReshape |
class | ClComponentResize |
class | ClComponentStore |
class | ClKernelRuntime |
OpenCL runtime to run a single kernel. More... | |
class | ClTemplateActivation |
class | ClTemplateCast |
class | ClTemplateDepthwiseConv2d |
class | ClTemplateDirectConv2d |
class | ClTemplateElementwiseBinary |
class | ClTemplateLogits1DMaxShiftExpSum |
class | ClTemplateLogits1DNorm |
class | ClTemplatePool2d |
class | ClTemplateReshape |
class | ClTemplateResize |
class | ClTemplateStore |
class | ClTemplateWriter |
Use a templated-string-based method to write kernel code It stitches the component code templates together based on the valid fusion configuration. More... | |
class | ClWorkloadRuntime |
OpenCL runtime to run a workload. More... | |
class | Conv2dAttributes |
Attributes are backend-agnostic parameters (in addition to the input/output tensors) of an operator. More... | |
class | DependencyGraph |
A multi-input (tensors), multi-output (tensors) acyclic directed graph Represented as a doubly-linked adjacency list with the differentiation between source and destination. More... | |
class | DepthwiseConv2dAttributes |
Attributes are backend-agnostic parameters (in addition to the input/output tensors) of an operator. More... | |
class | ElementwiseBinaryCommonAttributes |
class | GpuAdd |
Operator interface. More... | |
class | GpuCast |
Operator interface. More... | |
class | GpuCkwActivation |
class | GpuCkwCast |
class | GpuCkwComponentArgument |
The argument of a dynamic fusion component which can be either user tensor or virtual tensor. More... | |
class | GpuCkwDepthwiseConv2d |
class | GpuCkwDirectConv2d |
class | GpuCkwDriver |
Use Kernel Writer to write kernel code Used by dynamic_fusion module. More... | |
class | GpuCkwElementwiseBinary |
class | GpuCkwKernelWriter |
Extended implementation of kernel writer for dynamic fusion. More... | |
class | GpuCkwMatMul |
class | GpuCkwPool2d |
class | GpuCkwResize |
class | GpuCkwScopedKernelWriter |
Helper to automatically manage kernel writer ID space. More... | |
class | GpuCkwStore |
An interface used by ClTemplateWriter to write source code for a kernel component. More... | |
class | GpuCkwVariableTable |
A table of all the variables used in the kernel. More... | |
class | GpuClamp |
Operator interface. More... | |
class | GpuComponentServices |
Services that are used throughout the creation phase of workload code. More... | |
class | GpuConv2d |
Operator interface. More... | |
class | GpuDepthwiseConv2d |
Operator interface. More... | |
class | GpuElementwiseBinaryCommon |
Operator interface. More... | |
class | GpuKernelArgument |
Kernel argument information linked with its corresponding ITensorInfo. More... | |
struct | GpuKernelArgumentInfo |
Contain information required to set up a kernel argument at run time. More... | |
class | GpuKernelComponentFactory |
Factory class that creates new instances of IGpuKernelComponent by assigning new component ids. More... | |
class | GpuKernelComponentGraph |
A multi-input (tensors), multi-output (tensors) acyclic directed graph of gpu kernel components Its main purposes are: More... | |
class | GpuKernelComponentGroup |
A group of gpu kernel components to be fused together PRECONDITIONS: More... | |
class | GpuKernelComponentStream |
A linear sequence of component groups serialized from the GpuKernelComponentGraph Each component group in the stream denotes a complete kernel that may consist of multiple components. More... | |
class | GpuKernelSourceCode |
Container of kernel code to be compiled and run in a GpuUnitWorkload. More... | |
class | GpuKernelVariableTable |
A table of all the variables used in the kernel. More... | |
class | GpuLogicalKernel |
A wrapper-processor of a GpuKernelComponentGroup It adds the load (if any) and store components to the component group The GpuLogicalKernel represents a complete kernel, and can proceed to invoke any kernel writer to generate the full kernel code. More... | |
class | GpuMatMul |
Operator interface. More... | |
class | GpuMatMulSettings |
Operator backend specific settings. More... | |
class | GpuMul |
Operator interface. More... | |
class | GpuOperatorGroup |
A linear sequence of operators to be fused in a workload For the time being, this class is only used for validating operator fusion INVARIANTS: More... | |
class | GpuOutput |
Operator interface. More... | |
class | GpuPool2d |
Operator interface. More... | |
class | GpuPool2dSettings |
Operator backend specific settings. More... | |
class | GpuReshape |
Operator interface. More... | |
class | GpuResize |
Operator interface. More... | |
class | GpuSigmoid |
Operator interface. More... | |
class | GpuSoftmax |
Operator interface. More... | |
class | GpuSub |
Operator interface. More... | |
class | GpuTanh |
Operator interface. More... | |
class | GpuUnitWorkload |
The atomic unit in a Gpu workload. More... | |
class | GpuWorkloadArgument |
Describes all the info related to a workload argument (tensor) in order to: More... | |
class | GpuWorkloadContext |
Provide context necessary for the creation and configuration of a workload e.g. More... | |
class | GpuWorkloadSketch |
A descriptor of a workload of operators. More... | |
class | GpuWorkloadSourceCode |
Hold the generated kernel source code and other information required to compile and run the workload. More... | |
class | IGpuCkwComponentDriver |
An interface used by GpuCkwDriver to write source code for a kernel component. More... | |
class | IGpuKernelComponent |
An abstract interface of a component. More... | |
class | IGpuKernelWriter |
An interface that can write a gpu kernel. More... | |
class | IGpuTemplateComponentWriter |
An interface used by ClTemplateWriter to write source code for a kernel component. More... | |
class | KernelProperties |
Properties common to all kernel component types. More... | |
class | MatMulAttributes |
Attributes are backend-agnostic parameters (in addition to the input/output tensors) of an operator. More... | |
struct | MemoryDescriptor |
Descriptor of a workload tensor memory. More... | |
class | Operator |
An operator for the sole purpose of validating fusion. More... | |
class | Pool2dAttributes |
Attributes are backend-agnostic parameters (in addition to the input/output tensors) of an operator. More... | |
class | ReshapeAttributes |
Attributes are backend-agnostic parameters (in addition to the input/output tensors) of an operator. More... | |
class | ResizeAttributes |
Attributes are backend-agnostic parameters (in addition to the input/output tensors) of an operator. More... | |
class | SoftmaxAttributes |
Attributes are backend-agnostic parameters (in addition to the input/output tensors) of an operator. More... | |
struct | TagVal |
A tag value will substitute a tag in a string template during its instantiation. More... | |
struct | UnitWorkloadStage |
Describes when a unit workload is run. More... | |
Typedefs | |
using | GpuTarget = ::arm_compute::GPUTarget |
Gpu Information such as the Gpu target (for example, G76) More... | |
using | MemoryDescriptorMap = std::map< ITensorInfo::Id, MemoryDescriptor > |
A map from ITensorInfo to their corresponding MemoryDescriptor. More... | |
using | TileContainer = std::vector< std::vector< std::string > > |
using | SamplerCreator = std::function< TensorTileSampler(GpuCkwScopedKernelWriter &, int32_t, int32_t)> |
using | Settings = ClComponentDepthwiseConv2dSettings |
using | ComponentId = int32_t |
Uniquely identifies a kernel component within a workload. More... | |
using | GpuKernelArgumentList = std::map< ITensorInfo::Id, GpuKernelArgument > |
The argument list of a GpuKernelSourceCode. More... | |
using | OperatorId = DependencyGraph::OperatorId |
using | UnitWorkloadId = int32_t |
Uniquely identifies a GpuUnitWorkload within a GpuWorkloadSourceCode. More... | |
using | Tag = std::string |
A tag used in a string template is a placeholder string to be substituted by real values during template instantiation. More... | |
using | TagLUT = std::unordered_map< Tag, TagVal > |
Tag lookup table. More... | |
Enumerations | |
enum | GpuLanguage { OpenCL, Unknown } |
Gpu Language. More... | |
enum | MemoryType { User = 0, Auxiliary, Virtual } |
Type of memory used by a workload tensor. More... | |
enum | GpuComponentType { Complex, Simple, Unfusable, Output } |
Component type in the context of fusion Its main purpose is to inform the optimizer how to perform fusion. More... | |
enum | GpuOperatorType { Simple, Complex, Unfusable } |
Contain properties common to all operator types. More... | |
Functions | |
void | cl_add_tensor_component_argument (cl::Kernel &kernel, unsigned int &idx, const ICLTensor *tensor, TensorComponentType component) |
Select a Compute Kernel Writer tensor component from a tensor and add to the kernel's arguments at the specified index idx. More... | |
void | cl_add_buffer_argument (cl::Kernel &kernel, unsigned int &idx, const cl::Buffer &buffer) |
Add an OpenCL buffer object to the kernel's arguments at the specified index idx . More... | |
void | cl_add_texture_argument (cl::Kernel &kernel, unsigned int &idx, const cl::Image &image) |
Add an OpenCL image object to the kernel's arguments at the specified index idx . More... | |
ckw::DataType | to_ckw (DataType dt) |
ckw::TensorShape | to_ckw (const TensorShape &shape) |
ckw::TensorDataLayout | to_ckw (DataLayout dl) |
ckw::TensorInfo | to_ckw (const ITensorInfo &tensor_info) |
TensorComponentType | from_ckw (const ckw::TensorComponentType &component) |
ckw::TensorStorageType | to_ckw (const TensorStorageType &storage) |
TensorStorageType | from_ckw (const ckw::TensorStorageType &storage) |
ckw::BinaryOp | to_ckw (const ElementwiseBinaryCommonAttributes &attributes) |
void | load_src_dst_tiles_and_prepare_sampler (GpuCkwScopedKernelWriter &writer, GpuCkwComponentArgument *src, GpuCkwComponentArgument *dst, int32_t m0, int32_t n0, SamplerCreator create_sampler) |
Load src and dst tiles of dimension [m0, n0] only when not loaded and prepare the sampler. More... | |
void | get_coord (GpuCkwScopedKernelWriter writer, TileOperand &coord, const TileOperand &gid, int32_t step_v, int32_t leftover_step_v, const std::string &prefix, const TileOperand &const_0) |
Get boundary aware coordinate along one axis. More... | |
TensorTileSampler | create_boundary_aware_2d_sampler (GpuCkwScopedKernelWriter writer, TileOperand &gid_0, TileOperand &gid_1, int32_t dim0_v, int32_t dim1_v, int32_t n0_v, int32_t m0_v, const std::string prefix, TileOperand &const_0) |
Declare coordinate tiles "{prefix}_dim0_coord" and "{prefix}_dim1_coord", and create a boundary-aware sampler from tile of size [n0, m0], against the overall dimensions [dim0, dim1] The load and store of tile [n0, m0] will never be out of bound of [dim0, dim1]. More... | |
bool | operator== (const KernelProperties &config0, const KernelProperties &config1) |
bool | operator== (const GpuKernelArgumentInfo &info0, const GpuKernelArgumentInfo &info1) |
bool | operator== (const UnitWorkloadStage &stage0, const UnitWorkloadStage &stage1) |
bool | is_alloc_tensor (const ITensorInfo *tensor_info) |
Tensor should have backing memory. More... | |
bool | is_noalloc_tensor (const ITensorInfo *tensor_info) |
Tensor should not have backing memory. More... | |
bool | is_valid_tensor (const ITensorInfo *tensor_info) |
ITensorInfo has valid id More... | |
bool | is_invalid_tensor (const ITensorInfo *tensor_info) |
ITensorInfo has invalid id More... | |
PoolingLayerInfo | convert_pool_attr_to_pool_info (const Pool2dAttributes &pool_attr, bool mixed_precision=false, DataLayout data_layout=DataLayout::NHWC) |
Inline function to convert Pool2dAttributes to PoolingLayerInfo. More... | |
Variables | |
constexpr unsigned int | vector_size_byte_opencl = 16 |
using ComponentId = int32_t |
using GpuKernelArgumentList = std::map<ITensorInfo::Id, GpuKernelArgument> |
The argument list of a GpuKernelSourceCode.
Definition at line 47 of file GpuKernelSourceCode.h.
using GpuTarget = ::arm_compute::GPUTarget |
Gpu Information such as the Gpu target (for example, G76)
Definition at line 41 of file GpuWorkloadContext.h.
using MemoryDescriptorMap = std::map<ITensorInfo::Id, MemoryDescriptor> |
A map from ITensorInfo to their corresponding MemoryDescriptor.
Definition at line 91 of file MemoryDescriptor.h.
Definition at line 41 of file GpuOperatorGroup.h.
using SamplerCreator = std::function<TensorTileSampler(GpuCkwScopedKernelWriter &, int32_t , int32_t )> |
Definition at line 44 of file WriterHelper.h.
Definition at line 43 of file ClComponentDepthwiseConv2d.cpp.
using Tag = std::string |
A tag used in a string template is a placeholder string to be substituted by real values during template instantiation.
Definition at line 127 of file GpuKernelVariableTable.h.
Tag lookup table.
It is used to instantiate a string template
Definition at line 130 of file GpuKernelVariableTable.h.
using TileContainer = std::vector<std::vector<std::string> > |
Definition at line 50 of file GpuCkwDirectConv2d.cpp.
using UnitWorkloadId = int32_t |
Uniquely identifies a GpuUnitWorkload within a GpuWorkloadSourceCode.
Definition at line 75 of file GpuWorkloadSourceCode.h.
|
strong |
|
strong |
|
strong |
Contain properties common to all operator types.
Operator type in the context of fusion
Definition at line 37 of file GpuOperatorProperties.h.
|
strong |
Type of memory used by a workload tensor.
We can classify tensors in 2 dimensions: Topology (where they are in a workload) and Memory allocation: Topology: Argument tensors: "Outer" tensors exposed to the users as inputs and outputs (arguments) Intermediate tensors: "Inner" tensors hidden from the users as links between operators Memory allocation: Alloc: Tensors that need to be allocated real backing memory No-Alloc: Tensors that don't need to be allocated real backing memory
We end up with 3 MemoryType based on the product of these two classifications | Argument | Intermediate | ------—*-------------—*----------------—* Alloc | User | Auxiliary | ------—*-------------—*----------------—* No-Alloc * N/A | Virtual | ------—*-------------—*----------------—*
Enumerator | |
---|---|
User | Both User and Auxiliary types are of Alloc type. Since they require memory allocation Memory coming directly from users, e.g. for argument tensors |
Auxiliary | Additional memory required by the workload tensor, e.g. for tensors holding temporary results between kernels |
Virtual | Virtual type is of No-Alloc type. Since it doesn't require memory allocation Temporary tile which is not allocated as a whole tensor in the memory. It is mainly used at sketch time to link operators; there should be no Virtual tensors at runtime |
Definition at line 53 of file MemoryDescriptor.h.
void cl_add_buffer_argument | ( | cl::Kernel & | kernel, |
unsigned int & | idx, | ||
const cl::Buffer & | buffer | ||
) |
Add an OpenCL buffer object to the kernel's arguments at the specified index idx
.
[in,out] | kernel | OpenCL kernel to configure with the provided argument. |
[in,out] | idx | Index at which to add the argument. |
[in] | buffer | OpenCL buffer containing the tensor's data. |
Definition at line 93 of file GpuCkwKernelArgumentsHelpers.cpp.
void cl_add_tensor_component_argument | ( | cl::Kernel & | kernel, |
unsigned int & | idx, | ||
const ICLTensor * | tensor, | ||
TensorComponentType | component | ||
) |
Select a Compute Kernel Writer tensor component from a tensor and add to the kernel's arguments at the specified index idx.
[in,out] | kernel | OpenCL kernel to configure with the provided argument. |
[in,out] | idx | Index at which to add the argument. |
[in] | tensor | Tensor from which to access the tensor component. |
[in] | component | Tensor component to select such as tensor dimensions, strides, etc. |
Definition at line 33 of file GpuCkwKernelArgumentsHelpers.cpp.
References ARM_COMPUTE_ERROR, ARM_COMPUTE_ERROR_ON, arm_compute::test::validation::info, and tensor.
void cl_add_texture_argument | ( | cl::Kernel & | kernel, |
unsigned int & | idx, | ||
const cl::Image & | image | ||
) |
Add an OpenCL image object to the kernel's arguments at the specified index idx
.
[in,out] | kernel | OpenCL kernel to configure with the provided argument. |
[in,out] | idx | Index at which to add the argument. |
[in] | image | OpenCL image containing the image's data. |
Definition at line 98 of file GpuCkwKernelArgumentsHelpers.cpp.
References caffe_mnist_image_extractor::image.
|
inline |
Inline function to convert Pool2dAttributes to PoolingLayerInfo.
Definition at line 66 of file Utils.h.
References arm_compute::cpu::data_layout, Pool2dAttributes::exclude_padding(), arm_compute::FLOOR, Padding2D::left, Pool2dAttributes::pad(), Pool2dAttributes::pool_size(), Pool2dAttributes::pool_type(), Pool2dAttributes::stride(), Padding2D::top, Size2D::x(), and Size2D::y().
Referenced by ClComponentPool2d::validate().
|
inline |
Declare coordinate tiles "{prefix}_dim0_coord" and "{prefix}_dim1_coord", and create a boundary-aware sampler from tile of size [n0, m0], against the overall dimensions [dim0, dim1] The load and store of tile [n0, m0] will never be out of bound of [dim0, dim1].
Declare coordinate tiles "{prefix}_dim0_coord" and "{prefix}_dim1_coord", and create a boundary-aware sampler from tile of size [n0, m0], against the overall dimensions [dim0, dim1] The load and store of tile [n0, m0] will never be out of bound of [dim0, dim1]
[in,out] | writer | Writer |
[in] | gid_0 | Global work item id 0 |
[in] | gid_1 | Global work item id 1 |
[in] | dim0_v | Dimension 0 |
[in] | dim1_v | Dimension 1 |
[in] | n0_v | Tile size dimension 0 |
[in] | m0_v | Tile size dimension 1 |
[in] | prefix | Prefix to all the tiles declared within this function |
[in] | const_0 | Constant tile of value 0 |
Definition at line 137 of file WriterHelper.h.
References arm_compute::utility::clamp(), get_coord(), tf_frozen_model_extractor::None, and check_header_guards::prefix.
Referenced by GpuCkwElementwiseBinary::write_component_code().
|
inline |
Definition at line 94 of file Common.h.
References ARM_COMPUTE_ERROR.
Referenced by GpuCkwDriver::get_kernel_arguments().
|
inline |
|
inline |
Get boundary aware coordinate along one axis.
Load and store of size step_v at the coordinate will not be out of bound
[in,out] | writer | Writer |
[out] | coord | Resultant coordinate |
[in] | gid | Global work item id |
[in] | step_v | Step size / vector size |
[in] | leftover_step_v | Leftover step size at the boundary |
[in] | prefix | Prefix to all the tiles declared within this function |
[in] | const_0 | Constant tile of value 0 |
Definition at line 87 of file WriterHelper.h.
References check_header_guards::prefix, and arm_compute::cpu::step.
Referenced by create_boundary_aware_2d_sampler(), GpuCkwPool2d::write_component_code(), GpuCkwDepthwiseConv2d::write_component_code(), GpuCkwMatMul::write_component_code(), and GpuCkwDirectConv2d::write_component_code().
|
inline |
Tensor should have backing memory.
Definition at line 38 of file Utils.h.
References ITensorInfo::invalid_tensor_id, and tensor_info.
Referenced by GpuOutput::validate_op().
|
inline |
ITensorInfo has invalid id
Definition at line 59 of file Utils.h.
References is_valid_tensor(), and tensor_info.
|
inline |
Tensor should not have backing memory.
Definition at line 45 of file Utils.h.
References ITensorInfo::invalid_tensor_id, and tensor_info.
|
inline |
ITensorInfo has valid id
Definition at line 52 of file Utils.h.
References tensor_info.
Referenced by is_invalid_tensor().
|
inline |
Load src and dst tiles of dimension [m0, n0] only when not loaded and prepare the sampler.
Definition at line 48 of file WriterHelper.h.
References arm_compute::test::validation::dst, GpuCkwKernelWriter::op_load_once(), arm_compute::test::validation::src, and arm_compute::test::validation::reference::tile().
Referenced by GpuCkwActivation::write_component_code().
bool operator== | ( | const GpuKernelArgumentInfo & | info0, |
const GpuKernelArgumentInfo & | info1 | ||
) |
Definition at line 31 of file GpuKernelArgument.cpp.
References GpuKernelArgumentInfo::type.
|
inline |
Definition at line 56 of file IGpuKernelComponent.h.
References KernelProperties::stage().
|
inline |
Definition at line 193 of file GpuWorkloadSourceCode.h.
References UnitWorkloadStage::stage.
|
inline |
Definition at line 37 of file ElementwiseBinary.h.
References ElementwiseBinaryCommonAttributes::Add, ARM_COMPUTE_ERROR, ElementwiseBinaryCommonAttributes::Div, ElementwiseBinaryCommonAttributes::Max, ElementwiseBinaryCommonAttributes::Min, ElementwiseBinaryCommonAttributes::Mul, ElementwiseBinaryCommonAttributes::operation(), ElementwiseBinaryCommonAttributes::Power, ElementwiseBinaryCommonAttributes::Prelu, ElementwiseBinaryCommonAttributes::SquaredDiff, and ElementwiseBinaryCommonAttributes::Sub.
|
inline |
|
inline |
NOTE: Overflow danger. Use size_t?
Definition at line 67 of file Common.h.
References ARM_COMPUTE_ERROR_ON, and arm_compute::test::validation::shape.
|
inline |
|
inline |
|
inline |
Definition at line 40 of file Common.h.
References dt, arm_compute::F16, arm_compute::F32, arm_compute::QASYMM8, arm_compute::QASYMM8_SIGNED, arm_compute::S16, arm_compute::S32, arm_compute::S8, arm_compute::U16, arm_compute::U32, and arm_compute::U8.
Referenced by GpuCkwVariableTable::declare_variable(), to_ckw(), GpuCkwElementwiseBinary::write_component_code(), GpuCkwCast::write_component_code(), GpuCkwPool2d::write_component_code(), GpuCkwDepthwiseConv2d::write_component_code(), GpuCkwDirectConv2d::write_component_code(), and GpuCkwMatMul::write_component_code().
|
constexpr |
Definition at line 41 of file ClTemplateElementwiseBinary.cpp.
Referenced by ClElementWiseUnaryKernel::configure(), CLRangeKernel::configure(), GpuCkwActivation::get_window(), GpuCkwElementwiseBinary::get_window(), GpuCkwCast::get_window(), and ClTemplateReshape::get_window().