24.02.1
|
Go to the documentation of this file.
49 DirectConvComputeKernelInfo
50 config_indirect_convolution_nhwc(
const ITensorInfo *
src,
const ITensorInfo *weights,
const PadStrideInfo &
conv_info)
77 auto k0 = std::make_unique<kernels::ClIndirectConv2dAddressPrecalculationKernel>();
78 auto k1 = std::make_unique<kernels::ClIndirectConv2dKernel>();
83 k0->configure(compile_context,
src, weights, &_indirect_buffer,
conv_info, desc);
86 _addr_precalculation_kernel = std::move(k0);
87 _indirect_conv_kernel = std::move(k1);
94 _aux_mem[IndirectBuffer] =
std::vector< MemoryInfo > MemoryRequirements
#define ARM_COMPUTE_LOG_INFO_WITH_FUNCNAME_ACL(msg)
SimpleTensor< float > src
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
void run(ITensorPack &tensors) override
Run the kernels contained in the function.
Interface for OpenCL tensor.
virtual const cl::Buffer & cl_buffer() const =0
Interface to be implemented by the child class to return a reference to the OpenCL buffer containing ...
TensorShape compute_indirect_buffer_shape(const TensorShape &input_shape, DataLayout input_data_layout, const TensorShape &weights_shape, const PadStrideInfo &conv_info, const DirectConvComputeKernelInfo &desc)
Calculate the indirect buffer output shape used by the indirect convolution function.
ITensor * get_tensor(int id)
Get tensor of a given id from the pac.
static std::unique_ptr< IClIndirectConvKernelConfig > create(GPUTarget gpu)
Static method to call the ClIndirectConvolution kernel configuration class accordingly with the GPU t...
static Status validate(const ITensorInfo *src, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *dst, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Static function to check if given info will lead to a valid configuration.
void add_const_tensor(int id, const ITensor *tensor)
Add const tensor to the pack.
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Activation Layer Information class.
#define ARM_COMPUTE_ERROR_ON_NULLPTR(...)
#define ARM_COMPUTE_ERROR_ON(cond)
If the condition is true then an error message is printed and an exception thrown.
void prepare(ITensorPack &constants) override
Prepare the function for executing.
experimental::MemoryRequirements workspace() const override
Return the memory requirements required by the workspace.
Interface to enqueue OpenCL kernels and get/set the OpenCL CommandQueue and ICLTuner.
void tune_kernel_static(ICLKernel &kernel)
Tunes OpenCL kernel.
Tensor handler to wrap and handle tensor allocations on workspace buffers.
static CLScheduler & get()
Access the scheduler singleton.
GPUTarget target() const
Get the target GPU.
GPUTarget
Available GPU Targets.
void configure(const CLCompileContext &compile_context, ITensorInfo *src, ITensorInfo *weights, ITensorInfo *biases, ITensorInfo *dst, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Initialise the kernel's inputs and output.
Store the tensor's metadata.
int offset_int_vec(int offset)
Copyright (c) 2017-2024 Arm Limited.
@ S32
signed 32-bit number
void enqueue_op(ICLKernel &kernel, ITensorPack &tensors, bool flush=true)
Schedule the execution of the passed kernel if possible.
Status validate(const ITensorInfo *scores_in, const ITensorInfo *boxes_in, const ITensorInfo *batch_splits_in, const ITensorInfo *scores_out, const ITensorInfo *boxes_out, const ITensorInfo *classes, const ITensorInfo *batch_splits_out, const ITensorInfo *keeps, const ITensorInfo *keeps_size, const BoxNMSLimitInfo info)
Store the tensor's metadata.
#define ARM_COMPUTE_LOG_PARAMS(...)
Compute descriptor used by the direct convolution kernel.