Compute Library
 21.02
CLWinogradConvolutionLayer Class Reference

Basic function to execute Winograd-based convolution on OpenCL. More...

#include <CLWinogradConvolutionLayer.h>

Collaboration diagram for CLWinogradConvolutionLayer:
[legend]

Public Member Functions

 CLWinogradConvolutionLayer (std::shared_ptr< IMemoryManager > memory_manager=nullptr)
 Default constructor. More...
 
 CLWinogradConvolutionLayer (const CLWinogradConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
 CLWinogradConvolutionLayer (CLWinogradConvolutionLayer &&)=default
 Default move constructor. More...
 
CLWinogradConvolutionLayeroperator= (const CLWinogradConvolutionLayer &)=delete
 Prevent instances of this class from being copied (As this class contains pointers) More...
 
CLWinogradConvolutionLayeroperator= (CLWinogradConvolutionLayer &&)=default
 Default move assignment operator. More...
 
 ~CLWinogradConvolutionLayer ()
 Default destructor. More...
 
void configure (ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Set the input and output tensors. More...
 
void configure (const CLCompileContext &compile_context, ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Set the input and output tensors. More...
 
void run () override
 Run the kernels contained in the function. More...
 
void prepare () override
 Prepare the function for executing. More...
 
- Public Member Functions inherited from IFunction
virtual ~IFunction ()=default
 Destructor. More...
 

Static Public Member Functions

static Status validate (const ITensorInfo *input, const ITensorInfo *weights, const ITensorInfo *biases, const ITensorInfo *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
 Static function to check if given info will lead to a valid configuration of CLWinogradConvolutionLayer. More...
 

Detailed Description

Basic function to execute Winograd-based convolution on OpenCL.

This function calls the following OpenCL functions/kernels:

  1. CLWinogradInputTransform
  2. CLWinogradFilterTransformKernel (only once)
  3. CLGEMM
  4. CLWinogradOutputTransformKernel

Definition at line 48 of file CLWinogradConvolutionLayer.h.

Constructor & Destructor Documentation

◆ CLWinogradConvolutionLayer() [1/3]

CLWinogradConvolutionLayer ( std::shared_ptr< IMemoryManager memory_manager = nullptr)

Default constructor.

Definition at line 100 of file CLWinogradConvolutionLayer.cpp.

101  : _memory_group(memory_manager), _batched_mm(memory_manager), _input_transform(), _filter_transform(std::make_unique<CLWinogradFilterTransformKernel>()),
102  _output_transform(std::make_unique<CLWinogradOutputTransformKernel>()), _input0(), _input1(), _batched_mm_output(), _original_weights(nullptr), _is_prepared(false)
103 {
104 }

◆ CLWinogradConvolutionLayer() [2/3]

Prevent instances of this class from being copied (As this class contains pointers)

◆ CLWinogradConvolutionLayer() [3/3]

Default move constructor.

◆ ~CLWinogradConvolutionLayer()

Default destructor.

Member Function Documentation

◆ configure() [1/2]

void configure ( ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)

Set the input and output tensors.

Note
: This function only works with 3x3,3x1,1x3,5x5,5x1,1x5,7x1 and 1x7 kernels along with unit strides for both NCHW and NHWC data layout
Some Winograd configurations (i.e. F(4x4, 5x5)) are supported only with enable_fast_math = true
Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as input.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as input
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false

Definition at line 108 of file CLWinogradConvolutionLayer.cpp.

References CLKernelLibrary::get().

110 {
111  configure(CLKernelLibrary::get().get_compile_context(), input, weights, biases, output, conv_info, act_info, enable_fast_math);
112 }
static CLKernelLibrary & get()
Access the KernelLibrary singleton.
void configure(ICLTensor *input, const ICLTensor *weights, const ICLTensor *biases, ICLTensor *output, const PadStrideInfo &conv_info, const ActivationLayerInfo &act_info=ActivationLayerInfo(), bool enable_fast_math=false)
Set the input and output tensors.

◆ configure() [2/2]

void configure ( const CLCompileContext compile_context,
ICLTensor input,
const ICLTensor weights,
const ICLTensor biases,
ICLTensor output,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)

Set the input and output tensors.

Note
: This function only works with 3x3,3x1,1x3,5x5,5x1,1x5,7x1 and 1x7 kernels along with unit strides for both NCHW and NHWC data layout
Some Winograd configurations (i.e. F(4x4, 5x5)) are supported only with enable_fast_math = true
Parameters
[in]compile_contextThe compile context to be used.
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as input.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as input
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false

Definition at line 114 of file CLWinogradConvolutionLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_ERROR_ON_MSG, CLWinogradInputTransform::configure(), CLGEMM::configure(), ITensorInfo::data_layout(), ITensorInfo::data_type(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::idx_height, arm_compute::test::validation::idx_width, ITensor::info(), MemoryGroup::manage(), ITensorInfo::tensor_shape(), arm_compute::WIDTH, and arm_compute::test::validation::winograd_info.

117 {
118  // Get indices for the width and height
121 
122  // Input shape, kernel size and output tile
123  const Size2D input_dims = Size2D(input->info()->tensor_shape()[idx_width], input->info()->tensor_shape()[idx_height]);
124  const Size2D kernel_size = Size2D(weights->info()->tensor_shape()[idx_width], weights->info()->tensor_shape()[idx_height]);
125  const Size2D output_tile = winograd_output_tile(input_dims, kernel_size, input->info()->data_layout());
126 
127  // Check if the Winograd configuration requires fast math
128  if(!enable_fast_math)
129  {
130  ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(input, 1, DataType::F32); //disable winograd for fp16 if fast math is false.
131  ARM_COMPUTE_ERROR_ON_MSG(check_support_fast_math(output_tile, kernel_size), "This Winograd configuration requires enable_fast_math=true");
132  }
133  const WinogradInfo winograd_info = WinogradInfo(output_tile,
134  kernel_size,
135  input_dims,
136  conv_info,
137  input->info()->data_layout());
138 
139  _is_prepared = false;
140  _original_weights = weights;
141 
142  // Manage intermediate tensors
143  _memory_group.manage(&_input0);
144  _memory_group.manage(&_batched_mm_output);
145 
146  // Do not manage _input1 as it contains the weights
147 
148  // Configure input transform
149  _input_transform.configure(compile_context, input, &_input0, winograd_info);
150 
151  // Configure filter transform
152  _filter_transform->configure(compile_context, weights, &_input1, winograd_info);
153 
154  // Configure batched matrix multiply
155  _batched_mm.configure(compile_context, &_input0, &_input1, nullptr, &_batched_mm_output, 1.0f, 0.0f, GEMMInfo(false, false, true /* Reshape weights only for the first run*/, 0, false, false,
157  (input->info()->data_type() == DataType::F16)));
158 
159  // Configure output transform
160  _output_transform->configure(compile_context, &_batched_mm_output, biases, output, winograd_info, act_info);
161 
162  // Allocate temporary tensors
163  _input0.allocator()->allocate();
164  _batched_mm_output.allocator()->allocate();
165 }
Winograd information.
Definition: Types.h:2182
virtual DataType data_type() const =0
Data type used for each element of the tensor.
1 channel, 1 F32 per channel
CLTensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: CLTensor.cpp:61
1 channel, 1 F16 per channel
void manage(IMemoryManageable *obj) override
Sets a object to be managed by the given memory group.
Definition: MemoryGroup.h:79
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
#define ARM_COMPUTE_ERROR_ON_MSG(cond, msg)
Definition: Error.h:456
GEMMLowp output stage info.
Definition: Types.h:1952
virtual ITensorInfo * info() const =0
Interface to be implemented by the child class to return the tensor&#39;s metadata.
#define ARM_COMPUTE_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:790
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
void configure(const ICLTensor *a, const ICLTensor *b, const ICLTensor *c, ICLTensor *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Initialise the kernel&#39;s inputs and output.
Definition: CLGEMM.cpp:666
Class for specifying the size of an image or rectangle.
Definition: Size2D.h:34
GEMM information class.
Definition: Types.h:2003
void configure(ICLTensor *input, ICLTensor *output, const WinogradInfo &winograd_info)
Set the input and output tensors.
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
virtual DataLayout data_layout() const =0
Get the data layout of the tensor.

◆ operator=() [1/2]

CLWinogradConvolutionLayer& operator= ( const CLWinogradConvolutionLayer )
delete

Prevent instances of this class from being copied (As this class contains pointers)

◆ operator=() [2/2]

Default move assignment operator.

◆ prepare()

void prepare ( )
overridevirtual

Prepare the function for executing.

Any one off pre-processing step required by the function is handled here

Note
Prepare stage might not need all the function's buffers' backing memory to be available in order to execute

Reimplemented from IFunction.

Definition at line 234 of file CLWinogradConvolutionLayer.cpp.

References CLTensorAllocator::allocate(), CLTensor::allocator(), CLScheduler::enqueue(), CLTensorAllocator::free(), CLScheduler::get(), ITensor::is_used(), ITensor::mark_as_unused(), CLGEMM::prepare(), and CLScheduler::queue().

Referenced by CLWinogradConvolutionLayer::run().

235 {
236  if(!_is_prepared)
237  {
238  // Run filter transform and mark original weights as unused
239  _input1.allocator()->allocate();
240  CLScheduler::get().enqueue(*_filter_transform, false);
241  _original_weights->mark_as_unused();
242 
243  // Prepare GEMM and release reshaped weights if marked unused by CLGEMM
244  _batched_mm.prepare();
245  if(!_input1.is_used())
246  {
247  _input1.allocator()->free();
248  }
249 
250  CLScheduler::get().queue().finish();
251  _is_prepared = true;
252  }
253 }
void prepare() override
Prepare the function for executing.
Definition: CLGEMM.cpp:870
static CLScheduler & get()
Access the scheduler singleton.
bool is_used() const
Flags if the tensor is used or not.
Definition: ITensor.cpp:163
CLTensorAllocator * allocator()
Return a pointer to the tensor&#39;s allocator.
Definition: CLTensor.cpp:61
void mark_as_unused() const
Marks a tensor as unused.
Definition: ITensor.cpp:168
cl::CommandQueue & queue()
Accessor for the associated CL command queue.
Definition: CLScheduler.cpp:41
void enqueue(ICLKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
void allocate() override
Allocate size specified by TensorInfo of OpenCL memory.
void free() override
Free allocated OpenCL memory.

◆ run()

void run ( )
overridevirtual

Run the kernels contained in the function.

For Neon kernels:

  • Multi-threading is used for the kernels which are parallelisable.
  • By default std::thread::hardware_concurrency() threads are used.
Note
CPPScheduler::set_num_threads() can be used to manually set the number of threads

For OpenCL kernels:

  • All the kernels are enqueued on the queue associated with CLScheduler.
  • The queue is then flushed.
Note
The function will not block until the kernels are executed. It is the user's responsibility to wait.
Will call prepare() on first run if hasn't been done

Implements IFunction.

Definition at line 218 of file CLWinogradConvolutionLayer.cpp.

References CLScheduler::enqueue(), CLScheduler::get(), CLWinogradConvolutionLayer::prepare(), ICLSimpleFunction::run(), and CLGEMM::run().

219 {
220  prepare();
221 
222  MemoryGroupResourceScope scope_mg(_memory_group);
223 
224  // Run input transform
225  _input_transform.run();
226 
227  // Run batched matrix multiplication
228  _batched_mm.run();
229 
230  // Run output transform
231  CLScheduler::get().enqueue(*_output_transform);
232 }
void run() override
Run the kernels contained in the function.
Definition: CLGEMM.cpp:778
static CLScheduler & get()
Access the scheduler singleton.
void prepare() override
Prepare the function for executing.
void run() override final
Run the kernels contained in the function.
void enqueue(ICLKernel &kernel, bool flush=true)
Schedule the execution of the passed kernel if possible.
Memory group resources scope handling class.
Definition: IMemoryGroup.h:82

◆ validate()

Status validate ( const ITensorInfo input,
const ITensorInfo weights,
const ITensorInfo biases,
const ITensorInfo output,
const PadStrideInfo conv_info,
const ActivationLayerInfo act_info = ActivationLayerInfo(),
bool  enable_fast_math = false 
)
static

Static function to check if given info will lead to a valid configuration of CLWinogradConvolutionLayer.

Note
: This function only works with 3x3,3x1,1x3,5x5,5x1 and 1x5 kernels along with unit strides for both NCHW and NHWC data layout
Some Winograd configurations (i.e. F(4x4, 5x5)) are supported only with enable_fast_math = true
Parameters
[in]inputSource tensor. 3 lower dimensions represent a single input [width, height, IFM], while every optional dimension from 4 and above represent a batch of inputs. Data types supported: F16/F32.
[in]weightsWeights tensor. Weights are 4D tensor with dimensions [kernel_x, kernel_y, IFM, OFM]. Data type supported:Same as input.
[in]biasesBiases tensor. Shared biases supported. Biases are 1D tensor with dimensions [OFM].Data type supported: Same as input
[out]outputDestination tensor. 3 lower dimensions represent a single output [width, height, OFM], while the rest represent batch of outputs. Data types supported: Same as input.
[in]conv_infoContains padding and stride information described in PadStrideInfo.
[in]act_info(Optional) Activation layer information in case of a fused activation.
[in]enable_fast_math(Optional) Enable fast math computation. In case this flag were set, the function could dispatch the fastest implementation available which may introduce a drop of accuracy as well. Default is false
Returns
a status

Definition at line 167 of file CLWinogradConvolutionLayer.cpp.

References ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN, ARM_COMPUTE_RETURN_ERROR_ON_MSG, ARM_COMPUTE_RETURN_ON_ERROR, ICloneable< T >::clone(), TensorInfo::clone(), arm_compute::misc::shape_calculator::compute_winograd_filter_transform_shape(), arm_compute::misc::shape_calculator::compute_winograd_input_transform_shape(), ITensorInfo::data_layout(), ITensorInfo::data_type(), arm_compute::F16, arm_compute::F32, arm_compute::get_data_layout_dimension_index(), arm_compute::HEIGHT, arm_compute::test::validation::idx_height, arm_compute::test::validation::idx_width, PadStrideInfo::pad_bottom(), PadStrideInfo::pad_left(), PadStrideInfo::pad_right(), PadStrideInfo::pad_top(), ITensorInfo::tensor_shape(), TensorInfo::tensor_shape(), CLWinogradInputTransform::validate(), CLWinogradFilterTransformKernel::validate(), CLWinogradOutputTransformKernel::validate(), CLGEMM::validate(), arm_compute::WIDTH, and arm_compute::test::validation::winograd_info.

Referenced by CLConvolutionLayer::get_convolution_method(), and CLConvolutionLayer::validate().

169 {
170  // Get indeces for the width and height
173 
174  // Input shape, kernel size and output tile
175  const Size2D input_dims = Size2D(input->tensor_shape()[idx_width], input->tensor_shape()[idx_height]);
176  const Size2D kernel_size = Size2D(weights->tensor_shape()[idx_width], weights->tensor_shape()[idx_height]);
177  const Size2D output_tile = winograd_output_tile(input_dims, kernel_size, input->data_layout());
178 
179  ARM_COMPUTE_RETURN_ERROR_ON_MSG(((conv_info.pad_left() > (kernel_size.x() / 2u)) || (conv_info.pad_right() > (kernel_size.x() / 2u))), "Winograd only supports padding up to half kernel size");
180  ARM_COMPUTE_RETURN_ERROR_ON_MSG(((conv_info.pad_top() > (kernel_size.y() / 2u)) || (conv_info.pad_bottom() > (kernel_size.y() / 2u))), "Winograd only supports padding up to half kernel size");
181 
182  // Check if the Winograd configuration requires fast math
183  if(!enable_fast_math)
184  {
185  ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(input, 1, DataType::F32); //disable winograd for fp16 if fast math is false.
186  ARM_COMPUTE_RETURN_ERROR_ON_MSG(check_support_fast_math(output_tile, kernel_size), "This Winograd configuration requires enable_fast_math=true");
187  }
188 
189  const WinogradInfo winograd_info = WinogradInfo(output_tile,
190  kernel_size,
191  input_dims,
192  conv_info,
193  input->data_layout());
194 
195  // Validate input transform
196  const TensorShape input0_shape = misc::shape_calculator::compute_winograd_input_transform_shape(*input, winograd_info);
197  const TensorInfo input0 = input->clone()->set_tensor_shape(input0_shape);
198  ARM_COMPUTE_RETURN_ON_ERROR(CLWinogradInputTransform::validate(input, &input0, winograd_info));
199 
200  // Validate filter transform
201  const TensorShape input1_shape = misc::shape_calculator::compute_winograd_filter_transform_shape(*weights, winograd_info);
202  const TensorInfo input1 = weights->clone()->set_tensor_shape(input1_shape);
204 
205  // Validate batched matrix multiply
206  TensorShape batched_mm_output_shape = input0.tensor_shape();
207  batched_mm_output_shape[0] = input1.tensor_shape()[0];
208  const TensorInfo batched_mm_output = input0.clone()->set_tensor_shape(batched_mm_output_shape);
209  ARM_COMPUTE_RETURN_ON_ERROR(CLGEMM::validate(&input0, &input1, nullptr, &batched_mm_output, 1.0f, 0.0f, GEMMInfo(false, false, true /* Reshape weights only for the first run*/, 0, false, false,
210  GEMMLowpOutputStageInfo(), (input->data_type() == DataType::F16))));
211 
212  // Configure output transform
213  ARM_COMPUTE_RETURN_ON_ERROR(CLWinogradOutputTransformKernel::validate(&batched_mm_output, biases, output, winograd_info, act_info));
214 
215  return Status{};
216 }
Shape of a tensor.
Definition: TensorShape.h:39
TensorShape compute_winograd_input_transform_shape(const ITensorInfo &input, const WinogradInfo &winograd_info)
Calculate the winograd input transform shape.
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const WinogradInfo &winograd_info)
Static function to check if given info will lead to a valid configuration of CLWinogradFilterTransfor...
static Status validate(const ITensorInfo *input, const ITensorInfo *bias, const ITensorInfo *output, const WinogradInfo &winograd_info, const ActivationLayerInfo &act_info=ActivationLayerInfo())
Static function to check if given info will lead to a valid configuration of CLWinogradOutputTransfor...
std::unique_ptr< ITensorInfo > clone() const override
Provide a clone of the current object of class T.
Definition: TensorInfo.cpp:316
Winograd information.
Definition: Types.h:2182
#define ARM_COMPUTE_RETURN_ON_ERROR(status)
Checks if a status contains an error and returns it.
Definition: Error.h:204
virtual DataType data_type() const =0
Data type used for each element of the tensor.
1 channel, 1 F32 per channel
unsigned int pad_top() const
Get the top padding.
Definition: Types.h:806
Status class.
Definition: Error.h:52
1 channel, 1 F16 per channel
virtual const TensorShape & tensor_shape() const =0
Size for each dimension of the tensor.
virtual std::unique_ptr< T > clone() const =0
Provide a clone of the current object of class T.
GEMMLowp output stage info.
Definition: Types.h:1952
unsigned int pad_right() const
Get the right padding.
Definition: Types.h:801
TensorShape compute_winograd_filter_transform_shape(const ITensorInfo &input, const WinogradInfo &winograd_info)
Calculate the winograd filter transform shape.
Class for specifying the size of an image or rectangle.
Definition: Size2D.h:34
#define ARM_COMPUTE_RETURN_ERROR_ON_DATA_TYPE_CHANNEL_NOT_IN(t, c,...)
Definition: Validate.h:792
static Status validate(const ITensorInfo *a, const ITensorInfo *b, const ITensorInfo *c, const ITensorInfo *output, float alpha, float beta, const GEMMInfo &gemm_info=GEMMInfo())
Static function to check if given info will lead to a valid configuration of CLGEMM.
Definition: CLGEMM.cpp:727
#define ARM_COMPUTE_RETURN_ERROR_ON_MSG(cond, msg)
If the condition is true, an error is returned.
Definition: Error.h:244
Store the tensor&#39;s metadata.
Definition: TensorInfo.h:45
GEMM information class.
Definition: Types.h:2003
size_t get_data_layout_dimension_index(const DataLayout data_layout, const DataLayoutDimension data_layout_dimension)
Get the index of the given dimension.
Definition: Helpers.inl:193
unsigned int pad_bottom() const
Get the bottom padding.
Definition: Types.h:811
const TensorShape & tensor_shape() const override
Size for each dimension of the tensor.
Definition: TensorInfo.h:262
unsigned int pad_left() const
Get the left padding.
Definition: Types.h:796
static Status validate(const ITensorInfo *input, const ITensorInfo *output, const WinogradInfo &winograd_info)
Static function to check if given info will lead to a valid configuration of CLWinogradInputTransform...
virtual DataLayout data_layout() const =0
Get the data layout of the tensor.

The documentation for this class was generated from the following files: