Layout transitions

Vulkan requires the application to manage image layouts, so that all render pass attachments are in the correct layout when the render pass begins. This is usually done using pipeline barriers or the initialLayout and finalLayout parameters of the render pass.

If the rendering pipeline is complex, transitioning each image to its correct layout is not trivial, as it requires some sort of state tracking. If previous image contents are not needed, there is an easy way out, that is setting oldLayout/initialLayout to VK_IMAGE_LAYOUT_UNDEFINED. While this is functionally correct, it can have performance implications as it may prevent the GPU from performing some optimizations.

This tutorial will cover an example of such optimizations and how to avoid the performance overhead from using sub-optimal layouts.

Transaction elimination

Mali GPUs feature transaction elimination, a technology that is used to avoid frame buffer write bandwidth for static regions of the framebuffer. This is especially beneficial for games that contain many static opaque overlays.

Transaction elimination is used for an image under the following conditions:

The driver keeps a signature buffer for the image to check for redundant frame buffer writes. The signature buffer must always be in sync with the actual contents of the image, which is the case when an image is only used within the tile write path. In practice, this corresponds to only using layouts that are either read-only or can only be written to by fragment shading. These “safe” layouts are:

All other layouts, including UNDEFINED layout, are considered “unsafe” as they allow writes to an image outside the tile write path. When an image is transitioned via an “unsafe” layout, the signature buffer must be invalidated to prevent the signature and the data from becoming desynchronized. Note that the swapchain image is a slightly special case, as it is considered “safe” even when transitioned from UNDEFINED.

In addition signature invalidation could happen as part of a VkImageMemoryBarrier, vkCmdPipelineBarrier(), vkCmdWaitEvents(), or as part of a VkRenderPass if the color attachment reference layout is different from the final layout. The vkCmdBlitImage() framebuffer transfer stage operation will also always invalidate the signature buffer, so shader-based blits will likely be more efficient.

The sample

The sample sets up deferred rendering using two render passes, to show the effect of transitioning G-buffer images from UNDEFINED rather than their last known layout.

Note that a deferred rendering implementation using subpasses might be more efficient overall; see the subpasses tutorial for more detail.

The base case is with all color images being transitioned from UNDEFINED, as shown in the image below.

Undefined layout transitions

When we switch to using the last known layout as oldLayout in the pipeline barriers, transaction elimination can take place. This is highlighted in the counters showing about double the amount of tiles killed by CRC match, along with ~10% reduction in write bandwidth.

Last layout transitions

A reduction in memory bandwidth will reduce the power consumption of the device, resulting in less overheating and longer battery life. Additionally, this may improve performance on games that are bandwidth limited.

Best practice summary