Skip to content
Nicolas Silva edited this page Feb 10, 2017 · 2 revisions

Instancing

All the rectangles are drawn with instancing in order to minimize the number of draw calls we make.

Per-instance data interpolation

Some of the data in PrimitiveInstance is needed in the fragment shader only. We use the flat interpolation qualifier to pass it from the vertex shader, which eases the use of HW interpolator units. Example:

flat varying vec2 vRefPoint;

Batching

We provide PrimitiveInstance data via the instanced attributes to the primitive shaders. It doesn't actually contain as much data as it points to elements in other data sets that we provide in the textures. These data textures are read in the vertex shaders:

    ivec2 uv = get_fetch_uv_2(index);
    gradient.start_end_point = texelFetchOffset(sData32, uv, 0, ivec2(0, 0));
    gradient.extend_mode = texelFetchOffset(sData32, uv, 0, ivec2(1, 0));

Notice that we are not using the samplers (via textureLod) here, since neither the interpolation between texels or clamping are needed, thus we directly load the data using texelFetch instructions.

The textures are updated every frame and are multi-buffered in order to avoid GPU stalls.

Depth testing

Depth testing allows us to avoid executing the fragment shading on pixels that are behind known opaque primitives. The general idea is to first render opaque primitives (mostly) front to back with depth test and write enabled, and render transparent back to front with depth test enabled (no need to write to the z-buffer in the transparent pass, though).

This helps a lot with reducing overdraw which affects performance significantly.

Texture sampling

Reasons textureLod is used in the shaders, where applicable:

  1. It clearly states the pre-condition. For example, when sampling from images we clamp the texture coordinates to the half-texel offset from the edge. This is only going to work properly for Lod==0.0, thus textureLod makes sense. If you add any mipmap levels, the code doesn't care and will still work properly.
  2. It may be slightly faster. The shader compiler doesn't know if a texture has mipmap levels or not (unless it re-compiles the shader on the fly), so it will generate the gradient instructions and a regular sample, only to be dismissed by the texture sampler that discovers no mipmap levels. Using textureLod directly produces the efficient HW shader assembly right away.

Example:

float y = textureLod(sColor0, st_y, 0.0).r;