Gaussian Blur

Separable Gaussian blur

Initializing WebGPU...
Source image

Quick Start

Loading...

Source

Loading...

Documentation

Gaussian Blur

Separable Gaussian blur using two render passes. Takes a source GPUTexture and writes the blurred result to a caller-provided target GPUTexture. Uses fragment shaders with hardware texture sampling for bilinear filtering and edge clamping.

Usage

import { createGaussianBlur } from './gaussian-blur';

const blur = createGaussianBlur(device, { format: 'rgba8unorm' });

const output = device.createTexture({
  size: [1024, 768],
  format: 'rgba8unorm',
  usage: GPUTextureUsage.RENDER_ATTACHMENT | GPUTextureUsage.TEXTURE_BINDING
});

// Blur a source texture
blur.apply(sourceTexture, output, { radius: 8, sigma: 4.0 });
// output now contains the blurred result

// Clean up
blur.destroy();

API

createGaussianBlur(device, options?)

Returns a GaussianBlur instance.

Option Type Default Description
format GPUTextureFormat 'rgba8unorm' Texture format (must match source)

blur.apply(source, target, options?)

Blurs the source texture and writes the result to the target texture.

  • sourceGPUTexture to read from (must have TEXTURE_BINDING usage)
  • targetGPUTexture to write to (must have RENDER_ATTACHMENT usage)
Option Type Default Description
radius number 8 Blur radius in pixels (0–32)
sigma number radius / 2 Gaussian standard deviation

When radius is 0, the source is copied to the target without blurring.

blur.destroy()

Releases internal textures and buffers. Does not destroy source or target textures (you own them).

Algorithm

Separable Gaussian blur splits a 2D convolution into two 1D passes:

  1. Horizontal pass — blur along X, write to an intermediate texture
  2. Vertical pass — blur along Y from the intermediate, write to the target

This reduces the cost from O(radius²) to O(radius) per pixel. For a radius of 32, that's 65 samples per pixel per pass instead of 4225 for a single 2D pass.

Kernel weights are precomputed on the CPU using the Gaussian function exp(-x²/2σ²) and normalized so they sum to 1.0. Only radius + 1 weights are stored (center + one side) since the kernel is symmetric.

Why render passes instead of compute? Fragment shaders get hardware texture sampling (bilinear filtering, clamp-to-edge) for free. This simplifies edge handling — no manual bounds checking. It also demonstrates that modules aren't compute-only.

WGSL loading

The default import uses Vite's ?raw suffix:

import shaderSource from './gaussian-blur.wgsl?raw';

If you're not using a bundler, load via fetch:

const shaderSource = await fetch(new URL('./gaussian-blur.wgsl', import.meta.url)).then((r) =>
  r.text()
);

Modifying

Replace with a box blur

Replace the computeKernel function in gaussian-blur.ts with uniform weights:

function computeBoxKernel(radius: number): Float32Array {
  const size = radius + 1;
  const kernel = new Float32Array(size);
  const weight = 1.0 / (radius * 2 + 1);
  kernel.fill(weight);
  return kernel;
}

The shader doesn't change — it just reads different weights.

Add a bloom effect

Modify the fragment shader to threshold bright pixels before blurring:

let sample = textureSample(source_tex, source_sampler, in.uv);
let brightness = dot(sample.rgb, vec3f(0.2126, 0.7152, 0.0722));
if (brightness < threshold) { return vec4f(0.0); }

Then blend the blurred result additively with the original image.

Apply blur to a specific region

Pass a mask texture as an additional binding. In the fragment shader, multiply the blur weight by the mask value at each sample position.

Chain with other post-processing

Feed the blur output as the input to another module:

blur.apply(sourceTexture, blurredTexture, { radius: 8 });
otherEffect.apply(blurredTexture);

Switch to compute shaders

For very large radii or tiled/chunked processing, compute shaders can be more efficient because they allow shared memory tiling. Replace the render passes with compute dispatches that read from a storage texture and write to another. You'll need to handle edge clamping manually.

Further Reading

Resources on Gaussian blur, separable convolutions, and GPU image filtering.

Core Theory

  • Heckbert, "Filtering by Repeated Integration" (SIGGRAPH 1986) Foundational paper on efficient image filtering techniques, including the separability of Gaussian kernels that makes two-pass blur possible. https://dl.acm.org/doi/10.1145/15886.15921

  • Deriche, "Fast Algorithms for Low-Level Vision" (1990) Introduces recursive (IIR) Gaussian filtering that achieves O(1) cost per pixel regardless of kernel size. A useful alternative for very large radii. https://hal.inria.fr/inria-00074778

GPU Implementation

Post-Processing Applications

Bloom

General References