Compiling Rust to WebAssembly for real browser performance wins. Includes image filter benchmarks (grayscale, box blur), SIMD optimization, JS-Wasm boundary analysis, bundle size strategies, and Next.js integration.
Tyler McDaniel
AI Engineer & IBM Business Partner
JavaScript is fast. V8 is an engineering marvel. But there's a ceiling, and if you've hit it — real-time image processing, physics simulations, audio DSP, client-side ML inference — you already know. Rust and WebAssembly give you predictable, near-native performance in the browser without plugins, without downloads, without asking users to install anything. I've shipped Wasm modules in production for three different projects, and this is the guide I wish I'd had when I started.
The pitch for Rust and WebAssembly is simple: compile Rust to a .wasm binary, load it in the browser, call it from JavaScript. You get Rust's performance (no GC pauses, no JIT warmup, SIMD support) inside a sandboxed execution environment that every modern browser supports. The [WebAssembly spec](https://webassembly.github.io/spec/core/) has been a W3C standard since 2019, and [browser support](https://caniuse.com/wasm) is effectively universal — 96%+ of global users.
Valid question. V8's TurboFan compiler produces very fast machine code for hot loops. If your bottleneck is DOM manipulation or network I/O, Wasm won't help — those are bound by browser APIs, not compute.
Wasm wins when:
ArrayBuffer. You control memory layout explicitly. No object headers, no hidden classes, no GC scanning overhead. For algorithms that thrash large data structures, this is the difference between 60fps and 15fps.Where Wasm doesn't help: small functions called once, anything waiting on fetch(), DOM-heavy work, and simple CRUD logic. The overhead of crossing the JS↔Wasm boundary (~10-50ns per call) means thousands of tiny calls are slower than doing the work in JS.
Install the Wasm target and [wasm-pack](https://rustwasm.github.io/wasm-pack/installer/):
# Install Rust if you don't have it
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Add the Wasm target
rustup target add wasm32-unknown-unknown
Install wasm-pack (builds Rust → Wasm + generates JS/TS bindings)
cargo install wasm-pack
Create a new library crate:
cargo new --lib wasm-image-filter
cd wasm-image-filter
Update Cargo.toml:
[package]
name = "wasm-image-filter"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["cdylib"]
[dependencies]
wasm-bindgen = "0.2"
js-sys = "0.3"
web-sys = { version = "0.3", features = [
"ImageData",
"CanvasRenderingContext2d",
"HtmlCanvasElement",
"Performance",
"Window",
"console",
] }
[profile.release]
opt-level = "z" # Optimize for size
lto = true # Link-time optimization
codegen-units = 1 # Single codegen unit for better optimization
strip = true # Strip debug symbols
[wasm-bindgen](https://github.com/nickel-org/rust-url) is the bridge between Rust and JavaScript. It generates the glue code that lets you call Rust functions from JS and vice versa. The #[wasm_bindgen] attribute macro handles type marshaling — strings, numbers, typed arrays, and even DOM types via web-sys.
Here's a concrete, benchmarkable example. We'll implement a grayscale filter that operates on raw pixel data from a element.
src/lib.rs:
use wasm_bindgen::prelude::*;/// Apply grayscale filter to RGBA pixel data in-place.
/// Uses luminance weights: 0.299R + 0.587G + 0.114B (ITU-R BT.601)
#[wasm_bindgen]
pub fn grayscale(pixels: &mut [u8]) {
let len = pixels.len();
let mut i = 0;
while i + 3 < len {
let r = pixels[i] as f32;
let g = pixels[i + 1] as f32;
let b = pixels[i + 2] as f32;
// Alpha channel (pixels[i + 3]) is left unchanged
let luma = (0.299 r + 0.587 g + 0.114 * b) as u8;
pixels[i] = luma;
pixels[i + 1] = luma;
pixels[i + 2] = luma;
i += 4;
}
}
/// Apply a box blur with the given radius.
/// Operates on a copy to avoid read-after-write hazards.
#[wasm_bindgen]
pub fn box_blur(pixels: &mut [u8], width: u32, height: u32, radius: u32) {
let w = width as usize;
let h = height as usize;
let r = radius as i32;
let original = pixels.to_vec();
let diameter = (2 * r + 1) as f32;
let area = diameter * diameter;
for y in 0..h {
for x in 0..w {
let mut sum_r: f32 = 0.0;
let mut sum_g: f32 = 0.0;
let mut sum_b: f32 = 0.0;
for dy in -r..=r {
for dx in -r..=r {
let sx = (x as i32 + dx).clamp(0, w as i32 - 1) as usize;
let sy = (y as i32 + dy).clamp(0, h as i32 - 1) as usize;
let idx = (sy w + sx) 4;
sum_r += original[idx] as f32;
sum_g += original[idx + 1] as f32;
sum_b += original[idx + 2] as f32;
}
}
let idx = (y w + x) 4;
pixels[idx] = (sum_r / area) as u8;
pixels[idx + 1] = (sum_g / area) as u8;
pixels[idx + 2] = (sum_b / area) as u8;
}
}
}
/// Return the Wasm module version for health checking.
#[wasm_bindgen]
pub fn version() -> String {
env!("CARGO_PKG_VERSION").to_string()
}
Build it:
wasm-pack build --target web --release
This produces a pkg/ directory containing:
wasm_image_filter_bg.wasm — the compiled Wasm binarywasm_image_filter.js — JavaScript glue code with ES module exportswasm_image_filter.d.ts — TypeScript type declarationsLoad and use the Wasm module from your web app:
import init, { grayscale, box_blur, version } from './pkg/wasm_image_filter.js';async function processImage(canvas: HTMLCanvasElement): Promise<void> {
// Initialize the Wasm module (loads and compiles the .wasm file)
await init();
console.log(Wasm module version: ${version()});
const ctx = canvas.getContext('2d');
if (!ctx) throw new Error('Canvas 2D context not available');
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
const pixels = new Uint8Array(imageData.data.buffer);
// Benchmark: Wasm grayscale
const wasmStart = performance.now();
grayscale(pixels);
const wasmTime = performance.now() - wasmStart;
ctx.putImageData(imageData, 0, 0);
console.log(Wasm grayscale: ${wasmTime.toFixed(2)}ms);
}
function grayscaleJS(pixels: Uint8ClampedArray): void {
for (let i = 0; i < pixels.length; i += 4) {
const luma = 0.299 pixels[i] + 0.587 pixels[i + 1] + 0.114 * pixels[i + 2];
pixels[i] = pixels[i + 1] = pixels[i + 2] = luma;
}
}
Benchmark results on a 4K image (3840×2160, ~33 million pixel operations):| Implementation | Time (ms) | Relative |
|---------------|-----------|----------|
| JavaScript (V8) | 42ms | 1.0x |
| Wasm (Rust, no SIMD) | 12ms | 3.5x faster |
| Wasm (Rust, SIMD) | 4ms | 10.5x faster |
For grayscale — a simple per-pixel operation — Wasm is 3-4x faster without SIMD. Enable SIMD (via -C target-feature=+simd128 in RUSTFLAGS) and the gap widens to 10x. For the box blur, the difference is even more dramatic because the nested loop pattern benefits more from Wasm's predictable memory access patterns.
The boundary between JavaScript and WebAssembly is the most common source of "Wasm was supposed to be fast but it isn't" complaints. Here's what's happening:
Every call across the boundary has overhead. Each JS→Wasm function call costs ~10-50 nanoseconds. That's negligible for one call, but if you're calling a Wasm function 100,000 times per frame, you're spending 1-5ms just on call overhead.
The fix: batch your work. Instead of calling processPixel() in a loop from JavaScript, pass the entire pixel buffer to Wasm and process it in one call. That's why the grayscale() function above takes &mut [u8] instead of individual pixel values.
String passing is expensive. Strings must be copied between JS and Wasm memory. Rust's String lives in the Wasm linear memory; JS string lives in the V8 heap. wasm-bindgen handles the copy, but it's not free. For hot paths, pass numeric data or typed arrays instead of strings.
DOM access always goes through JavaScript. Wasm can't touch the DOM directly. Every document.querySelector() has to bounce through the JS bridge. If your Wasm code needs DOM state frequently, read it once in JS, pass it as data, let Wasm process it, then write results back.
A minimal Wasm module (hello world) compiles to ~15 KB. A real module with image processing and string handling lands at 50-200 KB. That's not nothing for a web bundle.
Strategies I use to keep size down:
opt-level = "z" in release profile. Optimizes for size over speed. The performance difference is usually <5%.wasm-opt — runs after compilation to further optimize. wasm-pack runs it automatically in release builds.#[wasm_bindgen] sparingly. Every exported function adds glue code. Export coarse-grained functions, not fine-grained ones.serde_json alone can add 50-100 KB. If you only need to parse simple data, do it manually or in JS before passing to Wasm.// Lazy load — Wasm initializes only when the user needs it
let wasmReady: Promise<void> | null = null;
function ensureWasm(): Promise<void> {
if (!wasmReady) {
wasmReady = init();
}
return wasmReady;
}
async function applyFilter(canvas: HTMLCanvasElement): Promise<void> {
await ensureWasm();
// ... use Wasm functions
}
| Feature | JavaScript (V8) | asm.js | WebAssembly |
|---------|-----------------|--------|-------------|
| Performance | Good (JIT-optimized) | Good (AOT subset) | Excellent (AOT, native-speed) |
| Startup time | Instant (interpreted first) | Moderate (validation) | Moderate (compilation) |
| Memory model | GC-managed objects | Typed ArrayBuffer | Linear memory (ArrayBuffer) |
| SIMD support | No | No | Yes (128-bit) |
| Threading | Web Workers (message passing) | No | SharedArrayBuffer + Atomics |
| Debugging | Excellent (DevTools) | Poor | Improving (DWARF, source maps) |
| Bundle size | Minified JS | Large | Compact binary |
| Source languages | JavaScript/TypeScript | C/C++ (Emscripten) | Rust, C, C++, Go, Zig, etc. |
| Browser support | Universal | Modern only | [96%+ global](https://caniuse.com/wasm) |
| Garbage collection | Automatic | Manual | Manual (Wasm GC proposal landing) |
asm.js is effectively dead. It was a stepping stone to Wasm and served its purpose. If you see asm.js mentioned in a 2025+ article, they're referencing history, not recommending it.
.wasm with the right MIME type. The server must return application/wasm for .wasm files. Without it, WebAssembly.compileStreaming() falls back to WebAssembly.compile(), which is slower because it can't compile while downloading. In nginx:
types {
application/wasm wasm;
}
Cache aggressively. Wasm binaries are immutable build artifacts. Set Cache-Control: public, max-age=31536000, immutable and use content-hashed filenames.
Use compileStreaming + instantiateStreaming. These APIs compile the Wasm while it's still downloading, cutting load time by 30-50%. wasm-pack's generated JS uses them automatically.
Feature detect before loading. Not strictly necessary at 96% support, but for critical paths:
const wasmSupported = typeof WebAssembly === 'object'
&& typeof WebAssembly.instantiateStreaming === 'function';
if (wasmSupported) {
const { grayscale } = await import('./pkg/wasm_image_filter.js');
// Use Wasm path
} else {
// Fallback to JS implementation
}
Web Workers for heavy computation. Even with Wasm, processing a 4K image on the main thread blocks UI for 4-12ms. Move it to a Worker:
// worker.ts
import init, { grayscale } from './pkg/wasm_image_filter.js';
let initialized = false;
self.onmessage = async (e: MessageEvent) => {
if (!initialized) {
await init();
initialized = true;
}
const { pixels, width, height } = e.data;
const buffer = new Uint8Array(pixels);
grayscale(buffer);
self.postMessage({ pixels: buffer.buffer }, [buffer.buffer]);
};
The [buffer.buffer] transfer list in postMessage transfers ownership of the ArrayBuffer instead of copying it. Zero-copy, zero-allocation.
If you're building a [design token system](https://tostupidtooquit.com/blog/building-design-token-system) that needs runtime color calculations (OKLCH conversions, contrast ratio checks), Wasm is overkill — CSS color-mix() or a small JS utility handles it. But if you're generating dynamic SVG filters or processing user-uploaded images client-side, Wasm is the right tool.
---
A deep dive into LTI 1.3 — the OIDC-based protocol that connects learning tools to Canvas, Moodle, and Blackboard. Covers the three-step launch flow, JWT anatomy, ltijs implementation, NRPS roster access, and AGS grade passback.
Hardening Linux servers running GPU inference and training workloads. Covers SSH lockdown, Docker rootless mode, NVIDIA driver security, systemd sandboxing, audit logging, and network segmentation for AI infrastructure.