> Writing

Long-form technical write-ups on LLM inference, TPU performance engineering, and high-performance computing.


A practical guide to roofline analysis for LLM kernels on TPU hardware. Covers arithmetic intensity, compute vs. memory bottlenecks, common misconceptions, and what rooflines can't tell you — with worked examples across v5p, v6e, and v7x.

JAX TPU Performance HPC