Posted on Feb 3, 15:217 viewed

GPU Performance Software Development Engineer

United States of America , San Jose
On-site
Visit company website

Job Description

We’re building Wave, a high-performance GPU programming language and compiler for machine learning workloads. Wave combines a Python-embedded DSL with an MLIR-based compiler stack to give engineers explicit control over GPU kernel performance.

We’re looking for a low-level GPU performance engineer to own end-to-end performance of Wave’s kernels.

You’ll work on

  • Hand-tuned GPU kernels (GEMM, Attention, MoE, decoding)
  • Instruction-level optimization: memory, registers, scheduling, wave/warp behavior
  • HIP/CUDA, GPU intrinsics (MFMA/MMA), and inline assembly
  • MLIR dialects, lowering pipelines, and performance-critical compiler passes
  • Profiling via disassembly, counters, rocprof/Nsight

Wave codebase: https://github.com/iree-org/wave