Senior Solutions Architect

Career

Senior Solutions Architect

Remote, US

About zymtrace

Organizations spend billions on GPU infrastructure to power AI, yet roughly 60-65% of that investment is wasted on underutilized hardware, idle cycles, and inefficient workloads. The problem isn’t the teams running them — it’s that existing tools treat GPUs as black boxes, showing surface-level metrics without revealing where the waste actually lives.

zymtrace is a distributed AI infrastructure optimization platform that gives our customers deep, always-on visibility into general-purpose and GPU-accelerated workloads across their entire clusters. We profile from PyTorch and JAX code through CUDA kernels all the way down to individual GPU instructions and stall reasons, then correlate everything back to the CPU code that triggered it. Zero code changes. Zero guesswork.

We work with leading AI labs, Fortune 500 companies, and research firms to debug and optimize their most demanding workloads. Read the anam.ai case study.

Our founders were part of the team that pioneered, open-sourced, and contributed the eBPF continuous CPU profiler to OpenTelemetry, the same technology now adopted by Grafana, Datadog, IBM, Cisco, and others. We’re now applying that same low-level engineering depth to GPU-bound workloads.

We’re a team of kernel hackers and systems programmers who operate at the deepest layers of the stack: GPUs, CUDA runtimes, eBPF, compilers, and instruction-level introspection.

By joining zymtrace, you’ll work at the bleeding edge of modern computing, helping organizations optimize AI training and inference workloads at massive scale. The problems we solve touch every layer of the stack, and the impact is measured in millions of GPU-hours reclaimed.

About the Role

As a Senior Solutions Architect, you’ll be the technical bridge between zymtrace and our most important customers. You’ll work directly with AI/ML engineering teams, SRE leads, and infrastructure decision-makers at companies running large-scale GPU fleets to help them understand, deploy, and extract maximum value from zymtrace.

This is a high-impact, early-stage role where you’ll shape how some of the world’s most demanding organizations think about GPU optimization.

GPU optimization is one of the hardest and most consequential problems in modern computing. You’ll help customers squeeze every drop of performance from the hardware that powers the AI revolution.

Key Responsibilities

Own the technical relationship with enterprise customers from proof-of-concept through production deployment and ongoing optimization
Design and deliver technical demos, workshops, and architecture reviews tailored to each customer’s inference and training workloads
Partner closely with founders and sales on technical discovery, scoping, and deal strategy
Build and maintain technical content: reference architectures, deployment guides, case studies, and best practices documentation
Represent zymtrace at industry events, conferences, and in the broader AI infrastructure community
Collaborate with engineering to improve the product based on real-world customer feedback and deployment patterns

What We’re Looking For

2+ years in a solutions architecture, sales engineering, or technical customer-facing role within infrastructure, observability, or cloud-native tooling
Strong understanding of GPU computing: CUDA, GPU memory hierarchies, inference and training pipelines, and common performance bottlenecks
Hands-on experience with AI/ML frameworks such as PyTorch, JAX or similar
Familiarity with Linux systems internals, profiling tools, and observability stacks (OpenTelemetry, Prometheus, Grafana, etc.)
Ability to communicate complex, low-level technical concepts clearly to both engineers and executive stakeholders
Self-starter mentality suited to a fast-moving, early-stage environment where you’ll define your own playbook

Nice to Have

Experience with eBPF, kernel tracing, or low-level performance engineering
Experience with Helm charts and Kubernetes
Background in HPC, scientific computing, or quantitative research infrastructure
Familiarity with NVIDIA tools like Nsight Compute, DCGM, or NVML
Previous experience at an infrastructure or developer tools startup
Existing relationships with AI/ML infrastructure teams at major enterprises or cloud providers

Why Join zymtrace?

Work at the frontier. GPU optimization is one of the hardest and most consequential problems in modern computing. You’ll help customers squeeze every drop of performance from hardware that powers the AI revolution.
Shape the company. This is an early-stage team. Your fingerprints will be on the product, the go-to-market motion, and the culture.
World-class teammates. You’ll work alongside engineers who helped build the eBPF profiler for OpenTelemetry, created disassemblers used in Firefox and WebKit, and joined from Google to hack on compilers and kernels.
Real customer impact. Our customers include leading AI labs and Fortune 500 companies. The work you do will directly translate into faster models, lower costs, and reclaimed infrastructure budgets.

Benefits

Competitive salary and meaningful equity
401(k) plan
Comprehensive health, dental, and vision insurance
Remote-first (may require travels)
Annual learning and development budget