Reduce Cloud Spend

CPU Profiling

Whole-system Continuous Profiling Platform

Understand exactly why slowdowns happen, not just where. Get fleet-wide continuous profiling that pinpoints performance bottlenecks across your entire infrastructure, down to the exact function and line of code.

Efficient services are faster, cheaper, greener. Ditch the guess work

Click to expand

Investigating impact of Python GIL

Reduce mean-time-to dopamine

It Just Works

Production visibility without the performance cost. Deploy once, profile forever.

< 1%

Performance Overhead

Minimal impact on production systems

3

Minutes to Deploy

From installation to insights

0

Code Changes Required

Works with existing applications

zymtrace does not require any application source code change, instrumentation, on-host debug symbols or other intrusive operation. Just deploy the agent and receive profiling data a few minutes later.

Purpose-built to run efficiently in production

Benefits of Whole System Profiling

zymtrace builds stack traces that go from the kernel, through userspace native code, all the way into code running in higher level runtimes, enabling unprecedented insight into your system’s behavior at all levels.

Find Production Bottlenecks

Identify methods and functions that perform well in development but become bottlenecks under real production load and traffic patterns.

Reduce Cloud Spend

Reduce compute costs by identifying and optimizing CPU-intensive code paths that consume excessive cloud resources.

Catch Regressions

Quickly identify when code changes introduce performance regressions by comparing before and after profiling data.

Pinpoint Root Causes

Drill down to specific lines of code causing latency spikes, high CPU utilization, or excessive memory allocations.

eBPF Powered

Polyglot Visibility

zymtrace is SDK-less. It never interferes with your runtime.
Profile any application, any language, any runtime—all through a single, unified platform.

Python

Python

C++

C/C++

Java

Java (Zing, GraalVM )

Go

Go

Rust

Rust

Node.js

Node.js

Ruby

Ruby

.NET

.NET

Zig

Zig

Perl

Perl

Scala

Scala

V8

V8

100%

Code Coverage

All functions profiled

0

Rebuilds Required

Works with existing binaries

0

Debug Symbols Required

No symbols needed on host

OpenTelemetry

OpenTelemetry Compliant

zymtrace is OpenTelemetry compliant, including support for OTEL resource attributes.

Fun Fact

The zymtrace team were part of the team that pioneered, open-sourced, and donated the eBPF profiler to OpenTelemetry.
With zymtrace, we’re extending that same low-level engineering excellence to GPU-bound workloads and building a highly scalable profiling platform purpose-built for today’s distributed, heterogeneous environments — spanning both general-purpose and AI-accelerated workloads.

FAQ

Frequently Asked Questions

CPU profiling is a performance analysis technique that shows how an application consumes CPU time, helping you uncover bottlenecks and inefficient code. zymtrace samples the CPU at 20Hz (configurable) to capture where threads spend their time. The profiler generates reports—often as flame graphs—that make it easy to see which functions are using the most CPU resources.
Continuous profiling builds on this by adding the time dimension. Because it runs all the time, it captures data as issues happen in production, instead of forcing you to reproduce them later in a different environment. It also gives better statistical accuracy, helping you pinpoint where your code consistently spends the most time so you can debug and optimize performance more effectively.
zymtrace supports on-demand profiling too. We recommend that you push on-demand profiling data into a dedicated zymtrace project. A project in zymtrace is a logical grouping of profiling events. Using a dedicated project is recommended to avoid skewing statistical accuracy in other projects that may be using continuous profiling. Refer to our configuration guide for more details.
Metrics, logs, and traces are analogous to measuring and monitoring the vital signs of the human body — they provide general information about health and performance, such as body temperature, weight, and heart rate, including records of events leading to symptoms. But zymtrace CPU profiler is like taking an X-ray, or better an MRI scan — it allows you to see the inner workings of the body and understand how different systems interact, giving more detailed information and potentially identifying issues that would not be visible just by looking at macro-level indicators. Further, CPU profiling provides unprecedented breadth and depth of visibility that unlocks the ability to surface unknown-unknowns of your application workloads. This deeper level of system-wide visibility enables users to ditch the guesswork; it opens up the ability to quickly get to the heart of the "why" questions –– why are we spending x% of our CPU cycles on function y? Why is this service consuming more resources than expected? What functions are consuming the most CPU time across our entire fleet?
One of the big hurdles to profiling is that upstream dependencies are often compiled with frame pointer omission – a compiler optimization that complicates unwinding of stacks during the collection of profiling data. The result of this is that most other profilers require either PMC access (which is not available in most virtualized environments) or debug symbols for all dependencies (which is time-consuming to obtain, disk-space-intensive, and generally deemed bad practice for production systems). zymtrace is different: zymtrace can unwind stack traces through C/C++/Rust binaries even if the frame pointer has been omitted, without debug symbols present, and without PMC access. We achieve this by doing some fairly heavy lifting from eBPF and we do after-the-fact symbolization by crawling and indexing popular Linux repos and Docker containers for debug information in our global symbolization service. Read more
zymtrace aims to stay within a budget of <1% of CPU usage and less than 250MB of RAM, meaning that for most workloads it can run 24/7 with no noticeable impact on the profiled systems. For workloads where even this CPU budget is too high, zymtrace can run at randomly selected time intervals to gain insights with an even lower impact on performance. The sampling-based approach ensures your applications run smoothly while collecting valuable performance data.
We support all major programming languages including Python, Java, Go, Node.js, C++, Rust, and .NET. Our agent can profile polyglot applications and provide unified analysis across different language runtimes.
Yes, our agent works seamlessly with containerized applications, Kubernetes clusters, and all major cloud platforms. We provide native integration with container orchestration systems for easy deployment and management.

Ready to Optimize Your Application Performance?

Get code-level insights into CPU performance bottlenecks with continuous production profiling across all major programming languages.

Start CPU Profiling