Skip to main content

· 9 min read

thumbnail_image

Profiling today looks very different than it did just a few years ago. As people move to more cloud-native workloads continuous profiling has evolved into a key piece of many companies' observability suites. At Pyroscope, we've been a huge part of this evolution thanks to an ever-expanding community who has provided great insight into the use cases where profiling is most valuable and how we can continue to improve that experience.

As a result, over the past few years we've released several products to help developers improve their applications' performance.

  1. Continuous Profiling: Our most popular a tool for continuously profiling your applications accross your entire system and then storing and querying that data efficiently

  2. Adhoc Profiling: In cases where you may not need to constantly profile or perhaps you'd like to save a snapshot of a profile, our adhoc tool allows you to both capture and save specific profiles to later refer to or use for your convenience

  3. Profiling Exemplars: Profiles linked to particular meaningful units such as HTTP requests or trace spans

Introducing CI Profiling#

Now, we're excited to announce the latest addition to the Pyroscope family - CI Profiling. Continuous Integration and Delivery (CI/CD) pipelines are critical for modern software development, but they can also be a source of frustration and inefficiency. Waiting for long test runs, dealing with frequent failures and timeouts, and wasting resources are all common problems associated with CI/CD pipelines. These issues can be compounded when multiple developers are working on the same codebase or when teams are working across multiple repositories. That's why we've developed this new feature that can help:

Continuous Profiling with Pyroscope in your CI/CD pipelines.

· 9 min read
Dmitry Filimonov
ChatGPT

Go 1.20 Experiment with Memory Arenas

caution

Go arenas are an experimental feature. The API and implementation is completely unsupported and go team makes no guarantees about compatibility or whether it will even continue to exist in any future release.

See this Github discussion for more details.

Introduction#

Go 1.20 introduces an experimental concept of "arenas" for memory management, which can be used to improve the performance of your Go programs. In this blog post, we'll take a look at:

  • What are arenas
  • How do they work
  • How can you determine if your programs could benefit from using arenas
  • How we used arenas to optimize one of our services

What Are Memory Arenas?#

Go is a programming language that utilizes garbage collection, meaning that the runtime automatically manages memory allocation and deallocation for the programmer. This eliminates the need for manual memory management, but it comes with a cost:

The Go runtime must keep track of every object that is allocated, leading to increased performance overhead.

In certain scenarios, such as when an HTTP server processes requests with large protobuf blobs (which contain many small objects), this can result in the Go runtime spending a significant amount of time tracking each of those individual allocations, and then deallocating them. As a result this also causes signicant performance overhead.

Arenas offer a solution to this problem, by reducing the overhead associated with many smaller allocations. In this protobuf blob example, a large chunk of memory (an arena) can be allocated before parsing enabling all parsed objects to then be placed within the arena and tracked as a collective unit.

Once parsing is completed, the entire arena can be freed at once, further reducing the overhead of freeing many small objects.

arenas drawio2

· 2 min read
Example banner

What is sandwich view?#

Sandwich view is a mode of viewing flamegraphs popularized by Jamie Wong in the Speedscope project It's function is relatively simple -- the typical flamegraph will break down resource utilization by function, but it can be difficult to see how much time is spent in the function itself vs how much time is spent in the functions it calls. Sandwich view solves this problem by splitting a flamegraph into two sections:

  • callers: the functions that called the function in question (it's "parents")
  • callees: the functions that the function in question called (it's "children")

Finding performance issues with standard Flamegraph mode#

A typical use case for leveraging flamegraphs is to identify opportunities for optimization. With a typical flamegraph the most common workflow is to identify a function node which has the largest width and then to look at the functions it calls to see if there are any low hanging fruit for optimization. For example, in the flamegraph below, we can see that the rideshare/car.OrderCar is the largest function in terms of width and thus a good place to start looking for opportunities for optimization.

Frame width represents CPU time per function
Pyroscope

Finding performance issues with sandwich view Flamegraph mode#

However, you'll also notice that Time.Since() shows up frequently towards the end of almost every path.

Example banner

Sandwich view helps you focus in on functions like this to analyze your application and determine if it's easier to optimize:

  • Time.Since(): a node with a shorter width that gets called frequently across many code paths discovered with sandwich view
  • rideshare/car.OrderCar: a node with a longer width, that gets called infrequently in a single code path discovered with standard flamegraph view

How to use sandwich view for your flamegraphs#

If you want to try it out simply go to your Pyroscope UI or upload a flamegraph to flamegraph.com and select the "sandwich" view icon in the new Flamegraph toolbar:

Example banner

then select a function to see it's callers and callees. We have many more view modes planned for the future, so stay tuned or let us know what you'd like to see!

· 6 min read

What is eBPF?#

At its root, eBPF takes advantage of the kernel’s privileged ability to oversee and control the entire system. With eBPF you can run sandboxed programs in a privileged context such as the operating system kernel. To better understand the implications and learn more check out this blog post which goes into much more detail. For profiling this typically means running a program that pulls stacktraces for the whole system at a constant rate (e.g 100Hz).

image

As you can see in the diagram, some of the most popular use cases for eBPF are related to Networking, Security, and most relevant to this blog post — observability (logs, metrics, traces, and profiles).

Landscape of eBPF profiling#

Over the past few years there has been significant growth in the profiling space as well as the eBPF space and there are a few notable companies and open source projects innovating at the intersection of profiling and eBPF. Some examples include:

The collective growth is representative of the rapidly growing interest in this space as projects like Pyroscope, Pixie, and Parca all gained a significant amount of traction over this time period.

It's also worth noting that the growth of profiling is not limited to eBPF, the prevalence of profiling tools has grown to the point where it is now possible to find a tool for almost any language or runtime. As a result, profiling is more frequently being considered a first-class citizen in observability suites.

For example, OpenTelemetry has kicked off efforts to standardize profiling in order to enable more effective observability. For more information on those efforts check out the #otel-profiling channel on the CNCF slack!

Pros and cons of eBPF and non-eBPF profiling#

When it comes to modern continuous profiling, there are two ways of getting profiling data:

  • User-space level: Popular profilers like pprof, async-profiler, rbspy, py-spy, pprof-rs, dotnet-trace, etc. operate at this level
  • Kernel level: eBPF profilers and linux perf are able to get stacktraces for the whole system from the kernel

Pyroscope is designed to be language agnostic and supports ingesting profiles originating from either or both of these methods.

However, each approach comes with its own set of pros and cons:

Pros and Cons of
native-language profiling

Pros

  • Ability tag application code in flexible way (i.e, tagging spans, controllers, functions)
  • Ability to profile various specific parts of code (i.e. Lambda functions, Test suites, scripts)
  • Ability/simplicity to profile other types of data (i.e. Memory profiling, goroutines)
  • Consistency of access to symbols across all languages
  • Simplicity of using in local development

Cons

  • Complexity, for large multi-language systems, to get fleet-wide view
  • Constraints on ability to auto-tag infrastructure metadata (i.e. kubernetes)

Pros and Cons of
eBPF profiling

Pros

  • Ability to get fleet-wide, whole-system metrics easily
  • Ability to auto-tag metadata that's available when profiling whole system (i.e. kubernetes pods, namespaces)
  • Simplicity of adding profiling at infrastructure level (i.e. multi-language systems)
  • Consistency of access to symbols across all languages
  • Simplicity of using in local development

Cons

  • Requirements call for particular linux kernel versions
  • Constraints on being able to tag user-level code
  • Constraints on performant ways to retrieve certain profile types (i.e. memory, goroutines)
  • Difficulty of developing locally for developers

Pyroscope's solution: Merge eBPF profiling and native-language profiling#

We believe that there's benefits to both eBPF and native-language profiling and our focus long term is to integrate them together seamlessly in Pyroscope. The cons of eBPF profiling are the pros of native-language profiling and vice versa. As a result, the best way to get the most value out of profiling itself is to actually combine the two.

Profiling compiled languages (Golang, Java, C++, etc.)#

When profiling compiled languages, like Golang, the eBPF profiler is able to get very similar information to the non-eBPF profiler.

Frame width represents CPU time per function
Pyroscope

Profiling interpreted languages (Ruby, Python, etc.)#

With interpreted languages like Ruby or Python, stacktraces in their runtimes are not easily accessible from the kernel. As a result, the eBPF profiler is not able to parse user-space stack traces for interpreted languages. You can see how the kernel interprets stack traces of compiled languages (go) vs how the kernel interprets stack traces from interpreted languages (ruby/python) in the examples below.

Frame width represents CPU time per function
Pyroscope

How to use eBPF for cluster level profiling#

Using Pyroscope's auto-tagging feature in the eBPF integration you can get a breakdown of cpu usage by kubernetes metadata. In this case, we can see which namespace is consuming the most cpu resources for our demo instance after adding Pyroscope with two lines of code:

# Add Pyroscope eBPF integration to your kubernetes clusterhelm repo add pyroscope-io https://pyroscope-io.github.io/helm-charthelm install pyroscope-ebpf pyroscope-io/pyroscope-ebpf

image

and you can also see the flamegraph representing CPU utilization for the entire cluster: image

Internally, we use a variety of integrations to get both a high level overview of what's going on in our cluster, but also a very detailed view for each runtime that we use:

  • We use our eBPF integration for our kubernetes cluster
  • We use ruby gem, pip package, go client, and java client with tags for our k8s services and github action test suites
  • We us our otel-profiling integrations (go, java) to get span-specific profiles inside our traces
  • We use our lambda extension to profile the code inside lambda functions

The next evolution: merging kernel and user-space profiling#

With the help of our community we've charted out several promising paths to improving our integrations by merging the eBPF and user-space profiles within the integration. One of the most promising approaches is using:

  • Non-eBPF language-specific integrations for more granular control and analytic capabilities (i.e. dynamic tags and labels)
  • eBPF integration for a comprehensive view of the whole cluster

Screen Shot 2022-09-27 at 8 49 41 PM

Stay tuned for more progress on these efforts. In them meantime check out the docs to get started with eBPF or the other integrations!

· 7 min read

floating_cloud_01

We started Pyroscope a few years ago, because we had seen, first-hand, how profiling was a powerful tool for improving performance, but at the time profiling was also not very user-friendly. Since then, we've been working hard not only to make Pyroscope easier to use, but also to make it easier to get value out of it.

As it stands today, Pyroscope has evolved to support an increasingly wide array of our community's day-to-day workflows by adding a valuable extra dimension to their observability stack:

Application Developers

  • Resolve spikes / increases in CPU usage
  • Locate and fix memory leaks and memory errors
  • Understand call trees of your applications
  • Clean up unused / dead code

SREs

  • Create performance-driven culture in dev cycle
  • Spot performance regressions in codebase proactively
  • Configure monitoring / alerts for infrastructure
  • Optimize test suites and team productivity

Engineering Managers

  • Get real-time analysis of resource utilization
  • Make efficient cost allocations for resources
  • Use insights for better decision making

Why we built a cloud service#

As our community has grown to include this diverse set of companies, users, and use-cases, we've had more people express interest in getting all the value from using Pyroscope, but without some of the costs that come with maintaining and scaling open source software. Some of the other reasons we decided to build a cloud service include:

  • Companies who have less time/resources to dedicate to setting up Pyroscope
  • Companies operating at scale who need an optimized solution that can handle the volume of data that is produced by profiling applications at scale
  • Users who are less technical and want a solution that's easy to use and requires little to no configuration
  • Users who want access to the latest features and bug fixes as soon as they are released (with zero downtime)
  • Users who want additional access to the Pyroscope team's profiling expertise and support (past our community Slack and GitHub)

And from our side, we believe that a cloud product will:

  • Making it easier for more companies to adopt Pyroscope
  • Providing more feedback to help prioritize features on our road map
  • Providing more resources to invest in Pyroscope's open source projects
  • Making it easier to offer integrations with other tools in the observability stack (e.g. Grafana, Honeycomb, Gitlab, Github, etc.)

Plus, we got to solve a lot of really cool challenges along the way!

Introducing Pyroscope Cloud#

Today we are excited to announce the general availability of Pyroscope Cloud, our hosted version of Pyroscope!

Pyroscope Cloud enables you to achieve your observability goals by removing concerns around setup, configuration, and scaling. It's designed to be easy to use and gives you a significant amount of insight into your application's performance with very minimal configuration.

Some notable features for the cloud include:

  • Horizontal scalability
  • Support for high-cardinality profiling data
  • Zero-downtime upgrades
  • Data encryption at rest and in transit
  • Compliance with SOC 2
  • Extra support options beyond public Slack / Github
  • Tracing integrations (Honeycomb and Jaeger)

Pyroscope Cloud's Major Scaling Improvements#

Similar to Pyroscope Open Source Software(OSS), the cloud service is designed to store, query, and analyze profiling data as efficiently as possible. However, certain limitations that fundamentally limit the scalability of Pyroscope OSS (for now) have been removed in Pyroscope Cloud.

When running Pyroscope OSS at scale, eventually people run into the limitations of the Open Source storage engine. It is built around badgerDB, which is an embeddable key-value database written in Go. The reliance on this component makes the OSS version of Pyroscope scale vertically but not horizontally.

In the cloud, we replace BadgerDB with a distributed key-value store which allows more freedom to scale Pyroscope horizontally. We leverage many of the techniques used by Honeycomb and many Grafana projects (i.e. Loki, Tempo) but with particular adjustments made for the unique requirements of profiling data (stay tuned for future blog post on this).

This means that with Pyroscope Cloud you don't need to worry about limiting the number of applications, profiles, and tag cardinality that you need in order to get the most out of Pyroscope!

Pyroscope Cloud's Major Features#

We've built Pyroscope Cloud with several different use cases in mind:

Continuous profiling for system-wide visibility#

This feature used for profiling your applications across various environments. Most agents have support for tags which will let you differentiate between different environments (e.g. staging vs production) and other metadata (e.g. pod, namespace, region, version, git commit, pr, etc.). Using single, comparison, or diff view in combination with these tags will let you easily understand and debug performance issues.

Tag Explorer View

  • Which tags are consuming the most cpu and memory resources?
  • How did the performance of my application change between versions?
  • What is our most/least frequently used code paths?
  • Which libraries are consuming the most resources?
  • Where are memory leaks originating?
  • etc.

Adhoc profiling for deep-dive debugging#

This feature is used for when you may want to profile something in a more adhoc manner. Most commonly people use this feature to either upload previously recorded profiles or save a profile for a particular script. Where many used to save profiles to a random folder on their computer, they can use our adhoc page to store them more efficiently, share them with others, and view them with the same powerful UI that they use for continuous profiling.

ebpf adhoc diff

Tracing exemplars for transaction-level visibility#

This feature is used for when you want to correlate profiling data with tracing data. While traces often will tell you where your application is running slow, profiling gives more granular detail into why and what particular lines of code are responsible for the performance issues. This view gives you a heatmap of span durations. We also have integrations with a few popular tracing tools:

Tracing exemplars

Profile upload API for automated workflows and migrations#

Over time we've found that some of the major companies in various sectors have built their own internal profiling systems which often will ultimately dump a large collection of profiles into some storage system like S3.

Pyroscope Cloud's API is built to accept many popular formats for profiling data and then store them in a way that is optimized for querying and analysis. This means that you can redirect your existing profiling data to Pyroscope Cloud and then use the same UI that you use for continuous profiling to analyze it.

# First get API_TOKEN from https://pyroscope.cloud/settings/api-keys
# Ingest profiles in pprof formatcurl \  -H "Authorization: Bearer $API_TOKEN" \  --data-binary @cpuprofile.pb.gz \  "https://pyroscope.cloud/ingest?format=pprof&from=1683809959&until=1683809969&name=my-app-name-pprof"

How to get started with Pyroscope Cloud#

Migrating from Pyroscope OSS#

In order to migrate from Pyroscope OSS to Pyroscope Cloud, you can use our remote write feature to send your data to Pyroscope Cloud. This will allow you to continue using Pyroscope OSS while you migrate your data to Pyroscope Cloud. remote_write_diagram_01_optimized

You can also get started directly with Pyroscope Cloud, by signing up for a free account at pyroscope.cloud.

What's Next for Pyroscope Cloud#

  • CI/CD Integrations (GitHub, GitLab, CircleCI, etc.): We've heard many using Pyroscope to profiling their testing suite and we have plans (link) for a UI specifically geared towards analyzing this data
  • More integrations (Tempo, PagerDuty, etc.)
  • More features (Alerting, etc.)
  • More documentation (Tutorials, etc.)

· 4 min read

Profile AWS Lambda Functions

What is AWS Lambda?#

AWS lambda is a popular serverless computing service that allows you to write code in any language and run it on AWS. In this case "serverless" means that rather than having to manage a server or set of servers, you can instead run your code on-demand on highly-available machines in the cloud.

Lambda manages your "serverless" infrastructure for you including:

  • Server maintenance
  • Automatic scaling
  • Capacity provisioning
  • and more

AWS Lambda functions are a "black box"#

However, the tradeoff that happens as a result of using AWS Lambda is that because AWS handles so much of the infrastructure and management for you, it ends up being somewhat of a "black box" with regards to:

  • Cost: You have little insight into why your Lambda function is costs so much or what functions are responsible
  • Performance: you often run into hard-to-debug latency or memory issues when running your Lambda function
  • Reliability: You have little insight into why your Lambda function is failing as much as its failing

Depending on availability of resources, these issues can be balloon over time until they become an expensive foundation which is hard to analyze and fix post-facto once much of your infrastructure relies on these functions.

Continuous Profiling for Lambda: A window into the serverless "black box" problem#

Continuous Profiling is a method of analyzing the performance of an application giving you a breakdown of which lines of code are consuming the most CPU or memory resources and how much of each resource is being consumed. Since, by definition, a Lambda function is a collection of many lines of code which consume resources (and incur costs) on demand, it makes sense that profiling is the perfect tool to use to understand how you can optimize your Lambda functions and allocate resources to them.

While you can already use our various language-specific integrations to profile your Lambda functions, with the naive approach, adding Pyroscope will add extra overhead to the critical path due to how the Lambda Execution Lifecycle works: image

However, we've introduced a more optimal solution which gives you insight into the Lambda "black box" without adding extra overhead to the critical path of your Lambda Function: our Pyroscope AWS Lambda Extension.

Pyroscope Lambda extension adds profiling support without impacting critical path performance#

lambda_image_04-01

This solution makes use of the extension to delegate profiling-related tasks to an asynchronous path which allows for the critical path to continue to run while the profiling related activities are being performed in the background. You can then use the Pyroscope UI to dive deeper into the various profiles and make the necessary changes to optimize your Lambda function!

How to add Pyroscope's Lambda Extension to your Lambda Function#

Pyroscope's lambda extension works with our various agents and documentation on how to integrate with those can be found in the integrations section of our documentation. Once you've added the agent to your code, there are just two steps needed to get up and running with profiling lambda extension:

Step 1: Add a new Layer in "function settings"#

Add a new layer using the latest "layer name" from our releases page.

lambda_add_a_layer_01-01

Step 2: Add environment variables to configure where to send the profiling data#

You can send data to either either pyroscope cloud or any running pyroscope server. This is configured via environment variables like this: lambda_env_variables_01-01

Lambda Function profile#

Here's an interactive flamegraph output of what you will end up with after you add the extension to your Lambda Function: image

While this flamegraph is exported for the purposes of this blog post in the Pyroscope UI you have additional tools for analyzing profiles such as:

  • Labels: to view function cpu performance or memory over time using FlameQL
  • Time controls: to select and filter for particular time periods of interest
  • Diff view: Compare two profiles and see the differences between them
  • And many more!

· 3 min read

Stop screenshotting Flamegraphs and start embedding them#

Typically a flamegraph is most useful when you're able to click into particular nodes or stack traces to understand the program more deeply. After several blog posts where we featured flamegraphs as a key piece of the posts we found that screenshotting pictures of flamegraphs was missing this key functionality compared to being able to interact with flamegraphs.

As a result, we created flamegraph.com to have a place where users can upload, view, and share flamegraphs.

We recently released an update to flamegraph.com that makes it easy to embed flamegraphs in your blog or website. The steps to embed a flamegraph are:

  1. Upload a flamegraph or flamegraph diff to flamegraph.com
  2. Click the "Embed" button
  3. Copy the "Copy" button to copy the embed code snippet
  4. Paste the embed code snippet into your blog or website

clicking_embed_button_high_res

· 6 min read
warning

Pyroscope is now part of Grafana Labs! As a result we have consolidated efforts around our Grafana plugins to make one recomended way of using Pyroscope. As a result this blogpost is now outdated. However, you can find updated docs on how to add profiling to Grafana with Pyroscope using the Grafana profiling documentation.

Grafana is an open-source observability and monitoring platform used by individuals and organizations to monitor their applications and infrastructures. Grafana leverages the three pillars of observability, metrics, logs, and traces, to deliver insights into how well your systems are doing. Nowadays, Observability involves a whole lot more than metrics, logs, and tracing; it also involves profiling.

In this article, I will:

  1. Describe how to leverage continuous profiling in Grafana by installing the Pyroscope flamegraph panel and datasource plugin
  2. Show how to configure the plugins properly
  3. Explain how to setup your first dashboard that includes profiling
  4. Give a sneak peak of an upcoming feature that will let you link profiles to logs, metrics, and traces

If you're new to flamegraphs and would like to learn more about what they are and how to use them, see this blog post.

Introduction#

pillars-of-observability-complete

Grafana provides you with tools to visualize your metrics, view logs, and analyze traces, but it is incomplete without the added benefits of profiling. Continuous Profiling is super critical when you’re looking to debug an existing performance issue in your application. It enables you to monitor your application’s performance over time and provides insights into parts of your application that are consuming resources the most. Continuous profiling is used to locate and fix memory leaks, clean up unused code, and understand the call tree of your application. This results in a more efficient application.

Benefits of using Pyroscope in Grafana#

Unified view for complete observability#

Using Pyroscope in Grafana provides you with complete observability without leaving your Grafana dashboard. Grafana leverages the powerful features of Pyroscope to take complete control of your application’s end-to-end observability and makes things like debugging easy. You can now see your profiles alongside corresponding logs, metrics, and traces to tell the complete story of your application.

Zero migration cost#

It costs nothing to migrate your application profile from Pyroscope’s UI dashboard into Grafana. Simply open Grafana and install both the Pyroscope panel and datasource plugin, and you’re all set!

left-right: flamegraph in Pyroscope, flamegraph in Grafana

· 6 min read

Introduction#

In this article, I will introduce you to flamegraphs, how to use them, how to read them, what makes them unique, and their use cases.

What is a flamegraph#

A flamegraph is a complete visualization of hierarchical data (e.g stack traces, file system contents, etc) with a metric, typically resource usage, attached to the data. Flamegraphs were originally invented by Brendan Gregg. He was inspired by the inability to view, read, and understand stack traces using the regular profilers to debug performance issues. The flamegraph was created to fix this exact problem.

Flamegraphs allow you to view a call stack in a much more visual way. They give you insight into your code performance and allow you to debug efficiently by drilling down to the origin of a bug, thereby increasing the performance of your application.

· 5 min read

Coming from a background working as a frontend developer at Grafana I'm no stranger to open source performance monitoring. I was part of a team that was responsible for the overall user experience of Grafana and performance was one of the key considerations. Along the line, I learned about a debugging technique known as profiling for monitoring application performance and fell in love ever since.

chrome browser profiler

What is continuous profiling#

“Profiling” is a dynamic method of analyzing the complexity of a program, such as CPU utilization or the frequency and duration of function calls. With profiling, you can locate exactly which parts of your application are consuming the most resources. “Continuous profiling” is a more powerful version of profiling that adds the dimension of time. By understanding your system's resources over time, you can then locate, debug, and fix issues related to performance.

As a frontend developer, my experience with profiling was limited to the browser. However, in the course of my study, I discovered a new pattern of profiling that seems exciting– continuous profiling. Similar to how you use the profiler in the dev console to understand frontend performance issues, continuous profiling allows you to profile servers from various languages 24/7 and be able to understand resource usage at any particular time.

While continuous profiling is new to many, the concept is actually relatively old. In 2010, Google released a paper titled “Google Wide profiling: A continuous profiling infrastructure for data centers” where they make the case for the value of adding continuous profiling to your applications.

Industry traction for continuous profiling#

Since then, many major performance monitoring solutions have joined them in releasing continuous profiling products. As time has gone on, the continuous profiling space has been getting increasingly popular as various companies/VCs are more frequently making major investments in Continuous Profiling to keep up with demand for this type of monitoring.

continuous profiling trends