Continuous Profile-Guided Optimization (PGO) platform based on Cloud Profiler

Hi!

I evaluate multiple approaches to applying Profile-Guided Optimization (PGO) to the IT ecosystem (if you are interested you can check the results in https://github.com/zamazan4ik/awesome-pgo/. According to my tests, PGO helps with achieving better performance for many existing applications.

There are two major kinds of PGO: Instrumentation and Sampling (about both of them you can read the Clang compiler documentation: https://clang.llvm.org/docs/UsersManual.html#differences-between-sampling-and-instrumentation. Despite Instrumentation-based PGO is much more well-known in the community (from my experience), it has major drawbacks. One of the biggest drawbacks is the performance overhead during the profiling phase.

To resolve this issue, Google invented Sampling PGO (also sometimes called AutoFDO:  https://github.com/google/autofdo. This approach uses `perf`-based profiling to collect the PGO profiles directly from the production environment, and then use them during the optimization phase. More details about the whole ecosystem around AutoFDO in Google can be found in their paper: https://research.google/pubs/pub45290/. I think you know much more about that than me 🙂

My idea is to think about building the same system as Google has internally but based on Cloud Profiler for GCP customers. It could be an interesting opportunity for Google to become a unique solution in the continuous optimization area.

I created issues for the same feature in other projects: Grafana Pyroscope (https://github.com/grafana/pyroscope/discussions/2783) and Elastic Universal Profiling (https://github.com/elastic/elasticsearch/issues/105802).

Would be great to hear thoughts about the idea from the Cloud Profiler devs.

Thank you.

1 3 271
3 REPLIES 3

Hi @zamazan4ik 

Welcome to Google Cloud Community!

Cloud Profiler, offered by Google Cloud, is a tool designed to continuously monitor the performance of your applications in production environments. It is a valuable tool for gaining continuous insights into a production application's performance without introducing significant overhead. While traditional profilers might offer more in-depth profiling capabilities, Cloud Profiler's focus on continuous monitoring and ease of use in production environments makes it a compelling choice for many applications. Here are some differences of Cloud Profiler compared to other profilers:

Continuous monitoring -- Unlike traditional profilers that capture data at specific points in time, Cloud Profiler gathers information constantly without significantly impacting application performance (low-overhead). This helps identify performance issues that might be fleeting or only occur under certain conditions.

Low Overhead -- Unlike some profilers that can slow down an application, Cloud Profiler is designed to be statistically light.expand_more It gathers data by sampling the application's activity at regular intervals, keeping the performance impact minimal.

Cloud-based analysis -- The data collected by Cloud Profiler is stored and analyzed in the Google Cloud console. This offers a centralized view of the application's performance and facilitates easy access to profiling information.

Statistical profiling -- Cloud Profiler gathers statistical data on CPU usage and memory allocation. This provides insights into overall application behavior and helps pinpoint areas consuming the most resources.

I hope this information is helpful. You can find additional information about Cloud Profiler here.

If you need further assistance, you can always file a case with our support team.

I already read all the documentation about Cloud Profiler. I understand what the project is capable to do for now. I suggest extending the Cloud Profiler's functionality to use its infrastructure for doing continuous Profile-Guided Optimization (PGO). Like it's already done inside Google according to the papers from the starting post.

Anyway, I created a feature request in the corresponding bug tracker: https://issuetracker.google.com/issues/337900460

Thanks for your feedback! I'm a product manager in Cloud Observability. We are not actively investing in more functionality for Cloud Profiler so if your needs involve more language coverage, I'd recommend a partner solution such as Grafana.
 
Sorry that this isn't the answer you're hoping for but hopefully this will allow you to make the best possible decision for the future. I'd be happy to connect with you if you'd like to reach out directly.
 
Sincerely,
 
Mary Koes
Product Manager, Google Cloud