Combining Performance Engineering with Observability 

If you’ve been a performance engineer for many years then you will remember the days of being in the test analysis trenches armed with only random flashlights (what I call traditional load testing tool’s built-in monitoring). In fact, for each project, we typically chose the commercial load testing tool based on its support of monitoring matching that application’s environment, at least as much as possible. They were always black holes and blindspots though…

I often reflect on how much easier my performance engineering career would have been by using Observability platforms and so I decided to create some blogs to help performance engineers better leverage Observability platforms.

In this post, I’ll discuss one of the ways to effectively combine Observability platforms into performance engineering practices – making your job easier! 

Understanding the Goal of Performance Engineering

At its core, the objective of performance engineering is to Proactively ensure that applications can scale to handle target workloads by executing realistic load tests and analyzing the results in order to provide an excellent user experience. Performance engineers have a very wide skill set and deep knowledge in order to design these performance test harnesses, execute, and analyze results. 

For this blog, let’s assume that all the application environments are already instrumented using OpenTelemetry and data is being sent to the Observability platform. Hooray, no more setting up ANY monitoring using a load tool!!  Phew. 

Start with a “Floor” Load Test

Begin with a persona load script, I like to refer to this test scenario as the “floor” baseline.

This script doesn’t attempt to mimic real user behavior or follow the logical UI steps of complex business workflows. Its main purpose is to cover the breadth of the application service area by executing the business transactions. As the application evolves, you’ll need to update the script to reflect new functionality. In between each business transaction, insert a fixed think time (e.g., 15 seconds). 

Execute a single user load test using your persona scripts in a stable & quiet testing environment that mimics the production deployment architecture. Execute this single user test in a loop of 10 iterations.

Using OpenTelemetry, add custom tags so that you can filter transactions based on attributes’ values making it easier to isolate transactions originating from a load test.  

Use Your Observability Platform for Validation

Compare the response times of the transactions reported by the load testing tool with the data provided by the Observability platform. You are validating that you can now depend on your Observability platform for complete analysis. If you decide to run a protocol load test, you will verify the application response times. If you run a browser test, you will be verifying at the RUM level (Real user monitoring). 

TIP: If any transaction fails to meet its target response time, halt load testing. There’s no need to run more complex tests involving concurrent users. Using your Observability platform’s data such as traces and profiling, you should have enough evidence to pinpoint the code level bottleneck and address it.

Capture and Save Baseline Traces

If the response times match between the load tool and the Observability platform and these response times are acceptable, then save (bookmark) the traces for future reference or have some way to refer back to these full trace timings. These traces will serve as your “floor” baseline for comparison during future performance testing. 

Keep this baseline test script for use in various testing scenarios.

Run Realistic Load Tests

Run more realistic load tests using your performance test harness. These tests should simulate actual user behavior, such as logging in and interacting with the application in a natural way.

Perform a variety of load tests: ramping, soak/longevity, spike/burst, and chaos or seasonal tests, among others. Continue to include the single-user “floor” scenario with these more complex load scenarios.

Hot tip: Often, it’s much more methodical to design a purpose-based load test which answers a question and then walk away once you hit start, don’t watch the test execute in real time like you are watching a sports match and cheering for a certain outcome, become unattached to the end results let it play out – come back to methodically analyze the results only when the test has completed. Also, allowing the test to end shows you how the application environment recovers after load ceases. 

Analyze Results with Observability

Once your tests are complete, analyze the results in your Observability platform. 

Look for key insights, such as:

Throughput: Changes in traffic volume (metrics)

Response TImes/Latency: Identify when and where performance degrades in the distributed environment (metrics, traces))

Root causes: Investigate and Identify performance degradations (infrastructure metrics, traces, logs)

AI within Observability: Use modern AI features to TELL you the bottleneck and Recommend a Solution

When a business transaction degrades past the acceptable response time, you can inspect a trace from that timeframe and compare it to your “floor” trace. Where the added time is being spent should now be obvious. Performance engineers can celebrate a more efficient path to investigating performance issues. 

Observability platforms provide critical visibility into how distributed applications behave under load. They offer a level of insight that load testing tools alone simply cannot match. By using Observability tools, you can quickly identify the underlying causes of performance issues and address them with confidence.

The Power of Observability in Performance Engineering

The behavior of applications under load can be unpredictable, and that’s where Observability platforms shine. The flood light of an Observability platform gives us an ultimate advantage of allowing us to see all the distributed relationships, internal and external dependencies, contextual perspectives, root causes, and cascading consequences within the system. No load tool offers this level of visibility, making Observability an essential part of performance engineering.

Conclusion

There are countless ways to creatively combine performance engineering with Observability, and I hope this post has provided some helpful insights. 

For instance, you can drop and run the same “floor” test script in any environment—even noisy production environments. Again, using filtering on attributes, you can easily isolate specific traces for comparison. This can help clarify where time is being spent in different environments. This particular approach doesn’t remove the requirement for Synthetics in production which comes with historical data, alerts, etc. 

To performance engineers: How are you currently incorporating Observability platforms into your daily workflow? Are you using these platforms to work more efficiently and effectively?