Performance testing measures how well our Mule application performs under specific conditions. It’s a way to ensure our app can handle real-world demands, from a handful of users to a bustling swarm of requests. This testing helps find bottlenecks, predict capacity, and confirm stability.
We, as MuleSoft Architects and Developers, should do performance testing to ensure our Mule applications meet functional and non-functional requirements under varying load conditions.However, performance testing is often thought of as a single activity, but it’s actually a collection of different techniques designed to evaluate specific aspects of an application’s behavior. As a matter of fact there’s not a unique performance test but a few different types of performance tests that we should do for our Mule apps. The type of performance testing we choose depends on what we want to learn about our app and the conditions it will face in the real world.
Is our app prepared for a sudden traffic surge? Can it run for hours without slowing down? Will it scale as demand grows? Each of these questions points to a different kind of performance test. Understanding these distinctions allows us to tailor our testing approach and uncover valuable insights that a generic performance test might overlook.
In this post, we will dive into the main 4 types of performance testing that we should perform to our Mule applications and understand when to apply each type of testing so that we can make our Mule apps robust, efficient, and ready for any challenge. There are more types of performance testing, but we’ll focus on the following 4, as they are considered, in my opinion, the minimum tests we should always do for our mule apps.
1. Load Testing
The goal of a Load Testing is to evaluate how our application performs under expected, steady-state load conditions. It simulates expected traffic to see how the app handles typical loads.With a Load Test we can:
- Validate application SLAs, such as throughput, response time, and error rates, under typical user loads.
- Identify bottlenecks in normal operational scenarios.
Example:
- Simulate 100 concurrent users or 1000 requests/minute for a specific duration
Before we do a Load Testing it is important that we’ve got some requirements - we need to have some numbers in terms of what is the (normal) expected throughput or response time for our app so that we can simulate this traffic. With no requirements we can’t do a proper load testing. Well, actually we could do it assuming/guessing our own metrics as Mule developers/architects but how do we really know the throughput/response time we use for our tests are acceptable?
How do we perform a Load Test?
The best way to do Load Testing is to use tools like JMeter/K6/Blazemeter to generate traffic and monitor the response of our app. With these tools we can simulate the expected throughput and monitor, among other values, the response time and error rate of our app. If those metrics are within our requirements that means our app has passed the Load test.2. Stress Testing
The purpose of Stress Testing is to push our application beyond its limits to see how it reacts under extreme pressure. This type of test is useful to identify breaking points and also to verify the app’s behavior under resource exhaustion or failures and how gracefully our apps handles failures.So, with a proper Stress test for our app, that means we are after two things:
- Set the application's maximum capacity (throughput and concurrent requests).
- Verify the application’s behaviour under extreme pressure. This means, checking that the reliability mechanisms we put in place for our app are working - reconnection strategies, circuit breakers, Dead Letter queues. And also our error handling strategy - we force the apps to fail and then we check if the apps is doing what it’s supposed to do when it fails.
How do we do a Stress Test?
Using a performance test tool like the ones mentioned, we can gradually increase load until the application fails or its response time and error rates exceed acceptable limits. So, normally this type of Testing requires multiple iterations to run the same tests under different loads.3. Spike Testing
In a Spike test, we are trying to understand the application’s behaviour when sudden/dramatic increases in load happen. This is slightly different from the Stress test. In here, we’re not really pushing the app to the limit, we’re testing the response of the app when, in a very short period of time, the traffic increases very quickly (a spike).This test helps us to simulate traffic patterns that happen in some common scenarios - for example when we know that most of our consumers will try to access our API at the same time as a result of a previous notification or a peak hour.
The goal here is to test how the application handles sudden, large increases in traffic and to verify recovery behavior after the spike subsides.
As a best practice, try to understand as much as you can the traffic patterns of your app - is your app likely to have spikes or moments during the day in which the load can increase dramatically. And if so, how much would this increase would be? Answering these questions would make our tests more real.
For best results, when simulating the peak we should aim for a real peak. Don’t just simulate a peak of unrealistic conditions, that won’t help us to understand if our apps can handle the real pattern of traffic.
Example
- Simulate a traffic surge from 100 to 1000 requests/second within a short time.
How do we perform a Spike test?
First, define the conditions of the Spike we need to generate. For example:- Baseline Load: 10 requests per second
- Spike Load: 500 requests per second
- Spike Duration: 30 seconds
- Recovery Period: 60 seconds
4. Endurance (Soak) Testing
In an Endurance Testing, the purpose is to understand the application behaviour over an extended period under sustained load. It’s similar to a Load Test, in the sense that the test is performed under typical load.The difference with an standard Load Testing is that in here we will try to detect issues in the app that may emerge only during prolonged operation like memory leaks, resource depletion, garbage collector issues or any sort of performance degradation.
This test is fundamental to validate application stability and reliability over time. In particular, for Mule applications, this type of test is very important, specially for those deployment models in which the runtime plane provide any kind of bursting.
For example, in CH1.0, the bursting was controlled by the AWS credits system and allows an application to use extra resources (CPU) for as long as it has credits. However, when the credits are consumed the CPU assigned to the app can drop dramatically and affect the app performance. This is why it’s very important, with this type of test, to verify that the normal operation of the app over a extended period of time does not consume all the AWS credits or it is not impacted when this happens.
Example:
- Run 500 requests/minute for 24 hours to check for resource exhaustion or performance degradation.
How do we perform an Endurance Test?
- Monitor system performance while running sustained tests for hours or days.
Other types of Performance Testing
These 4 types of performance testing are not the only ones. Actually, there are many more that we can do, depending on the different scenarios and conditions we want to test for our apps. Here’s a quick list:Scalability Testing - Assess how the application scales when additional resources (workers/replicas, cores, memory) are added.
- Failover and Resilience Testing - Test the application's behavior during failures and recovery scenarios.
- Backend Dependency Testing - Assess the performance impact of downstream systems (databases, external APIs) on the Mule application.
- Data Volume Testing - Test the application's ability to handle large payloads or high volumes of data.
- Latency Testing - Measure the impact of network latency on application performance.
- Security Testing - Test the impact of security measures on application performance.