The Really Large Guide to Performance Testing
We have all come across performance issues, whether it be a painfully slow eCommerce site during the holiday shopping or simply a page taking so long to load that it times out. What would you do in that case? That’s easy to answer, as a visitor you would probably quickly move on to another platform offering the same service. However, when it’s your business that loses potential customers because of performance issues, the situation is more complicated.
The numbers around eCommerce revenue and performance issues make a good case for performance testing. Taking an eCommerce platform earning $100,000 a day as an example, a one-second delay could potentially amount to $2.5 million every year in lost sales. Statistics are one thing, but we all know how frustrating it is to use a slow website or application. It’s bad for your user experience and for your business’s revenue. Performance testing can help avoid these kinds of issues by uncovering the root causes behind a lagging performance and suggesting how to fix these issues.
In this guide, we will walk you through everything you need to know about performance testing, from the performance testing types to the tools you need to help you improve your performance.
If you landed on this page because you need help conducting performance tests on your applications check how spriteCloud can support you.
Share this guide:
Table of Contents
What is Performance Testing?
It’s always good to first understand the terminology, so we’ll explain what performance testing is. It’s the process of determining how a system behaves and performs under a certain workload.
In this context, the speed, scalability, responsiveness and stability of the system are meticulously examined. The performance of the system’s components is checked by passing different parameters in different load scenarios. Doing so allows us to determine the benchmarks and standards for the application under normal conditions.
Further tests can help to find how an application behaves when it’s put under extreme stress and when it reaches its breaking point. This can be helpful to plan for busy periods. Tests can also be focused on various areas of the back-end to determine where performance issues reside.
For example, in a recent project our performance engineer discovered that the application was running slow because the CPU was spending its efforts on garbage collection rather than running the application. Read the case study or watch the video to see how we supported this client and how we could help you.
What happens when a website is subjected to a flood of visitors on Black Friday for example. Will it remain stable and fast? When will it reach the breaking point?
Performance tests aim to proactively find out if an application meets the speed, stability and scalability requirements needed to keep your business goals attainable.
Why is it important to conduct performance tests?
We all had the experience of trying to use a website with a slow internet connection and the frustration that nearly made you smash your computer or phone. Performance testing helps ensure that your applications never make your visitors feel this frustration and leave your application looking for a faster competitor.
You might think your website is fast enough. And, worst-case scenario, what are a few seconds of delay? Your visitors certainly disagree.
As a matter of fact, 40% of people abandon a website that takes more than three seconds to load. And this number reaches a whooping 90% user abandonment rate after 5 seconds. When it comes to user experience (UX), speed is a crucial factor. Irrespective of the number of concurrent users accessing your application, the device or browser they use, or their connection speed, your application should aim to deliver consistent performance. Conducting performance tests allows you to consistently offer an excellent experience to your customers.
From a marketing perspective, since Google has now placed an importance on page speed with its Web Core Vitals update to its search algorithm, speed also impacts organic search rankings (SEO). You may not depend on organic traffic for revenue but it is typically the best converting marketing channel. After all, the best place to keep a secret hidden is on page two of Google search results.
Many companies are also moving away from hosting physical servers on their premises to using cloud services. These usually have a pay-as-you-go pricing structure, in which an application with performance issues requires more space and compute time on the servers, needlessly costing you money. Optimising your performance can therefore have a direct impact on your operating costs.
All in all, may it be because you are losing customers, have less traffic coming to your website or have high operating costs, it’s clear, performance issues have a negative impact on your bottom line.
To help you make sure you avoid these consequences, continue reading this guide to performance testing for more tips and best practices, and don’t hesitate to get in touch with spriteCloud if you require the assistance of our expert Performance Engineers.
What performance metrics should you monitor?
This won’t come as a surprise, before you start testing it’s important to decide what measurements will be used to assess your system’s performance and allow you to set performance goals. There are many possible metrics, but the four most important ones are response time, the number of concurrent users, throughput and resource utilization.
The response time of an application may well be the most important one as it determines how responsive an application is to a user’s request, i.e. how fast does the webpage feel. And as we know, slow response times contribute to a poor user experience, and a potential loss of revenue.
How is this measured? It’s pretty straightforward, the time it takes the application to process a user’s request and send the response back to them is the response time.
Concurrent users vs. simultaneous users
Concurrent users are another crucial term for performance testing. They are used to determine how much load an application can take before it becomes unresponsive and its performance plummets. A common misconception is to see these as simultaneous users. Concurrent users are not all using an application at the exact same time and requesting the same thing. They access an application and request different things within a certain time frame.
Let’s look at an example, on an eCommerce website one customer might be browsing through the books of an online shop. Three other customers forgot their password and are requesting a new one. Another one could be creating an account. All these user requests use different resources of your application during a certain time interval. However, the only ones who could be considered simultaneous users are the ones who are requesting a new password and only if they do so at the same time. So concurrent users are a more realistic test of real world conditions.
When conducting a test, one will also look into the throughput, this is the number of requests an application can handle in a particular time period. In the performance testing jargon, this is also referred to as transactions per second (TPS).
This metric is useful to consider for occasions like sale periods, like Cyber Monday or the launch of new marketing campaigns where suddenly there may be many more users on a particular webpage.
Finally, let’s look at resource utilisation. This indicator is rather broad and encompasses the different ways in which a system’s resources are used during a performance test. Simply put, it tells us how and which resources are busy when a performance test is conducted. There are many different metrics that can be used to monitor utilisation, the most common ones are CPU (central processing unit), memory, disk and network utilisation.
The CPU utilisation can be visualised as a system’s brain: if it has to deal with too many requests or there’s a single task that requires all its attention it can get overwhelmed. The percentage of time the CPU needs to handle a process allows you to know if your test is effective and to flag performance issues. High percentages, usually around 80%, signal possible performance problems. It’s also possible to use the CPU as a general indication of how a system is performing after a change has been made.
Memory usage specifies the amount of memory needed to process a request. If this is too high the system may be slowed down. Unusually high numbers suggest that there is a memory leak, which is a red flag for any performance tester. Disk usage corresponds to the time a disk needs to execute a request.
Finally, monitoring network utilisation, the ratio between the current traffic and maximum traffic, will allow you to find out how busy a network is. If network utilisation exceeds what it should be under normal conditions, the consequences will be low transmission speed, intermittence or request delays for example. Tracking if the network utilisation is idle, normal or substantial allows to establish benchmarks and troubleshoot network failures should they happen.
Depending on the project and business needs other metrics will be included and certain metrics will have a higher priority compared to others.
What are the different types of performance tests?
Now that we explained what performance testing is, why it’s important, and what key metrics to consider, let’s dive into the different types of performance tests and what their goal is.
Here the idea is to understand how the system behaves under increasing user load, to see how the performance is impacted by this increase, and finally to determine the maximum capacity of the system.
Can the system cope with user requests when there are 100, 1,000 or 100,000 concurrent users? Is the performance consistently reliable, even with a high load? When does the system reach its limits? All questions that load testing can answer.
In this process, we want to simulate real-life conditions. Let’s quickly come back to what we discussed in the section above about concurrent users. They are not necessarily simultaneous users.
In a load test, we want to simulate a realistic situation where many concurrent users are doing different things on a website during a certain time interval. This test should emulate real-world conditions. This is different from having all users doing the same action on a website at the same time (aka simultaneous users).
A stress test looks into the system’s performance when it’s subjected to extreme load. It checks if the system remains stable even after it reaches its tipping point. To do so the load is gradually increased until the application becomes unresponsive. Beyond that, a stress test also examines the mechanisms in place when a system fails, such as error management and recovery procedure.
Does this sound too abstract? Have a look at our case study, we recently performed a stress test for a renowned eCommerce platform.
As the name indicates, endurance tests are conducted over a longer period. A continuous load is put on the system and different responses are monitored to detect if there is a loss of performance. For example, memory utilisation is usually tracked to see if there are any leaks. An increase in response time or decrease in throughput can also signal a performance deterioration, ultimately signalling that the system isn’t as robust as expected.
Volume Testing (or Flood Testing)
Here the goal is to test the efficiency of a system regarding data processing. Can it cope with voluminous quantities of data? During a volume test, the amount of data in a database is dramatically increased and the general system behaviour is observed to detect possible bottlenecks. Relevant metrics to track could be the response time of the database, its memory utilisation, as well as proper data storage, potential data loss and data integrity.
This test is undertaken to find out if the application’s number of concurrent users and user requests can be scaled up or down without seeing a deterioration in effectiveness. In a nutshell, this type of test looks into how adaptable a system is to varying needs considering the maximum number of users it can sustain without compromising user experience.
That is particularly useful before an expected increase in traffic, like in the case of a new marketing campaign being launched. It allows the development and operations team to spot when the application reaches its breaking point and degradation occurs on the client-side. Knowing this will enable performance engineers and developers to work on increasing the application’s capacity if they are deemed too low.
While for an untrained eye a spike test might look similar to a stress test, they differ significantly. When performing a spike test the load increment is sudden and followed by a return to the normal situation. During a stress test, the load increases gradually until the application becomes unresponsive. Accordingly, the goal of a spike test is to find out if the system can cope with such a sudden and important change of load. Between the spikes, the systems recovery time is also monitored.
Companies’ systems can face unexpected and sudden higher numbers of users. Maybe this sudden higher traffic is due to a Distributed Denial of Service (DDoS) attack. Knowing how your system behaves under this spike can inform your cybersecurity policy. This is an important reminder that testing for average usage is not sufficient and that looking into a system’s robustness to peak-load is a crucial part of performance testing.
Performance testing methodology
Now that we laid out the different performance tests let’s see how these are planned and carried out.
Identify the test environment
The first step is to get to know the testing environment, the tools available, the hardware, the software, the database setup and the network configuration. Doing so before starting to test allows testers to know what they are dealing with and eventually design better tests. It’s also the opportunity to uncover possible future challenges and take these into account in the testing process.
Determine the performance criteria and set performance goals
Depending on the application different performance criteria and goals should be identified. Metrics used for these criteria could include the response time, throughput, resource utilisation etc.
Plan and design the tests cases
In this step testing scenarios should be developed considering the test data and metrics. User variability should also be kept in mind when designing the test cases and they should reflect the diversity of usage. Planning for a wide range of scenarios will ensure truly comprehensive testing.
Configure the test environment
The test environment and all monitoring instruments should now be set up and adjusted with the idea of replicating the production environment.
Implement the test design
Execute the tests
The moment you have been waiting for is here: it’s time to start testing. In this step, the scripts are executed according to the different workloads that were previously planned.
Analyse, finetune – and start again if needed!
This is the most important step, as the findings will help to fix performance bottlenecks and identify the causes of poor performance. With these reports, the development team can improve the application. Once that is done it can make sense to retest in order to see if the changes bore fruits. This cycle can be repeated as many times as necessary to ensure the quality requirement is met.
Performance testing tools
There are many tools out there and it’s not always easy to find the right one. We asked our performance testing engineers what their go-to tools are. They gave us a selection of licensed and open-source tools that can help you improve your application’s performance.
Loadrunner enables you to test web applications, as well as ERP software, legacy system applications or web 2.0 technologies. It’s a sophisticated testing tool that gives testers a full overview of a system end-to-end performance and each of its individual components.
Loadrunner is a very versatile tool with many advantages that supports more than 50 protocols. Thanks to the thorough view of an application’s performance it gives, the root causes can be determined and performance issues can be solved quickly. You can perform large scale performance tests involving thousands of simultaneous users and emulate real-life user load with Loadrunner. Cloud-based testing can also be conducted with this tool. However, Loadrunner comes at a (high) price. Its license is based on the number of virtual users and is expensive, making it unsuitable for a more limited budget.
Neoload can be used to perform load and stress tests on web and mobile applications. It’s often praised for being user friendly, having a flexible license model and being cheaper than Loadrunner. Its path maintenance feature, which allows to redesign scripts is also particularly appreciated. It supports SAP, citrix and web socket protocols. Cloud testing is possible directly through Neoload without having to purchase an additional license. Besides that, Neoload has an agentless monitoring module, which means that you don’t need to install an agent on a server to measure the server’s resource utilisation. Finally, this tool supports a continuous integration pipeline.
This being said, Neoload also has notable drawbacks. There is no autosave feature and the code editor is not very powerful for scripting. Reporting on Neoload web could also be improved. To host Neoload web on Azure, having mongo DB is required.
On the downside, it only supports HTTP/HTML protocols. With a free subscription, there is only a 20-minute execution available. As this tool is web-based, often cumulative layout shifts (CLS) happen when clicking on unwanted elements.
This very popular open-source alternative to Loadrunner has been built with many sophisticated features. It’s designed especially for load testing, allowing you to measure the performance and load times on an application.
JMeter is a Java application where you can right-click your way through different menus to conduct all the tests a professional performance engineer would need. It can test many different technologies with a large range of protocols like Java Objects, SOAP and Rest Services, FTP, Databases with JDBC as well as Web HTTP/HTTPS, for example.
JMeter uses Groovy as a default testing language. As mentioned earlier, this tool is very popular which means that there are a lot of resources out there if you need help and multiple integrations available. Having said that, if you intend to conduct large scale tests Jmeter might not be your best choice as it’s hard to scale up.
Gatling is a stress test tool that uses Scala as a programming language. Being code based, it gives you a lot of flexibility and offers all the benefits of Scala and Java. As Gatling allows you to develop your tests directly using code, it enables you to take a shift-left code approach to performance testing.
When should Gatling be used? If you want to perform load and stress tests without looking into other performance requirements it’s an ideal tool.
Locust is an easy to use load testing tool which enables you to measure response time for websites or other applications.
As Locust is built on an event-based architecture rather than a thread-based one, like JMeter, it uses fewer resources for its tests. This is particularly handy when you have to conduct large scale performance tests. Being able to create your test scenarios using Python is another of the advantages of Locust. However, as it’s relatively new, the number of plugins and additional resources is more limited than for JMeter for example.
How can Calliope Pro help?
Having one go-to tool is the exception rather than the rule. Testers usually use multiple tools and their personal tool repository is likely pretty wide. If that is your case too, you might want to use one tool for a certain application and combine it with another for a more thorough testing exercise. Doing so can fragment your test results and make these harder to follow across the different platforms for the rest of the QA team.
If you find yourself in this situation, give Calliope Pro, our test result dashboard, a try. With Calliope, all your test results are easily accessible by all the development team in one and the same place. Learn more about Calliope and try it out for free.
Performance testing best practices
Ideally, performance testing should be proactive. Rather than being prompted by user complaints or a decline in sales, performance tests should be an integral part of the development cycle, following an agile approach to software development.
Timing and repetition are key
Performance tests should be conducted early in the development process – leaving these out for the final sprint will likely lead to neglecting important performance issues. In software development, typically the later an bug or issue is found in the development process, the more expensive and time consuming it is to fix. Testing often and repeating tests is another crucial aspect. In order to get a comprehensive picture of an application’s functioning, frequent testing is necessary.
What should you test?
As the previous point on starting early might hint at, testing the ‘finished’ product is good but testing individual units and modules early on in the development as well is better. The different systems forming an application, such as databases and servers, for example, should be tested in conjunction as well as separately.
Report averages and pay attention to outliers
Relying on averages only can be misleading. Reporting the average response time without including outliers can distort your test results. Including the 90th percentile or the standard deviation as well will give a more complete picture of the application’s performance.
Performance Testing vs. Performance Engineering, what’s the difference?
Performance issues don’t come from bad luck, they are designed into the software, after uncovering their root causes with performance tests, they can then be corrected. Performance testing and performance engineering are closely related and certain practices of performance engineering overlap with performance testing.
Performance testing aims to detect speed, scalability, responsiveness and stability issues that undermine performance under a particular workload. These issues are reported to the rest of the development team so that they can be corrected. Traditionally this used to happen at a later stage of the development process after functional testing had been performed. With the move towards the right and left shifting, performance testing gets integrated into the development process, and in a way gets closer to performance engineering.
Performance engineering goes one step further and besides identifying performance issues, it also fixes these. What are typical tasks performance engineers undertake to improve performance? They can rewrite application logics, perform heap and thread analyses, CPU profiling and can also look into load balancing and implementing caching strategies, for example.
Performance engineering is seen as an integral part of the software development life cycle (SDLC). Doing so allows spotting performance problems early on when they are easier and cheaper to solve. This ties into the changes introduced by Agile development and the focus on continuous integration as well as continuous delivery (CI/CD). Performance engineers are therefore implicated during the entire development process – from the design to the end-user experience.
In a perfect world, all development teams would be Agile and they would have all the resources and funding required to build the perfect product. But clearly, we don’t live in a perfect world.
We live in one where development teams say they work Agile but are actually working Waterfall, where large portions of development is outsourced making performance engineering impossible, or where hiring full time software testers and performance engineers is too costly because they aren’t needed consistently.
Fortunately for the latter case, spriteCloud is here to provide the performance engineering and software testing expertise when you need it, and only when you need it.