Core Activities of Performance Testing

  1. Performance testing is typically done to help identify bottlenecks in a system, establish a baseline for future testing, support a performance tuning effort, determine compliance with performance goals and requirements, and/or collect other performance-related data to help stakeholders make informed decisions related to the overall quality of the application being tested. In addition, the results from performance testing and analysis can help you to estimate the hardware configuration required to support the application(s) when you “go live” to production operation.
  2. Activity 1. Identify the Test Environment.  Identify the physical test environment and the production environment as well as the tools and resources available to the test team. The physical environment includes hardware, software, and network configurations. Having a thorough understanding of the entire test environment at the outset enables more efficient test design and planning and helps you identify testing challenges early in the project. In some situations, this process must be revisited periodically throughout the project’s life cycle.
  3. Activity 2. Identify Performance Acceptance Criteria.  Identify the response time, throughput, and resource utilization goals and constraints. In general, response time is a user concern, throughput is a business concern, and resource utilization is a system concern. Additionally, identify project success criteria that may not be captured by those goals and constraints; for example, using performance tests to evaluate what combination of configuration settings will result in the most desirable performance characteristics.
  4. Activity 3. Plan and Design Tests.  Identify key scenarios, determine variability among representative users and how to simulate that variability, define test data, and establish metrics to be collected. Consolidate this information into one or more models of system usage to be implemented, executed, and analyzed.   
  5. Activity 4. Configure the Test Environment.  Prepare the test environment, tools, and resources necessary to execute each strategy as features and components become available for test. Ensure that the test environment is instrumented for resource monitoring as necessary.
  6. Activity 5. Implement the Test Design.  Develop the performance tests in accordance with the test design.
  7. Activity 6. Execute the Test.  Run and monitor your tests. Validate the tests, test data, and results collection. Execute validated tests for analysis while monitoring the test and the test environment.
  8. Activity 7. Analyze Results, Report, and Retest.  Consolidate and share results data. Analyze the data both individually and as a cross-functional team. Reprioritize the remaining tests and re-execute them as needed. When all of the metric values are within accepted limits, none of the set thresholds have been violated, and all of the desired information has been collected, you have finished testing that particular scenario on that particular configuration.
  9. Why Do Performance Testing?
  10. Assessing release readiness by:
  11. Enabling you to predict or estimate the performance characteristics of an application in production and evaluate whether or not to address performance concerns based on those predictions. These predictions are also valuable to the stakeholders who make decisions about whether an application is ready for release or capable of handling future growth, or whether it requires a performance improvement/hardware upgrade prior to release.
  12. Providing data indicating the likelihood of user dissatisfaction with the performance characteristics of the system.
  13. Providing data to aid in the prediction of revenue losses or damaged brand credibility due to scalability or stability issues, or due to users being dissatisfied with application response time.
  14. Assessing infrastructure adequacy by:
  15. Evaluating the adequacy of current capacity.
  16. Determining the acceptability of stability.
  17. Determining the capacity of the application’s infrastructure, as well as determining the future resources required to deliver acceptable application performance.
  18. Comparing different system configurations to determine which works best for both the application and the business.
  19. Verifying that the application exhibits the desired performance characteristics, within budgeted resource utilization constraints.
  20. Assessing adequacy of developed software performance by:
  21. Determining the application’s desired performance characteristics before and after changes to the software.
  22. Providing comparisons between the application’s current and desired performance characteristics.
  23. Improving the efficiency of performance tuning by:
  24. Analyzing the behavior of the application at various load levels.
  25. Identifying bottlenecks in the application.
  26. Providing information related to the speed, scalability, and stability of a product prior to production release, thus enabling you to make informed decisions about whether and when to tune the system.
  27. Project Context
  28. The overall vision or intent of the project
  29. Performance testing objectives
  30. Performance success criteria
  31. The development life cycle
  32. The project schedule
  33. The project budget
  34. Available tools and environments
  35. The skill set of the performance tester and the team
  36. The priority of detected performance concerns
  37. The business impact of deploying an application that performs poorly
  38. Project vision.  Before beginning performance testing, ensure that you understand the current project vision. The project vision is the foundation for determining what performance testing is necessary and valuable. Revisit the vision regularly, as it has the potential to change as well.
  39. Purpose of the system.  Understand the purpose of the application or system you are testing. This will help you identify the highest-priority performance characteristics on which you should focus your testing. You will need to know the system’s intent, the actual hardware and software architecture deployed, and the characteristics of the typical end user.
  40. Customer or user expectations.  Keep customer or user expectations in mind when planning performance testing. Remember that customer or user satisfaction is based on expectations, not simply compliance with explicitly stated requirements.
  41. Business drivers.  Understand the business drivers – such as business needs or opportunities – that are constrained to some degree by budget, schedule, and/or resources. It is important to meet your business requirements on time and within the available budget.
  42. Reasons for testing performance.  Understand the reasons for conducting performance testing very early in the project. Failing to do so might lead to ineffective performance testing. These reasons often go beyond a list of performance acceptance criteria and are bound to change or shift priority as the project progresses, so revisit them regularly as you and your team learn more about the application, its performance, and the customer or user.
  43. Value that performance testing brings to the project.  Understand the value that performance testing is expected to bring to the project by translating the project- and business-level objectives into specific, identifiable, and manageable performance testing activities. Coordinate and prioritize these activities to determine which performance testing activities are likely to add value.
  44. Project management and staffing.  Understand the team’s organization, operation, and communication techniques in order to conduct performance testing effectively.
  45. Process.  Understand your team’s process and interpret how that process applies to performance testing. If the team’s process documentation does not address performance testing directly, extrapolate the document to include performance testing to the best of your ability, and then get the revised document approved by the project manager and/or process engineer.
  46. Compliance criteria.  Understand the regulatory requirements related to your project. Obtain compliance documents to ensure that you have the specific language and context of any statement related to testing, as this information is critical to determining compliance tests and ensuring a compliant product. Also understand that the nature of performance testing makes it virtually impossible to follow the same processes that have been developed for functional testing.
  47. Project schedule.  Be aware of the project start and end dates, the hardware and environment availability dates, the flow of builds and releases, and any checkpoints and milestones in the project schedule.
  48. The Relationship Between Performance Testing and Tuning
  49. Cooperative Effort
  50. Product vendors
  51. Architects
  52. Developers
  53. Testers
  54. Database administrators
  55. System administrators
  56. Network administrators
  57. Tuning Process Overview
  58. Tests are conducted with the system or application deployed in a well-defined, controlled test environment in order to ensure that the configuration and test results at the start of the testing process are known and reproducible.
  59. When the tests reveal performance characteristics deemed to be unacceptable, the performance testing and tuning team enters a diagnosis and remediation stage (tuning) that will require changes to be applied to the test environment and/or the application. It is not uncommon to make temporary changes that are deliberately designed to magnify an issue for diagnostic purposes, or to change the test environment to see if such changes lead to better performance.
  60. The cooperative testing and tuning team is generally given full and exclusive control over the test environment in order to maximize the effectiveness of the tuning phase.
  61. Performance tests are executed, or re-executed after each change to the test environment, in order to measure the impact of a remedial change.
  62. The tuning process typically involves a rapid sequence of changes and tests. This process can take exponentially more time if a cooperative testing and tuning team is not fully available and dedicated to this effort while in a tuning phase.
  63. When a tuning phase is complete, the test environment is generally reset to its initial state, the successful remedial changes are applied again, and any unsuccessful remedial changes (together with temporary instrumentation and diagnostic changes) are discarded. The performance test should then be repeated to prove that the correct changes have been identified. It might also be the case that the test environment itself is changed to reflect new expectations as to the minimal required production environment. This is unusual, but a potential outcome of the tuning effort.
  64. Performance, Load, and Stress Testing
  65. Performance testing.  This type of testing determines or validates the speed, scalability, and/or stability characteristics of the system or application under test. Performance is concerned with achieving response times, throughput, and resource-utilization levels that meet the performance objectives for the project or product. In this guide, performance testing represents the superset of all of the other subcategories of performance-related testing.
  66. Load testing.  This subcategory of performance testing is focused on determining or validating performance characteristics of the system or application under test when subjected to workloads and load volumes anticipated during production operations.  
  67. Stress testing.  This subcategory of performance testing is focused on determining or validating performance characteristics of the system or application under test when subjected to conditions beyond those anticipated during production operations. Stress tests may also include tests focused on determining or validating performance characteristics of the system or application under test when subjected to other stressful conditions, such as limited memory, insufficient disk space, or server failure. These tests are designed to determine under what conditions an application will fail, how it will fail, and what indicators can be monitored to warn of an impending failure.
  68. Baselines
  69. A baseline can be created for a system, component, or application.  A baseline can also be created for different layers of the application, including a database, Web services, and so on.
  70. A baseline can set the standard for comparison, to track future optimizations or regressions.  It is important to validate that the baseline results are repeatable, because considerable fluctuations may occur across test results due to environment and workload characteristics.
  71. Baselines can help identify changes in performance.  Baselines can help product teams identify changes in performance that reflect degradation or optimization over the course of the development life cycle. Identifying these changes in comparison to a well-known state or configuration often makes resolving performance issues simpler.
  72. Baselines assets should be reusable.  Baselines are most valuable if they are created by using a set of reusable test assets. It is important that such tests accurately simulate repeatable and actionable workload characteristics.
  73. Baselines are metrics.  Baseline results can be articulated by using a broad set of key performance indicators, including response time, processor capacity, memory usage, disk capacity, and network bandwidth.
  74. Baselines act as a shared frame of reference.  Sharing baseline results allows your team to build a common store of acquired knowledge about the performance characteristics of an application or component.
  75. Avoid over-generalizing your baselines.  If your project entails a major reengineering of the application, you need to reestablish the baseline for testing that application. A baseline is application-specific and is most useful for comparing performance across different versions. Sometimes, subsequent versions of an application are so different that previous baselines are no longer valid for comparisons.
  76. Know your application’s behavior.  It is a good idea to ensure that you completely understand the behavior of the application at the time a baseline is created.  Failure to do so before making changes to the system with a focus on optimization objectives is frequently counterproductive.
  77. Baselines evolve.  At times you will have to redefine your baseline because of changes that have been made to the system since the time the baseline was initially captured.
  78. Benchmarking
  79. You need to play by the rules.  A benchmark is achieved by working with industry specifications or by porting an existing implementation to meet such standards. Benchmarking entails identifying all of the necessary components that will run together, the market where the product exists, and the specific metrics to be measured.
  80. Because you play by the rules, you can be transparent.  Benchmarking results can be published to the outside world. Since comparisons may be produced by your competitors, you will want to employ a strict set of standard approaches for testing and data to ensure reliable results.
  81. You divulge results across various metrics. Performance metrics may involve load time, number of transactions processed per unit of time, Web pages accessed per unit of time, processor usage, memory usage, search times, and so on.
  82. Terminology

Term / Concept
Description
Capacity
The capacity of a system is the total workload it can handle without violating predetermined key performance acceptance criteria.
Capacity test
capacity test complements load testing by determining your server’s ultimate failure point, whereas load testing monitors results at various levels of load and traffic patterns. You perform capacity testing in conjunction with capacity planning, which you use to plan for future growth, such as an increased user base or increased volume of data. For example, to accommodate future loads, you need to know how many additional resources (such as processor capacity, memory usage, disk capacity, or network bandwidth) are necessary to support future usage levels. Capacity testing helps you to identify a scaling strategy in order to determine whether you should scale up or scale out.
Component test
component test is any performance test that targets an architectural component of the application. Commonly tested components include servers, databases, networks, firewalls, and storage devices.
Endurance test
An endurance test is a type of performance test focused on determining or validating performance characteristics of the product under test when subjected to workload models and load volumes anticipated during production operations over an extended period of time. Endurance testing is a subset of load testing.
Investigation
Investigation is an activity based on collecting information related to the speed, scalability, and/or stability characteristics of the product under test that may have value in determining or improving product quality. Investigation is frequently employed to prove or disprove hypotheses regarding the root cause of one or more observed performance issues.
Latency
Latency is a measure of responsiveness that represents the time it takes to complete the execution of a request. Latency may also represent the sum of several latencies or subtasks.
Metrics
Metrics are measurements obtained by running performance tests as expressed on a commonly understood scale. Some metrics commonly obtained through performance tests include processor utilization over time and memory usage by load.
Performance
Performance refers to information regarding your application’s response times, throughput, and resource utilization levels.
Performance test
performance test is a technical investigation done to determine or validate the speed, scalability, and/or stability characteristics of the product under test. Performance testing is the superset containing all other subcategories of performance testing described in this chapter.
Performance budgets or allocations
Performance budgets (or allocations) are constraints placed on developers regarding allowable resource consumption for their component.
Performance goals
Performance goals are the criteria that your team wants to meet before product release, although these criteria may be negotiable under certain circumstances. For example, if a response time goal of three seconds is set for a particular transaction but the actual response time is 3.3 seconds, it is likely that the stakeholders will choose to release the application and defer performance tuning of that transaction for a future release.
Performance objectives
Performance objectives are usually specified in terms of response times, throughput (transactions per second), and resource-utilization levels and typically focus on metrics that can be directly related to user satisfaction.
Performance requirements
Performance requirements are those criteria that are absolutely non-negotiable due to contractual obligations, service level agreements (SLAs), or fixed business needs. Any performance criterion that will not unquestionably lead to a decision to delay a release until the criterion passes is not absolutely required ― and therefore, not a requirement.
Performance targets
Performance targets are the desired values for the metrics identified for your project under a particular set of conditions, usually specified in terms of response time, throughput, and resource-utilization levels. Resource-utilization levels include the amount of processor capacity, memory, disk I/O, and network I/O that your application consumes. Performance targets typically equate to project goals.
Performance testing objectives
Performance testing objectives refer to data collected through the performance-testing process that is anticipated to have value in determining or improving product quality. However, these objectives are not necessarily quantitative or directly related to a performance requirement, goal, or stated quality of service (QoS) specification.
Performance thresholds
Performance thresholds are the maximum acceptable values for the metrics identified for your project, usually specified in terms of response time, throughput (transactions per second), and resource-utilization levels. Resource-utilization levels include the amount of processor capacity, memory, disk I/O, and network I/O that your application consumes. Performance thresholds typically equate to requirements.
Resource utilization
Resource utilization is the cost of the project in terms of system resources. The primary resources are processor, memory, disk I/O, and network I/O.
Response time
Response time is a measure of how responsive an application or subsystem is to a client request.
Saturation
Saturation refers to the point at which a resource has reached full utilization.
Scalability
Scalability refers to an application’s ability to handle additional workload, without adversely affecting performance, by adding resources such as processor, memory, and storage capacity.
Scenarios
In the context of performance testing, a scenario is a sequence of steps in your application. A scenario can represent a use case or a business function such as searching a product catalog, adding an item to a shopping cart, or placing an order.
Smoke test
smoke test is the initial run of a performance test to see if your application can perform its operations under a normal load.
Spike test
spike test is a type of performance test focused on determining or validating performance characteristics of the product under test when subjected to workload models and load volumes that repeatedly increase beyond anticipated production operations for short periods of time. Spike testing is a subset of stress testing.
Stability
In the context of performance testing, stability refers to the overall reliability, robustness, functional and data integrity, availability, and/or consistency of responsiveness for your system under a variety conditions.
Stress test
stress test is a type of performance test designed to evaluate an application’s behavior when it is pushed beyond normal or peak load conditions. The goal of stress testing is to reveal application bugs that surface only under high load conditions. These bugs can include such things as synchronization issues, race conditions, and memory leaks. Stress testing enables you to identify your application’s weak points, and shows how the application behaves under extreme load conditions.
Throughput
Throughput is the number of units of work that can be handled per unit of time; for instance, requests per second, calls per day, hits per second, reports per year, etc. 
Unit test
In the context of performance testing, a unit test is any test that targets a module of code where that module is any logical subset of the entire existing code base of the application, with a focus on performance characteristics. Commonly tested modules include functions, procedures, routines, objects, methods, and classes. Performance unit tests are frequently created and conducted by the developer who wrote the module of code being tested.
Utilization
In the context of performance testing, utilization is the percentage of time that a resource is busy servicing user requests. The remaining percentage of time is considered idle time.
Validation test
validation test compares the speed, scalability, and/or stability characteristics of the product under test against the expectations that have been set or presumed for that product.
Workload
Workload is the stimulus applied to a system, application, or component to simulate a usage pattern, in regard to concurrency and/or data inputs. The workload includes the total number of users, concurrent active users, data volumes, and transaction volumes, along with the transaction mix. For performance modeling, you associate a workload with an individual scenario.
  1. Summary
  2. Bb924356.image001(en-us,PandP.10).gif
  3. Figure 1.1 Core Performance Testing Activities
  4. The performance testing approach used in this guide consists of the following activities:
  5. At the highest level, performance testing is almost always conducted to address one or more risks related to expense, opportunity costs, continuity, and/or corporate reputation. Some more specific reasons for conducting performance testing include:
  6. For a performance testing project to be successful, both the approach to testing performance and the testing itself must be relevant to the context of the project. Without an understanding of the project context, performance testing is bound to focus on only those items that the performance tester or test team assumes to be important, as opposed to those that truly are important, frequently leading to wasted time, frustration, and conflicts.
  7. The project context is nothing more than those things that are, or may become, relevant to achieving project success. This may include, but is not limited to:
  8. Some examples of items that may be relevant to the performance-testing effort in your project context include:
  9. When end-to-end performance testing reveals system or application characteristics that are deemed unacceptable, many teams shift their focus from performance testing to performance tuning, to discover what is necessary to make the application perform acceptably. A team may also shift its focus to tuning when performance criteria have been met but the team wants to reduce the amount of resources being used in order to increase platform headroom, decrease the volume of hardware needed, and/or further improve system performance.
  10. Although tuning is not the direct responsibility of most performance testers, the tuning process is most effective when it is a cooperative effort between all of those concerned with the application or system under test, including:
  11. Without the cooperation of a cross-functional team, it is almost impossible to gain the system-wide perspective necessary to resolve performance issues effectively or efficiently.
  12. The performance tester, or performance testing team, is a critical component of this cooperative team as tuning typically requires additional monitoring of components, resources, and response times under a variety of load conditions and configurations. Generally speaking, it is the performance tester who has the tools and expertise to provide this information in an efficient manner, making the performance tester the enabler for tuning.
  13. Tuning follows an iterative process that is usually separate from, but not independent of, the performance testing approach a project is following. The following is a brief overview of a typical tuning process:
  14. Performance tests are usually described as belonging to one of the following three categories:
  15. Creating a baseline is the process of running a set of tests to capture performance metric data for the purpose of evaluating the effectiveness of subsequent performance-improving changes to the system or application. A critical aspect of a baseline is that all characteristics and configuration options except those specifically being varied for comparison must remain invariant. Once a part of the system that is not intentionally being varied for comparison to the baseline is changed, the baseline measurement is no longer a valid basis for comparison.
  16. With respect to Web applications, you can use a baseline to determine whether performance is improving or declining and to find deviations across different builds and versions. For example, you could measure load time, the number of transactions processed per unit of time, the number of Web pages served per unit of time, and resource utilization such as memory usage and processor usage. Some considerations about using baselines include:
  17. Benchmarking is the process of comparing your system’s performance against a baseline that you have created internally or against an industry standard endorsed by some other organization. 
  18. In the case of a Web application, you would run a set of tests that comply with the specifications of an industry benchmark in order to capture the performance metrics necessary to determine your application’s benchmark score. You can then compare your application against other systems or applications that also calculated their score for the same benchmark. You may choose to tune your application performance to achieve or surpass a certain benchmark score. Some considerations about benchmarking include:
  19. The following definitions are used throughout this guide. Every effort has been made to ensure that these terms and definitions are consistent with formal use and industry standards; however, some of these terms are known to have certain valid alternate definitions and implications in specific industries and organizations. Keep in mind that these definitions are intended to aid communication and are not an attempt to create a universal standard.
  20. Performance testing helps to identify bottlenecks in a system, establish a baseline for future testing, support a performance tuning effort, and determine compliance with performance goals and requirements. Including performance testing very early in your development life cycle tends to add significant value to the project.
  21. For a performance testing project to be successful, the testing must be relevant to the context of the project, which helps you to focus on the items that that are truly important.
  22. If the performance characteristics are unacceptable, you will typically want to shift the focus from performance testing to performance tuning in order to make the application perform acceptably. You will likely also focus on tuning if you want to reduce the amount of resources being used and/or further improve system performance.
  23. Performance, load, and stress tests are subcategories of performance testing, each intended for a different purpose.
  24. Creating a baseline against which to evaluate the effectiveness of subsequent performance-improving changes to the system or application will generally increase project efficiency.

Common Performance Problems

Common Performance Problems:

Most performance problems revolve around speed, response time, load time and poor scalability. Speed is often one of the most important attributes of an application. A slow running application will lose potential users. Performance testing is done to make sure an app runs fast enough to keep a user's attention and interest. Take a look at the following list of common performance problems and notice how speed is a common factor in many of them:
  • Long Load time - Load time is normally the initial time it takes an application to start. This should generally be kept to a minimum. While some applications are impossible to make load in under a minute, Load time should be kept under a few seconds if possible.
  • Poor response time - Response time is the time it takes from when a user inputs data into the application until the application outputs a response to that input. Generally this should be very quick. Again if a user has to wait too long, they lose interest.
  • Poor scalability - A software product suffers from poor scalability when it cannot handle the expected number of users or when it does not accommodate a wide enough range of users. Load testing should be done to be certain the application can handle the anticipated number of users.
  • Bottlenecking - Bottlenecks are obstructions in system which degrade overall system performance. Bottlenecking is when either coding errors or hardware issues cause a decrease of throughput under certain loads. Bottlenecking is often caused by one faulty section of code. The key to fixing a bottlenecking issue is to find the section of code that is causing the slow down and try to fix it there. Bottle necking is generally fixed by either fixing poor running processes or adding additional Hardware. Some common performance bottlenecks are
    • CPU utilization
    • Memory utilization
    • Network utilization
    • Operating System limitations
    • Disk usage

Types of performance testing

  • Load testing - checks the application's ability to perform under anticipated user loads. The objective is to identify performance bottlenecks before the software application goes live.
  • Stress testing - involves testing an application under extreme workloads to see how it handles high traffic or data processing .The objective is to identify breaking point of an application.
  • Endurance testing - is done to make sure the software can handle the expected load over a long period of time.
  • Spike testing - tests the software's reaction to sudden large spikes in the load generated by users.
  • Volume testing - Under Volume Testing large no. of. Data is populated in database and the overall software system's behavior is monitored. The objective is to check software application's performance under varying database volumes.
  • Scalability testing - The objective of scalability testing is to determine the software application's effectiveness in "scaling up" to support an increase in user load. It helps plan capacity addition to your software system.

Overview of Performance Testing Concepts

Overview of Performance Testing Concepts

Performance Testing :  There are lot of Definitions available but the one mentioned in IEEE Glossary is as follows:

“Testing conducted to evaluate the compliance of a system or component with specified performance requirements. Often this is performed using an automated test tool to simulate large number of users. Also known as "Load Testing".

Or

“The testing performed to determine the degree to which a system or component accomplishes its designated functions within given constraints regarding processing time and throughput rate.”

The purpose of the test is to measure characteristics, such as response times, throughput or the mean time between failures (for reliability testing)

Performance testing tool:
A tool to support performance testing and that usually has two main facilities: load generation and test transaction measurement. Load generation can simulate either multiple users or high volumes of input data. During execution, response time measurements are taken from selected transactions and these are logged. Performance testing tools normally provide reports based on test logs and graphs of load against response times.

Features or characteristics of performance-testing tools include support for:
• generating a load on the system to be tested;
• measuring the timing of specific transactions as the load on the system varies;
• measuring average response times;
• producing graphs or charts of responses over time.

Load test:
A test type concerned with measuring the behavior of a component or system with increasing load, e.g. number of parallel users and/or numbers of transactions to determine what load can be handled by the component or system.

While doing Performance testing we measure some of the following:

Characterisitics (SLA)                                       Measurement (units)
Response Time                                                          Seconds
Hits per Second                                                        #Hits
Throughput                                                              Bytes Per Second
Transactions per Second (TPS)         #Transactions of a Specific Business Process
Total TPS (TTPS)                                                     Total no.of Transactions
Connections per Second (CPS)                                 #Connections/Sec
Pages Downloaded per Second (PDPS)                     #Pages/Sec

Some Definitions and importance of the Above:

Response Time :

What is Transaction Response Time?

Transaction Response Time represents the time taken for the application to complete a defined transaction or business process.

Why is important to measure Transaction Response Time?

The objective of a performance test is to ensure that the application is working perfectly under load. However, the definition of “perfectly” under load may vary with different systems.
By defining an initial acceptable response time, we can benchmark the application if it is performing as anticipated.

The importance of Transaction Response Time is that it gives the project team/ application team an idea of how the application is performing in the measurement of time. With this information, they can relate to the users/customers on the expected time when processing request or understanding how their application performed.


What does Transaction Response Time encompass?

The Transaction Response Time encompasses the time taken for the request made to the web server, there after being process by the Web Server and sent to the Application Server. Which in most instances will make a request to the Database Server. All this will then be repeated again backward from the Database Server, Application Server, Web Server and back to the user. Take note that the time taken for the request or data in the network transmission is also factored in.

To simplify, the Transaction Response Time comprises of the following:
1. Processing time on Web Server
2. Processing time on Application Server
3. Processing time on Database Server.
4. Network latency between the servers, and the client.

The following diagram illustrates Transaction Response Time.

 
Transaction Response Time = (t1 + t2 + t3 + t4 + t5 + t6 + t7 + t8 + t9) X 2
Note:
Factoring the time taken for the data to return to the client.


How do we measure?

Measuring of the Transaction Response Time begins when the defined transaction makes a request to the application. From here, till the transaction completes before proceeding with the next subsequent request (in terms of transaction), the time is been measured and will stop when the transaction completes.

Differences with Hits Per Seconds

Hits per Seconds measures the number of “hits” made to a web server. These “hits” could be a request made to the web server for data or graphics. However, this counter does not represent well to users on how well their applications is performing as it measures the number of times the web server is being accessed.

How can we use Transaction Response Time to analyze performance issue?

Transaction Response Time allows us to identify abnormalities when performance issues surface. This will be represented as slow response of the transaction, which differs significantly (or slightly) from the average of the Transaction Response Time.
With this, we can further drill down by correlation using other measurements such as the number of virtual users that is accessing the application at the point of time and the system-related metrics (e.g. CPU Utilization) to identify the root cause.
Bringing all the data that have been collected during the load test, we can correlate the measurements to find trends and bottlenecks between the response time, the amount of load that was generated and the payload of all the components of the application.

How is it beneficial to the Project Team?

Using Transaction Response Time, Project Team can better relate to their users using transactions as a form of language protocol that their users can comprehend. Users will be able to know that transactions (or business processes) are performing at an acceptable level in terms of time.
Users may be unable to understand the meaning of CPU utilization or Memory usage and thus using a common language of time is ideal to convey performance-related issues.


Relation between Load, Response Time and Performance:

1.       Load is Directly Proportional to Response Time
2.      Performance is inversely proportional to Response Time.

So, As and When the Load increases the Response Time Increases. As Response Time Increases, the Performance Decreases.

Hits Per Second

A Hit is a request of any kind made from the virtual client to the application being tested (Client to Server). It is measured by number of Hits. The higher the Hits Per Second, the more requests the application is handling per second.

A virtual client can request an HTML page, image, file, etc. Testing the application for Hits Per Second will tell you if there is a possible scalability issue with the application. For example, if the stress on an application increases but the Hits Per Second does not, there may be a scalability problem in the application.

One issue with this metric is that Hits Per Second relates to all requests equally.
Thus a request for a small image and complex HTML generated on the fly will both be considered as hits. It is possible that out of a hundred hits on the application, the application server actually answered only one and all the rest were either cached on the web server or other caching mechanism.

So, it is very important when looking at this metric to consider what and how the
application is intended to work. Will your users be looking for the same piece of
information over and over again (a static benefit form) or will the same number of users be engaging the application in a variety of tasks – such as pulling up images, purchasing items, bringing in data from another site? To create the proper test, it is important to understand this metric in the context of the application. If you’re testing an application function that requires the site to ‘work,’ as opposed to present static data, use the pages per second measurement.

Pages Per Second

Pages Per Second measures the number of pages requested from the application per second. The higher the Page Per Second the more work the application is doing per second. Measuring an explicit request in the script or a frame in a frameset provides a metric on how the application responds to actual work requests. Thus if a script contains a Navigate command to a URL, this request is considered a page. If the HTML that returns includes frames they will also be considered pages, but any other elements retrieved such as images or JS Files, will be considered hits, not pages. This measurement is key to the end-user’s experience of application performance.

Correlation: If the stress increases, but the Page Per Second count doesn’t, there may be a scalability issue. For example, if you begin with 75 virtual users requesting 25 different pages concurrently and then scale the users to 150, the Page Per Second count should increase. If it doesn’t, some of the virtual users aren’t getting their pages. This could be caused by a number of issues and one likely suspect is throughput.

Throughput

“The amount of data transferred across the network is called throughput. It considers the amount of data transferred from the server to client only and is measured in Bytes/sec.”

This is an important baseline metric and is often used to check that the application and its server connection is working. Throughput measures the average number of bytes per second transmitted from the application being tested to the virtual clients running the test agenda during a specific reporting interval. This metric is the response data size (sum) divided by the number of seconds in the reporting interval.

Generally, the more stress on an application, the more Throughput. If the stress increases, but the Throughput does not, there may be a scalability issue or an application issue.

Another note about Throughput as a measurement – it generally doesn’t provide any information about the content of the data being retrieved. Thus it can be misleading especially in regression testing. When building regression tests, leave time in the testing plan for comparing returned data quality.


Round Trips

Another useful scalability and performance metric is the testing of Round Trips. Round Trips tells you the total number of times the test agenda was executed versus the total number of times the virtual clients attempted to execute the Agenda. The more times the agenda is executed, the more work is done by the test and the application.
The test scenario the agenda represents influences the round Trips measurement.
This metric can provide all kinds of useful information from the benchmarking of an application to the end-user availability of a more complex application. It is not
recommended for regression testing because each test agenda may have a different scenario and/or length of scenario.

Hit Time
Hit time is the average time in seconds it took to successfully retrieve an element of any kind (image, HTML, etc). The time of a hit is the sum of the Connect Time, Send Time, Response Time and Process Time. It represents the responsiveness or performance of the application to the end user. The more stressed the application, the longer it should take to retrieve an average element. But, like Hits Per Second, caching technologies can influence this metric. Getting the most from this metric requires knowledge of how the application will respond to the end user.
This is also an excellent metric for application monitoring after deployment. 

Time to First Byte

This measurement is important because end users often consider a site malfunctioning if it does not respond fast enough. Time to First Byte measures the number of seconds it takes a request to return its first byte of data to the test software’s Load Generator.
For example, Time to First Byte represents the time it took after the user pushes the “enter” button in the browser until the user starts receiving results. Generally, more concurrent user connections will slow the response time of a request. But there are also other possible causes for a slowed response.
For example, there could be issues with the hardware, system software or memory issues as well as problems with database structures or slow-responding components within the application.

Page Time

Page Time calculates the average time in seconds it takes to successfully retrieve a page with all of its content. This statistic is similar to Hit Time but relates only to pages. In most cases this is a better statistic to work with because it deals with the true dynamics of the application. Since not all hits can be cached, this data is more helpful in terms of tracking a user’s experience (positive or frustrated). It’s important to note that in many test software application tools you can turn caching on or off depending on your application needs.

Generally, the more stress on the site the slower its response. But since stress is a combination of the number of concurrent users and their activity, greater stress may or may not impact the user experience. It all depends upon the application’s functions and users. A site with 150 concurrent users looking up benefit information will differ from a news site during a national emergency. As always, metrics must be examined within context.

Failed Rounds/Failed Rounds Per Second

During a load test it’s important to know that the application requests perform as
expected. The Failed Rounds and Failed Rounds Per Second tests the number of
rounds that fail.

This metric is an “indicator metric” that provides QA and test with clues to the
application performance and failure status. If you start to see Failed Rounds or Failed Rounds Per Second, then you would typically look into the logs to see what types of failures correspond to this metric report. Also, with some software test packages, you can set what the definition of a failed round in an application.

Sometimes, basic image or page missing errors (HTTP 404 error codes) could be set to fail a round, which would stop the execution of the test agenda at that point and start at the top of the agenda again, thus not completing that particular round.

Failed Hits/Failed Hits Per Second

This test offers insight into the application’s integrity during the load test. An example of a request that might fail during execution is a broken link or a missing image from the server. The number of errors should grow with the load size. If there are no errors with a low load, the number of errors with a high load should remain zero. If the percentage of errors only increases during high loads, the application may have a scalability issue.

Failed Connections

This test is simply the number of connections that were refused by the application during the test. This test leads to other tests. A failed connection could mean the server was too busy to handle all the requests, so it started refusing them. It could be a memory issue. It could also mean that the user sent bogus or malformed data to which the server couldn’t respond so it refused the connection.