Understanding Performance of a request/response message

I have always faced challenges understanding Performance related terminologies. Underlying concepts within performance of a system seems convoluted. The very basic performance measure is the time between a request and a response. For diagnostic purpose, it necessitates the need of understanding the underlying concepts of performance. The ideas is that if we are able to understand internal nitty gritty of performance of a system, we will be able to make better design decision. Also, this will help Quality Assurance team create a performance test strategy.

Above diagram tries to explain the internal nitty-gritty of performance component of a system. QA would generally ask, what kind of testing needs to be performed on a web application. The answer is definitely not straight forward, but dependent on following, but not limited to, criteria:

Is this Web Application built from scratch based on some open platform?
Is this a SaaS Application?
How much control do we have over Network?
How much customization is done with the Application?

Following section consist of few main performance related terminology and their details.

Performance

At an abstract level, performance is basically measure of time between a request and response under different conditions like number of requests, amount of data, consistency in responses, system load, etc. In other words, how response time changes under twist and turn of the Application under different conditions. Performance could be further divided into following sub-categories.

Latency

Latency can be defined as the duration of time between the start of the request and the first sign of response on the client. Latency can be split into two parts (a) Network Latency (b) Server Latency

Network Latency

Network Latency can be defined as the time difference between a request initiation from Client and and the first sign of request reaching to the Server. Similarly, the time difference between start of response message and first sign of arrival of response message. In other words, how much time a message took in-transit and that is over the network. Calculating these numbers could be tricky and would require some sophisticated software. In order to diagnose and trace each hop of request & response message, one has to understand the underlying network topology. Practical examples: (a) In one of the performance resolution effort we found that changing the configuration of F5 Load Balancer improved the latency thus overall response time by 2-4 seconds. (b) Reducing the number of hops of the message reduces the latency time

Server Latency

Server Latency can be defined as total time taken by server from the start of the request to the first sign of response from the Server. In reality, this definition is abstract and ground reality is little more complicated. There could be many things going inside the server. For example, it might be doing a query or trying to retrieve information from other servers. A Server serving to a Static Web-Site might be plain and simple, but, a Web-Application Server might be performing pretty complex operation internally. Diagnosing these performance issues might require complex analysis depending on the complexity of logic behind a server.

Response Time

Response time is similar to latency, which can be described as the amount of time it takes to get the first sign of response from the server, after a request is initiated.

User Perceived Response Time

As the name suggest, user perceived response time could be different in different scenario. This categorization of performance measures the response time when a response is really usable by the user. For example, when a web page has already loaded above-the-fold, but part of response data is still in transit. In case of CSS file or javascript file download, user agent waits for the completion of the download.

While it was good to understand the performance aspects of the solution, but, overall performance of a system from the Business perspective could be convoluted with following, but not limited to, criteria as well.

Scalability

If you increase your production line, will your output/performance will increase in similar order. Adding N times more servers in the environment should be able to handle N times more requests. In reality many of the complex enterprise application will not scale linearly. World-Wide-Web is the best example of scalable system.

Robustness

How well system can handle different types of stress.

Simplicity

How simple is the system in order to deliver a capability. Simplicity might control the agility of the system.

Modifiability

How easy it is to modify the system. An agile system could live for ages after continuously modifying the system.

Portability

How easy it is to port the system to a different platform. Is it hardwired with a particular platform?

Reliability

How reliable is the system when it comes to providing services to the client. Reliability and availability kind of go hand-in-hand.

Salesforce Technical Diary

Search This Blog