Sign In

Communications of the ACM


High-Performance Web Sites

View as: Print Mobile App ACM Digital Library Full Text (PDF) In the Digital Edition Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook

Illustration by Nik Schulz

Google Maps, Yahoo! Mail, Facebook, MySpace, YouTube, and Amazon are examples of Web sites built to scale. They access petabytes of data sending terabits per second to millions of users worldwide. The magnitude is awe-inspiring.

Users view these large-scale Web sites from a narrower perspective. The typical user has megabytes of data that they download at a few hundred kilobits per second. Users are less interested in the massive number of requests per second being served, caring more about their individual requests. As they use these Web applications they inevitably ask the same question: "Why is this site so slow?"

The answer hinges on where development teams focus their performance improvements. Performance for the sake of scalability is rightly focused on the backend. Database tuning, replicating architectures, customized data caching, and so on, allow Web servers to handle a greater number of requests. This gain in efficiency translates into reductions in hardware costs, data center rack space, and power consumption. But how much does the backend affect the user experience in terms of latency?

The Web applications listed here are some of the most highly tuned in the world, and yet they still take longer to load than we'd like. It almost seems as if the high-speed storage and optimized application code on the backend have little impact on the end user's response time. Therefore, to account for these slowly loading pages we must focus on something other than the backend: we must focus on the frontend.

Back to Top

The Importance of Frontend Performance

Figure 1 illustrates the HTTP traffic sent when your browser visits iGoogle with an empty cache. Each HTTP request is represented by a horizontal bar whose position and size are based on when the request began and how long it took. The first HTTP request is for the HTML document ( As noted in Figure 1, the request for the HTML document took only 9% of the overall page load time. This includes the time for the request to be sent from the browser to the server, for the server to gather all the necessary information on the backend and stitch that together as HTML, and for that HTML to be sent back to the browser.

The other 91% percent is spent on the frontend, which includes everything that the HTML document commands the browser to do. A large part of this is fetching resources. For this iGoogle page there are 22 additional HTTP requests: two scripts, one stylesheet, one iframe, and 18 images. Gaps in the HTTP profile (places with no network traffic) are where the browser is parsing CSS, and parsing and executing JavaScript.

The primed cache situation for iGoogle is shown in Figure 2. Here there are only two HTTP requests: one for the HTML document and one for a dynamic script. The gap is even larger because it includes the time to read the cached resources from disk. Even in the primed cache situation, the HTML document accounts for only 17% of the overall page load time.

This situation, in which a large percentage of page load time is spent on the frontend, applies to most Web sites. Table 1 shows that eight of the top 10 Web sites in the U.S. (as listed on spend less than 20% of the end user's response time fetching the HTML document from the backend. The two exceptions are Google Search and Live Search, which are highly tuned. These two sites download four or fewer resources in the empty cache situation, and only one request with a primed cache.

The time spent generating the HTML document affects overall latency, but for most Web sites this backend time is dwarfed by the amount of time spent on the frontend. If the goal is to make the user experience faster, the place to focus is on the frontend. Given this new focus, the next step is to identify best practices for improving frontend performance.

Back to Top

Frontend Performance Best Practices

Through research and consulting with development teams, I've developed a set of performance improvements that have been proven to speed up Web pages. A big fan of Harvey Penick's Little Red Book1 with advice like "Take Dead Aim," I set out to capture these best practices in a simple list that is easy to remember. The list has evolved to contain the following 14 prioritized rules:

  1. Make fewer HTTP requests
  2. Use a content delivery network
  3. Add an Expires header
  4. Gzip components
  5. Put stylesheets at the top
  6. Put scripts at the bottom
  7. Avoid CSS expressions
  8. Make JavaScript and CSS external
  9. Reduce DNS lookups
  10. Minify JavaScript
  11. Avoid redirects
  12. Remove duplicate scripts
  13. Configure ETags
  14. Make Ajax cacheable

A detailed explanation of each rule is the basis of my book, High Performance Web Sites2 What follows is a brief summary of each rule.

Back to Top

Rule 1: Make Fewer HTTP Requests

As the number of resources in the page grows, so does the over all page load time. This is exacerbated by the fact that most browsers only download two resources at a time from a given hostname, as suggested in the HTTP/1.1 specification ( Several techniques exist for reducing the number of HTTP requests without reducing page content:

  • Combine multiple scripts into a single script.
  • Combine multiple stylesheets into a single stylesheet.
  • Combine multiple CSS background images into a single image called a CSS sprite (see

Back to Top

Rule 2: Use a Content Delivery Network

A content delivery network (CDN) is a collection of distributed Web servers used to deliver content to users more efficiently. Examples include Akamai Technologies, Limelight Networks, SAWIS, and Panther Express. The main performance advantage provided by a CDN is delivering static resources from a server that is geographically closer to the end user. Other benefits include backups, caching, and the ability to better absorb traffic spikes.

Back to Top

Rule 3: Add an Expires Header

When a user visits a Web page, the browser downloads and caches the page's resources. The next time the user visits the page, the browser checks to see if any of the resources can be served from its cache, avoiding time-consuming HTTP requests. The browser bases its decision on the resource's expiration date. If there is an expiration date, and that date is in the future, then the resource is read from disk. If there is no expiration date, or that date is in the past, the browser issues a costly HTTP request. Web developers can attain this performance gain by specifying an explicit expiration date in the future. This is done with the Expires HTTP response header, such as the following:

Expires: Thu, 1 Jan 2015 20:00:00 GMT

Back to Top

Rule 4: Gzip Components

The amount of data transferred over the network affects response times, especially for users with slow network connections. For decades developers have used compression to reduce the size of files. This same technique can be used for reducing the size of data sent over the Internet. Many Web servers and Web hosting services enable compression of HTML documents by default, but compression shouldn't stop there. Developers should also compress other types of text responses, such as scripts, stylesheets, XML, JSON, among others. Gzip is the most popular compression technique. It typically reduces data sizes by 70%.

Back to Top

Rule 5: Put Stylesheets at the Top

Stylesheets inform the browser how to format elements in the page. If stylesheets are included lower in the page, the question arises: What should the browser do with elements that it can render before the stylesheet has been downloaded?

One answer, used by Internet Explorer, is to delay rendering elements in the page until all stylesheets are downloaded. But this causes the page to appear blank for a longer period of time, giving users the impression that the page is slow. Another answer, used by Firefox, is to render page elements and redraw them later if the stylesheet changes the initial formatting. This causes elements in the page to "flash" when they're redrawn, which is disruptive to the user. The best answer is to avoid including stylesheets lower in the page, and instead load them in the HEAD of the document.

Back to Top

Rule 6: Put Scripts at the Bottom

External scripts (typically, ".js" files) have a bigger impact on performance than other resources for two reasons. First, once a browser starts downloading a script it won't start any other parallel downloads. Second, the browser won't render any elements below a script until the script has finished downloading. Both of these impacts are felt when scripts are placed near the top of the page, such as in the HEAD section. Other resources in the page (such as images) are delayed from being downloaded and elements in the page that already exist (such as the HTML text in the document itself) aren't displayed until the earlier scripts are done. Moving scripts lower in the page avoids these problems.

Back to Top

Rule 7: Avoid CSS Expressions

CSS expressions are a way to set CSS properties dynamically in Internet Explorer. They enable setting a style's property based on the result of executing JavaScript code embedded within the style declaration. The issue with CSS expressions is that they are evaluated more frequently than one might expect—potentially thousands of times during a single page load. If the JavaScript code is inefficient it can cause the page to load more slowly.

Back to Top

Rule 8: Make JavaScript and CSS External

JavaScript can be added to a page as an inline script: