Architecture and Hardware Practice

Making the Mobile Web Faster

Mobile performance issues? Fix the back end, not just the client.
  1. Introduction
  2. Sending the "Right" Data
  3. Conclusion
  4. References
  5. Author
  6. Figures
Making the Mobile Web Faster, illustration

back to top  

Mobile clients have been on the rise and will only continue to grow. This means that if you are serving clients over the Internet, then you cannot ignore the customer experience on a mobile device.

There are many informative articles on mobile performance, and just as many on general API design, but you will find few discussing the design considerations needed to optimize the back-end systems for mobile clients. Whether you have an app, mobile website, or both, it is likely these clients are consuming APIs from your back-end systems. It is this part of that infrastructure that this article is about.

Certainly, optimizing the on-mobile performance of the application is critical, but software engineers can do a lot to ensure mobile clients are remotely served both data and application resources in a reliably performant manner.

What is so special about mobile? If you were to go back in time and use the Internet, you would notice that most websites felt slower. The technology has now evolved to the point that clients can efficiently use and negotiate low-bandwidth channels. Mobile clients, however, do not have the computer power, storage, and high-bandwidth connections of desktops, so mobile needs to be thought about a little differently.

Here are some of the special considerations to take into account when building mobile-based applications:

  • Limited screen size. There is less space for data and images.
  • Smaller number of simultaneous connections. This one is important because unlike Web browsers that can run many concurrent asynchronous requests, mobile browsers have a limited number of connections per domain at any given moment.
  • Slower network. Network performance is heavily affected by poor signal reception and multiple cellular handovers (even though some clients are on Wi-Fi, some networks are congested and can require additional lookups if a user changes cell towers).
  • Slower processing power. Extensive client-side computations, 3D graphics rendering, and heavy JavaScript usage can greatly affect performance.
  • Smaller caches. Mobile clients are generally memory-restricted so it is best not to rely heavily on cached content for performance.
  • “Special” browsers. In many ways the mobile browser ecosystem is reminiscent of the fragmented desktop browser scene of several years ago, with mobile vendors producing versions with fatal deficiencies and incompatibilities.

Although there are many ways to tackle these unique obstacles, this article focuses on what can be done from an API or back-end service to improve the performance (or the perception thereof) of mobile clients. The article is divided into two parts:

  • Minimizing network connections and the need to transmit data—efficient media handling, effective caching, and employing longer data-oriented operations with fewer connections.
  • Sending the “right” data across the network—designing APIs to return only the data that is needed/requested, and optimizing for the various types of forms of mobile devices.

Although this article is focused solely on mobile, many of the lessons and ideas can be applied to other API client forms as well.

Minimizing connections and data across the network. Minimizing the number of HTTP requests required to render a Web page is undoubtedly one of the biggest ways of improving mobile performance. There are many ways to do this, but the exact approach may depend on your data and the architecture of your application.

In most cases you want to minimize how much information is sent across the network. Rendering on the server has its advantages (such as when the server sends back whole HTML pages) since it requires less compute and processing resources than doing so on the client. Of course, the downside of this approach is that the more code rendered server side, the more likely that code may have display issues in client browsers (and dealing with browser compatibility is seldom fun). Still, the more that can be done on the client, the fewer trips across the network. After all, that is why “apps” have become so popular—if you could do everything in the Web browser with the network, this would be a mobile website world.

Minimize image requests. In a standard browser, making a single request for each image on the page results in speed improvements and allows you to take advantage of caching for each image. The browser is able to execute each request quickly and in parallel, so there is not a big performance hit for making many requests (and with the caching benefits there can even be performance gains). This same request, however, can be a killer on mobile.

Since every request for data on a mobile device can require substantially more overhead, it can add significant latency to each request. Therefore, minimizing image requests can reduce the number of requests and in some cases the amount of data that needs to be sent (which can also help mobile performance).

Here are some strategies to consider:

Use image sprites. The use of image sprites can reduce the number of individual images that need to be downloaded from the server, but sprites can be cumbersome to maintain and difficult to generate in some circumstances (such as on product search results where you are showing thumbnail images for many products).

Use CSS instead of images. Avoiding images where possible and using CSS (Cascading Style Sheets) rendering for shadows, gradients, and other effects can reduce the amount of bytes that need to be transmitted and downloaded.

Support responsive images. A popular way of delivering the right image to the right device is using responsive images. Apple does this by loading regular images and then replacing them with high-resolution ones using JavaScript.7 There are several other ways3 of approaching this problem, but the issue is far from solved.12

In these cases you should make sure that the server-side support and APIs are able to support different versions of the same image, and the exact way to do that will depend on the approach of the clients. For example, one easy way of doing this with an API is to support a handful of image sizes as a parameter for the request, as shown in Figure 1.

To keep APIs simple, make this parameter optional and send back a default size. To pick your default size, select either the smallest size (to handle situations such as responsive images) or the most commonly used size on your website.

Use data URIs for images inline to minimize extra requests. An alternative to sprites is to use data URIs (uniform resource identifiers) to embed images inline within the HTML itself. This makes the images part of the overall page, and while the URI-encoded images can be larger in terms of bytes, they compress better with gzip compression, which helps minimize the effect of transmitting additional data.

If using URIs, then make sure to:

  • Resize images to the appropriate size before encoding into the URI payload.
  • Gzip-compress responses (to take advantage of compression).
  • Note that URI-encoded images are part of the CSS of the page. As a result, caching of individual images is more difficult, so do not use this approach if there are good reasons to cache the image locally (that is, it is reused frequently on several pages).

Leverage localStorage and caching. Since mobile networks can be slow, HTML, CSS, and images can be stored in localStorage to make the mobile experience faster. (There is a great case study on Bing’s improvements using localStorage for mobile to reduce the size of an HTML document from about 200KB to about 30KB.11)

Pulling data out of local storage can negatively impact performance,13 but it is typically much less than the latency incurred going across the network. In addition to localStorage, some apps are using other features in HTML5,6 such as appCache,1 to improve performance and startup time.

One optimization that can be leveraged on the server involves being aware of what is on the device. By embedding CSS and JavaScript directly within a single Web request, then storing a reference to those files on the client, it is possible to track what has been downloaded and resides in the cache. Then, the next time the client makes a request to the server, it can pass the references to its cached files to the server via a cookie. The server then only has to send new files over the network, which prevents the client from downloading those assets again.

This trick to leverage local caching can save a lot of time. (For more details on how directly to embed and then reference these files, as well as other resources for more reading on the topic, see Mark Pilgrim’s Dive into HTML5.8)

Prefetch and cache data. One great way to improve perceived performance is by prefetching data that will be used throughout the mobile experience so it can be loaded directly on the device without additional requests—for example, paginated results, popular queries, and user data. Thinking about these use cases and factoring them into your API design will allow you to create APIs designed for prefetching and caching data before the user interacts with it, increasing the perception of responsiveness.

If your client is an app, then for data that is not likely to change between updates (such as categories or main navigation) consider shipping the data inside the app so it never requires a trip across the network.

If you want to get sophisticated, ship the data inside the app but also create a versioning and expiration scheme; that way, the app can ping the server in the background and update the data only if the version on the device is out of date.

Ideally, you want to transfer data when needed by the client and preload data when advantageous to do so (such as, when the network or other required resources are not in use). Therefore, if an end user will not view the image or content, then do not send it (this is particularly important for responsive sites since some just “hide” elements). Design your APIs to be flexible and support sending smaller payloads to the client.

A great use case for prefetching images is a gallery of image results, such as a list of products on an e-commerce site. In these situations it is worth downloading the previous and next image(s) to speed up interactions and browsing. Be careful, however, not to go overboard and fetch too far ahead; otherwise, you could end up requesting data that may not be seen by the user.

Use nonblocking I/O. With client optimizations, it is well known to watch out for blocking JavaScript execution,14 which can have a big impact on the perception of performance. This is even more important for APIs. If there is a longer API call, such as one that could rely on a third party and might time out, it is important to implement this as nonblocking (or even long-waiting) and instead choose a polling or triggering model:

  • Polling API (pull-based model). The client makes a request and then periodically checks for the results of that request, periodically backing off if required.
  • Triggering API (push-based model). The call makes the request and then listens for a response from the server. The server is provided a call back so it can trigger an event letting the caller know the results are available.

Triggering APIs are typically more difficult to implement, as connections on mobile clients are unreliable. Therefore, polling is a much better option in most cases.

For example, in‘s mobile app,4 each product page shows availability and pricing at stores close to a user’s location. Since a third party delivers those results, the developers did not want the local pricing to take as long as the partner’s API did to deliver results to the client. To work around this, created its own wrapper API that allows users to pass a flag for any product query (a set of APIs supported retrieving product data in various ways) that would signal the server to retrieve local prices for that product. Those prices would be stored in the server’s cache. Then in the event the user would want the local pricing for the product, those prices would have a higher probability of being in the cache and would not incur the longer wait times from the third-party partner.

This method is a lot like prefetching on the client but is instead done on the server side with APIs and data. Figure 2 depicts sample requests to show how this works.

As shown in Figure 3, this call looks in the cache first, and if the prices for that product are not present, it calls the third-party API and waits.

You want to make sure that APIs return quickly and do not block while waiting for results, since mobile clients have a limited number of connections.

In general, you want to make sure that APIs return quickly and do not block while waiting for results, since mobile clients have a limited number of connections. In cases where some components are significantly slower than others on the server side, it can be worth breaking the API into separate calls using typical response time as a factor. That way the client can start rendering pages from the initial fast response calls while waiting for the slower ones. The goal is to minimize the time-to-text rendering on the screen.

You should avoid chatty APIs, and it is important in slow network situations to avoid several API calls. A good rule of thumb is to have all the data needed to render a page returned in a single API call.

Avoid redirects and minimize DNS lookups. When it comes to requests, redirects can negatively impact performance, especially if they cross domains and require a DNS lookup.

For example, many sites handle their mobile sites using client-side redirects; for example, when a mobile client goes to its main site URL (for example,, it would redirect the client to the mobile site ( This is especially common when the sites are built on different technology stacks. Here is an example of how this works:

  1. A user Googles “yahoo” and clicks on the first link in the results.
  2. Google captures the click using its own tracking URL and then redirects the phone to [redirect].
  3. Google’s redirect response goes through the cell tower and then back to the phone.
  4. Then there is a DNS lookup for
  5. The IP resulting from the DNS lookup is sent through the cell tower and back to the phone.
  6. When the phone hits, it is recognized as a mobile client and is redirected to [redirect].
  7. The phone then has to do another DNS lookup for that subdomain (
  8. The IP resulting from the DNS lookup is sent through the cell tower and back to the phone.
  9. The resulting HTML and assets are finally sent back through the cell tower and then to the phone.
  10. Some of the images on pages of the mobile site are served via a CDN (content delivery network), referencing yet another domain,
  11. The phone then has to do another DNS lookup for that subdomain,
  12. The IP resulting from the DNS lookup is sent through the cell tower and back to the phone.
  13. The images are rendered, completing the page.

As is obvious from this example, a lot of overheard is involved in these requests. They can be avoided by using redirects on the server side (routing via the server and keeping DNS lookups and redirects to a minimum on the client) or by using responsive techniques.2 If DNS lookups are unavoidable, try using DNS prefetching for known domains to save time.

Design your APIs to allow clients to request just the information they need.

Use HTTP pipelining and SPDY. Another useful technique is HTTP pipelining, which allows for combining multiple requests into one. If I were to implement an optimization translation layer, however, I would opt for SPDY, which essentially optimizes HTTP requests to make them much more efficient. SPDY is getting traction in places such as Amazon’s Kindle browser, Twitter, and Google.

Back to Top

Sending the “Right” Data

Depending on the client, the experience may require different files, CSS, JavaScript, or even the number of results. Creating APIs in a way that supports different permutations and versions of results and files provides the most flexibility for creating amazing client experiences.

Use limit and offset to get results. As with regular APIs, fetching results using limit and offset allows clients to request ranges of the data that make sense for the client’s use case (thus, fewer results for mobile). The limit and offset notation is more common (than, say, start and next), well understood in most databases, and therefore easy to build on:


You should choose a default that caters either to the lowest or highest common denominator, depending on which clients are more important to your business: smaller if mobile clients are your biggest users; bigger if users are likely to be on their desktops, such as a B2B website or service.

Support partial response and partial update. Design your APIs to allow clients to request just the information they need. This means that APIs should support a set of fields, instead of returning the full resource representation each time. By avoiding the need for clients to collect and parse unnecessary data, it can simplify the requests and improve performance.

Partial update allows clients to do the same thing with data they are writing to the API (thereby avoiding the need to specify all elements within the resource taxonomy).

Google supports partial response by adding optional fields in a comma-delimited list as follows:


For each call, specifying entry indicates that the caller is requesting only a partial set of fields.

Avoid or minimize cookies. Every time a client sends a request to the domain, it will include all of the cookies that it has from that domain—even duplicated entries or extraneous values. This means keeping cookies small is another way to keep payloads down and performance up. Do not use or require cookies unless necessary. Serve static content that does not require permissions from a cookieless domain, such as images from a static domain or CDN. (The Google Developers site provides some best practices for cookies and performance.5)

Establish device profiles for APIs. With the many different screen sizes and resolutions on desktops, tablets, and mobile phones, it is helpful to establish a set of profiles you plan to support. For each profile you can deliver different images, data, and files so they suit each device; you can do this using media queries on the client.10

If each profile is tailored to a device, then it has the opportunity to offer a better user experience. For each different function and scenario supported by each profile, however, the more difficult it will be to maintain (since devices are constantly changing and evolving). As a result, the smartest approach is to support only as many profiles as absolutely necessary for your particular business. (The mobiForge website offers more information on some tradeoffs and options for creating great experiences on different devices.9)

For most applications three profiles will be sufficient:

  • Mobile phone—smaller images, touch enabled, and low bandwidth.
  • Tablet—larger images designed for lower bandwidth, touch enabled, more data per request.
  • Desktop—larger, high-resolution images designed for tablets with high resolution and Wi-Fi or desktop browsers.

Selecting the right profile can be handled by the client, which means on the server side APIs just need to support this configuration. You should design APIs to take these profiles as input, or parameters, and send different information based on the device making the request. Depending on the application, this may mean sending smaller images, fewer results, or inline CSS and JavaScript.

For example, if one of your APIs returns search results to the client, each profile might behave differently as:


This would use the default profile (desktop) and serve up the standard page, making a request for each image so subsequent product views could be loaded from cache:


This would return 10 product results and use the low-resolution images encoded as URIs with the same HTTP request:


This would return 20 product results using the larger-size low-resolution images encoded as URIs with the same HTTP request.

You can even create special profiles for devices such as feature phones. Unlike smartphones, feature phones can cache files on only a per-page basis, so it is better to send CSS and JavaScript with each request for these clients. Using profiles is an easy way to support that functionality server side.

You should use profiles instead of partial responses when the response from the server is drastically different per profile—for example, if the response has inline URI images and compact layout in one case but not the other. Of course, profiles could be specified using a “partial response,” although typically it is used to specify a part (or portion) of a standard schema (such as a subset of a larger taxonomy), not a whole different set of data, format, among others.

Back to Top


There are many ways to make the Web faster, including mobile. This article is meant to be a useful reference for API developers who are designing the back-end systems that support mobile clients—and to this end, ultimately enabling and preserving a positive mobile-application user experience.

q stamp of ACM Queue Related articles

Mobile Application Development: Web vs. Native
Andre Charland and Brian LeRoux

Streams and Standards: Delivering Mobile Video
Tom Gerstel

Usablity Testing for the Web
Vikram V. Ingleshwar

Back to Top

Back to Top

Back to Top


F1 Figure 1. Example request and response using a parameter to indicate image size.

F2 Figure 2. Example request and response for a specific product, with a flag indicator to show prefetching data.

F3 Figure 3. An API call with third-party prefetching (along with Figure 2).

Back to top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More