Technical Perspective: Video Quality Assessment in the Age of Internet Video

In the days of one-size-fits-all video delivery, broadcasters and rebroadcasters (like satellite and cable companies) concerned themselves with maintaining very high delivery quality under conditions they mostly controlled, from the original released content to the users’ eyeballs. Variations in delivered quality were, of course, present. Televisions differed in size and quality of electronics. Both analog and digital broadcast technologies had distance-based degradations and occasional partial or total failures. Nonetheless, users’ expectations of these systems were set to "broadcast quality" and providers were held to that standard by both regulators and the content owners. Quality measurement for broadcast video, both via objective metrics and subjective assessment, are a mature field with generally well-understood methodology and agreed standards.

Once again, it appears that "the Internet changes everything." The variety of endpoints for consuming video has dramatically increased. The variety of formats and delivery protocols has exploded, with standards lagging invention by years. Consumer access has opened up beyond the broadcasters and licensees through the global availability of "over the top" video services like Netflix, Hulu, and YouTube. All of this is eerily reminiscent of the disruptive trajectory of Voice-over-IP (VoIP), which went from a hobbyist curiosity in the mid-1990s to a competitive mainstream commercial service in the early 2000s, to poised to completely replace the PSTN by the year 2020. The tipping point came around 2004, when the majority of global voice traffic was carried at least partially over the Internet.

As with VoIP, we are seeing consumers substantially changing their consumption behavior and their assessment of service quality. Convenience trumps quality. Quality expectations are lower, partially due to this, but also due to dramatically lower pricing for large libraries of content. In this changed environment, what measures of quality are most relevant, and how are they obtained from consumers no longer tethered to a single delivery service? In the following paper, the authors study the relationship between various objective measures of video delivery quality and user engagement, which they propose as the overall measure of the effectiveness of a video service.

Prior to digital video services, which have the ability to easily instrument a large number of programmable endpoints, user engagement could be assessed only by expensive means and with small sample sets, as was done for decades by companies like Nielsen. Digital cable systems could get a rough measure of user engagement by logging channel-changing behaviors. More recently IPTV systems, which emulate classic over-the-air and cable broadcast systems using IP network technology, have collected both channel-change and video-quality information from endpoints. None of these systems, however, has amassed anywhere near the quantity and diversity of data available to the authors through the Conviva service, nor have they published extensive studies relating video quality measures to user engagement. For large-scale Internet video services, this study is groundbreaking and will provide a baseline set of measures for others to use in future work.

For large-scale Internet video services, this study is groundbreaking.

Some of the results are unsurprising. Stalls in video output (which the authors call buffering) cause the most dramatic effects on user engagement, particularly for "live" content such as sports and news. At some level, video stalls are analogous to major failures or glitches of earlier systems, and cause the most profound dissatisfaction among users. Stalls are a widespread phenomenon new to Internet video, and stem from the time- and location-varying quality of Internet bandwidth, packet loss patterns, and congestive overload. The protocols used today for adaptive streaming of video are still primitive and the subject of both active research and standardization work. As they improve, the frequency and severity of stalls should decrease, no longer masking the magnitude of other effects such as encoding quality, variations in quality due to adaptation algorithms, and join time. Other results in this work must be seen in the light of the dominance of stalls compared to the other phenomena the authors evaluate.

Despite the ubiquity and popularity of Netflix and YouTube, we are still in the early days of the Internet video phenomenon. As with VoIP, users initially were delighted to the point of astonishment that the system worked at all, since it provided access to a significantly larger library of content, at lower cost, greater convenience, and on more devices than available in the "walled gardens" offered by broadcasters, satellite, and cable systems. Today we are in the hyper-growth phase, where users expect the systems to have high availability but not necessarily reach the quality of the legacy systems. It will not be long before quality as well is competitive with or even in some cases exceeds non-Internet systems.

We will need mature measurement methodologies to support these systems. This work is a good start.

Technical Perspective: Video Quality Assessment in the Age of Internet Video

Technical Perspective: Video Quality Assessment in the Age of Internet Video

DOI

March 2013 Issue

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.

Technical Perspective: Video Quality Assessment in the Age of Internet Video

DOI

March 2013 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.