Computing Applications Viewpoint

Google’s Hybrid Approach to Research

By closely connecting research and development Google is able to conduct experiments on an unprecedented scale, often resulting in new capabilities for the company.

By Alfred Spector, Peter Norvig, and Slav Petrov

Posted Jul 1 2012

Introduction
Implications of Google's Mission and Capabilities
Hybrid Research at Google
Example Research Patterns
Successes
Discussion
Conclusion
References
Authors
Footnotes
Figures

Google Fellow Jeffrey Dean discusses MapReduce, Google File System, and BigTable during a keynote session.

In this viewpoint, we describe how we organize computer science research at Google. We focus on how we integrate research and development and discuss the benefits and risks of our approach. The challenge in organizing R&D is great because CS is an increasingly broad and diverse field. It combines aspects of mathematical reasoning, engineering methodology, and the empirical approaches of the scientific method. The empirical components are clearly on the upswing, in part because the computer systems we construct have become so large that analytic techniques cannot properly describe their properties, because the systems now dynamically adjust to the difficult-to-predict needs of a diverse user community, and because the systems can learn from vast datasets and large numbers of interactive sessions that provide continuous feedback.

We have also noted that CS is an expanding sphere, where the core of the field (theory, operating systems, and so forth) continues to grow in depth, while the field keeps expanding into neighboring application areas. Research results come not only from universities, but also from companies, both large and small. The way research results are disseminated is also evolving and the peer-reviewed paper is under threat as the dominant dissemination method. Open source releases, standards specifications, data releases, and novel commercial systems that set new standards upon which others then build are increasingly important.

To compare our approach to research with that of other companies is beyond the scope of this Viewpoint. But, for reference, we note that in the terminology of Pasteur’s Quadrant,¹¹ we do “use-inspired basic” and “pure applied” (CS) research. Buderi² and Dodgson et al.⁵ discuss information technology research generally, pointing out the movement in industrial labs toward research that strongly considers product needs. Recent articles, such as those by Leifer et al.⁸ and Enkel et al.,⁶ illustrate related issues on how firms do research and catalyze innovation.

The goal of research at Google is to bring significant, practical benefits to our users, and to do so rapidly, within a few years at most. Research happens throughout Google, exploring technical innovations whose implementation is risky, and may well fail. Sometimes, research at Google operates in entirely new spaces, but most frequently, the goals are major advances in areas where the bar is already high, but there is still potential for new methods. In these cases, simply establishing the feasibility of a research idea may be a substantial task, but even greater effort is required to create a true success or useful negative result.

Because of the time frame and effort involved, Google’s approach to research is iterative and usually involves writing production, or near-production, code from day one. Elaborate research prototypes are rarely created, since their development delays the launch of improved end-user services. Typically, a single team iteratively explores fundamental research ideas, develops and maintains the software, and helps operate the resulting Google services—all driven by real-world experience and concrete data. This long-term engagement serves to eliminate most risk to technology transfer from research to engineering. This approach also helps ensure the research efforts produce results that benefit Google’s users, by allowing research ideas and implementations to be honed on empirical data and real-world constraints, and by utilizing even failed efforts to gather valuable data and statistics for further attempts.

Implications of Google’s Mission and Capabilities

Google’s mission “To organize the world’s information and make it universally accessible and useful,” both supports and requires innovation in almost all CS disciplines. For example, we aim to “understand” user intent and the meaning of documents, to translate between languages with everhigher fidelity, and to be able to transform content in one modality (say, image) into relevant content in all others (say, text). Google’s entire organization is focused on rapid innovation, and three aspects of Google’s technology and business model support this:

Organizing all of the world’s information requires large amounts of resources. By providing a rich set of computing abstractions and powerful processors, storage, and networking capabilities in our data centers, Google has been able to gain economies of scale and to sidestep some of the complexity of heterogeneous computing environments.
The services-based delivery model brings significant benefits to research and development. Even a small team has at its disposal the power of many internal services, allowing the team to quickly create complex and powerful products and services. Design, testing, production, and maintenance processes are simplified. Additionally, the services model, particularly one where there is significant consumer engagement, facilitates empirical research.
Google has been able to hire a talented team across the entire engineering operation. This gives us the opportunity to innovate everywhere, and for people to move between projects, whether they be primarily research or primarily engineering.

Hybrid Research at Google

Google’s focus on innovation, its services model, its large user community, its talented team, and the evolutionary nature of CS research has led Google to a “Hybrid Research Model.” In this model, we blur the line between research and engineering activities and encourage teams to pursue the right balance of each, knowing that this balance varies greatly. We also maintain considerable fluidity in terms of moving both people and projects as needs change. As such, even in areas where there is a much higher proportion of research to engineering, the “Research Team” we have established is not as formally separate from engineering activities as those in other organizations, and for example runs large production systems, too. Overall, we undertake research work when we feel its substantially higher risk is warranted by a chance of more significant potential impact. Additionally, research also has the potential to impact the world both through Google’s products and services, and through the academic research community. We recognize that the wide dissemination of fundamental results often benefits us by garnering valuable feedback, educating future hires, providing collaborations, and seeding additional work.

Google’s approach to research is iterative and usually involves writing production, or near-production, code from day one.

In no way do we feel our model precludes long-term research: we just try to “factorize” it into shorter-term, measurable components. This provides benefits to us in terms of team motivation (based upon evidence of concrete progress in reasonable time periods) and the potential for commercial benefit (in advance of the complete fulfillment of all objectives). Even if we cannot fully factorize work, we have sometimes undertaken longer-term efforts. For example, we have started multiyear, large systems efforts (including Google Translate, Chrome, Google Health) that have important research components. These projects were characterized by the need for complex systems and research (such as Web-scale identification of parallel corpora for Translate¹² and various complex security features in Chrome⁹ and Health). At the same time, we have recently shown that even in longer-term, publicly launched efforts, we are unafraid to refocus our work (for example, Google Health), if it seems we are not achieving success.

Clearly, this approach benefits from the mainly evolutionary nature of CS research, where great results are usually the composition of many discrete steps. If the discrete steps required large leaps in vastly different directions, we admit that our primarily hill-climbing-based approach might fail. Thus, we have structured the Google environment as one where new ideas can be rapidly verified by small teams through large-scale experiments on real data, rather than just debated. The small-team approach benefits from the services model, which enables a few engineers to create new systems and put them in front of users. This in turn enables us to conduct experiments at a scale that is generally unprecedented for research and development projects. One consequence is that many projects can directly affect billions of users. This naturally influences how researchers choose to spend their time, balancing the opportunity to have impact through Google’s services with the opportunity to have impact in the academic community. Google encourages both kinds of impact, and some of the most successful projects achieve both.

Our hybrid approach to research enables us to conduct experiments at a scale that is generally unprecedented for research projects.

We thus define our hybrid research model as one that aims to generate scientific and engineering advances in fields of import to Google; that does so in a way that tends to factorize longer projects (perhaps with very challenging goals) into discrete, achievable steps (each of which may be of commercial value); where we maximally leverage our cloud computing models and large user base to support in vivo research; where we allow for the maximal amount of organizational flexibility so we can support both projects that require some room to grow unfettered by current constraints and projects that require close integration with existing products; and where we emphasize knowledge dissemination using a flexible collection of different approaches.

Example Research Patterns

An advanced project in a product-focused team that, by virtue of its creativity and newness, changes the state of the art and thereby produces new research results. The first and most prevalent pattern exemplifies how blurry the line between research and development work can be. Operating at large scale, engineering teams are often faced with novel challenges which, when overcome, constitute research results. Organizationally, research is done in situ by the product team to achieve its goals. The most successful high-profile examples of this pattern are systems infrastructure projects such as MapReduce,⁴ Google File System,⁷ and BigTable.³
A project in the research group that results in new products or services. The second pattern is research followed by the operation of the production service based on that research. Both Google Translate and Voice Search¹⁰ are examples of this pattern, where the cloud computing infrastructure enabled small research teams to build systems that could be deployed. This pattern applies best when continuing research can further improve and extend the resulting products.
A project in the research group that creates new concepts and technologies, which are then applied to existing products or services. The third pattern is a traditional research and development model. Google’s success with this model of research benefits from the services model and from the emphasis on data-driven evaluation. For instance, some new audio and video fingerprinting techniques,¹ which researchers were able to demonstrate not only on small test cases, but on real data at production scale, were then productized by YouTube engineers.
A joint research project between an engineering team and the research group that is then used by that engineering team. The fourth pattern is a collaborative integration of research and development teams. Many of our products require novel algorithmic solutions to support high performance, thus posing a blend of research and engineering challenges. An example for this pattern is the work done by our Market Algorithms group in collaboration with teams working on our advertisement systems. Together, they design, modify, and analyze the core algorithms and economic mechanisms used for ad selection and optimization.
A research project in an engineering team that is transitioned to the research group (and eventually becomes (2.), (3.), or (4.) here). The fifth pattern, transitioning a project from an engineering team to the research team is an important mechanism for giving a project more time or resources, when the work is important more broadly than for a specific engineering team. An example of this pattern is work on YouTube recommendations, which started in various engineering groups, but then moved to a research team, where the work continued using a different, and perhaps deeper, algorithmic basis.

Successes

In the same way that it is difficult to define what exactly constitutes “research,” it can be difficult to measure its “success.” In our opinion, a research project is successful if it has academic or commercial impact, or ideally, both. Commercial impact at Google is perhaps easier to measure, and the company has benefitted from numerous advances in systems, speech recognition, language translation, machine learning, market algorithms, computer vision, and more.

By academic impact we refer to impact on the academic community, other companies or industries, and the field of computer science in general. Of course, this type of impact has most traditionally come from publications, and Google continues to publish research results at increasing rates (from 13 papers published in 2003, to 130 in 2006, to 279 in 2011). Some of our papers are highly regarded and have been extensively cited.^3,4,7 But we feel that publications are by no means the only mechanism for knowledge dissemination: Googlers have led the creation of over 1,000 open source projects, contributed to various standards (for example, as editor of HTML5), and produced hundreds of public APIs for accessing our services. In some cases, we have used these different channels in symbiotic ways, following up an initial publication describing the high-level ideas (MapReduce, GFS, BigTable) with open source implementations of particular aspects (Protocol Buffers). In other cases, projects have started as open source initiatives from day one: Android and Chromium are probably the two most well-known examples of open source projects and demonstrate the effectiveness of this approach.

Discussion

Technology companies invest in research for a number of reasons, including: importance to the company’s products and services, prestige and contributions to the public good, and reducing the risk of getting blindsided by new technology developments.

Research at Google is built on the premise that connecting research with development provides teams with powerful, production-quality infrastructure and a large user base, resulting not only in innovative research, but also in valuable new commercial capabilities. By coupling research and development, our goal is to minimize or even eliminate the traditional technology transfer process, which has proven challenging at other companies. Most of our projects involve people working with a given technology from the research stage through to the product stage. This close collaboration and integration furthermore ensures the reality of the problems being investigated: research is conducted on real systems and with real users. Our flexible organization also provides diverse opportunities for our employees and has positive implications for our innovation culture and hiring ability.

Of course, this close integration also brings some risks with it. Being so close to the users and to the day-to-day activities of product teams, it is easy to get drawn in and miss new developments. To mitigate this risk, we engage with the academic community through various initiatives such as our visiting faculty program, our intern program or our faculty research awards program. We also encourage publication of research results, though we sometimes get criticized for not publishing enough. One reason for this is that researchers at Google have multiple avenues for having impact, publishing papers not being the only method. As a result, Googlers publish fewer papers, but the ones they publish can be more impactful, because they describe experience with well-tested and implemented systems, not just proposed ideas. Another potential pitfall of the hybrid research model is that it is probably more conducive to incremental research. We therefore do support paradigmatic changes as well, as exemplified by our autonomous vehicles project, Google Chauffeur, among others.

Conclusion

Many of the world’s computer science research questions are of great relevance to Google’s business, our technical leaders, and our user community. We have chosen to organize computer science research differently at Google by maximally connecting research and development. This yields not only innovative research results and new technologies, but also valuable new capabilities for the company. Our hybrid approach to research enables us to conduct experiments at a scale that is generally unprecedented for research projects, generating stronger research results that can have a wider academic and commercial impact. We also provide flexible opportunities across the R&D spectrum for our team members. While our hybrid research model exploits a number of things particular to Google, we hypothesize that it may also serve as an interesting model for other technology companies.

Figures

Figure. Google Fellow Jeffrey Dean discusses MapReduce, Google File System, and BigTable during a keynote session.

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read

Google’s Hybrid Approach to Research

View in the ACM Digital Library

DOI

10.1145/2209249.2209262

July 2012 Issue

Published: July 1, 2012

Vol. 55 No. 7

Pages: 34-37

Table of Contents

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Explore More

BLOG@CACM Oct 3 2024

Leveraging Graph Databases for Fraud Detection in Financial Systems

Alex Williams

Architecture and Hardware

bank vault and analytics graphs, illustration

News Oct 2 2024

How Laser Communications Are Improving Satellites

Logan Kugler

Data and Information

satellite spacecraft above the Earth, illustration

BLOG@CACM Sep 30 2024

Leveraging SaaS and Cloud Solutions for Enhanced Business Agility

Alex Tray

Data and Information

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More

Implications of Google’s Mission and Capabilities

Hybrid Research at Google

Example Research Patterns

Successes

Discussion

Conclusion

Figures

Google’s Hybrid Approach to Research

DOI

July 2012 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.