Engineering the Web’s Third Decade

smart phones and wine glasses — A Rensselaer Polytechnic Institute application for location-aware phones accesses Facebook and other online sources to make wine recommendations for a particular group of friends.

Researchers working on the next generation of Web technology tend to avoid hyperbole, using language more cautious than the erstwhile bravado frequently exhibited by Internet evangelists prior to the big dot-com bust. Today, the Web is quickly advancing toward its third decade and to what many are calling its third major upgrade. It is moving beyond mere two-way interactive Web 2.0 technologies to a more dynamic, pervasive, and perhaps even more human experience. Indeed, as Web 3.0 emerges, those working at the forefront of Internet technology research tend to speak with guarded language, suggesting the next major advancements in Web technologies might be more evolutionary than revolutionary—at least for now.

Use of the term “Web 3.0” to describe the Web’s next major developments has become loaded, capable of connoting very different implications for technology and society. At least one popular idea hovering around use of the term Web 3.0 is that Web 3.0 technologies will help filter the “wisdom of the crowd” so that it doesn’t become the “madness of the mob.” Critics of this position suggest this way of thinking will contribute to a reduction of the kind of democratization on the Web that made it so popular as a medium for information sharing, social interaction, and other forms of expression.

Richard Stanton, chief executive of Bintro.com, a NY-based company that bases its business model on emerging Web 3.0 technologies, sidesteps the Web 3.0 terminology controversy and points out that Web 3.0’s social implications can be defined simply by focusing on the personal. “Data becomes much more valuable and has a much bigger return when we tailor users’ experiences to their individual needs,” Stanton says. “The more fulfillment one gains from personal experiences on the Web, the better off the masses will be, whether it is democratic, meritocratic, or anything in between.”

From a technology standpoint, researchers suggest a key aspect of Web 3.0 technology is moving beyond Web 2.0’s popular Asynchronous JavaScript and XML (AJAX) model to one more infused with semantic technologies that facilitate interlinked data and customizable, portable applications that are device- or system-neutral. Jim Hendler, for example, suggests viewing Web 3.0 simply as Semantic Web technologies powering large-scale Web apps. “The problem is that, like Web 2.0 before it, the term can be taken many ways,” says Hendler, a professor in the computer and cognitive science departments at Rensselaer Polytechnic Institute (RPI).

“Many people use Web 3.0 to mean Web applications that use semantic technologies, while others tend to use it to mean anything that fixes the many known problems with Web 2.0,” he says. “I tend to like [Radar Networks CEO] Nova Spivack’s idea that the version numbers correspond more to Web decades than to specific technologies, and that 3.0 will be the term used for all the new technologies emerging over the coming third decade of the Web.”

Debates about the merits of Web 3.0 as a label for emerging Internet technologies aside, Hendler’s own work focuses largely on Semantic Web technology and in particular on scalable reasoning and data-on-demand systems. “We are looking at technologies that could, on the fly, find and merge appropriate pieces of very large data sets into custom data caches and make those available in Web applications,” Hendler says. The key, he notes, is finding a trade-off that is more efficient than the traditional knowledge relationships that researchers working in AI might use, but more powerful than the relational models that have been the hallmark of database research.

Hendler is working with the data that the U.S. government is releasing in the data.gov project with the purpose of making it available in Semantic Web formats. In practical terms, Hendler and his team are focused on linking the data to other data sets and connecting it into information sources in what researchers are now calling the “linked open data cloud,” a set of data sets that have partial mappings to other data sets and domains, so that developers can mash up the data and write Web apps on top of it.

In another project, Hendler is using supercomputers to scale Semantic Web algorithms to extremely large data sets. “We’ve been playing with graphs that have over a billion triples [the assertions underlying the Semantic Web],” he says. “There’s really only a small number of groups working on this approach, and we think we’re the only U.S. group in the space, so it is great fun.” As it turns out, Hendler and his team at RPI have been able to engineer new kinds of parallelization for Semantic Web processes. He says these developments might soon enable his team to migrate the algorithms to commodity hardware to power large-scale Web apps used by millions of people.

Billions of Triples

Despite the promising developments, challenges in this area remain. While Hendler and his team are experimenting with billions of triples, the Open Calais project, one of many new endeavors in this area, is creating 800 million triples each week. “In the way the Web has of making scale critical, the numbers are growing really big, really fast,” says Hendler. Such scale will become even more of an issue as more applications begin linking to other apps through the Semantic Web layer.

Another researcher working on how Semantic Web technologies can facilitate information handling and more precise representations is Ora Lassila, a senior data technologist at Nokia Services and a member of the Nokia CEO Technology Council. Lassila is focused on a specific aspect of Semantic Web technologies, which he calls “provenance”—that is, where a piece of information came from, who generated it, and when. One of his goals is to facilitate the transformation of Web 2.0 mashups so that information bits from multiple sources can still retain their provenance.

Web 3.0’s semantic technologies will facilitate interlocked data and customizable, portable apps that are device-or system-neutral.

“Thus, you would be able to dissect information and better understand its reliability and trustworthiness,” Lassila says. In this line of thinking, Lassila rejects the notion that Web 3.0 might lead to a reduction in democratization. “It seems to me,” he says, “that making it easier to disseminate trustworthy information would have the opposite, positive effect.”

Lassila says he is surprised at how quickly Semantic Web ideas have been embraced by Internet developers, particularly in the past few years during which many Semantic Web ideas and formats have been adopted by even large Internet companies. For example, Google now supports a technology called Rich Snippets and Yahoo has created Search Monkey, both of which rely on Semantic Web strategies. Powerset’s semantic technology—acquired by Microsoft in 2008—is reportedly a significant component of Bing, Microsoft’s search engine.

As another example, Bintro.com is using semantics to enhance matching technologies and simplify the way users’ needs are fulfilled online. Bintro’s technology combines public semantic knowledge bases with the company’s own knowledge base, which includes subject-specific terminology and jargon. Bintro uses semantic data that in most cases was not compiled for the purpose of matchmaking, making the effective organization of it a challenge that Stanton says is unique to the company. One aspect of this challenge, in particular, is replacing existing multi-select fields by using semantic data relationships from narrative fields.

The future Web will facilitate a more pervasive and intuitive user experience, providing content or services specific to the user’s implied needs, suggests Richard Stanton.

According to Stanton, Bintro’s goal is not only to demonstrate the use of Web 3.0 technologies today, but also to build an engine for powering other Web 3.0 apps in the future. “Web 3.0 is all about personalization,” he says. “Instead of simply looking at the user as an eyeball, Web 3.0 aims to look at the user as an engaged personality with multiple facets from which the context of a user’s statement can draw a better result.”

Lassila points to this type of highly customized user experience as an ongoing challenge at Nokia. “This is good for the users, but I am not entirely convinced how sustainable this is, as the implementation part becomes more and more difficult,” according to Lassila. Still, he says Semantic Web technologies hold great promise, particularly in situations in which users might require useful information from multiple data sources. “There are plenty of existing opportunities for clever data management,” he says.

As for future research, Lassila says he is committed to working toward a “substantial convergence” of technologies, with computing machinery such as phones, PCs, and appliances connecting with communication systems to facilitate seamless interaction with family, friends, and colleagues, regardless of the different technologies involved. “It should not matter where the data comes from, where it resides, or what applications or systems create it,” Lassila says. “What matters is how I want to use it.”

Echoing this sentiment, Stanton suggests the future Web will facilitate a more pervasive and intuitive user experience, providing content or services specific to the user’s implied needs. “I was a big fan of The Jetsons as a kid and I always loved how effortless their interaction with technology was,” he says. “The Semantic Web puts us one step closer to such a reality.”

For his part, RPI’s Hendler predicts that in five years when Web 3.0 strategies have begun to mature, Web applications might still look a lot like they do today but will have much more data available to them, will have search-like capabilities far more sophisticated than current search engines, and will be able to exploit query context much more effectively. Hendler also predicts that much more of our access to the Web will be from mobile devices, with location and social context more readily available to applications that are given elevated privileges.

Still, like many researchers working in this area, Hendler is already looking beyond emerging Semantic Web strategies and related technologies that are now collectively called Web 3.0. “This stuff is new and exciting,” he says. “But I look at it this way: I started playing with the Semantic Web back in the 1990s. As a researcher, I’m not content to sit around and exploit Web 3.0; my job is to help create Web 4.0.”

Further Reading

Berners-Lee, T., Hendler, J., and Lassila, O.
The Semantic Web. Scientific American 284, 5, May 2001.

Harris, D.
Web 2.0 Evolution into The Intelligent Web 3.0, Emereo Publishing, Brisbane, Australia, 2008.

Hendler, J.
Web 3.0 emerging. Computer 42, 1, January 2009.

Shadbolt, N., Berners-Lee, T., and Hall, W.
The Semantic Web revisited. Intelligent Systems 21, 3, May 2006.

Warren, P., Davies, J., and Brown, D.
The Semantic Web: from vision to reality. ICT Futures: Delivering Pervasive, Real-time and Secure Services, John Wiley & Sons, Hoboken, NJ, 2008.

Figures

Figure. A Rensselaer Polytechnic Institute application for location-aware phones accesses Facebook and other online sources to make wine recommendations for a particular group of friends.