October 2009 - Vol. 52 No. 10

October 2009 issue cover image

Features

BLOG@CACM blog@CACM

The Netflix Prize, Computer Science Outreach, and Japanese Mobile Phones

The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the BLOG@CACM community. In each issue of Communications, we'll publish excerpts from selected posts. Greg Linden writes about machine learning and the Netflix Prize, Judy Robertson offers suggestions about getting teenagers interested in computer science, and Michael Conover discusses mobile phone usage and quick response codes in Japan.
Opinion CACM online

Following the Leaders

The articles, sections, and services available on Communications' Web site all vie for visitor attention. According to our latest Web statistics, the following features are the most popular in pageviews since the site's formal launch in April 2009.
Opinion Viewpoints

Computing in the Depression Era

Since its beginning, the computer industry has been through several major recessions, each occurring  approximately five years after the establishment of a new computing paradigm. These new computing modes created massive opportunities that the entrepreneurial economy rapidly supplied and then oversupplied.
Opinion Viewpoints

Reflections on Conficker

Conficker's alarming growth rate in early 2009 along with the apparent mystery surrounding its ultimate purpose had raised concern among whitehat security researchers. Here is an insider's view of the analysis and implications of the Conficker conundrum.
Opinion Viewpoints

Dealing with the Venture Capital Crisis

The venture capital industry, like financial services in general, has fallen on hard times. Part of the problem is that large payoffs have become increasingly scarce. But perhaps the biggest future challenge for VC firms will be geography. What really might jump-start the industry is more creative globalization, with an eye toward using some overseas markets as "natural incubators."
Research and Advances Contributed articles

A View of the Parallel Computing Landscape

Writing programs that scale with increasing numbers of cores should be as easy as writing programs for sequential computers. Here as a concrete example of a coordinated attack on the problem of parallelism.
Research and Advances Research highlights

Distinct-Value Synopses For Multiset Operations

The task of estimating the number of distinct values (DVs) in a large dataset arises in a wide variety of settings in computer science and elsewhere. We provide DV estimation techniques for the case in which the dataset of interest is split into partitions. We create for each partition a synopsis that can be used to estimate the number of DVs in the partition. By combining and extending a number of results in the literature, we obtain both suitable synopses and DV estimators. The synopses can be created in parallel, and can be easily combined to yield synopses and DV estimates for "compound" partitions that are created from the base partitions via arbitrary multiset union, intersection, or difference operations. Our synopses can also handle deletions of individual partition elements. We prove that our DV estimators are unbiased, provide error bounds, and show how to select synopsis sizes in order to achieve a desired estimation accuracy. Experiments and theory indicate that our synopses and estimators lead to lower computational costs and more accurate DV estimates than previous approaches.
Research and Advances Virtual extension

Do SAP Successes Outperform Themselves and Their Competitors?

It's been over 10 years since corporate America embraced ERP systems, but hard evidence on the financial benefits that ERP systems have provided has been elusive. This debate has spilled into the mainstream media, as America's two largest ERP vendors regularly advertise that their customers have benefited financially by using their products; some of these ads even cite studies that discredit the other's claims of financial superiority. Investments in IT are typically justified by the productivity and profitability improvements that follow their implementation. It seems intuitive that IT will help streamline existing business processes, which should lead to a more efficient and ultimately more profitable company. Organizations of all types and sizes have invested heavily in IT based on this simple rationale, but the associated financial benefits have been difficult to nail down. While managers struggled to value their firms' IT investments, researchers tried to better understand the factors that made it so difficult to value corporate investments in IT. Among the factors that have been suggested, three may be especially useful. They are: • Firm-specific resources and capabilities that meaningfully impact the success of IT implementations; • External forces that exert themselves on the firm; and • The nature of the financial indices used to value IT investments. Much of the early research on the value of IT investments was based on industry-level data that masked the effects of important firm-specific resources and capabilities. When researchers finally examined firm-level data, they realized that differences in, for example, IT expertise and management, the quality of a firm's leadership and other human resources, and the uniqueness of its operations affected the success of IT implementations. Companies with firm-specific advantages generally out-produced their competitors. The success of IT implementations also depends on the unique set of unwieldy external forces that exert themselves on the firm. For example, Melville etal. suggest a host of external forces that may affect the impact of IT implementations on organizational performance, including the degree of competition within an industry, the impact of the firm's trading partners, and country characteristics. Unfortunately, studies of the relationship between IT and organizational performance are plagued by disagreements about the external constructs that should be examined, how these constructs are operationalized, and the nature of their interrelationships. There have also been concerns expressed about the financial indices used to measure the affects of IT implementations on corporate performance. Foremost among these indices are measures of corporate productivity and profitability. Productivity is associated with how efficiently a firm manages its business processes to produce a dollar of sales. For example, employee productivity is often calculated as the dollar level of sales generated per dollar paid to employees (net sales/employee cost). Firms usually have substantial control over their business processes, and this makes them potentially easier to measure and value financially. For example, firms often use proprietary processes to more efficiently manage their inventories in order to generate a higher level of sales. Therefore, measuring how efficiently a firm manages its inventory can be calculated using inventory turnover (net sales/inventory). On the other hand, profitability is an organizational performance measure that can be affected by factors unrelated to the IT investment. These factors include, for example, the number and quality of the firm's competitors and trading partners, intra-firm shifts in spending, and macro-economic changes in interest rates, exchange rates and inflation; many pundits would argue that even the formal recognition as a su
Research and Advances Virtual extension

Balancing Four Factors in System Development Projects

Introduction The success of system development is most often gauged by three primary indicators: the number of days of deviation from scheduled delivery date, the percentage of deviation from the proposed budget, and meeting the needs of the client users. Tools and techniques to help perform well along these dimensions abound in practice and research. However, the project view of systems development should be broader than any particular development tool or methodology. Any given development philosophy or approach can be inserted into a systems development project to best fit the conditions, product, talent, and goals of the markets and organization. In order to best satisfy the three criteria, system development project managers must focus on the process of task completion and look to apply controls that ensure success, promote learning within the team and organization, and end up with a software product that not only meets the requirements of the client but operates efficiently and is flexible enough to be modified to meet changing needs of the organization. In this fashion, the project view must examine both process and product. Often, tasks required for project completion seem contradictory to organizational goals. Within the process, managerial controls are applied in order to retain alignment of the product to the initial, and changing, requirements of the organization. However, freedom from tight controls promotes learning. The product also has contradictions among desired outcomes. Designers must consider tradeoffs between product efficiency and flexibility, with the trend in processing power leading us ever more toward the flexibility side. Still, we rage between conflicting criteria, with the advocates of a waterfall system development lifecycle (SDLC) usually pushing more for control aspects and efficient operations while agile proponents seek more of a learning process and flexible product. Regardless of the development methodology followed, project managers must strive to deliver the system on time, within budget, and to meet the requirements of the user. Thus, both product and process are crucial in the determination of success. To compound the difficulties, those in control of choosing an appropriate methodology view success criteria from a different perspective than other stakeholders. Understanding how different stakeholders perceive these factors impacting eventual project success can be valuable in adjusting appropriate methodologies. Our study looks at these relationships using well established instruments in a survey of IS development professionals to better clarify the importance of these variables in system project success and any perceived differences among different players in IS development (see the sidebar on "How the Study Was Conducted").
Research and Advances Virtual extension

Attaining Superior Complaint Resolution

Introduction In 2003, Dell computers shifted support calls for two of its corporate computer lines from its call center in Bangalore, India back to the U.S. The reason was that its customers were not satisfied with the level of technical support they were getting. Apart from the language difficulties, customers also faced difficulty in reaching senior technicians to, perhaps, resolve their problems more quickly. However, such problems are not just limited to computer vendors such as Dell. Recent research from Accenture finds that 75% of the sample of consumer technology company executives believed their companies provided average customer service. However, to their surprise, 58% of their customers had rated customer service to be either average or below average. A further grim detail was that 81% of the respondents who rated customer service as below average expressed intent to purchase from a different vendor next time. This research highlights the importance of customer service for consumer technology companies in retaining their customers. In general, consumer technology companies spend inordinate amounts of time, cost, and effort to get their innovations to market. However, initial acceptance is only the first step towards technology utilization. It is only after a certain amount of use that customers become aware of a technology's benefits and limitations. Having technology is one thing, using it effectively and persisting with it, is quite another. Hence, the study of factors leading to consumer technology repurchase is of critical importance. Consumer technologies, in particular, demand attention due to their commoditization, increased complexity, advances in technology, and focus on high serviceability. We can note the following when we think of consumer technologies such as PCs, laptops, or mobile phones: • The marketplace for these technologies is characterized by fierce competition amongst numerous players leading to a continuous price decline. For instance, almost all computer vendors now offer laptops for a few hundred dollars as compared to thousands of dollars a few years back. As prices continue to decline, it is imperative that companies focus on providing high-level customer service to differentiate from competitors and retain their existing customers, and prevent them from discontinuing their product. • Consumer technologies have also become more complex with more functionality being constantly added to the core product. Take the case of mobile phones: What was once a simple device for making phone calls has been morphed to include a digital camera, mp3 player, organizer, and a Web browser, to name a few. With such additional functionality and increased complexity, a customer is likely to encounter problems whose cause is difficult to identify correctly, yet need to be resolved quickly before the customer switches to a competing product. • Technological advances and a new generation of products have meant that both the technology providers as well as customers have to be knowledgeable in utilizing the consumer technologies. Without proper knowledge of the technology, support staff often struggle to resolve the problems in a timely manner. For example, in resolving problems with new release of operating system like Windows Vista, both the customers as well as Microsoft technical staff are required to have certain amount of knowledge about the system. A crucial aspect of customer service is being able to resolve consumer concerns during their use of technology. These factors contribute to difficulties in retaining customers for the consumer technology companies. One of the ways to have satisfied customers is continuing to address customer complaints effectively. Customers expect to have any service or product failures diagnosed and resolved quickly. In this context, we chose to examine how the complaint management process c
Research and Advances Virtual extension

Making Ubiquitous Computing Available

The field of ubiquitous computing was inspired by Mark Weiser's vision of computing artifacts that disappear. "They weave themselves into the fabric of everyday life until they are indistinguishable from it." Although Weiser cautioned that achieving the vision of ubiquitous computing would require a new way of thinking about computers, that takes into account the natural human environment, to date no one has articulated this new way of thinking. Here, we address this gap, making the argument that ubiquitous computing artifacts need to be physically and cognitively available. We show what this means in practice, translating our conceptual findings into principles for design. Examples and a specific application scenario show how ubiquitous computing that depends on these principles is both physically and cognitively available, seamlessly supporting living. The term 'ubiquitous computing' has been used broadly to include pervasive or context-aware computing, anytime-anywhere computing (access to the same information everywhere) and even mobile computing. Work on this 'ubiquitous computing' has been largely application driven, reporting on technical developments and new applications for RF(Radio Frequency) ID technologies, smart phones, active sensors, and wearable computing. The risk is that in focusing on the technical capabilities, the end result is a host of advanced applications that bear little resemblance to Weiser's original vision. This is a classic case of not seeing the forest for the trees. In this article, we want to take a walk in the forest, that is, to suggest a new way of thinking about how computing artifacts can assist us in living. In doing this, we draw on German philosopher Martin Heidegger's analysis of the need for equipment to be 'available.' While several influential studies in human-computer interaction (HCI) have also drawn on Heidegger and the concept of availability, these studies have focused on physical availability. While going some way to identifying and addressing the problems that Weiser identified with traditional computing, they have not gone far enough. Delving deeper into Heidegger's analysis, we can explain why artifacts designed using the traditional model of computing tend to get in the way of what we want to do. This leads us to refine the concept of physical availability and identify the need for computing artifacts to also be cognitively available. We will first draw on Heidegger to explain why it is that computing artifacts designed according to the traditional model are often a hindrance rather than a help. The traditional conception of how we use computing is based on a particular understanding of human action, which we have referred to elsewhere as the deliberative theory of action. According to this deliberative theory of action, humans reflect on the world before acting. Traditionally computing artifacts are designed to assist us through providing a representation of the world which we can reflect on before action. In other words, the traditional computing artifact requires us to move away from acting in the world to 'use' the computer. In the case of the desktop computer, there is an obvious physical move away from acting in the world to 'using' the computer. Mobile technology can bring the computer to the person in the form of laptops, handhelds and so on. However, as Figure 1 illustrates, mobility, in and of itself, does nothing to remove the dichotomy between reflecting on the world and acting in the world. We consider that Heidegger's account of how we act in the world is a truer account of everyday activity than the deliberative theory of action implicit in Figure 1. According to Heidegger's situated theory of action, we are already thrown into the world, continually responding to the situations we encounter. This means that in everyday activity we
Research and Advances Virtual extension

De-Escalating IT Projects: The DMM Model

Taming runaway Information Technology (IT) projects is a challenge that most organizations have faced and that managers continue to wrestle with. These are projects that grossly exceed their planned budgets and schedules, often by a factor of 2--3 fold or greater. Many end in failure; failure not only in the sense of budget or schedule, but in terms of delivered functionality as well. Runaway projects are frequently the result of escalating commitment to a failing course of action, a phenomenon that occurs when investments fail to work out as envisioned and decision-makers compound the problem by persisting irrationally. Keil, Mann, and Rai reported that 30--40% of IT projects exhibit some degree of escalation. To break the escalation cycle, de-escalation of commitment to the failing course of action must occur so that valuable resources can be channeled into more productive use. But, making de-escalation happen is neither easy nor intuitive. This article briefly examines three approaches that have been suggested for managing de-escalation. By combining elements from the three approaches, we introduce a de-escalation management maturity (DMM) model that provides a useful framework for improving practice.
Research and Advances Virtual extension

Human Interaction For High-Quality Machine Translation

Translation from a source language into a target language has become a very important activity in recent years, both in official institutions (such as the United Nations and the EU, or in the parliaments of multilingual countries like Canada and Spain), as well as in the private sector (for example, to translate user's manuals or newspapers articles). Prestigious clients such as these cannot make do with approximate translations; for all kinds of reasons, ranging from the legal obligations to good marketing practice, they require target-language texts of the highest quality. The task of producing such high-quality translations is a demanding and time-consuming one that is generally conferred to expert human translators. The problem is that, with growing globalization, the demand for high-quality translation has been steadily increasing, to the point where there are just not enough qualified translators available today to satisfy it. This has dramatically raised the need for improved machine translation (MT) technologies. The field of MT has undergone something of a revolution over the last 15 years, with the adoption of empirical, data-driven techniques originally inspired by the success of automatic speech recognition. Given the requisite corpora, it is now possible to develop new MT systems in a fraction of the time and with much less effort than was previously required under the formerly dominant rule-based paradigm. As for the quality of the translations produced by this new generation of MT systems, there has also been considerable progress; generally speaking, however, it remains well below that of human translation. No one would seriously consider directly using the output of even the best of these systems to translate a CV or a corporate Web site, for example, without submitting the machine translation to a careful human revision. As a result, those who require publication-quality translation are forced to make a diffcult choice between systems that are fully automatic but whose output must be attentively post-edited, and computer-assisted translation systems (or CAT tools for short) that allow for high quality but to the detriment of full automation. Currently, the best known CAT tools are translation memory (TM) systems. These systems recycle sentences that have previously been translated, either within the current document or earlier in other documents. This is very useful for highly repetitive texts, but not of much help for the vast majority of texts composed of original materials. Since TM systems were first introduced, very few other types of CAT tools have been forthcoming. Notable exceptions are the TransType system and its successor TransType2 (TT2). These systems represent a novel rework-ing of the old idea of interactive machine translation (IMT). Initial efforts on TransType are described in detail in Foster; suffice it to say here the system's principal novelty lies in the fact the human-machine interaction focuses on the drafting of the target text, rather than on the disambiguation of the source text, as in all former IMT systems. In the TT2 project, this idea was further developed. A full-fledged MT engine was embedded in an interactive editing environment and used to generate suggested completions of each target sentence being translated. These completions may be accepted or amended by the translator; but once validated, they are exploited by the MT engine to produce further, hopefully improved suggestions. This is in marked contrast with traditional MT, where typically the system is first used to produce a complete draft translation of a source text, which is then post-edited (corrected) offline by a human translator. TT2's interactive approach offers a significant advantage over traditional post-editing. In the latter paradigm, there is no way for the system, which is off-line, to benefit from the user's corrections; in TransType, just the opposite is true. As soon as
Research and Advances Virtual extension

How Effective Is Google’s Translation Service in Search?

In multilingual countries (Canada, Hong Kong, India, among others) and large international organizations or companies (such as, WTO, European Parliament), and among Web users in general, accessing information written in other languages has become a real need (news, hotel or airline reservations, or government information, statistics). While some users are bilingual, others can read documents written in another language but cannot formulate a query to search it, or at least cannot provide reliable search terms in a form comparable to those found in the documents being searched. There are also many monolingual users who may want to retrieve documents in another language and then have them translated into their own language, either manually or automatically. Translation services may however be too expensive, not readily accessible or not available within a short timeframe. On the other hand, many documents contain non-textual information such as images, videos and statistics that do not need translation and can be understood regardless of the language involved. In response to these needs and in order to make the Web universally available regardless of any language barriers, in May 2007 Google launched a translation service that now provides two-way online translation services mainly between English and 41 other languages, for example, Arabic, simplified and traditional Chinese, French, German, Italian, Japanese, Korean, Portuguese, Russian, and Spanish (http://translate.google.com/). Over the last few years other free Internet translation services have been made available as for example by BabelFish (http://babel.altavista.com/) or Yahoo! (http://babelfish.yahoo.com/). These two systems are similar to that used by Google, given they are based on technology developed by Systran, one of the earliest companies to develop machine translation. Also worth mentioning here is the Promt system (also known as Reverso, http://translation2.paralink.com/), which was developed in Russia to provide mainly translation between Russian and other languages. The question we would like to address here is to what extent a translation service such as Google can produce adequate results in the language other than that being used to write the query. Although we will not evaluate translations per se we will test and analyze various systems in terms of their ability to retrieve items automatically based on a translated query. To be adequate, these tests must be done on a collection of documents written in one given language plus a series of topics (expressing user information needs) written in other languages, plus a series of relevance assessments (relevant documents for each topic).
Research and Advances Virtual extension

Overcoming the J-Shaped Distribution of Product Reviews

While product review systems that collect and disseminate opinions about products from recent buyers (Table 1) are valuable forms of word-of-mouth communication, evidence suggests that they are overwhelmingly positive. Kadet notes that most products receive almost five stars. Chevalier and Mayzlin also show that book reviews on Amazon and Barnes & Noble are overwhelmingly positive. Is this because all products are simply outstanding? However, a graphical representation of product reviews reveals a J-shaped distribution (Figure 1) with mostly 5-star ratings, some 1-star ratings, and hardly any ratings in between. What explains this J-shaped distribution? If products are indeed outstanding, why do we also see many 1-star ratings? Why aren't there any product ratings in between? Is it because there are no "average" products? Or, is it because there are biases in product review systems? If so, how can we overcome them? The J-shaped distribution also creates some fundamental statistical problems. Conventional wisdom assumes that the average of the product ratings is a sufficient proxy of product quality and product sales. Many studies used the average of product ratings to predict sales. However, these studies showed inconsistent results: some found product reviews to influence product sales, while others did not. The average is statistically meaningful only when it is based on a unimodal distribution, or when it is based on a symmetric bimodal distribution. However, since product review systems have an asymmetric bimodal (J-shaped) distribution, the average is a poor proxy of product quality. This report aims to first demonstrate the existence of a J-shaped distribution, second to identify the sources of bias that cause the J-shaped distribution, third to propose ways to overcome these biases, and finally to show that overcoming these biases helps product review systems better predict future product sales. We tested the distribution of product ratings for three product categories (books, DVDs, videos) with data from Amazon collected between February--July 2005: 78%, 73%, and 72% of the product ratings for books, DVDs, and videos are greater or equal to four stars (Figure 1), confirming our proposition that product reviews are overwhelmingly positive. Figure 1 (left graph) shows a J-shaped distribution of all products. This contradicts the law of "large numbers" that would imply a normal distribution. Figure 1 (middle graph) shows the distribution of three randomly-selected products in each category with over 2,000 reviews. The results show that these reviews still have a J-shaped distribution, implying that the J-shaped distribution is not due to a "small number" problem. Figure 1 (right graph) shows that even products with a median average review (around 3-stars) follow the same pattern.

Recent Issues

  1. November 2024 CACM cover
    November 2024 Vol. 67 No. 11
  2. October 2024 CACM cover
    October 2024 Vol. 67 No. 10
  3. September 2024 CACM cover
    September 2024 Vol. 67 No. 9
  4. August 2024 CACM cover
    August 2024 Vol. 67 No. 8