Artificial Intelligence and Machine Learning

Research and Advances Oct 1 2009

Human Interaction For High-Quality Machine Translation

Translation from a source language into a target language has become a very important activity in recent years, both in official institutions (such as the United Nations and the EU, or in the parliaments of multilingual countries like Canada and Spain), as well as in the private sector (for example, to translate user's manuals or newspapers articles). Prestigious clients such as these cannot make do with approximate translations; for all kinds of reasons, ranging from the legal obligations to good marketing practice, they require target-language texts of the highest quality. The task of producing such high-quality translations is a demanding and time-consuming one that is generally conferred to expert human translators. The problem is that, with growing globalization, the demand for high-quality translation has been steadily increasing, to the point where there are just not enough qualified translators available today to satisfy it. This has dramatically raised the need for improved machine translation (MT) technologies. The field of MT has undergone something of a revolution over the last 15 years, with the adoption of empirical, data-driven techniques originally inspired by the success of automatic speech recognition. Given the requisite corpora, it is now possible to develop new MT systems in a fraction of the time and with much less effort than was previously required under the formerly dominant rule-based paradigm. As for the quality of the translations produced by this new generation of MT systems, there has also been considerable progress; generally speaking, however, it remains well below that of human translation. No one would seriously consider directly using the output of even the best of these systems to translate a CV or a corporate Web site, for example, without submitting the machine translation to a careful human revision. As a result, those who require publication-quality translation are forced to make a diffcult choice between systems that are fully automatic but whose output must be attentively post-edited, and computer-assisted translation systems (or CAT tools for short) that allow for high quality but to the detriment of full automation. Currently, the best known CAT tools are translation memory (TM) systems. These systems recycle sentences that have previously been translated, either within the current document or earlier in other documents. This is very useful for highly repetitive texts, but not of much help for the vast majority of texts composed of original materials. Since TM systems were first introduced, very few other types of CAT tools have been forthcoming. Notable exceptions are the TransType system and its successor TransType2 (TT2). These systems represent a novel rework-ing of the old idea of interactive machine translation (IMT). Initial efforts on TransType are described in detail in Foster; suffice it to say here the system's principal novelty lies in the fact the human-machine interaction focuses on the drafting of the target text, rather than on the disambiguation of the source text, as in all former IMT systems. In the TT2 project, this idea was further developed. A full-fledged MT engine was embedded in an interactive editing environment and used to generate suggested completions of each target sentence being translated. These completions may be accepted or amended by the translator; but once validated, they are exploited by the MT engine to produce further, hopefully improved suggestions. This is in marked contrast with traditional MT, where typically the system is first used to produce a complete draft translation of a source text, which is then post-edited (corrected) offline by a human translator. TT2's interactive approach offers a significant advantage over traditional post-editing. In the latter paradigm, there is no way for the system, which is off-line, to benefit from the user's corrections; in TransType, just the opposite is true. As soon as

News Oct 1 2009

Shaping the Future

To create shape-shifting robotic ensembles, researchers need to teach micro-machines to work together.

Tom Geller

News Sep 1 2009

Medical Nanobots

Researchers working in medical nanorobotics are creating technologies that could lead to novel health-care applications, such as new ways of accessing areas of the human body that would otherwise be unreachable without invasive surgery.

Kirk L. Kroeker

News Sep 1 2009

Facing an Age-Old Problem

Researchers are addressing the computing challenges of older individuals, whose needs are different — and too often disregarded.

Samuel Greengard

Opinion Sep 1 2009

Computing: The Fourth Great Domain of Science

Computing is as fundamental as the physical, life, and social sciences.

Peter J. Denning and Paul S. Rosenbloom

image of physical, life, social, and computing sciences

Opinion Sep 1 2009

Face the Inevitable, Embrace Parallelism

Hardware, software, and applications must all evolve in anticipation of the proliferation of parallelism.

Anwar Ghuloum

Opinion Sep 1 2009

An Interview with Maurice Wilkes

Maurice Wilkes, the designer and builder of the EDSAC, passed away on Nov. 29 at age 97. He reflects on his career in this 2009 interview.

David P. Anderson and Maurice Wilkes

Practice Sep 1 2009

Monitoring and Control of Large Systems With MonALISA

MonALISA developers describe how it works, the key design principles behind it, and the biggest technical challenges in building it.

Practice Sep 1 2009

Making Sense of Revision-Control Systems

All revision-control systems come with complicated sets of trade-offs. How do you find the best match between tool and team?

Bryan O'Sullivan

Research and Advances Sep 1 2009

Sound Index: Charts For the People, By the People

Mining the wisdom of the online crowds generates music business intelligence, identifying what's hot and what's not.

Varun Bhagwan, Tyrone Grandison, and Daniel Gruhl

Research and Advances Sep 1 2009

Technical Perspective: Abstraction For Parallelism

Looking for some new insight into an old problem? The familiar problem of writing parallel applications and a fresh approach based on data abstraction allows some challenging programs to be parallelized.

Katherine Yelick

Research and Advances Sep 1 2009

Optimistic Parallelism Requires Abstractions

Writing software for multicore processors is greatly simplified if we could automatically parallelize sequential programs. Although auto-parallelization has been studied for many decades, it has succeeded only in a few application areas such as dense matrix computations.

Opinion Sep 1 2009

Future Tense: Confusions of the Hive Mind

Be cautious about the artificial intelligence approach to computer science. It is impossible to differentiate the actual achievement of AI from the degree to which people change when confronted with what is purported to be intelligent technology.

Jaron Lanier

Research and Advances Sep 1 2009

Examining User Involvement in Continuous Software Development

Ms. Perez was giving a PowerPoint presentation to her potential clients in the hope of landing a big contract. She was presenting a new advertising campaign for a mutual fund company and had spent three months with her team on perfecting the proposal. Everything seemed to be going well when suddenly a small window screen popped up informing her that an error had occurred and asked if she would wish to send an error report. She clicked the send button and the application on her laptop shut down, disrupting the flow of her presentation and making her look unprofessional. This story entails an example of a user's experience and response to a new method for collecting information on software application errors. To maintain a certain level of quality and ensure customer satisfaction, software firms spend approximately 50% to 75% of the total software development cost on debugging, testing, and verification activities. Despite such efforts, it is not uncommon for a software application to contain errors after the final version is released. To better manage the software development process in the long run firms are involving the users in software improvement initiatives by soliciting error information, while they are using the software application. The information collected through an error reporting system (ERS) plays an important role in uncovering bugs and prioritizing future development work. Considering that about 20% of bugs cause 80% of the errors, gathering information on application errors can substantially improve software firms' productivity and improve the quality of their products. High quality software applications can benefit the software users individually and also help improve the image of the software community as a whole. Thus, understanding the emerging error reporting systems (ERS) and why users adopt them are important issues that require examination. Such an analysis can help the software companies in learning how to design better ERS and educate the users about ERS and its utilities.

Achita (Mi) Muthitacharoen and Khawaja A. Saeed

Research and Advances Sep 1 2009

Constructive Function-Based Modeling in Multilevel Education

It is a digital age, especially for children and students who can be called the world's first truly digital generation. Accordingly a new generation education technology with a particular emphasis on visual thinking and specific computer-based notions and means is emerging. This is a new challenge for computer graphics which is a wide discipline dealing with creating visual images and devising their underlying models. There have been two major paradigms in computer graphics, and shape modeling as its part, for a certain period of time: namely, approximation and discretization. Their purpose is to simplify ideal complex shapes to make it possible to deal with them using limited capabilities of hardware and software. The approximation paradigm includes 2D vector graphics, 3D polygonal meshes, and later approximations by free-form curves and surfaces. The discretization paradigm originated raster graphics, then volume graphics based on 3D grid samples, and recently point-based graphics employing clouds of scanned or otherwise generated surface points. The problems of the both paradigms are obvious: loss of precise shape and visual property definitions, growing memory consumption, limited complexity, and others. Surface and volumetric meshes, lying in the foundation of modern industrial computer graphics systems, are so cumbersome that it is difficult to create, handle, and even understand them. The need in compact precise models with unlimited complexity has lead to the newly emerging paradigm of procedural modeling and rendering. One of the possibilities to represent an object procedurally is to evaluate a real function representing the shape and other real functions representing object properties at the given point. Our research group proposed in a constructive approach to creation of such function evaluation procedures for geometric shapes and in extended the approach to the case of point attribute functions representing object properties. The main idea of this approach is the creation of complex models from simple ones using operations similar to a model assembly from elementary pieces in LEGO. In terms of educational technology, such an approach is very much in the spirit of a constructionism theory by Seimur Papert. The main principle of this theory is active learning when learners gain knowledge actively constructing artifacts external to themselves. Applications of this theory coupled with modern computer technologies are emerging although a relationship with educational practice is not always easy. It is known that constructive thinking lying in the heart of LEGO games enable children to learn notions that were previously considered as too complex for them. There was research at the MIT Media Laboratory that led to the LEGO MindStorms robotics kits allowing children build their own robots using "programmable bricks" with electronics embedded inside. We have been developing not physical but virtual modeling and graphics tools that make it possible to use an extensible suite of "bricks" (see illustration in Figure 1) with a possibility to deform and modify them on the fly. Such an approach assumes mastering the basic mathematical concepts, initial programming in a simple language with subsequent creating an underlying model, generating its images and finally fabricating a real object of that model. We believe it is of interest as an educational technology for not only children and students but also for researchers, artists, and designers. It is important that learners interacting with a created virtual world acquire knowledge not just about mathematics and programming but also about structures and processes of the real world. We found soon after the introduction of our approach to modeling in the mid-90s that none of existing modeling systems or languages support this paradigm. Another necessity was to start preparation of qualified students to be involved in the R&

Alexander Pasko and Valery Adzhiev

BLOG@CACM Aug 1 2009

An ICT Research Agenda, HPC and Innovation, and Why Only the Developed World Lacks Women in Computing

Jeannette M. Wing writes about the need for a comprehensive research agenda, Daniel Reed discusses high-performance computing, and Mark Guzdial shares insights about women in computing.

Jeannette M. Wing, Daniel Reed, and Mark Guzdial

News Aug 1 2009

Just For You

Recommender systems that provide consumers with customized options have redefined e-commerce, and are spreading to other fields.

Don Monroe

clusters of movies discovered by a computer algorithm

News Aug 1 2009

Face Recognition Breakthrough

By using sparse representation and compressed sensing, researchers have been able to demonstrate significant improvements in accuracy over traditional face-recognition techniques.

Kirk L. Kroeker

Practice Aug 1 2009

The Pathologies of Big Data

Scale up your datasets enough and your apps come undone. What are the typical problems and where do the bottlenecks surface?

Adam Jacobs

Practice Aug 1 2009

CTO Roundtable: Cloud Computing

The age of cloud computing has begun. How can companies take advantage of the new opportunities it provides?

Mache Creeger

Research and Advances Aug 1 2009

Technical Perspective: Where the Chips May Fall

The traditional approach to circuit design has been to build chips that work correctly at extreme-case process corners, thereby guard-banding them against process variations.

Sachin S. Sapatnekar

Research and Advances Aug 1 2009

Statistical Analysis of Circuit Timing Using Majorization

Future miniaturization of silicon transistors following Moore's Law may be in jeopardy as it becomes harder to precisely define the behavior and shape of nanoscale transistors.

Michael Orshansky and Wei-Shen Wang

distribution of dopant atoms in a transistor

Research and Advances Aug 1 2009

Learning to Build an IT Innovation Platform

Information Technology (IT) pervades every aspect of a firm's value chain as a vast electronic network of interconnected applications and data. Managers perceive the immense potential of this complex infrastructure to enhance products and processes and create altogether new ones.

Rajiv Kohli and Nigel P. Melville

News Jul 17 2009

Shape-Shifting Material Suggests Morphable Hardware

Electronic devices that can change their physical shape depending on the needs of the user might sound far-fetched. But recent research advances on several fronts have brought such shape-shifting hardware closer to reality.

Bob Violino

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.