Advertisement

Research and Advances

Human Interaction For High-Quality Machine Translation

Translation from a source language into a target language has become a very important activity in recent years, both in official institutions (such as the United Nations and the EU, or in the parliaments of multilingual countries like Canada and Spain), as well as in the private sector (for example, to translate user's manuals or newspapers articles). Prestigious clients such as these cannot make do with approximate translations; for all kinds of reasons, ranging from the legal obligations to good marketing practice, they require target-language texts of the highest quality. The task of producing such high-quality translations is a demanding and time-consuming one that is generally conferred to expert human translators. The problem is that, with growing globalization, the demand for high-quality translation has been steadily increasing, to the point where there are just not enough qualified translators available today to satisfy it. This has dramatically raised the need for improved machine translation (MT) technologies. The field of MT has undergone something of a revolution over the last 15 years, with the adoption of empirical, data-driven techniques originally inspired by the success of automatic speech recognition. Given the requisite corpora, it is now possible to develop new MT systems in a fraction of the time and with much less effort than was previously required under the formerly dominant rule-based paradigm. As for the quality of the translations produced by this new generation of MT systems, there has also been considerable progress; generally speaking, however, it remains well below that of human translation. No one would seriously consider directly using the output of even the best of these systems to translate a CV or a corporate Web site, for example, without submitting the machine translation to a careful human revision. As a result, those who require publication-quality translation are forced to make a diffcult choice between systems that are fully automatic but whose output must be attentively post-edited, and computer-assisted translation systems (or CAT tools for short) that allow for high quality but to the detriment of full automation. Currently, the best known CAT tools are translation memory (TM) systems. These systems recycle sentences that have previously been translated, either within the current document or earlier in other documents. This is very useful for highly repetitive texts, but not of much help for the vast majority of texts composed of original materials. Since TM systems were first introduced, very few other types of CAT tools have been forthcoming. Notable exceptions are the TransType system and its successor TransType2 (TT2). These systems represent a novel rework-ing of the old idea of interactive machine translation (IMT). Initial efforts on TransType are described in detail in Foster; suffice it to say here the system's principal novelty lies in the fact the human-machine interaction focuses on the drafting of the target text, rather than on the disambiguation of the source text, as in all former IMT systems. In the TT2 project, this idea was further developed. A full-fledged MT engine was embedded in an interactive editing environment and used to generate suggested completions of each target sentence being translated. These completions may be accepted or amended by the translator; but once validated, they are exploited by the MT engine to produce further, hopefully improved suggestions. This is in marked contrast with traditional MT, where typically the system is first used to produce a complete draft translation of a source text, which is then post-edited (corrected) offline by a human translator. TT2's interactive approach offers a significant advantage over traditional post-editing. In the latter paradigm, there is no way for the system, which is off-line, to benefit from the user's corrections; in TransType, just the opposite is true. As soon as
News

Medical Nanobots

Researchers working in medical nanorobotics are creating technologies that could lead to novel health-care applications, such as new ways of accessing areas of the human body that would otherwise be unreachable without invasive surgery.
Research and Advances

Optimistic Parallelism Requires Abstractions

Writing software for multicore processors is greatly simplified if we could automatically parallelize sequential programs. Although auto-parallelization has been studied for many decades, it has succeeded only in a few application areas such as dense matrix computations. 
Opinion

Future Tense: Confusions of the Hive Mind

Be cautious about the artificial intelligence approach to computer science. It is impossible to differentiate the actual achievement of AI from the degree to which people change when confronted with what is purported to be intelligent technology.
Research and Advances

Examining User Involvement in Continuous Software Development

Ms. Perez was giving a PowerPoint presentation to her potential clients in the hope of landing a big contract. She was presenting a new advertising campaign for a mutual fund company and had spent three months with her team on perfecting the proposal. Everything seemed to be going well when suddenly a small window screen popped up informing her that an error had occurred and asked if she would wish to send an error report. She clicked the send button and the application on her laptop shut down, disrupting the flow of her presentation and making her look unprofessional. This story entails an example of a user's experience and response to a new method for collecting information on software application errors. To maintain a certain level of quality and ensure customer satisfaction, software firms spend approximately 50% to 75% of the total software development cost on debugging, testing, and verification activities. Despite such efforts, it is not uncommon for a software application to contain errors after the final version is released. To better manage the software development process in the long run firms are involving the users in software improvement initiatives by soliciting error information, while they are using the software application. The information collected through an error reporting system (ERS) plays an important role in uncovering bugs and prioritizing future development work. Considering that about 20% of bugs cause 80% of the errors, gathering information on application errors can substantially improve software firms' productivity and improve the quality of their products. High quality software applications can benefit the software users individually and also help improve the image of the software community as a whole. Thus, understanding the emerging error reporting systems (ERS) and why users adopt them are important issues that require examination. Such an analysis can help the software companies in learning how to design better ERS and educate the users about ERS and its utilities.
Research and Advances

Constructive Function-Based Modeling in Multilevel Education

It is a digital age, especially for children and students who can be called the world's first truly digital generation. Accordingly a new generation education technology with a particular emphasis on visual thinking and specific computer-based notions and means is emerging. This is a new challenge for computer graphics which is a wide discipline dealing with creating visual images and devising their underlying models. There have been two major paradigms in computer graphics, and shape modeling as its part, for a certain period of time: namely, approximation and discretization. Their purpose is to simplify ideal complex shapes to make it possible to deal with them using limited capabilities of hardware and software. The approximation paradigm includes 2D vector graphics, 3D polygonal meshes, and later approximations by free-form curves and surfaces. The discretization paradigm originated raster graphics, then volume graphics based on 3D grid samples, and recently point-based graphics employing clouds of scanned or otherwise generated surface points. The problems of the both paradigms are obvious: loss of precise shape and visual property definitions, growing memory consumption, limited complexity, and others. Surface and volumetric meshes, lying in the foundation of modern industrial computer graphics systems, are so cumbersome that it is difficult to create, handle, and even understand them. The need in compact precise models with unlimited complexity has lead to the newly emerging paradigm of procedural modeling and rendering. One of the possibilities to represent an object procedurally is to evaluate a real function representing the shape and other real functions representing object properties at the given point. Our research group proposed in a constructive approach to creation of such function evaluation procedures for geometric shapes and in extended the approach to the case of point attribute functions representing object properties. The main idea of this approach is the creation of complex models from simple ones using operations similar to a model assembly from elementary pieces in LEGO. In terms of educational technology, such an approach is very much in the spirit of a constructionism theory by Seimur Papert. The main principle of this theory is active learning when learners gain knowledge actively constructing artifacts external to themselves. Applications of this theory coupled with modern computer technologies are emerging although a relationship with educational practice is not always easy. It is known that constructive thinking lying in the heart of LEGO games enable children to learn notions that were previously considered as too complex for them. There was research at the MIT Media Laboratory that led to the LEGO MindStorms robotics kits allowing children build their own robots using "programmable bricks" with electronics embedded inside. We have been developing not physical but virtual modeling and graphics tools that make it possible to use an extensible suite of "bricks" (see illustration in Figure 1) with a possibility to deform and modify them on the fly. Such an approach assumes mastering the basic mathematical concepts, initial programming in a simple language with subsequent creating an underlying model, generating its images and finally fabricating a real object of that model. We believe it is of interest as an educational technology for not only children and students but also for researchers, artists, and designers. It is important that learners interacting with a created virtual world acquire knowledge not just about mathematics and programming but also about structures and processes of the real world. We found soon after the introduction of our approach to modeling in the mid-90s that none of existing modeling systems or languages support this paradigm. Another necessity was to start preparation of qualified students to be involved in the R&
News

Just For You

Recommender systems that provide consumers with customized options have redefined e-commerce, and are spreading to other fields.

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More