AMTA 2002 PROGRAM

Last updated: Sat Sep 28 14:16:14 PDT 2002

TUESDAY

MT. TAMALPAIS

8 October

 

9:00 — 12:00

Workshop
Interlingua Reliability

12:00 — 2:00

LUNCH

2:00 — 5:00

Workshop
Interlingua Reliability (cont.)

 

WEDNESDAY

BELVEDERE

TIBURON

MT. TAMALPAIS

9 October

 

 

 

9:00 — 12:00

Tutorial
MT Evaluation -
The Common Thread

J. White & F. Reeder

Tutorial
To Go Beyond the Gist: A Post-Editing Primer for Translation Professionals
W. Hartmann

Tutorial
The State of the Art in Language Modeling
J. Goodman

12:00 — 2:00

LUNCH

 

2:00 — 5:00

Tutorial
Supporting a Multilingual Online Audience
L. Gerber

Tutorial
Units of Meaning in Translation - Making Real Use of Corpus Evidence
W. Teubert &
P. Daniellson

Tutorial
Example-Based Machine Translation
R. Brown

6:00 — 8:00

WELCOME RECEPTION – BELEVERON ROOM
>>>EXHIBITS OPEN – GARDEN COURT ROOM<<<

 

 

THURSDAY

BELVERON

MT. TAMALPAIS

10 October

 

 

8:45 — 9:00

Opening Remarks

Elliott Macklovitch, Conference Chair & AMTA President
Steve Richardson, Program Chair
Violetta Cavalli-Sforza, Local Arrangements Chair
Bob Frederking, Workshops & Tutorials Chair
Laurie Gerber, Exhibits Coordinator

 

9:00 — 10:00

Invited Speaker

Empiricism from TMI-1992 to AMTA-2002 to AMTA-2012: Have IBM Models 1-5 failed?
Ken Church (AT&T Labs Research)
Abstract

 

10:00 — 10:30

BREAK – Exhibits

 

10:30 — 12:00

Panel 1

Taking MT from research to real users: have we made any progress?
Ed Hovy, moderator

Technical Papers

Using Word Formation Rules to Extend MT Lexicons
Claudia Gdaniec & Esmé Manandise (IBM Research)

Deriving Semantic Knowledge from Descriptive Texts using an MT System
Eric Nyberg, Teruko Mitamura, Kathryn Baker, & David Svoboda (LTI, Carnegie Mellon Univ.); Brian Peterson & Jennifer Williams (Ontology Works)

Korean-Chinese Machine Translation Based on Verb Patterns
Changhyun Kim, Munpyo Hong, Yinxia Huang, Young Kil Kim, Sung Il Yang, Young Ae Seo, & Sung-Kwon Choi (ETRI)

12:00 — 1:30

LUNCH – Exhibits

1:30 — 3:00

Technical Papers

Semi-automatic Compilation of Bilingual Lexicon Entries from Cross-Lingually Relevant News Articles on WWW News Sites
Takehito Utsuro, Takashi Horiuchi, Yasunobu Chiba, & Takeshi Hamamoto (Toyohashi Univ. of Technology)

Adaptive Bilingual Sentence Alignment
Thomas C. Chuang (Van Nung Inst. of Technology), GN You (National Taichung Inst. of Technology), & Jason S. Chang (National Tsing Hua Univ.)

Fast and Accurate Sentence Alignment of Bilingual Corpora
Robert C. Moore (Microsoft Research)

System Demos

Translation by the Numbers: Language Weaver
Bryce Benjamin, Kevin Knight, & Daniel Marcu (Language Weaver, Inc.)

The KANTOO MT System: Controlled Language Checker and Lexical Maintenance Tool
Teruko Mitamura, Eric Nyberg, Kathy Baker, Peter Cramer, Jeongwoo Ko, David Svoboda, & Michael Duggan (LTI, Carnegie Mellon Univ.)

Fluent Machines’ EliMT System
Eli Abir, Steve Klein, David Miller, & Michael Steinbaum (Fluent Machines)

3:00 — 3:30

BREAK – Exhibits

 

3:30 — 5:00

Technical Papers

Better Contextual Translation Using Machine Learning
Arul Menezes (Microsoft Research)

Handling Translation Divergences: Combining Statistical and Symbolic Techniques in Generation-Heavy Machine Translation
Nizar Habash & Bonnie Dorr (University of Maryland (UMIACS))

DUSTer: A Method for Unraveling Cross-Language Divergences for Statistical Word-Level Alignment
Bonnie J. Dorr, Lisa Pearl, Rebecca Hwa, & Nizar Habash (University of Maryland (UMIACS))

System Demos

Natural Intelligence in a Machine Translation System
Howard J. Bender (Any-Language Communications Inc.)

MSR-MT: The Microsoft Research Machine Translation System
William B. Dolan, Jessie Pinkham, Stephen D. Richardson (Microsoft Research)

5:15 — 6:00

AMTA General Membership Meeting – BELVERON ROOM

>>>ALL AMTA MEMBERS ARE INVITED TO ATTEND<<<

 

FRIDAY

BELVERON

MT. TAMALPAIS

11 October

 

 

9:00 — 10:30

Technical Papers

Efficient Integration of Maximum Entropy Lexicon Models within the Training of Statistical Alignment Models
Ismael García-Varea (Univ. of Castilla-La Mancha); Franz J. Och & Hermann Ney (RWTH Aachen); Francisco Casacuberta (ITI (UPV) Valencia)

Using a Large Monolingual Corpus to Improve Translation Accuracy
Radu Soricut, Kevin Knight, & Daniel Marcu (ISI, Univ. of Southern California)

Text Prediction with Fuzzy Alignments
George Foster, Philippe Langlais, & Guy Lapalme (RALI, Univ. of Montreal)

User Studies

An Assessment of Machine Translation for Vehicle Assembly Process Planning at Ford Motor Company
Nestor Rychtyckyj (Ford Motor Company)

A Report on the Experiences of Implementing an MT System for Use in a Commercial Environment
Anthony Clarke (Corporate Language Services AG), Elisabeth Maier (Canoo Engineering AG), & Hans-Udo Stadler (Corporate Language Services AG)

Getting the Message In: A Global Company’s Experience with the New Generation of Low-Cost, High Performance Machine Translation Systems
Verne Morland (NCR Corporation)

10:30 — 11:00

BREAK – Exhibits

11:00 — 12:00

Invited Speaker

Romantics of the Translation Market
Jaap Van der Meer (SYSTRAN, and former CEO of ALPNET)
Abstract

 

12:00 — 1:30

LUNCH – Exhibits

1:30 — 3:00

Panel 2

MT methodologies: what works, what doesn’t, and what’s the right mix?
Steve Richardson, moderator

System Demos

The NESPOLE! Speech-to-Speech Translation System
Alon Lavie, Lori Levin, & Robert Federking (Carnegie Mellon Univ.); Fabio Pianesi (ITC-irst Trento)

Approaches to Spoken Translation
Christine A. Montgomery & Naicong Li (Language Systems, Inc.)

LogoMedia TRANSLATE™, version 2.0
Glenn A. Akers (LogoMedia Corporation)

3:00 — 3:30

BREAK – Exhibits

 

3:30 — 5:00

Technical Papers

Bootstrapping the Lexicon Building Process for Machine Translation between 'New' Languages
Ruvan Weerasinghe (Univ. of Colombo, Sri Lanka)

Automatic Rule Learning for Resource-Limited MT
Jaime Carbonell, Katharina Probst, Erik Peterson, Christian Monson, Alon Lavie, Ralf Brown, & Lori Levin (LTI, Carnegie Mellon Univ.)

Classification Approach to Word Selection in Machine Translation
Hyo-Kyung Lee (University of Illinois)

System Demo

A New Family of the PARS Translation Systems
Michael Blekhman, Andrei Kursin, & Alla Rakova (Lingvistica '98 Inc.)

User Presentation

Cisco Systems and Systran Software: An Ongoing Partnership in MT
Riki Shore (Cisco Systems, Inc.)

7:00-9:00

BANQUET

The  Waterfront Restaurant and Cafe

Pier 7, The Embarcadero, San Francisco

 





SATURDAY

BELVERON

MT. TAMALPAIS

12 October

 

 

9:00 — 10:30

Panel 3

Who’s making/saving money with MT and how are they doing it?
Mary Flanagan, moderator

Technical Papers

Example-based Machine Translation via the Web
Nano Gough, Andy Way, & Mary Hearne (Dublin City University)

Toward a Hybrid Integrated Translation Environment
Michael Carl (RALI, Univ. of Montreal), Andy Way (Dublin City University), & Reinhard Schäler (LRC, Univ. of Limerick)

Merging Example-Based and Statistical Machine Translation: An Experiment
Philippe Langlais & Michel Simard (RALI, Univ. of Montreal)

10:30 — 11:00

BREAK – last chance to see Exhibits!

11:00 — 12:00

Invited Speaker

Stone soup revisited, or the unity of MT as the prime NLP task.
Yorick WIlks (University of Sheffield, UK)
Abstract

 

12:00 — 12:15

Closing Remarks

Elliott Macklovitch, Conference Chair

 

 

INVITED SPEAKERS

Empiricism from TMI-1992 to AMTA-2002 to AMTA-2012: Have IBM Models 1-5 failed?

Ken Church (AT&T Labs Research)

 

Abstract:  The organizers of this conference asked me to comment on what's changed since TMI-92 (if anything). There was great excitement at TMI-92 about using aligned parallel corpora to assist human translation. There was also a lot of controversy over the IBM Models 1-5, which was shaking up the field.

So what's happened since then? Empiricism has come of age. What used to be considered radical is now accepted practice. The new field of Machine Learning has absorbed many good (and formally controversial) ideas including the IBM Models 1-5. Yarowsky's work on Word Sense Disambiguation grew out of Machine Translation, but is now widely cited in Machine Learning as an early example of co-training. Mercer's fighting words, "More data is better data," doesn't seem as shocking when Bill makes the case a decade later.

It took a decade or two for the revival of empirical methods to become popular (perhaps too popular). I worry that the pendulum has swung so far that we are no longer training students for the possibility that the pendulum might swing the other way. We ought to be preparing students with a broad education including Statistics and Machine Learning as well as Linguistic Theory.

Empiricism has not only come of age in academic venues (e.g., conferences, textbooks), but also in commercial venues. Many of the alignment tools and suggestions proposed in "Good Applications for Crummy MT" and elsewhere are currently being sold by Trados and others. There are some even better apps than the ones we imagined: e.g., CLIR (cross-language information retrieval) and MT in web search engines (Systran & AltaVista).

So, what do I expect to happen over the next decade?

  1. Scale, stupid: There is a lot of excitement about the web, which is not only large, and growing, but also contains a rich structure of hypertext links. I am going to suggest a bait and switch strategy, where the public Internet is the bait, but the real target is something larger and more valuable, but more elusive.
  2. Good Apps for Crummy NLP: In Good Applications for Crummy MT, Hovy and I advocated spending more time thinking about what we can do with what we have, and not spend all our resources on the core technology. There is a lot to a killer app: great technology helps, but there is a lot more to a killer app. Similar arguments apply beyond MT to much of natural language and speech.

Bio:  Ken Church is currently the head of a data mining department in AT&T Labs-Research. He received his BS, Masters and PhD from MIT in computer science in 1978, 1980 and 1983, and immediately joined AT&T Bell Labs, where he has been ever since (though the name of the organization has changed). In 2001, Ken received the honor of being selected as an AT&T fellow. He has worked in many areas of computational linguistics including: acoustics, speech recognition, speech synthesis, OCR, phonetics, phonology, morphology, word-sense disambiguation, spelling correction, terminology, translation, lexicography, information retrieval, compression, language modeling and text analysis. He enjoys working with very large corpora such as the Associated Press newswire (1 million words per week). His data mining department is currently applying similar methods to much larger data sets such as telephone call detail (1-10 billion records per month).

 

Romantics of the Translation Market

Jaap Van der Meer (SYSTRAN consultant; former CEO of ALPNET)

 

Abstract:  No matter how much effort has been put into treating translation as a process like any other, the profession is really still regarded as a vocation or an art that can not be measured or standardized. This cultural background has contributed to an archaic industry with a cascaded supply chain. But the demand is changing and what’s even more important: information consumption is changing. A new environment is emerging where machine translation can really make a big difference. Is the world of MT ready for the challenge? In his speech Jaap van der Meer will address the following topics:

§         Historical overview of the translation market and typology of actors

§         Cost and efficiencies in the translation market

§         The role of technology in the translation market

§         Changing demands, information recycling as a paradigm

§         Enterprise applications for machine translation and their ROI

§         New hybrid solutions

 

Bio:  Until the end of 2001 Jaap van der Meer was President and CEO of ALPNET. Since his departure from ALPNET upon the merger with SDL, he is advising various technology companies. Since his debut in the localization market in 1980 he has been a great advocate of translation automation. His first company, INK International, launched the first desk-top translation memory software. He also published the Language Technology magazine for several years covering many of the pioneering technologies. In 1990 he launched the idea for a localization industry association and he funded the establishing of LISA. In 1999 he helped to start the SAE TopTec Multilingual Automotive Information conference. At ALPNET he spearheaded the implementation of an end-to-end automated localization process, including machine translation, centralized translation memory and automated workflow. Currently, Jaap is consulting with SYSTRAN on the promotion of machine translation in the enterprise market. He is also a member of the Translation Vendor Web Services steering committee.

 

Stone soup revisited, or the unity of MT as the prime NLP task.

Yorick WIlks (University of Sheffield, UK)

 

Abstract:  MT researchers of a certain age will remember that, about fifteen years ago, the group under Jelinek and Brown at IBM mounted an attack on the idea of MT as a purely linguistic/symbolic enterprise, and argued that engineering methods based purely on text statistics, and derived from their success in speech recognition, could yield fundamental advance in MT. There were debates at conferences and in newsletters and matters came to a head in the DARPA MT competitions of the early Nineties, where both types of system (supported by DARPA) were pitted against each other and against commercial systems , including SYSTRAN. The answer was pretty clear, statistical MT did well, better than many expected, but never beat SYSTRAN over texts and domains for which neither had been trained.

Many believe that nothing much has happened since, but I will argue that that is not so. What has happened above all is the web, which has both provided a new easy-accessible market for MT, through page translation, and has also provided a source of vast corpora, unimaginable before. However, that availability has not yet been cashed in: there is an enormous amount of work, of both sorts and above all as hybrids of both, but nothing fundamental has yet enabled purely empirical methods to overcome the data-sparseness problem, not even the web itself, viewed as a corpus. It seems pretty clear that some form of symbolic methods will be needed to do that. Again, that opposition is increasingly hard to make, as "symbolic" methods now themselves tend to be empirically based, and refer only to information types, rather than to structures that are written down directly from intuition.

Most striking has been the division of the old MT task up into sub-tasks, each being tackled and evaluated  independently--the most MT relevant case has been Word Sense Discrimination--- but whose limited successes have not, so far, been built back into more advanced MT itself. Again, MT has disintegrated in another way, in that multilingual functionality over  a whole range of tasks, up from simple text editing to summarization and information extraction, are now such that one cannot really say whether they are MT or not. None of this should matter if real advances for language workers are being made, and they are.  But intellectually, it can be bewildering, as in the recent turning of tables in which it has been argued that information retrieval should be seen as a form of machine translation (as opposed to vice versa!).

Bio:  YORICK WILKS is Professor of Computer Science at the University of Sheffield and Director of ILASH, the Institute of Language, Speech and Hearing. For the eight years 1985-93 he was Director of the Computing Research Laboratory at New Mexico State University, a centre for research in artificial intelligence and its applications. He received his doctorate from Cambridge University in 1968 for work in computer programs that understand written English in terms of a theory later called "preference semantics": the claim that language is to be understood by means of a search for semantic "gists", combined with a coherence function over such structures that minimises effort in the analyser.

This has continued as the focus of his work, and has had applications in the areas of machine translation, the use of English as a "front end" for users of data bases, and the computation of belief structures. He was a researcher at Stanford AI Laboratory, and then Professor of Computer Science and Linguistics at the University of Essex. He has published numerous articles and nine books in that area of artificial intelligence, of which the most recent are Artificial Believers (with Afzal Ballim) from Lawrence Erlbaum Associates (1991) and Electric Words: dictionaries, computers and meanings (with Brian Slator and Louise Guthrie) MIT Press, (1995).

He is a member of the (UK) EPSRC College of Computing, and also a Fellow of the American Association for Artificial Intelligence and the European Society for Artificial Intelligence (ECCAI). He is on the boards of some fifteen AI-related journals. He currently works in the areas of information extraction from text sources, computational pragmatics and dialogue systems, and the automatic construction and maintenance of linguistic resources such as lexicons, ontologies and grammars. He has been principal researcher at Sheffield on a range of Fourth and Fifth Framework EU contracts: AVENTINUS, ECRAN, EUROWORDNET, SIMPLE, PAROLE, NAMIC, MUMIS and AMITIES. He has substantial experience managing and directing large research projects for national agencies such as the EPSRC (UK) and DARPA and NSF (US).