Tutorials

Morning Tutorials (Wednesday, October 9, 9:00 am - 12:00 pm)

T1. John S. White & Florence Reeder MT Evaluation -- The Common Thread
T3. Joshua Goodman The State of the Art in Language Modeling
T5. Walter Hartmann To Go Beyond the Gist: A Post-Editing Primer for Translation Professionals

Afternoon Tutorials (Wednesday, October 9, 2:00 pm - 5:00 pm)

T2. Ralf Brown Example-Based Machine Translation
T4. Wolfgang Teubert & Pernilla Daniellson Units of Meaning in Translation -- Making real use of corpus evidence
T6. Laurie Gerber Supporting a Multilingual Online Audience

T1. John S. White & Florence Reeder: MT Evaluation -- The Common Thread

The tutorial will cover the topics of the difficulty of MT evaluation, and then presents different views (including the recent work from the ISLE project) on the stakeholders, uses, and types of MT, and the attributes, measurands, and metrics implied by these perspectives. We will present a number of historical methods which may have renewed usefulness in today's context, as well as some approaches over the last decade and new approaches modeled in the last year.

T2. Ralf Brown: Example-Based Machine Translation

This tutorial will introduce participants to the history and practice of example-based machine translation (EBMT).  After a definition of EBMT and an overview of its origins (Sato and Nagao, among others), various types of approaches to example-based translation (such as deep versus shallow processing) will be presented.  This discussion will lead into a overview of a number of recent example-based systems, both "pure" and hybrid systems combining rule-, statistics-, or knowledge-based approaches with EBMT. Candidates for discussion include EGDAR, Gaijin, ReVerb, and systems by Cranias, Guevenier/Cicekli, and Streiter.  Finally, the tutorial will conclude with a more in-depth examination of the Generalized EBMT system developed at Carnegie Mellon University.

T3. Joshua Goodman: The State of the Art in Language Modeling

This tutorial will cover the state of the art in language modeling. The introduction will include what a language model is, a quick review of elementary probability, and applications of language modeling, with an emphasis on statistical machine translation. The bulk of the talk will describe current techniques in language modeling, including techniques like clustering and smoothing that are useful in many areas besides language modeling, and more language-model specific techniques such as high order n-grams and sentence mixture models. Finally, we will describe available toolkits and corpora. The target audience of the tutorial is AMTA attendees with an interest in statistical machine translation research. (Some material in this tutorial was developed by Eugene Charniak.)

T4. Wolfgang Teubert & Pernilla Daniellson: Units of Meaning in Translation -- Making real use of corpus evidence

This tutorial will focus on meaning. As such, it ignores both the rationalists and the empiricists ways of trying to model language in the pursuit of automating the translation process. The Birmingham Centre for Corpus Linguistics (CCL) has in its latest research on Chinese-English Translation Databases turned towards modern corpus linguistics methods in which the Unit of Meaning, rather than the single word, is the unit of analysis.  We work from the hypothesis that meaning is in its use.  Units of meaning are often larger and more complex than the simple word.  Most units of translation are compounds, collocations or even phrases.  As for single words, most of them are ambiguous.  This tutorial will show how by carefully examining large corpora, meaning can emerge through patterning.

The participants will be shown methods to disambiguate words by investigating their contextual profiles. The tutorial will also focus on retrieving translation equivalents and learning how corpus data can help us produce translated texts that display the naturalness of the target language.

T5. Walter Hartmann: To Go Beyond the Gist: A Post-Editing Primer for Translation Professionals

This tutorial is addressed to the professional translator who is aiming to integrate MT into the production flow as well as anyone else interested in applying MT in serious translation.  It will discuss the following topics:

Evaluation techniques for MT output from a pragmatic point of view

  • How much time can I save by using MT?
  • How useful is this translation for my job at hand?
  • How to find the weak spots in MT output and use them to your advantage.

Approaches to efficient editing

  • Editing
  • Cut-and-Paste
  • Destroy & Rebuild
  • Leaving well enough alone

T6. Laurie Gerber: Supporting a Multilingual Online Audience

The need to provide customer service and technical support to non-English-speaking customers is growing rapidly within many U.S. based companies.  But companies that release localized products outside of the U.S. are often unprepared to fully support their increasingly diverse customer base.  Simply translating web sites is rarely adequate. Customer communications are conducted through a variety of channels, and may contain very different types of text, speech and data.  Translation products and services abound, but understanding which solutions can effectively address a specific need requires an understanding of the problem, as well as the solutions.

This tutorial will provide you with:

  • Practical skills for analyzing your language support requirements
  • Strategies for selecting and deploying appropriate language solutions
  • An understanding of the range and capabilities of language technologies
  • Knowledge of how to integrate language technologies into your
  • organizational workflow while avoiding common pitfalls
  • Tools for measuring the effectiveness of your multilingual customer support

webmaster@amtaweb.org
Last updated: Tue Sep 24 15:32:30 PDT 2002