T1. John S. White & Florence Reeder: MT Evaluation -- The Common ThreadThe tutorial will cover the topics of the difficulty of MT evaluation, and then presents different views (including the recent work from the ISLE project) on the stakeholders, uses, and types of MT, and the attributes, measurands, and metrics implied by these perspectives. We will present a number of historical methods which may have renewed usefulness in today's context, as well as some approaches over the last decade and new approaches modeled in the last year. T2. Ralf Brown: Example-Based Machine TranslationThis tutorial will introduce participants to the history and practice of example-based machine translation (EBMT). After a definition of EBMT and an overview of its origins (Sato and Nagao, among others), various types of approaches to example-based translation (such as deep versus shallow processing) will be presented. This discussion will lead into a overview of a number of recent example-based systems, both "pure" and hybrid systems combining rule-, statistics-, or knowledge-based approaches with EBMT. Candidates for discussion include EGDAR, Gaijin, ReVerb, and systems by Cranias, Guevenier/Cicekli, and Streiter. Finally, the tutorial will conclude with a more in-depth examination of the Generalized EBMT system developed at Carnegie Mellon University. T3. Joshua Goodman: The State of the Art in Language ModelingThis tutorial will cover the state of the art in language modeling. The introduction will include what a language model is, a quick review of elementary probability, and applications of language modeling, with an emphasis on statistical machine translation. The bulk of the talk will describe current techniques in language modeling, including techniques like clustering and smoothing that are useful in many areas besides language modeling, and more language-model specific techniques such as high order n-grams and sentence mixture models. Finally, we will describe available toolkits and corpora. The target audience of the tutorial is AMTA attendees with an interest in statistical machine translation research. (Some material in this tutorial was developed by Eugene Charniak.) T4. Wolfgang Teubert & Pernilla Daniellson: Units of Meaning in Translation -- Making real use of corpus evidenceThis tutorial will focus on meaning. As such, it ignores both the rationalists and the empiricists ways of trying to model language in the pursuit of automating the translation process. The Birmingham Centre for Corpus Linguistics (CCL) has in its latest research on Chinese-English Translation Databases turned towards modern corpus linguistics methods in which the Unit of Meaning, rather than the single word, is the unit of analysis. We work from the hypothesis that meaning is in its use. Units of meaning are often larger and more complex than the simple word. Most units of translation are compounds, collocations or even phrases. As for single words, most of them are ambiguous. This tutorial will show how by carefully examining large corpora, meaning can emerge through patterning. The participants will be shown methods to disambiguate words by investigating their contextual profiles. The tutorial will also focus on retrieving translation equivalents and learning how corpus data can help us produce translated texts that display the naturalness of the target language. T5. Walter Hartmann: To Go Beyond the Gist: A Post-Editing Primer for Translation ProfessionalsThis tutorial is addressed to the professional translator who is aiming to integrate MT into the production flow as well as anyone else interested in applying MT in serious translation. It will discuss the following topics: Evaluation techniques for MT output from a pragmatic point of view
Approaches to efficient editing
T6. Laurie Gerber: Supporting a Multilingual Online AudienceThe need to provide customer service and technical support to non-English-speaking customers is growing rapidly within many U.S. based companies. But companies that release localized products outside of the U.S. are often unprepared to fully support their increasingly diverse customer base. Simply translating web sites is rarely adequate. Customer communications are conducted through a variety of channels, and may contain very different types of text, speech and data. Translation products and services abound, but understanding which solutions can effectively address a specific need requires an understanding of the problem, as well as the solutions.
This tutorial will provide you with:
webmaster@amtaweb.org |
||||||||||||||||