AMTA Logo
AMTA Home Page
View of Waikiki AMTA 2008 logo
The Eighth Conference of the
Association for Machine Translation
in the Americas

Waikiki, Hawai'i
Conference Home Page Hilton Prince Kuhio Hotel, Waikiki, Hawai'i
October 21-25, 2008

The links below will be active as the information becomes available.

Program At-A-Glance

Full Program

Keynote Speakers

[updated] Accepted Papers

Tutorials

Workshops

Technology Showcase

Special Events


Fees & Registration

Accommodations
Look for the best price, and
please stay at the Prince Kuhio.

Local Information:
Waikiki, HI

Sponsorships


Organizers






Call for Papers
[Submissions are now closed]

Paper Submissions
Use these Templates
to submit camera-ready
Research Papers in pdf:

MS Word file

MS Word template

pdf Format

LaTEX2e format

LaTEX2e Style file

ACL Bibliography Style file


 


Tutorials
Schedule of Tutorials:
Tuesday, 21 October
1. Introduction to Globalization (morning)
6. Sixty Years of Statistical Machine Translation (morning)
2. Language Processing Standards (afternoon)
3. Arabic Natural Language Processing for Machine Translation (afternoon)

Saturday, 25 October
4. Statistical Machine Translation: Theory and Practice (morning and afternoon)
5. Authoring for Translatability and Self-Service Support (morning)
[ top ]    #1 - Introduction to Globalization
           Arle Lommel (LISA)
Description:
***Half-day tutorial: 3 hours


Although translation and localization are often treated as discrete processes, both of them exist in a broader business and technical context known as globalization that consists of a number of interrelated aspects. This tutorial will cover the basics of the globalization process, including internationalization, localization/translation, support, and other business issues. An understanding of this broad context will help participants understand the business framework within which technologies and solutions are deployed and will introduce the basic technologies and methods used at each stage of the globalization cycle.
Arle Lommel is chair of the OSCAR steering committee, the standards development group within the Localization Industry Standards Association. He has worked for LISA for ten years. An experienced translator, he is trained as a linguist and a folklorist, holding a BA in linguistics from Brigham Young University, Provo (where he worked under Alan Melby) and an MA in folklore studies from Indiana University, Bloomington. At LISA he has spearheaded LISA¹s standards initiatives and market and data analysis, with an emphasis on technologies and standards. He currently resides in Bloomington, Indiana.
[ top ]    #2 - Language Processing Standards
           Arle Lommel (LISA)
Description:
***Half-day tutorial: 3 hours


Despite the relative youth of the language services industry, standards have begun to play a critical role in its development. Because language services are heavily dependent on technical solutions, customers and business processes require standards-based approaches that free them from dependence on particular tools or proprietary formats. This presentation will cover the basics of standards for language technologies with an emphasis on open standards from the Localization Industry Standards Association (LISA) and OASIS, along with related developments such as OLIF. Each standard will be described in detail along with concrete examples of its use, business impact, and potential scope.
Arle Lommel is chair of the OSCAR steering committee, the standards development group within the Localization Industry Standards Association. He has worked for LISA for ten years. An experienced translator, he is trained as a linguist and a folklorist, holding a BA in linguistics from Brigham Young University, Provo (where he worked under Alan Melby) and an MA in folklore studies from Indiana University, Bloomington. At LISA he has spearheaded LISA¹s standards initiatives and market and data analysis, with an emphasis on technologies and standards. He currently resides in Bloomington, Indiana.
[ top ]    #3 - Arabic Natural Language Processing for Machine Translation
           Nizar Habash (Columbia University)
Description:
***Half-day tutorial: 3 hours
***This tutorial does not expect the attendees to be able to speak/read/write Arabic.


The tutorial has four sections. First is a discussion of Arabic phonology and orthography with a focus on Arabic spelling peculiarities and their effect on Arabic processing for MT. Arabic encoding issues are also addressed. Second, aspects of Arabic morphological phenomena are presented and explained. This is followed by a survey of different approaches to address these phenomena. Third, a survey of Arabic syntactic phenomena is presented and contrasted to English syntactic phenomena. Syntactic representation in the Arabic Treebank is discussed. Finally, Arabic dialects and the kind of problems they present for Arabic NLP are presented. Links to recent publications and available toolkits/resource for all four sections will be provided. This tutorial will provide NLP system developers/researchers with necessary background information for working with the Arabic language, which has recently become a focus of an increasing number of projects in computational linguistics. The goal of the tutorial is to introduce Arabic linguistic phenomena that need to be addressed and review the state-of-the-art on Arabic processing. Alternative approaches will be presented and contrasted for their value in different application contexts. Basic skills for handling Arabic text (even when illiterate in Arabic script) are discussed. This tutorial is designed for computer scientists and linguists alike. Acquaintance with basic formal language theory and knowledge of some programming language will be useful, but not necessary. Previous versions of the tutorial were given at AMTA 2004, ACL 2005, and LREC 2006. See here.
Nizar Habash received his PhD in 2003 from the Computer Science Department, University of Maryland College Park. His Ph.D. thesis is titled Generation-Heavy Hybrid Machine Translation. He is currently an Associate research scientist at the Center for Computational Learning Systems in Columbia University. His research includes work on machine translation, natural language generation, lexical semantics, morphological analysis, generation and disambiguation, computational modeling of Arabic dialects, and Arabic dialect parsing. Dr. Habash served as co-chair for the Workshop on Computational Approaches to Semitic Languages (ACL 2005) and also the Workshop on Machine Translation for Semitic Languages (MT Summit 2003). In 2005, he co-founded the Columbia Arabic Dialect Modeling (CADIM) group. He is the vice-president of the Semitic Language Special Interest Group in the Association of Computational Linguistics. He was a program co-chair for AMTA 2006.
[ top ]    #4 - Statistical Machine Translation: Theory and Practice
           Philipp Koehn (University of Edinburgh) & Dennis Mehay (Ohio State)
Description:
***Full-day tutorial: 3 hours lectures + 3 hours practical session


We will describe the current methods used in statistical machine translation (word alignment, phrase models, automatic evaluation), and focus the second part of the tutorial on the issue of integrating linguistic knowledge into this approach (tree-based models, clause restructuring, factored translation models). This tutorial is similar to the ones Koehn/Knight have given in previous years, but with a stronger emphasis on current research efforts (both by other researchers and myself) to integrate linguistic knowledge into statistical machine translation models. We will also draw from a book on statistical machine translation that Philipp Koehn is currently writing and that will be published around the time of MT Summit XII. Practical sessions on machine translation evaluation, the use of the Moses decoder and the training of phrase-based and factored models.
Philipp Koehn received his PhD from the University of Southern California, where he was a research assistant at the Information Sciences Institute (ISI) from 1997 to 2003. He was a postdoctoral research associate at the Massachusetts Institute of Technology (MIT) in 2004, and joined the University of Edinburgh as a lecturer in 2005. His research centres on statistical machine translation, but he also worked on speech in 1999 at AT&T Research Labs and text classification in 2000 at Whizbang Labs. He is a co-founder of Getprice, a German price comparison Internet company, where he acted as CTO from 2000-2005. Besides his research, his major contribution to the machine translation community are the preparation and release of the DE-News and Europarl corpora, as well as the Pharaoh and Moses decoder --- all of which are widely used. The statistical machine translation that was developed under his leadership over the last years is one of the top performers in recent DARPA, IWSLT and TC-STAR competitions. He organised a workshop of parallel text at ACL-2005 with a shared task concerning the translation between European languages. His research is funded by DARPA (GALE project) and he is the scientific co-ordinator for the EuroMatrix project, funded by the European Commission.

Dennis Mehay has been working in Machine Translation Evaluation (using syntax to handle word-order and large-scale distributional models for synonym mining) as a PhD student at Ohio State University. He collaborated with Chris Brew on parsing, supertagging, and lexical acquisition, and works on pushing that into the multilingual domain.
[ top ]    #5 - Authoring for Translatability and Self-Service Support
           Jörg Schütz (Bioloom Group) & Mike Dillinger (AMTA)
Description:
***Half-day tutorial: 3 hours


This tutorial focuses on processes and technologies that are gaining ground in companies which are at the forefront of multilingual business communication (B2C and B2B). We focus on how these businesses optimize authoring and translation by using business processes that integrate, monitor, and optimize the whole content supply chain. These processes and technologies result in dramatically better leverage of existing assets, improve consistency in both source and target information, and reduce time to market. The most direct effect at these companies is a sharp reduction in localization and technical support costs.
Prof. Dr. Jörg Schütz is a computer scientist with 30 years of business experience in different fields of computation, ranging from databases through language technologies to virtualization. He has consulted and lectured world-wide, is a member of several scientific and industry associations as well as standardization bodies and working groups on data exchange formats, and he advises the European Commission as an expert, evaluator and reviewer. He has studied computer science, mathematics, and medicine, holds a PhD in AI and Machine Translation, and received a Honorary Professorship for Machine Translation and for Information Sciences from the University of the Saarland in Saarbrücken. He is the founder of the Bioloom Group, an organization that develops solutions for the next generation of intelligent computing.

Mike Dillinger, PhD is an independent consultant who helps his clients optimize authoring and translation processes for more effective content management. Dr. Dillinger’s unique expertise is grounded in more than a dozen years’ research on how adults read, write, and translate technical information in different languages. In addition, he brings a decade of hands-on experience training technical writers and auditing documentation for industrial clients, as well as in academia. He wrote the widely circulated LISA Best Practices Guide: Implementing Machine Translation, as an overview of this process. Dr. Dillinger is also President of the Association for Machine Translation in the Americas and an experienced developer of commercial machine translation systems.
[ top ]    #6 - Sixty Years of Statistical Machine Translation
           Kevin Knight (ISI & Language Weaver)
Description:
***Half-day tutorial: 3 hours


This tutorial will describe the results of statistical machine translation research since 1948. Most of the tutorial will focus on the explosion of work in the past few years that has resulted from intense interest on the part of scientists, research funders, and industry. Topics will include evaluation of translation, extracting translation rules from bilingual text, target language modeling, runtime translation algorithms, specialized components, syntactic models, and statistical training algorithms. In addition to bringing listeners up to speed on the latest research developments, we will also examine the roots of statistical MT in World War Two decipherment activities. Some of the concepts from that era (such as the noisy channel model and statistical language modeling) have become core to the field, while others still remain to be picked up.
Kevin Knight is a Research Associate Professor in Computer Science at the University of Southern California, a Senior Research Scientist and Fellow at the USC/Information Sciences Institute, and Chief Scientist at Language Weaver, Inc. Dr. Knight received a Ph.D. from Carnegie Mellon University in 1992, and a bachelor’s degree from Harvard University. With collaborators at USC/ISI and elsewhere, he has authored over 50 papers on MT and natural language processing. Some of Dr. Knight's contributions to MT include automatic post-editing of translated documents (1994), statistical meaning-to-text algorithms (1995), transliteration of names across language pairs with different writing systems (1997), weighted finite-state models for MT (1998), a statistical MT tutorial (1999), Chinese room experiments (2000), statistical decoding algorithms (2001, 2002, 2006), syntax-based statistical MT (2001-), tree automata for MT (2004), and commercialization of statistical MT (2001-).


president@amtaweb.org
Last updated: 1 August, 2008