Abstract:
The goal of our project is a system that can translate between
arbitrary pairs of languages. Unfortunately, most machine translation
methodologies assume aligned corpora or grammar rules, which are available
for only a small number of major language pairs. This makes scaling the
popular approaches to any-language translation virtually impossible. We
propose to scale machine translation to a panlingual level by first
attempting to solve the lexical translation problem and then proceeding to
translating pairs of words, phrases and then simple sentences. In this talk,
I will primarily describe a novel approach to lexical translation that
employs probabilistic inference over the Translation Graph, a novel lexical
resource that combines translations from hundreds of machine readable
dictionaries and Wiktionaries. Our inference algorithm results in the
compilation of PanDictionary, a sense-distinguished dictionary translating
between thousands of languages. I will also demo PanImages, an image search application that is powered by
PanDictionary.
Bio:
Mausam is a Research Assistant
Professor of Computer Science at University of Washington, Seattle. His
primary research interest is in Artificial Intelligence. In his most recent
work, his students have: developed a series of algorithms to solve
structured Markov Decision Processes, created a large-scale repository of
selectional preferences for common relation phrases in text, built a large
multi-lingual dictionary by probabilistic inference, and designed an
intelligent controller to control complex workflows on Mechanical Turk. He
received his PhD from University of Washington in 2007 and a B.Tech. from
IIT Delhi in 2001.