Abstract:
The goal of our project is a system that can translate between arbitrary pairs of languages. Unfortunately, most machine translation methodologies assume aligned corpora or grammar rules, which are available for only a small number of major language pairs. This makes scaling the popular approaches to any-language translation virtually impossible. We propose to scale machine translation to a panlingual level by first attempting to solve the lexical translation problem and then proceeding to translating pairs of words, phrases and then simple sentences. In this talk, I will primarily describe a novel approach to lexical translation that employs probabilistic inference over the Translation Graph, a novel lexical resource that combines translations from hundreds of machine readable dictionaries and Wiktionaries. Our inference algorithm results in the compilation of PanDictionary, a sense-distinguished dictionary translating between thousands of languages. I will also demo PanImages, an image search application that is powered by PanDictionary.

Bio:
Mausam is a Research Assistant Professor of Computer Science at University of Washington, Seattle. His primary research interest is in Artificial Intelligence. In his most recent work, his students have: developed a series of algorithms to solve structured Markov Decision Processes, created a large-scale repository of selectional preferences for common relation phrases in text, built a large multi-lingual dictionary by probabilistic inference, and designed an intelligent controller to control complex workflows on Mechanical Turk. He received his PhD from University of Washington in 2007 and a B.Tech. from IIT Delhi in 2001.