PDF
Department Home Rankings Events
::Dining:: DC Foodies Washingtonian ::Goings On:: DCist ::Neighborhood Specific:: Dupont/Logan 14th&U DC Wifi Map
::Dining:: Boston Magazine Boston.com mcslimjb blog ::Goings On:: (Pretty much) all Boston events Bostonist

Vladimir A. Eidelman

Institute for Advanced Computer Studies
A.V. Williams 3126F
University of Maryland
College Park, MD 20742
View Vladimir Eidelman's profile on LinkedIn

Publications



2014

Polylingual Tree-Based Topic Models for Translation Domain Adaptation Abstract:Topic models, an unsupervised technique for inferring translation domains improve machine translation quality. However, previous work uses only the source language and completely ignores the target language, which can disambiguate domains. We propose new polylingual tree-based topic models to extract domain knowledge that considers both source and target languages and derive three different inference schemes. We evaluate our model on a Chinese to English translation task and obtain up to 1.2 BLEU improvement over strong baselines.
Y. Hu, K. Zhai, V. Eidelman, and J. Boyd-Graber
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics

2013

Improved Online Learning and Modeling for Feature-Rich Discriminative Machine Translation Abstract: Machine translation represents one of the core tasks in natural language processing: performing an automatic analysis of input text to produce a structured output. Most modern statistical machine translation (SMT) systems learn how to translate by constructing a discriminative model based on statistics from the data. While this allows the introduction of feature functions representing important attributes of the translation process, the associated model parameters must be learned. In this thesis, we present novel models and learning algorithms that address this issue by tackling three core problems for discriminative training: what to optimize, how to optimize, and how to represent the input. The first issue amounts to selecting an appropriate objective function, the second to properly searching for parameters that optimize that objective, and the third to extracting informative features of the data. In addressing these issues, we develop fast learning algorithms that are both suitable for large-scale SMT training and capable of generalization in high-dimensional feature spaces.

Our algorithms are developed in an online margin-based framework. While online margin-based methods are firmly established in the machine learning community, their adaptation to machine translation is not straightforward. Thus, the first problem we address is what to optimize when learning for SMT, which involves loss-augmented inference over latent variables. To this end, we define a family of objective functions for large-margin learning in SMT and investigate their optimization performance in standard and high-dimensional feature spaces. We also show that this approach shows promise not just with respect to the size of the feature space, but with respect to the size of the tuning data.

After establishing what to optimize, the second problem we focus on is how to improve learning in the feature-rich space. While the goal of all learning methods is to produce a model from a limited number of training instances that generalizes well to unseen data, this becomes increasingly difficult as the feature dimension grows. Following recent developments in machine learning that show that generalization ability can be improved by incorporating higher order information into the optimization, we develop an online gradient-based algorithm based on Relative Margin Machines that improves upon large-margin learning by considering and bounding the spread of the data while maximizing the margin.

By utilizing the learning regimes developed thus far, we are able to focus on the third problem and introduce new features specifically targeting generalization to new domains. While domain adaptation has typically been done with manually defined domains and corpora, we employ topic models to perform unsupervised domain induction, and introduce translation model adaptation features based on probabilistic domain membership.

As a final question, we look at how to take advantage of all the available information for optimization. In current models of SMT, there is an exponential number of paths by which the same translation output can be produced. However, only the final output is observed, thereby necessitating the treatment of these paths as a latent variable. The standard practice is to sidestep this ambiguity by treating each derivation as an individual translation, and performing training and inference toward a single way of producing the output. While addressing the learning problem above, we were still firmly situated in that standard regime. In the final part of the thesis, we revisit and present a more thorough approach for what to optimize, defining a framework for latent variable models which explicitly takes advantage of all derivations in both learning and inference. We present a novel loss function for large-margin learning in that setting, and develop a suitable optimization algorithm for training an SMT system with this objective.
[slides] [drum]
V. Eidelman
Dissertation
Topic Models for Translation Model Adaptation Abstract: Topic models have been successfully applied in domain adaptation for translation models. However, previous works applied topic models only on source side and ignored the relations between source and target languages in machine translation. This paper corrects this omission by learning models that can also use target-side information to discover more distinct topics: tree-based topic models and polylingual topic models. We evaluate the models using translation quality.
Y. Hu, K. Zhai, V. Eidelman, and J. Boyd-Graber
In Proceedings of the NIPS 2013 workshop on Topic Models: Computation, Application, and Evaluation
Online Relative Margin Maximization for Statistical Machine Translation Abstract: Recent advances in large-margin learning have shown that better generalization can be achieved by incorporating higher order information into the optimization, such as the spread of the data. However, these solutions are impractical in complex structured prediction problems such as statistical machine translation. We present an online gradient-based algorithm for relative margin maximization, which bounds the spread of the projected data while maximizing the margin. We evaluate our optimizer on Chinese-English and Arabic-English translation tasks, each with small and large feature sets, and show that our learner is able to achieve significant improvements of 1.2-2 BLEU and 1.7-4.3 TER on average over state-of-the-art optimizers with the large feature set. [slides] [code]
V. Eidelman, Y. Marton, and P. Resnik
In Proceedings of the 51th Annual Meeting of the Association for Computational Linguistics (ACL)
Mr. MIRA: Open-Source Large-Margin Structured Learning on MapReduce Abstract: We present an open-source framework for large-scale online structured learning. Developed with the flexibility to handle cost-augmented inference problems such as statistical machine translation (SMT), our large-margin learner can be used with any decoder. Integration with MapReduce using Hadoop streaming allows efficient scaling with increasing size of training data. Although designed with a focus on SMT, the decoder-agnostic design of our learner allows easy future extension to other structured learning problems such as sequence labeling and parsing. [poster] [code]
V. Eidelman, K. Wu, F. Ture, P. Resnik and J. Lin
In Proceedings of the 51th Annual Meeting of the Association for Computational Linguistics (ACL)
Towards Efficient Large-Scale Feature-Rich Statistical Machine Translation Abstract: We present the system we developed to provide efficient large-scale feature-rich discriminative training for machine translation. We describe how we integrate with MapReduce using Hadoop streaming to allow arbitrarily scaling the tuning set and utilizing a sparse feature set. We report our findings on German-English and Russian- English translation, and discuss benefits, as well as obstacles, to tuning on larger development sets drawn from the parallel training data. [poster]
V. Eidelman, K. Wu, F. Ture, P. Resnik and J. Lin
In Proceedings of the Eighth Workshop on Statistical Machine Translation (WMT)

2012

Unsupervised Feature-Rich Clustering Abstract: Unsupervised clustering of documents is challenging because documents can conceivably be divided across multiple dimensions. Motivated by prior work incorporating expressive features into unsupervised generative models, this paper presents an unsupervised model for categorizing textual data which is capable of utilizing arbitrary features over a large context. Utilizing locally normalized log-linear models in the generative process, we offer straightforward extensions to the standard multinomial mixture model that allow us to effectively utilize automatically derived complex linguistic, statistical, and metadata features to influence the learned cluster structure for the desired task. We extensively evaluate and analyze the model’s capabilities over four distinct clustering tasks: topic, perspective, sentiment analysis, and Congressional bill survival, and show that this model outperforms strong baselines and state-of-the-art models. [slides]
V. Eidelman
In Proceedings of the 24th International Conference on Computational Linguistics (COLING)
Topic Models for Dynamic Translation Model Adaptation Abstract: We propose an approach that biases machine translation systems toward relevant translations based on topic-specific contexts, where topics are induced in an unsupervised way using topic models; this can be thought of as inducing subcorpora for adaptation without any human annotation. We use these topic distributions to compute topic-dependent lexical weighting probabilities and directly incorporate them into our translation model as features. Conditioning lexical probabilities on the topic biases translations toward topicrelevant output, resulting in significant improvements of up to 1 BLEU and 3 TER on Chinese to English translation over a strong baseline. [slides]
V. Eidelman, J. Boyd-Graber, and P. Resnik
In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL)
Optimization Strategies for Online Large-Margin Learning in Machine Translation Abstract: The introduction of large-margin based discriminative methods for optimizing statistical machine translation systems in recent years has allowed exploration into many new types of features for the translation process. By removing the limitation on the number of parameters which can be optimized, these methods have allowed integrating millions of sparse features. However, these methods have not yet met with wide-spread adoption. This may be partly due to the perceived complexity of implementation, and partly due to the lack of standard methodology for applying these methods to MT. This papers aims to shed light on large-margin learning for MT, explicitly presenting the simple passive-aggressive algorithm which underlies many previous approaches, with direct application to MT, and empirically comparing several widespread optimization strategies. [slides] [code]
V. Eidelman
In Proceedings of the Seventh Workshop on Statistical Machine Translation (WMT)

2011

Noisy SMS Machine Translation in Low-Density Languages Abstract: This paper presents the system we developed for the 2011 WMT Haitian Creole–English SMS featured translation task. Applying standard statistical machine translation methods to noisy real-world SMS data in a low-density language setting such as Haitian Creole poses a unique set of challenges, which we attempt to address in this work. Along with techniques to better exploit the limited available training data, we explore the benefits of several methods for alleviating the additional noise inherent in the SMS and transforming it to better suite the assumptions of our hierarchical phrase-based model system. We show that these methods lead to significant improvements in BLEU score over the baseline.
V. Eidelman, K. Hollingshead, and P. Resnik
In Proceedings of the Sixth Workshop on Statistical Machine Translation (WMT)
The Value of Monolingual Crowdsourcing in a Real-World Translation Scenario: Simulation using Haitian Creole Emergency SMS Messages Abstract: MonoTrans2 is a translation system that combines machine translation (MT) with human computation using two crowds of monolingual source (Haitian Creole) and target (English) speakers. We report on its use in the WMT 2011 Haitian Creole to English translation task, showing that MonoTrans2 translated 38% of the sentences well compared to Google Translate’s 25%.
C. Hu, P. Resnik, Y. Kronrod, V. Eidelman, O. Buzek, and B. Bederson
In Proceedings of the Sixth Workshop on Statistical Machine Translation (WMT)

2010

Lessons Learned in Part-of-Speech Tagging of Conversational Speech Abstract: This paper examines tagging models for spontaneous English speech transcripts. We analyze the performance of state-of-the-art tagging models, either generative or discriminative, left-to-right or bidirectional, with or without latent annotations, together with the use of ToBI break indexes and several methods for segmenting the speech transcripts (i.e., conversation side, speaker turn, or humanannotated sentence). Based on these studies, we observe that: (1) bidirectional models tend to achieve better accuracy levels than left-toright models, (2) generative models seem to perform somewhat better than discriminative models on this task, and (3) prosody improves tagging performance of models on conversation sides, but has much less impact on smaller segments. We conclude that, although the use of break indexes can indeed significantly improve performance over baseline models without them on conversation sides, tagging accuracy improves more by using smaller segments, for which the impact of the break indexes is marginal. [poster]
V. Eidelman, Z. Huang, and M. Harper
In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP)
The University of Maryland Statistical Machine Translation System for the Fifth Workshop on Machine Translation Abstract: This paper describes the system we developed to improve German-English translation of News text for the shared task of the Fifth Workshop on Statistical Machine Translation. Working within cdec, an open source modular framework for machine translation, we explore the benefits of several modifications to our hierarchical phrase-based model, including segmentation lattices, minimum Bayes Risk decoding, grammar extraction methods, and varying language models. Furthermore, we analyze decoder speed and memory performance across our set of models and show there is an important trade-off that needs to be made. [poster]
V. Eidelman, C. Dyer, and P. Resnik
In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and Metrics MATR (WMT)
cdec: A Decoder, Alignment, and Learning Framework for Finite-State and Context-Free Translation Models Abstract: We present cdec, an open source framework for decoding, aligning with, and training a number of statistical machine translation models, including word-based models, phrase-based models, and models based on synchronous context-free grammars. Using a single unified internal representation for translation forests, the decoder strictly separates model-specific translation logic from general rescoring, pruning, and inference algorithms. From this unified representation, the decoder can extract not only the 1- or k-best translations, but also alignments to a reference, or the quantities necessary to drive discriminative training using gradient-based or gradient-free optimization techniques. Its efficient C++ implementation means that memory use and runtime performance are significantly better than comparable decoders.
C. Dyer, A. Lopez, J. Ganitkevitch, J. Weese, F. Ture, P. Blunsom, H. Setiawan, V. Eidelman, and P. Resnik
In Proceedings of the Association for Computational Linguistics (ACL)

2009

Improving A Simple Bigram HMM Part-of-Speech Tagger by Latent Annotation and Self-Training Abstract: In this paper, we describe and evaluate a bigram part-of-speech (POS) tagger that uses latent annotations and then investigate using additional genre-matched unlabeled data for self-training the tagger. The use of latent annotations substantially improves the performance of a baseline HMM bigram tagger, outperforming a trigram HMM tagger with sophisticated smoothing. The performance of the latent tagger is further enhanced by self-training with a large set of unlabeled data, even in situations where standard bigram or trigram taggers do not benefit from selftraining when trained on greater amounts of labeled training data. Our best model obtains a state-of-the-art Chinese tagging accuracy of 94.78% when evaluated on a representative test set of the Penn Chinese Treebank 6.0. [poster]
Z. Huang, V. Eidelman, M. Harper
In Proceedings of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT)

2008

Inferring Activity Time in News through Event Modeling Abstract: Many applications in NLP, such as question answering and summarization, either require or would greatly benefit from the knowledge of when an event occurred. Creating an effective algorithm for identifying the activity time of an event in news is difficult in part because of the sparsity of explicit temporal expressions. This paper describes a domain-independent machine-learning based approach to assign activity times to events in news. We demonstrate that by applying topic models to text, we are able to cluster sentences that describe the same event, and utilize the temporal information within these event clusters to infer activity times for all sentences. Experimental evidence suggests that this is a promising approach, given evaluations performed on three distinct news article sets against the baseline of assigning the publication date. Our approach achieves 90%, 88.7%, and 68.7% accuracy, respectively, outperforming the baseline twice. [slides]
V. Eidelman
In Proceedings of the Association for Computational Linguistics (ACL) Student Research Workshop
BART: A Modular Toolkit for Coreference Resolution Abstract: Developing a full coreference system able to run all the way from raw text to semantic interpretation is a considerable engineering effort, yet there is very limited availability of off-the shelf tools for researchers whose interests are not in coreference, or for researchers who want to concentrate on a specific aspect of the problem. We present BART, a highly modular toolkit for developing coreference applications. In the Johns Hopkins workshop on using lexical and encyclopedic knowledge for entity disambiguation, the toolkit was used to extend a reimplementation of the Soon et al. (2001) proposal with a variety of additional syntactic and knowledge-based features, and experiment with alternative resolution processes, preprocessing tools, and classifiers.
Y. Versley, S. Ponzetto, M. Poesio, V. Eidelman, A. Jern, J. Smith, X. Yang, and A. Moschitti
In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL)
BART: A Modular Toolkit for Coreference Resolution
Y. Versley, S. Ponzetto, M. Poesio, V. Eidelman, A. Jern, J. Smith, X. Yang, and A. Moschitti
In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC)

Pre-2008

Exploiting Lexical & Encyclopedic Resources For Entity Disambiguation Final Report
M. Poesio, D. Day, R. Arstein, J. Duncan, V. Eidelman, C. Giuliano, R. Hall, J. Hitzeman, A. Jern, M. Kabadjov, G. Mann, P. McNamee, A. Moschitti, S. Ponzetto, J. Smith, J. Steinberger, M. Strube, J. Su, Y. Versley, X. Yang, and M. Wick
Technical Report for CLSP Workshop, Johns Hopkins University, 2007
Cognitive Robotics and Multiagency in a Fuzzy Modeling Framework
G. Trajkovski, G. Stojanov, S. Collins, V. Eidelman, C. Harman, and G. Vincenti
International Journal of Agent Technologies and Systems. 1(1):50-73
Extension of an Algebraic Model of Cognition To a Congruent Continuous Model
V. Eidelman and G. Trajkovski
Technical Report for NSF REU, Towson University, 2006


Last Modified: Thursday, 26-Jun-2014 10:40:43 EDT