Hal Daumé III

I am Hal Daumé III, an Assistant Professor in Computer Science (also UMIACS and Linguistics) at the University of Maryland; I was previously in the School of Computing at the University of Utah (CV). Although I'd like to be known for my research in language (computational linguistics and natural language processing) and machine learning (structured prediction, domain adapation and Bayesian methods), I am probably best known for my NLPers blog. I associate myself most with conferences like ACL, ICML, EMNLP and NIPS. At UMD, I'm affiliated with the Computational Linguistics lab, the machine learning reading group, the language science program and the AI group, and interact closely with LINQS and computer vision.

BRAQUE   Braque is a news for researchers site that Percy Liang and I developed to help people stay on top of their research fields. Sign up and try it out!


The best way to reach me is by email at me AT hal3 DOT name, I cannot reply to all emails from prospective students; please read this to ensure that I read your email. For pressing matters, please come visit me in person at AVW 3227, or call my office at 301-405-1073.


Recent publications:

  • Regularized Interlingual Projections: Evaluation on Multilingual Transliteration [Jagarlamudi+HD] (2012)@InProceedings{daume12transliterate,
       author = {Jagadeesh Jagarlamudi and Hal {Daum\'e III}},
       title = {Regularized Interlingual Projections: Evaluation on Multilingual Transliteration},
       booktitle = {Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning},
       month = {July},
       year = {2012},
       address = {Jeju Island, Korea},
       publisher = {Association for Computational Linguistics},
       pages = {12--23},
       url = {http://www.aclweb.org/anthology/D12-1002}
    }
  • Imitation Learning by CoachingAbstract     Imitation Learning has been shown to be successful in solving many challenging real-world problems. Some recent approaches give strong performance guarantees by training the policy iteratively. However, it is important to note that these guarantees depend on how well the policy we found can imitate the oracle on the training data. When there is a substantial difference between the oracle’s ability and the learner’s policy space, we may fail to find a policy that has low error on the training set. In such cases, we propose to use a coach that demonstrates easy-to-learn actions for the learner and gradually approaches the oracle. By a reduction of learning by demonstration to online learning, we prove that coaching can yield a lower regret bound than using the oracle. We apply our algorithm to cost-sensitive dynamic feature selection, a hard decision problem that considers a user-specified accuracy-cost trade-off. Experimental results on UCI datasets show that our method outperforms state-of-the-art imitation learning methods in dynamic feature selection and two static feature selection methods. [He+al.] (NIPS 2012)@InProceedings{daume12coaching,
       author = {He He and Hal {Daum\'e III} and Jason Eisner},
       title = {Imitation Learning by Coaching},
       booktitle = {Neural Information Processing Systems (NIPS)},
       year = {2012},
       url = {http://hal3.name/docs/#daume12coaching}
    }
       
  • Detecting Visual TextAbstract     When people describe a scene, they often include information that is not visually apparent; sometimes based on background knowledge, sometimes to tell a story. We aim to separate visual text—descriptions of what is being seen—from non-visual text in natural images and their descriptions. To do so, we first concretely define what it means to be visual, annotate visual text and then develop algorithms to automatically classify noun phrases as visual or non-visual. We find that using text alone, we are able to achieve high accuracies at this task, and that incorporating features derived from computer vision algorithms improves performance. Finally, we show that we can reliably mine visual nouns and adjectives from large corpora and that we can use these effectively in the classification task. [Dodge+al.] (NAACL 2012)@InProceedings{daume12desctext,
       author = {Jesse Dodge and Amit Goyal and Xufeng Han and Alyssa Mensch and Margaret Mitchell and Karl Stratos and Kota Yamaguchi and Yejin Choi and Hal {Daum\'e III} and Alexander C. Berg and Tamara L. Berg},
       title = {Detecting Visual Text},
       booktitle = {North American Chapter of the Association for Computational Linguistics (NAACL)},
       year = {2012},
       url = {http://hal3.name/docs/#daume12desctext}
    }
     [data/code]
  • Understanding and Predicting Importance in ImagesAbstract     What do people care about in an image? To drive computational visual recognition toward more human-centric outputs, we need a better understanding of how people perceive and judge the importance of content in images. In this paper, we explore how a number of factors relate to human perception of importance. Proposed factors fall into 3 broad types: 1) factors related to composition, e.g. size, location, 2) factors related to semantics, e.g. category of object or scene, and 3) contextual factors related to the likelihood of attribute-object, or object-scene pairs. We explore these factors using what people describe as a proxy for importance. Finally, we build models to predict what will be described about an image given either known image content, or image content estimated automatically by recognition systems. [Stratos+al.] (CVPR 2012)@InProceedings{daume12importance,
       author = {Karl Stratos and Aneesh Sood and Alyssa Mensch and Xufeng Han and Margaret Mitchell and Kota Yamaguchi and Jesse Dodge and Amit Goyal and Hal {Daum\'e III} and Alexander C. Berg and Tamara L. Berg},
       title = {Understanding and Predicting Importance in Images},
       booktitle = {Computer Vision and Pattern Recognition (CVPR)},
       year = {2012},
       url = {http://hal3.name/docs/#daume12importance}
    }
  • Midge: Generating Image Descriptions From Computer Vision DetectionsAbstract     This paper introduces a novel generation system that composes humanlike descriptions of images from computer vision detections. By leveraging syntactically informed word co-occurrence statistics, the generator filters and constrains the noisy detections output from a vision system to generate syntactic trees that detail what the computer vision system sees. Results show that the generation system outperforms state-of-the-art systems, automatically generating some of the most natural image descriptions to date. [Mitchell+al.] (EACL 2012)@InProceedings{daume12midge,
       author = {Margaret Mitchell and Jesse Dodge and Amit Goyal and Kota Yamaguchi and Karl Stratos and Xufeng Han and Alyssa Mensch and Alexander C. Berg and Tamara L. Berg and Hal {Daum\'e III}},
       title = {Midge: Generating Image Descriptions From Computer Vision Detections},
       booktitle = {European Chapter of the Association for Computational Linguistics (EACL)},
       year = {2012},
       url = {http://hal3.name/docs/#daume12midge}
    }
  • Learning Task Grouping and Overlap in Multi-task LearningAbstract     In the paradigm of multi-task learning, multiple related prediction tasks are learned jointly, sharing information across the tasks. We propose a framework for multi-task learning that enables one to selectively share the information across the tasks. We assume that each task parameter vector is a linear combination of a finite number of underlying basis tasks. The coefficients of the linear combination are sparse in nature and the overlap in the sparsity patterns of two tasks controls the amount of sharing across these. Our model is based on the assumption that task parameters within a group lie in a low dimensional subspace but allows the tasks in different groups to overlap with each other in one or more bases. Experimental results on four datasets show that our approach outperforms competing methods. [Kumar+HD] (ICML 2012)@InProceedings{daume12gomtl,
       author = {Abhishek Kumar and Hal {Daum\'e III}},
       title = {Learning Task Grouping and Overlap in Multi-task Learning},
       booktitle = {International Conference on Machine Learning (ICML)},
       year = {2012},
       url = {http://hal3.name/docs/#daume12gomtl}
    }
  • A Binary Classification Framework for Two-Stage Multiple Kernel LearningAbstract     With the advent of kernel methods, automating the task of specifying a suitable kernel has become increasingly important. In this context, the Multiple Kernel Learning (MKL) problem of finding a combination of prespecified base kernels that is suitable for the task at hand has received significant attention from researchers. In this paper we show that Multiple Kernel Learning can be framed as a standard binary classification problem with additional constraints that ensure the positive definiteness of the learned kernel. Framing MKL in this way has the distinct advantage that it makes it easy to leverage the extensive research in binary classification to develop better performing and more scalable MKL algorithms that are conceptually simpler, and, arguably, more accessible to practitioners. Experiments on nine data sets from different domains show that, despite its simplicity, the proposed technique compares favorably with current leading MKL approaches. [Kumar+al.] (ICML 2012)@InProceedings{daume12binarymkl,
       author = {Abhishek Kumar and Alexandru Niculescu-Mizil and Koray Kavukcuoglu and Hal {Daum\'e III}},
       title = {A Binary Classification Framework for Two-Stage Multiple Kernel Learning},
       booktitle = {International Conference on Machine Learning (ICML)},
       year = {2012},
       url = {http://hal3.name/docs/#daume12binarymkl}
    }
  • Generalized Multiview Analysis: A Discriminative latent spaceAbstract     This paper presents a general multi-view feature extraction approach that we call Generalized Multiview Analysis or GMA. GMA has all the desirable properties required for cross-view classification and retrieval: it is supervised, it allows generalization to unseen classes, it is multi-view and kernelizable, it affords an efficient eigenvalue based solution and is applicable to any domain. GMA exploits the fact that most popular supervised and unsupervised feature extraction techniques are the solution of a special form of a quadratic constrained quadratic program (QCQP), which can be solved efficiently as a generalized eigenvalue problem. GMA solves a joint, relaxed QCQP over different feature spaces to obtain a single (non)linear subspace. Intuitively, GMA is a supervised extension of Canonical Correlational Analysis (CCA), which is useful for cross-view classification and retrieval. The proposed approach is general and has the potential to replace CCA whenever classification or retrieval is the purpose and label information is available. We outperform previous approaches for text-image retrieval on Pascal and Wiki text-image data. We report state-of-the-art results for pose and lighting invariant face recognition on the MultiPIE face dataset, significantly outperforming other approaches. [Sharma+al.] (CVPR 2012)@InProceedings{daume12gma,
       author = {Abhishek Sharma and Abhishek Kumar and Hal {Daum\'e III} and David Jacobs},
       title = {Generalized Multiview Analysis: A Discriminative latent space},
       booktitle = {Computer Vision and Pattern Recognition (CVPR)},
       year = {2012},
       url = {http://hal3.name/docs/#daume12gma}
    }
  • Incorporating Lexical Priors into Topic ModelsAbstract     Topic models have great potential for helping users understand document corpora. This potential is stymied by their purely unsupervised nature, which often leads to topics that are neither entirely meaningful nor effective in extrinsic tasks (Chang et al., 2009). We propose a simple and effective way to guide topic models to learn topics of specific interest to a user. We achieve this by providing sets of seed words that a user believes are representative of the underlying topics in a corpus. Our model uses these seeds to improve both topicword distributions (by biasing topics to produce appropriate seed words) and to improve document-topic distributions (by biasing documents to select topics related to the seed words they contain). Extrinsic evaluation on a document clustering task reveals a significant improvement when using seed information, even over other models that use seed information navely. [Jagarlamudi+al.] (EACL 2012)@inproceedings{daume12seeded,
       title = {Incorporating Lexical Priors into Topic Models},
       author = {Jagadeesh Jagarlamudi and Hal {Daum\'e III} and Raghavendra Udupa},
       booktitle = {Proceedings of the Conference on European Chapter of the Association for Computational Linguistics (EACL)},
       year = {2012},
       address = {Avignon, France},
       url = {http://hal3.name/docs/#daume12seeded}
    }
  • Flexible Modeling of Latent Task Structures in Multitask LearningAbstract     Multitask learning algorithms are typically designed assuming some fixed, a priori known latent structure shared by all the tasks. However, it is usually unclear what type of latent task structure is the most appropriate for a given multitask learning problem. Ideally, the "right" latent task structure should be learned in a data-driven manner. We present a flexible, nonparametric Bayesian model that posits a mixture of factor analyzers structure on the tasks. The nonparametric aspect makes the model expressive enough to subsume many existing models of latent task structures (e.g, meanregularized tasks, clustered tasks, low-rank or linear/non-linear subspace assumption on tasks, etc.). Moreover, it can also learn more general task structures, addressing the shortcomings of such models. We present a variational inference algorithm for our model. Experimental results on synthetic and realworld datasets, on both regression and classification problems, demonstrate the effectiveness of the proposed method. [Passos+al.] (ICML 2012)@InProceedings{daume12flexiblemtl,
       author = {Alexandre Passos and Piyush Rai and Jacques Wainer and Hal {Daum\'e III}},
       title = {Flexible Modeling of Latent Task Structures in Multitask Learning},
       booktitle = {International Conference on Machine Learning (ICML)},
       year = {2012},
       address = {Edinburgh, Scotland},
       url = {http://hal3.name/docs/#daume12flexiblemtl}
    }

Recent teaching:


Advisees:

Prospective students:
  • Read this and email me after taking machine learning and/or NLP about potential research.

Current advisees:

Past advisees:

  • Adam Teichert (MS 2009 at Utah, now PhD student at JHU)
  • Scott Alfeld (BS 2008 at Utah, now PhD student at USC)

Upcoming Conferences

(bold = plan to attend):

LocationDue DateNotificationConference Dates
AISTATS 12Canary IslandsPastPast21-23 Apr
ACL 12Jeju, KoreaPast11 Mar09-11 Jul
EMNLP 12Jeju, Korea28 Mar18 May12-14 Jul
CVPR 12Providence, RIPast02 Mar18-20 Jun
ICML 12Edinburgh, Scotland24 Feb30 Apr26 Jun-01 Jul
AAAI 12Toronto, CanadaPast28 Mar22-26 Jul
KDD 12Beijing, ChinaPast04 May12-16 Aug
UAI 12Catalina Island, CA30 Mar01 Jun15-17 Aug
NIPS 12Reno, NV??????

last updated on twenty two april, two thousand thirteen; contact me AT hal3 DOT name