Tom Yeh

Assistant Research Scientist
4465 A.V. Williams Building
(301) 405-9159
Education: 
Ph.D., MIT (Computer Science)
Biography: 

Tom Yeh is an assistant research scientist in the University of Maryland Institute for Advanced Computer Studies (UMIACS). He received his Ph.D. from MIT in Computer Science in 2009 and started at the University of Maryland in 2010. His research interests span human computer interaction, computer vision, and software engineering. He has written over 30 research publications on algorithms for interactive computer vision, vision-based interactive systems, multimedia information retrieval, and visual software test automation. He has served on the program committees of the conferences in his area including the Symposium on User Interface Software and Technology and the Workshop on Compute Vision Application. He has won a number of best paper awards. He is one of the creators of the popular Sikuli software that enables non-programmers to write simple image-based automation scripts.

Publications

2012


Xie B, Tom Yeh, Walsh G, Watkins I, Huang M.  2012.  Co-designing an e-health tutorial for older adults. Proceedings of the 2012 iConference.
:240-247.

2011


Tom Yeh, White B, San Pedro J, Katz B, Davis LS.  2011.  A case for query by image and text content: searching computer help using screenshots and keywords. Proceedings of the 20th international conference on World wide web.
:775-784.

Chen D, Bilgic M, Getoor L, Jacobs DW, Mihalkova L, Tom Yeh.  2011.  Active inference for retrieval in camera networks. Person-Oriented Vision (POV), 2011 IEEE Workshop on.
:13-20.

Chang T-H, Tom Yeh, Miller R.  2011.  Associating the visual representation of user interfaces with their internal structures and metadata. Proceedings of the 24th annual ACM symposium on User interface software and technology.
:245-256.

Xie B, Tom Yeh, Walsh G, Watkins I, Huang M.  2011.  Co‐designing contextual tutorials for older adults on searching health information on the internet. Proceedings of the American Society for Information Science and Technology. 48(1):1-4.

Tom Yeh, White B, San Pedro J, Katz B, Davis LS.  2011.  A case for query by image and text content: searching computer help using screenshots and keywords. Proceedings of the 20th international conference on World wide web.
:775-784.

Tom Yeh, Chang T-H, Xie B, Walsh G, Watkins I, Wongsuphasawat K, Huang M, Davis LS, Bederson BB.  2011.  Creating contextual help for GUIs using screenshots. Proceedings of the 24th annual ACM symposium on User interface software and technology.
:145-154.

Chen D, Bilgic M, Getoor L, Jacobs DW, Mihalkova L, Tom Yeh.  2011.  Active inference for retrieval in camera networks. Person-Oriented Vision (POV), 2011 IEEE Workshop on.
:13-20.

2010


Bigham JP, Jayant C, Ji H, Little G, Miller A, Miller RC, Miller R, Tatarowicz A, White B, White S et al..  2010.  VizWiz: nearly real-time answers to visual questions. Proceedings of the 23nd annual ACM symposium on User interface software and technology.
:333-342.

Bigham JP, Jayant C, Miller A, White B, Tom Yeh.  2010.  VizWiz::LocateIt - enabling blind people to locate objects in their environment. Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on.
:65-72.

Chang T-H, Tom Yeh, Miller RC.  2010.  GUI testing using computer vision. Proceedings of the 28th international conference on Human factors in computing systems.
:1535-1544.

White B, Tom Yeh, Jimmy Lin, Davis LS.  2010.  Web-scale computer vision using MapReduce for multimedia data mining. Proceedings of the Tenth International Workshop on Multimedia Data Mining.
:9:1–9:10-9:1–9:10.

2009


Tom Yeh, Katz B.  2009.  Searching documentation using text, OCR, and image. Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval.
:776-777.

Tom Yeh, Chang T-H, Miller RC.  2009.  Sikuli: using GUI screenshots for search and automation. Proceedings of the 22nd annual ACM symposium on User interface software and technology.
:183-192.

Tom Yeh, Lee JJ, Darrell T.  2009.  Fast concurrent object localization and recognition. Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on.
:280-287.

2008


Tom Yeh, Darrell T.  2008.  Multimodal question answering for mobile devices. Proceedings of the 13th international conference on Intelligent user interfaces.
:405-408.

Tom Yeh, Lee JJ, Darrell T.  2008.  Photo-based question answering. Proceedings of the 16th ACM international conference on Multimedia.
:389-398.

Tom Yeh, Lee JJ, Darrell T.  2008.  Scalable classifiers for Internet vision tasks. Computer Vision and Pattern Recognition Workshops, 2008. CVPRW'08. IEEE Computer Society Conference on.
:1-8.

Tom Yeh, Darrell T.  2008.  Dynamic visual category learning. Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on.
:1-8.

Tom Yeh, Lee JJ, Darrell T.  2008.  Fast concurrent object classification and localization. CSAIL Technical Reports (July 1, 2003 - present).

2007


Tom Yeh, Lee J, Darrell T.  2007.  Adaptive Vocabulary Forests br Dynamic Indexing and Category Learning. Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on.
:1-8.

2006

2005


Tom Yeh, Grauman K, Tollmar K, Darrell T.  2005.  A picture is worth a thousand keywords: image-based object search on a mobile platform. CHI '05 extended abstracts on Human factors in computing systems.
:2025-2028.

Tom Yeh, Darrell T.  2005.  Doubleshot: an interactive user-aided segmentation tool. Proceedings of the 10th international conference on Intelligent user interfaces.
:287-289.

2004


Tom Yeh, Tollmar K, Darrell T.  2004.  Searching the web with mobile images for location recognition. Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on. 2:76-81.

Tollmar K, Tom Yeh, Darrell T.  2004.  Ideixis-image-based deixis for finding location-based information. Mobile HCI, Vienna, Austria, Pages.
:781-782.

Tollmar K, Tom Yeh, Darrell T.  2004.  IDeixis–Searching the Web with Mobile Images for Location-Based Information. Mobile Human-Computer Interaction–MobileHCI 2004.
:61-125.

Tom Yeh, Tollmar K, Darrell T.  2004.  IDeixis: image-based Deixis for finding location-based information. CHI '04 extended abstracts on Human factors in computing systems.
:781-782.

2000


Voll K, Tom Yeh, Dahl V.  2000.  An assumptive logic programming methodology for parsing. Tools with Artificial Intelligence, 2000. ICTAI 2000. Proceedings. 12th IEEE International Conference on.
:11-18.