TY - JOUR T1 - Combining preference and absolute judgements in a crowd-sourced setting JF - ics.uci.edu Y1 - Submitted A1 - Ye, P A1 - EDU, UMD A1 - David Doermann AB - Abstract This paper addresses the problem of obtaining gold-standard labels of objects based on subjective judgements provided by humans. Assuming each object can be associated with an underlying score, the objective of this work is to predict the underlying ... UR - https://www.ics.uci.edu/~qliu1/MLcrowd_ICML_workshop/Papers/ActivePaper5.pdf ER - TY - JOUR T1 - Comparison of Infant Gut and Skin Microbiota, Resistome and Virulome Between Neonatal Intensive Care Unit (NICU) Environments JF - Frontiers in Microbiology Y1 - 2018 A1 - Hourigan, Suchitra K. A1 - Subramanian, Poorani A1 - Hasan, Nur A. A1 - Ta, Allison A1 - Klein, Elisabeth A1 - Chettout, Nassim A1 - Huddleston, Kathi A1 - Deopujari, Varsha A1 - Levy, Shira A1 - Baveja, R A1 - Clemency, Nicole C. A1 - Baker, Robin L. A1 - Niederhuber, John E. A1 - Rita R Colwell UR - https://www.frontiersin.org/article/10.3389/fmicb.2018.01361/full J1 - Front. Microbiol. M3 - 10.3389/fmicb.2018.01361 ER - TY - JOUR T1 - Application of a paper based device containing a new culture medium to detect Vibrio cholerae in water samples collected in Haiti JF - Journal of Microbiological Methods Y1 - 2017 A1 - Briquaire, Romain A1 - Rita R Colwell A1 - Boncy, Jacques A1 - Rossignol, Emmanuel A1 - Dardy, Aline A1 - Pandini, Isabelle A1 - Villeval, François A1 - Machuron, Jean-Louis A1 - Huq, Anwar A1 - Rashed, Shah A1 - Vandevelde, Thierry A1 - Rozand, Christine VL - 133 UR - https://www.sciencedirect.com/science/article/pii/S0167701216303578?via%3Dihub J1 - Journal of Microbiological Methods M3 - 10.1016/j.mimet.2016.12.014 ER - TY - JOUR T1 - Comprehensive benchmarking and ensemble approaches for metagenomic classifiers JF - Genome Biology Y1 - 2017 A1 - McIntyre, Alexa B. R. A1 - Ounit, Rachid A1 - Afshinnekoo, Ebrahim A1 - Prill, Robert J. A1 - Hénaff, Elizabeth A1 - Alexander, Noah A1 - Minot, Samuel S. A1 - Danko, David A1 - Foox, Jonathan A1 - Ahsanuddin, Sofia A1 - Tighe, Scott A1 - Hasan, Nur A. A1 - Subramanian, Poorani A1 - Moffat, Kelly A1 - Levy, Shawn A1 - Lonardi, Stefano A1 - Greenfield, Nick A1 - Rita R Colwell A1 - Rosen, Gail L. A1 - Mason, Christopher E. UR - http://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1299-7 CP - 1210 J1 - Genome Biol M3 - 10.1186/s13059-017-1299-7 ER - TY - JOUR T1 - CRISPR-Cas and Contact-Dependent Secretion Systems Present on Excisable Pathogenicity Islands with Conserved Recombination Modules JF - Journal of Bacteriology Y1 - 2017 A1 - Carpenter, Megan R. A1 - Kalburge, Sai S. A1 - Borowski, Joseph D. A1 - Peters, Molly C. A1 - Rita R Colwell A1 - Boyd, E. Fidelma ED - DiRita, Victor J. AB - Pathogenicity islands (PAIs) are mobile integrated genetic elements that contain a diverse range of virulence factors. PAIs integrate into the host chromosome at a tRNA locus that contains their specific bacterial attachment site, attB, via integrase-mediated site-specific recombination generating attL and attR sites. We identified conserved recombination modules (integrases and att sites) previously described in choleragenic Vibrio cholerae PAIs but with novel cargo genes. Clustered regularly interspaced short palindromic repeat (CRISPR)-associated proteins (Cas proteins) and a type VI secretion system (T6SS) gene cluster were identified at the Vibrio pathogenicity island 1 (VPI-1) insertion site in 19 V. cholerae strains and contained the same recombination module. Two divergent type I-F CRISPR-Cas systems were identified, which differed in Cas protein homology and content. The CRISPR repeat sequence was identical among all V. cholerae strains, but the CRISPR spacer sequences and the number of spacers varied. In silico analysis suggests that the CRISPR-Cas systems were active against phages and plasmids. A type III secretion system (T3SS) was present in 12 V. cholerae strains on a 68-kb island inserted at the same tRNA-serine insertion site as VPI-2 and contained the same recombination module. Bioinformatics analysis showed that two divergent T3SSs exist among the strains examined. Both the CRISPR and T3SS islands excised site specifically from the bacterial chromosome as complete units, and the cognate integrases were essential for this excision. These data demonstrated that identical recombination modules that catalyze integration and excision from the chromosome can acquire diverse cargo genes, signifying a novel method of acquisition for both CRISPR-Cas systems and T3SSs. UR - http://jb.asm.org/lookup/doi/10.1128/JB.00842-16 CP - 10 J1 - J. Bacteriol. M3 - 10.1128/JB.00842-16 ER - TY - JOUR T1 - Genomic Methods and Microbiological Technologies for Profiling Novel and Extreme Environments for the Extreme Microbiome Project (XMP) JF - Journal of Biomolecular Techniques : JBT Y1 - 2017 A1 - Tighe, Scott A1 - Afshinnekoo, Ebrahim A1 - Rock, Tara M. A1 - McGrath, Ken A1 - Alexander, Noah A1 - McIntyre, Alexa A1 - Ahsanuddin, Sofia A1 - Bezdan, Daniela A1 - Green, Stefan J. A1 - Joye, Samantha A1 - Stewart Johnson, Sarah A1 - Baldwin, Don A. A1 - Bivens, Nathan A1 - Ajami, Nadim A1 - Carmical, Joseph R. A1 - Herriott, Ian Charold A1 - Rita R Colwell A1 - Donia, Mohamed A1 - Foox, Jonathan A1 - Greenfield, Nick A1 - Hunter, Tim A1 - Hoffman, Jessica A1 - Hyman, Joshua A1 - Jorgensen, Ellen A1 - Krawczyk, Diana A1 - Lee, Jodie A1 - Levy, Shawn A1 - Garcia-Reyero, àlia A1 - Settles, Matthew A1 - Thomas, Kelley A1 - ómez, Felipe A1 - Schriml, Lynn A1 - Kyrpides, Nikos A1 - Zaikova, Elena A1 - Penterman, Jon A1 - Mason, Christopher E. AB - The Extreme Microbiome Project (XMP) is a project launched by the Association of Biomolecular Resource Facilities Metagenomics Research Group (ABRF MGRG) that focuses on whole genome shotgun sequencing of extreme and unique environments using a wide variety of biomolecular techniques. The goals are multifaceted, including development and refinement of new techniques for the following: 1) the detection and characterization of novel microbes, 2) the evaluation of nucleic acid techniques for extremophilic samples, and 3) the identification and implementation of the appropriate bioinformatics pipelines. Here, we highlight the different ongoing projects that we have been working on, as well as details on the various methods we use to characterize the microbiome and metagenome of these complex samples. In particular, we present data of a novel multienzyme extraction protocol that we developed, called Polyzyme or MetaPolyZyme. Presently, the XMP is characterizing sample sites around the world with the intent of discovering new species, genes, and gene clusters. Once a project site is complete, the resulting data will be publically available. Sites include Lake Hillier in Western Australia, the “Door to Hell” crater in Turkmenistan, deep ocean brine lakes of the Gulf of Mexico, deep ocean sediments from Greenland, permafrost tunnels in Alaska, ancient microbial biofilms from Antarctica, Blue Lagoon Iceland, Ethiopian toxic hot springs, and the acidic hypersaline ponds in Western Australia. VL - 28 UR - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5345951/ CP - 1 J1 - J Biomol Tech M3 - 10.7171/jbt.17-2801-004 ER - TY - JOUR T1 - The microbiomes of blowflies and houseflies as bacterial transmission reservoirs JF - Scientific Reports Y1 - 2017 A1 - Junqueira, AC A1 - Ratan, Aakrosh A1 - Acerbi, Enzo A1 - Drautz-Moses, Daniela I. A1 - Premkrishnan, BNV A1 - Costea, PI A1 - Linz, Bodo A1 - Purbojati, Rikky W. A1 - Paulo, Daniel F. A1 - Gaultier, Nicolas E. A1 - Subramanian, Poorani A1 - Hasan, Nur A. A1 - Rita R Colwell A1 - Bork, Peer A1 - Azeredo-Espin, Ana Maria L. A1 - Bryant, Donald A. A1 - Schuster, Stephan C. AB - Blowflies and houseflies are mechanical vectors inhabiting synanthropic environments around the world. They feed and breed in fecal and decaying organic matter, but the microbiome they harbour and transport is largely uncharacterized. We sampled 116 individual houseflies and blowflies from varying habitats on three continents and subjected them to high-coverage, whole-genome shotgun sequencing. This allowed for genomic and metagenomic analyses of the host-associated microbiome at the species level. Both fly host species segregate based on principal coordinate analysis of their microbial communities, but they also show an overlapping core microbiome. Legs and wings displayed the largest microbial diversity and were shown to be an important route for microbial dispersion. The environmental sequencing approach presented here detected a stochastic distribution of human pathogens, such as Helicobacter pylori, thereby demonstrating the potential of flies as proxies for environmental and public health surveillance. UR - http://www.nature.com/articles/s41598-017-16353-x CP - 1 J1 - Sci Rep M3 - 10.1038/s41598-017-16353-x ER - TY - JOUR T1 - Enrichment dynamics of Listeria monocytogenes and the associated microbiome from naturally contaminated ice cream linked to a listeriosis outbreak JF - BMC Microbiology Y1 - 2016 A1 - Ottesen, Andrea A1 - Ramachandran, Padmini A1 - Reed, Elizabeth A1 - White, James R. A1 - Hasan, Nur A1 - Subramanian, Poorani A1 - Ryan, Gina A1 - Jarvis, Karen A1 - Grim, Christopher A1 - Daquiqan, Ninalynn A1 - Hanes, Darcy A1 - Allard, Marc A1 - Rita R Colwell A1 - Brown, Eric A1 - Chen, Yi UR - http://bmcmicrobiol.biomedcentral.com/articles/10.1186/s12866-016-0894-1 J1 - BMC Microbiol M3 - 10.1186/s12866-016-0894-1 ER - TY - JOUR T1 - Deep-sea hydrothermal vent bacteria related to human pathogenic Vibrio species JF - Proceedings of the National Academy of Sciences Y1 - 2015 A1 - Hasan, Nur A. A1 - Grim, Christopher J. A1 - Lipp, Erin K. A1 - Rivera, Irma N. G. A1 - Chun, Jongsik A1 - Haley, Bradd J. A1 - Taviani, Elisa A1 - Choi, Seon Young A1 - Hoq, Mozammel A1 - Munk, A. Christine A1 - Brettin, Thomas S. A1 - Bruce, David A1 - Challacombe, Jean F. A1 - Detter, J. Chris A1 - Han, Cliff S. A1 - Eisen, Jonathan A. A1 - Huq, Anwar A1 - Rita R Colwell AB - Vibrio species are both ubiquitous and abundant in marine coastal waters, estuaries, ocean sediment, and aquaculture settings worldwide. We report here the isolation, characterization, and genome sequence of a novel Vibrio species, Vibrio antiquarius, isolated from a mesophilic bacterial community associated with hydrothermal vents located along the East Pacific Rise, near the southwest coast of Mexico. Genomic and phenotypic analysis revealed V. antiquarius is closely related to pathogenic Vibrio species, namely Vibrio alginolyticus, Vibrio parahaemolyticus, Vibrio harveyi, and Vibrio vulnificus, but sufficiently divergent to warrant a separate species status. The V. antiquarius genome encodes genes and operons with ecological functions relevant to the environment conditions of the deep sea and also harbors factors known to be involved in human disease caused by freshwater, coastal, and brackish water vibrios. The presence of virulence factors in this deep-sea Vibrio species suggests a far more fundamental role of these factors for their bacterial host. Comparative genomics revealed a variety of genomic events that may have provided an important driving force in V. antiquarius evolution, facilitating response to environmental conditions of the deep sea. UR - http://www.pnas.org/lookup/doi/10.1073/pnas.1503928112 CP - 2144666966517 J1 - Proc Natl Acad Sci USA M3 - 10.1073/pnas.1503928112 ER - TY - JOUR T1 - Environmental Surveillance for Toxigenic Vibrio cholerae in Surface Waters of Haiti JF - The American Journal of Tropical Medicine and Hygiene Y1 - 2015 A1 - Hill, Vincent R. A1 - Humphrys, Michael S. A1 - Kahler, Amy M. A1 - Boncy, Jacques A1 - Tarr, Cheryl L. A1 - Huq, Anwar A1 - Chen, Arlene A1 - Katz, Lee S. A1 - Mull, Bonnie J. A1 - Derado, Gordana A1 - Haley, Bradd J. A1 - Freeman, Nicole A1 - Rita R Colwell A1 - Turnsek, Maryann AB - Epidemic cholera was reported in Haiti in 2010, with no information available on the occurrence or geographic distribution of toxigenic Vibrio cholerae in Haitian waters. In a series of field visits conducted in Haiti between 2011 and 2013, water and plankton samples were collected at 19 sites. Vibrio cholerae was detected using culture, polymerase chain reaction, and direct viable count methods (DFA-DVC). Cholera toxin genes were detected by polymerase chain reaction in broth enrichments of samples collected in all visits except March 2012. Toxigenic V. cholerae was isolated from river water in 2011 and 2013. Whole genome sequencing revealed that these isolates were a match to the outbreak strain. The DFA-DVC tests were positive for V. cholerae O1 in plankton samples collected from multiple sites. Results of this survey show that toxigenic V. cholerae could be recovered from surface waters in Haiti more than 2 years after the onset of the epidemic. VL - 92 UR - http://www.ajtmh.org/content/journals/10.4269/ajtmh.13-0601 CP - 1 M3 - 10.4269/ajtmh.13-0601 ER - TY - JOUR T1 - A unified initiative to harness Earth's microbiomes JF - Science Y1 - 2015 A1 - Alivisatos, A. P. A1 - Blaser, M. J. A1 - Brodie, E. L. A1 - Chun, M. A1 - Dangl, J. L. A1 - Donohue, T. J. A1 - Dorrestein, P. C. A1 - Gilbert, J. A. A1 - Green, J. L. A1 - Jansson, J. K. A1 - Knight, R. A1 - Maxon, M. E. A1 - McFall-Ngai, M. J. A1 - Miller, J. F. A1 - Pollard, K. S. A1 - Ruby, E. G. A1 - Taha, S. A. A1 - Rita R Colwell UR - http://www.sciencemag.org/cgi/doi/10.1126/science.aac8480 CP - 62607551614341176 J1 - Science M3 - 10.1126/science.aac8480 ER - TY - CHAP T1 - Can Optimally-Fair Coin Tossing Be Based on One-Way Functions? T2 - Theory of Cryptography Y1 - 2014 A1 - Dana Dachman-Soled A1 - Mahmoody, Mohammad A1 - Malkin, Tal ED - Lindell, Yehuda KW - Algorithm Analysis and Problem Complexity KW - black-box separations KW - Coin-Tossing KW - Computation by Abstract Devices KW - Data Encryption KW - Discrete Mathematics in Computer Science KW - One-Way Functions KW - Systems and Data Security AB - Coin tossing is a basic cryptographic task that allows two distrustful parties to obtain an unbiased random bit in a way that neither party can bias the output by deviating from the protocol or halting the execution. Cleve [STOC’86] showed that in any r round coin tossing protocol one of the parties can bias the output by Ω(1/r) through a “fail-stop” attack; namely, they simply execute the protocol honestly and halt at some chosen point. In addition, relying on an earlier work of Blum [COMPCON’82], Cleve presented an r-round protocol based on one-way functions that was resilient to bias at most O(1/r√)O(1/\sqrt r) . Cleve’s work left open whether ”‘optimally-fair’” coin tossing (i.e. achieving bias O(1/r) in r rounds) is possible. Recently Moran, Naor, and Segev [TCC’09] showed how to construct optimally-fair coin tossing based on oblivious transfer, however, it was left open to find the minimal assumptions necessary for optimally-fair coin tossing. The work of Dachman-Soled et al. [TCC’11] took a step toward answering this question by showing that any black-box construction of optimally-fair coin tossing based on a one-way functions with n-bit input and output needs Ω(n/logn) rounds. In this work we take another step towards understanding the complexity of optimally-fair coin-tossing by showing that this task (with an arbitrary number of rounds) cannot be based on one-way functions in a black-box way, as long as the protocol is ”‘oblivious’” to the implementation of the one-way function. Namely, we consider a natural class of black-box constructions based on one-way functions, called function oblivious, in which the output of the protocol does not depend on the specific implementation of the one-way function and only depends on the randomness of the parties. Other than being a natural notion on its own, the known coin tossing protocols of Blum and Cleve (both based on one-way functions) are indeed function oblivious. Thus, we believe our lower bound for function-oblivious constructions is a meaningful step towards resolving the fundamental open question of the complexity of optimally-fair coin tossing. JA - Theory of Cryptography T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-54241-1, 978-3-642-54242-8 UR - http://link.springer.com/chapter/10.1007/978-3-642-54242-8_10 ER - TY - JOUR T1 - Microbial Community Profiling of Human Saliva Using Shotgun Metagenomic Sequencing JF - PLoS ONE Y1 - 2014 A1 - Hasan, Nur A. A1 - Young, Brian A. A1 - Minard-Smith, Angela T. A1 - Saeed, Kelly A1 - Li, Huai A1 - Heizer, Esley M. A1 - McMillan, Nancy J. A1 - Isom, Richard A1 - Abdullah, Abdul Shakur A1 - Bornman, Daniel M. A1 - Faith, Seth A. A1 - Choi, Seon Young A1 - Dickens, Michael L. A1 - Cebula, Thomas A. A1 - Rita R Colwell ED - Ahmed, Niyaz AB - Human saliva is clinically informative of both oral and general health. Since next generation shotgun sequencing (NGS) is now widely used to identify and quantify bacteria, we investigated the bacterial flora of saliva microbiomes of two healthy volunteers and five datasets from the Human Microbiome Project, along with a control dataset containing short NGS reads from bacterial species representative of the bacterial flora of human saliva. GENIUS, a system designed to identify and quantify bacterial species using unassembled short NGS reads was used to identify the bacterial species comprising the microbiomes of the saliva samples and datasets. Results, achieved within minutes and at greater than 90% accuracy, showed more than 175 bacterial species comprised the bacterial flora of human saliva, including bacteria known to be commensal human flora but also Haemophilus influenzae, Neisseria meningitidis, Streptococcus pneumoniae, and Gamma proteobacteria. Basic Local Alignment Search Tool (BLASTn) analysis in parallel, reported ca. five times more species than those actually comprising the in silico sample. Both GENIUS and BLAST analyses of saliva samples identified major genera comprising the bacterial flora of saliva, but GENIUS provided a more precise description of species composition, identifying to strain in most cases and delivered results at least 10,000 times faster. Therefore, GENIUS offers a facile and accurate system for identification and quantification of bacterial species and/or strains in metagenomic samples. UR - https://dx.plos.org/10.1371/journal.pone.0097699 CP - 5 J1 - PLoS ONE M3 - 10.1371/journal.pone.0097699 ER - TY - CHAP T1 - On Minimal Assumptions for Sender-Deniable Public Key Encryption T2 - Public-Key Cryptography – PKC 2014 Y1 - 2014 A1 - Dana Dachman-Soled ED - Krawczyk, Hugo KW - Algorithm Analysis and Problem Complexity KW - black-box separation KW - Coding and Information Theory KW - Data Encryption KW - sender-deniable encryption KW - simulatable PKE KW - Systems and Data Security AB - The primitive of deniable encryption was introduced by Canetti et al. (CRYPTO, 1997). Deniable encryption is an encryption scheme with the added feature that after transmitting a message m, both sender and receiver may produce random coins showing that the transmitted ciphertext was an encryption of any message m′ in the message space. Deniable encryption is a key tool for constructing incoercible protocols, since it allows a party to send one message and later provide apparent evidence to a coercer that a different message was sent. In addition, deniable encryption may be used to obtain adaptively-secure multiparty computation (MPC) protocols and is secure under selective-opening attacks. Different flavors such as sender-deniable and receiver-deniable encryption, where only the sender or receiver produce fake random coins, have been considered. Recently, over 15 years after the primitive was first introduced, Sahai and Waters (IACR Cryptology ePrint Archive, 2013), gave the first construction of sender-deniable encryption schemes with super-polynomial security, where an adversary has negligible advantage in distinguishing real and fake openings. Their construction is based on the construction of an indistinguishability obfuscator for general programs recently introduced in a breakthrough result of Garg et al. (FOCS, 2013). Although feasibility has now been demonstrated, the question of determining the minimal assumptions necessary for sender-deniable encryption with super-polynomial security remains open. The primitive of simulatable public key encryption (PKE), introduced by Damgård and Nielsen (CRYPTO, 2000), is a public key encryption scheme with additional properties that allow oblivious sampling of public keys and ciphertexts. It is one of the low-level primitives used to construct adaptively-secure MPC protocols and was used by O’Neill et al. in their construction of bi-deniable encryption in the multi-distributional model (CRYPTO, 2011). Moreover, the original construction of sender-deniable encryption with polynomial security given by Canetti et al. can be instantiated with simulatable PKE. Thus, a natural question to ask is whether it is possible to construct sender-deniable encryption with super-polynomial security from simulatable PKE. In this work, we investigate the possibility of constructing sender-deniable public key encryption from simulatable PKE in a black-box manner. We show that there is no black-box construction of sender-deniable public key encryption with super-polynomial security from simulatable PKE. This indicates that improving on the original construction of Canetti et al. requires the use of non-black-box techniques, stronger assumptions, or interaction, thus giving some evidence that strong assumptions such as those used by Sahai and Waters are necessary. JA - Public-Key Cryptography – PKC 2014 T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-54630-3, 978-3-642-54631-0 UR - http://link.springer.com/chapter/10.1007/978-3-642-54631-0_33 ER - TY - JOUR T1 - Occurrence in Mexico, 1998–2008, of Vibrio cholerae CTX + El Tor carrying an additional truncated CTX prophage JF - Proceedings of the National Academy of Sciences Y1 - 2014 A1 - Alam, Munirul A1 - Rashed, Shah M A1 - Mannan, Shahnewaj Bin A1 - Islam, Tarequl A1 - Lizarraga-Partida, Marcial L. A1 - Delgado, Gabriela A1 - Morales-Espinosa, Rosario A1 - Mendez, Jose Luis A1 - Navarro, Armando A1 - Watanabe, Haruo A1 - Ohnishi, Makoto A1 - Hasan, Nur A. A1 - Huq, Anwar A1 - Sack, R. Bradley A1 - Rita R Colwell A1 - Cravioto, Alejandro VL - 111 UR - http://www.pnas.org/lookup/doi/10.1073/pnas.1323408111 CP - 27 J1 - Proc Natl Acad Sci USA M3 - 10.1073/pnas.1323408111 ER - TY - CHAP T1 - Adaptive and Concurrent Secure Computation from New Adaptive, Non-malleable Commitments T2 - Advances in Cryptology - ASIACRYPT 2013 Y1 - 2013 A1 - Dana Dachman-Soled A1 - Malkin, Tal A1 - Raykova, Mariana A1 - Venkitasubramaniam, Muthuramakrishnan ED - Sako, Kazue ED - Sarkar, Palash KW - Algorithm Analysis and Problem Complexity KW - Applications of Mathematics KW - Data Encryption KW - Discrete Mathematics in Computer Science KW - Management of Computing and Information Systems KW - Systems and Data Security AB - We present a unified approach for obtaining general secure computation that achieves adaptive-Universally Composable (UC)-security. Using our approach we essentially obtain all previous results on adaptive concurrent secure computation, both in relaxed models (e.g., quasi-polynomial time simulation), as well as trusted setup models (e.g., the CRS model, the imperfect CRS model). This provides conceptual simplicity and insight into what is required for adaptive and concurrent security, as well as yielding improvements to set-up assumptions and/or computational assumptions in known models. Additionally, we provide the first constructions of concurrent secure computation protocols that are adaptively secure in the timing model, and the non-uniform simulation model. As a corollary we also obtain the first adaptively secure multiparty computation protocol in the plain model that is secure under bounded-concurrency. Conceptually, our approach can be viewed as an adaptive analogue to the recent work of Lin, Pass and Venkitasubramaniam [STOC ‘09], who considered only non-adaptive adversaries. Their main insight was that the non-malleability requirement could be decoupled from the simulation requirement to achieve UC-security. A main conceptual contribution of this work is, quite surprisingly, that it is still the case even when considering adaptive security. A key element in our construction is a commitment scheme that satisfies a strong definition of non-malleability. Our new primitive of concurrent equivocal non-malleable commitments, intuitively, guarantees that even when a man-in-the-middle adversary observes concurrent equivocal commitments and decommitments, the binding property of the commitments continues to hold for commitments made by the adversary. This definition is stronger than previous ones, and may be of independent interest. Previous constructions that satisfy our definition have been constructed in setup models, but either require existence of stronger encryption schemes such as CCA-secure encryption or require independent “trapdoors” provided by the setup for every pair of parties to ensure non-malleability. A main technical contribution of this work is to provide a construction that eliminates these requirements and requires only a single trapdoor. JA - Advances in Cryptology - ASIACRYPT 2013 T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-42032-0, 978-3-642-42033-7 UR - http://link.springer.com/chapter/10.1007/978-3-642-42033-7_17 ER - TY - CONF T1 - Age-Related Differences in Performance with Touchscreens Compared to Traditional Mouse Input T2 - CHI 2013 To Appear Y1 - 2013 A1 - Findlater,L. A1 - Jon Froehlich A1 - Fattal, K. A1 - Wobbrock,J.O. A1 - Dastyar, T. AB - Despite the apparent popularity of touch screen for older adults, little is known about the psychomotor performance of these devices. We compare performance between older adults and younger adults in four desktop and touchscreen tasks: pointing, dragging, crossing and steering. On the touchscreen, we also examine pinch zoom. Our results show that while older adults were significantly slower than younger adults in general, the touchscreen reduced the performance gap relative to the desktop and mouse. Indeed, the touchscreen resulted in a significant movement time reduction of 35% over the mouse for older adults, compared to only 16% for younger adults. Error rates also decreased. JA - CHI 2013 To Appear ER - TY - JOUR T1 - Clutter noise removal in binary document images JF - International Journal on Document Analysis and Recognition (IJDAR) Y1 - 2013 A1 - Agrawal, Mudit A1 - David Doermann AB - Abstract The paper presents a clutter detection and removal algorithm for complex document images. This distance transform based technique aims to remove irregular and independent unwanted clutter while preserving the text content. The novelty of this approach is in its ... PB - Springer Berlin Heidelberg VL - 16 UR - http://link.springer.com/article/10.1007/s10032-012-0196-6/fulltext.html CP - 4 J1 - IJDAR M3 - 10.1007/s10032-012-0196-6 ER - TY - BOOK T1 - Current Topics in Microbiology and Immunology One Health: The Human-Animal-Environment Interfaces in Emerging Infectious Diseases The Human Environment Interface: Applying Ecosystem Concepts to Health Y1 - 2013 A1 - Preston, Nicholas D. A1 - Daszak, Peter A1 - Rita R Colwell ED - Mackenzie, John S. ED - Jeggo, Martyn ED - Daszak, Peter ED - Richt, Juergen A. PB - Springer Berlin Heidelberg CY - Berlin, Heidelberg VL - 365 SN - 978-3-642-36888-2 UR - http://link.springer.com/10.1007/978-3-642-36889-9 M3 - 10.1007/978-3-642-36889-9 ER - TY - JOUR T1 - A Dataset for Quality Assessment of Camera Captured Document Images Y1 - 2013 A1 - Kumar, J A1 - Ye, P A1 - David Doermann AB - Abstract—With the proliferation of cameras on mobile devices there is an increased desire to image document pages as an alternative to scanning. However, the quality of captured document images is often lower than its scanned equivalent due to hardware limitations ... UR - http://lampsrv02.umiacs.umd.edu/pubs/Papers/kumar-data-13/kumar-data-13.pdf ER - TY - CONF T1 - Document Image Quality Assessment: A Brief Survey T2 - International Conference on Document Analysis and Recognition (ICDAR) Y1 - 2013 A1 - Ye,Peng A1 - David Doermann JA - International Conference on Document Analysis and Recognition (ICDAR) ER - TY - JOUR T1 - Engaging Actively with Issues in the Responsible Conduct of Science: Lessons from International Efforts Are Relevant for Undergraduate Education in the United States JF - CBE—Life Sciences Education Y1 - 2013 A1 - Clements, John D. A1 - Connell, Nancy D. A1 - Dirks, Clarissa A1 - El-Faham, Mohamed A1 - Hay, Alastair A1 - Heitman, Elizabeth A1 - Stith, James H. A1 - Bond, Enriqueta C. A1 - Rita R Colwell A1 - Anestidou, Lida A1 - Husbands, Jo L. A1 - Labov, Jay B. AB - Numerous studies are demonstrating that engaging undergraduate students in original research can improve their achievement in the science, technology, engineering, and mathematics (STEM) fields and increase the likelihood that some of them will decide to pursue careers in these disciplines. Associated with this increased prominence of research in the undergraduate curriculum are greater expectations from funders, colleges, and universities that faculty mentors will help those students, along with their graduate students and postdoctoral fellows, develop an understanding and sense of personal and collective obligation for responsible conduct of science (RCS). This Feature describes an ongoing National Research Council (NRC) project and a recent report about educating faculty members in culturally diverse settings (Middle East/North Africa and Asia) to employ active-learning strategies to engage their students and colleagues deeply in issues related to RCS. The NRC report describes the first phase of this project, which took place in Aqaba and Amman, Jordan, in September 2012 and April 2013, respectively. Here we highlight the findings from that report and our subsequent experience with a similar interactive institute in Kuala Lumpur, Malaysia. Our work provides insights and perspectives for faculty members in the United States as they engage undergraduate and graduate students, as well as postdoctoral fellows, to help them better understand the intricacies of and connections among various components of RCS. Further, our experiences can provide insights for those who may wish to establish “train-the-trainer” programs at their home institutions. UR - https://www.lifescied.org/doi/10.1187/cbe.13-09-0184 CP - 4 J1 - LSE M3 - 10.1187/cbe.13-09-0184 ER - TY - CONF T1 - Large-Scale Signature Matching Using Multi-stage Hashing T2 - Document Analysis and Recognition (ICDAR), 2013 12th International Conference on Y1 - 2013 A1 - Du, Xianzhi A1 - Abdalmageed, W A1 - David Doermann AB - In this paper, we propose a fast large-scale signature matching method based on locality sensitive hashing (LSH). Shape Context features are used to describe the structure of signatures. Two stages of hashing are performed to find the nearest neighbours for query signatures. In the first stage, we use M randomly generated hyper planes to separate shape context feature points into different bins, and compute a term-frequency histogram to represent the feature point distribution as a feature vector. In the second stage we again use LSH to categorize the high-level features into different classes. The experiments are carried out on two datasets - DS-I, a small dataset contains 189 signatures, and DS-II, a large dataset created by our group which contains 26,000 signatures. We show that our algorithm can achieve a high accuracy even when few signatures are collected from one same person and perform fast matching when dealing with a large dataset. View full abstract JA - Document Analysis and Recognition (ICDAR), 2013 12th International Conference on PB - IEEE SN - 978-0-7695-4999-6 UR - http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6628762 M3 - 10.1109/ICDAR.2013.197 ER - TY - CONF T1 - Real-time No-Reference Image Quality Assessment based on Filter Learning T2 - International Conference on Computer Vision and Pattern Recognition (CVPR) Y1 - 2013 A1 - Ye,Peng A1 - Kumar,Jayant A1 - Kang,Le A1 - David Doermann JA - International Conference on Computer Vision and Pattern Recognition (CVPR) ER - TY - JOUR T1 - Scene Text Detection via Integrated Discrimination of Component Appearance and Consensus Y1 - 2013 A1 - Ye, Q A1 - David Doermann AB - Abstract—In this paper, we propose an approach to scene text detection that leverages both the appearance and consensus of connected components. Component appearance is modeled with an SVM based dictionary classifier and the component consensus is ... UR - http://lampsrv02.umiacs.umd.edu/pubs/Papers/qixiangye-13/qixiangye-13.pdf ER - TY - JOUR T1 - Segmenting time-lapse phase contrast images of adjacent NIH 3T3 cells. JF - Journal of microscopy Y1 - 2013 A1 - Chalfoun, J A1 - Kociolek, M A1 - Dima, A A1 - Halter, M A1 - Cardone, Antonio A1 - Peskin, A A1 - Bajcsy, P A1 - Brady, M. KW - Animals KW - Cell Adhesion KW - Cell Count KW - Cell Division KW - Cell Shape KW - Computational Biology KW - Fibroblasts KW - Image Processing, Computer-Assisted KW - Mice KW - Microscopy, Phase-Contrast KW - NIH 3T3 Cells KW - Reproducibility of results KW - Sensitivity and Specificity KW - Time-Lapse Imaging AB - We present a new method for segmenting phase contrast images of NIH 3T3 fibroblast cells that is accurate even when cells are physically in contact with each other. The problem of segmentation, when cells are in contact, poses a challenge to the accurate automation of cell counting, tracking and lineage modelling in cell biology. The segmentation method presented in this paper consists of (1) background reconstruction to obtain noise-free foreground pixels and (2) incorporation of biological insight about dividing and nondividing cells into the segmentation process to achieve reliable separation of foreground pixels defined as pixels associated with individual cells. The segmentation results for a time-lapse image stack were compared against 238 manually segmented images (8219 cells) provided by experts, which we consider as reference data. We chose two metrics to measure the accuracy of segmentation: the 'Adjusted Rand Index' which compares similarities at a pixel level between masks resulting from manual and automated segmentation, and the 'Number of Cells per Field' (NCF) which compares the number of cells identified in the field by manual versus automated analysis. Our results show that the automated segmentation compared to manual segmentation has an average adjusted rand index of 0.96 (1 being a perfect match), with a standard deviation of 0.03, and an average difference of the two numbers of cells per field equal to 5.39% with a standard deviation of 4.6%. VL - 249 CP - 1 U1 - http://www.ncbi.nlm.nih.gov/pubmed/23126432?dopt=Abstract M3 - 10.1111/j.1365-2818.2012.03678.x ER - TY - JOUR T1 - Structural similarity for document image classification and retrieval JF - Pattern Recognition Letters Y1 - 2013 A1 - Kumar, Jayant A1 - Ye, Peng A1 - David Doermann AB - Abstract This paper presents a novel approach to defining document image structural similarity for the applications of classification and retrieval. We first build a codebook of SURF descriptors extracted from a set of representative training images. We then encode ... UR - http://linkinghub.elsevier.com/retrieve/pii/S0167865513004224 J1 - Pattern Recognition Letters M3 - 10.1016/j.patrec.2013.10.030 ER - TY - JOUR T1 - A study of unpredictability in fault-tolerant middleware JF - Computer Networks Y1 - 2013 A1 - Tudor Dumitras A1 - Narasimhan, Priya KW - Fault tolerance KW - latency KW - Middleware KW - Remote procedure call KW - Unpredictability AB - In enterprise applications relying on fault-tolerant middleware, it is a common engineering practice to establish service-level agreements (SLAs) based on the 95th or the 99th percentiles of the latency, to allow a margin for unexpected variability. However, the extent of this unpredictability has not been studied systematically. We present an extensive empirical study of unpredictability in 16 distributed systems, ranging from simple transport protocols to fault-tolerant, middleware-based enterprise applications, and we show that the inherent unpredictability in the systems examined arises from at most 1% of the remote invocations. In the normal, fault-free operating mode most remote invocations have a predictable end-to-end latency, but the maximum latency follows unpredictable trends and is comparable with the time needed to recover from a fault. The maximum latency is not influenced by the system’s workload, cannot be regulated through configuration parameters and is not correlated with the system’s resource consumption. The high-latency outliers (up to three orders of magnitude higher than the average latency) have multiple causes and may originate in any component of the system. However, after filtering out 1% of the invocations with the highest recorded response-times, the latency becomes bounded with high statistical confidence (p < 0.01). We have verified this result on different operating systems (Linux 2.4, Linux 2.6, Linux-rt, TimeSys), middleware platforms (CORBA and EJB), programming languages (C, C++ and Java), replication styles (active and warm passive) and applications (e-commerce and online gaming). Moreover, this phenomenon occurs at all the layers of middleware-based systems, from the communication protocols to the business logic. VL - 57 SN - 1389-1286 UR - http://www.sciencedirect.com/science/article/pii/S1389128612003696 CP - 3 J1 - Computer Networks ER - TY - CONF T1 - Unsupervised Classification of Structurally Similar Document Images T2 - International Conference on Document Analysis and Recognition Y1 - 2013 A1 - Kumar,Jayant A1 - David Doermann JA - International Conference on Document Analysis and Recognition ER - TY - CHAP T1 - Why “Fiat-Shamir for Proofs” Lacks a Proof T2 - Theory of Cryptography Y1 - 2013 A1 - Bitansky, Nir A1 - Dana Dachman-Soled A1 - Garg, Sanjam A1 - Jain, Abhishek A1 - Kalai, Yael Tauman A1 - López-Alt, Adriana A1 - Wichs, Daniel ED - Sahai, Amit KW - Algorithm Analysis and Problem Complexity KW - Computation by Abstract Devices KW - Data Encryption KW - Systems and Data Security AB - The Fiat-Shamir heuristic [CRYPTO ’86] is used to convert any 3-message public-coin proof or argument system into a non-interactive argument, by hashing the prover’s first message to select the verifier’s challenge. It is known that this heuristic is sound when the hash function is modeled as a random oracle. On the other hand, the surprising result of Goldwasser and Kalai [FOCS ’03] shows that there exists a computationally sound argument on which the Fiat-Shamir heuristic is never sound, when instantiated with any actual efficient hash function. This leaves us with the following interesting possibility: perhaps we can securely instantiates the Fiat-Shamir heuristic for all 3-message public-coin statistically sound proofs, even if we must fail for some computationally sound arguments. Indeed, this has been conjectured to be the case by Barak, Lindell and Vadhan [FOCS ’03], but we do not have any provably secure instantiation under any “standard assumption”. In this work, we give a broad black-box separation result showing that the security of the Fiat-Shamir heuristic for statistically sound proofs cannot be proved under virtually any standard assumption via a black-box reduction. More precisely: –If we want to have a “universal” instantiation of the Fiat-Shamir heuristic that works for all 3-message public-coin proofs, then we cannot prove its security via a black-box reduction from any assumption that has the format of a “cryptographic game”. –For many concrete proof systems, if we want to have a “specific” instantiation of the Fiat-Shamir heuristic for that proof system, then we cannot prove its security via a black box reduction from any “falsifiable assumption” that has the format of a cryptographic game with an efficient challenger. JA - Theory of Cryptography T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-36593-5, 978-3-642-36594-2 UR - http://link.springer.com/chapter/10.1007/978-3-642-36594-2_11 ER - TY - CONF T1 - Ask WINE: Are We Safer Today? Evaluating Operating System Security Through Big Data Analysis T2 - LEET'12 Proceedings of the 5th USENIX conference on Large-Scale Exploits and Emergent Threats Y1 - 2012 A1 - Tudor Dumitras A1 - Efstathopoulos, Petros AB - The Internet can be a dangerous place: 800,000 new malware variants are detected each day, and this number is growing at an exponential rate--driven by the quest for economic gains. However, over the past ten years operating-system vendors have introduced a number of security technologies that aim to make exploits harder and to reduce the attack surface of the platform. Faced with these two conflicting trends, it is difficult for end-users to determine what techniques make them safer from Internet attacks. In this position paper, we argue that to answer this question conclusively we must analyze field data collected on real hosts that are targeted by attacks--e.g., the approximately 50 million records of anti-virus telemetry available through Symantec's WINE platform. Such studies can characterize the factors that drive the production of malware, can help us understand the impact of security technologies in the real world and can suggest new security metrics, derived from field observations rather than small lab experiments, indicating how susceptible to attacks a computing platform may be. JA - LEET'12 Proceedings of the 5th USENIX conference on Large-Scale Exploits and Emergent Threats T3 - LEET'12 PB - USENIX Association UR - http://dl.acm.org/citation.cfm?id=2228340.2228356 ER - TY - JOUR T1 - Automatic Authentication of Banknotes JF - IEEE Transactions on Information Forensics & Security Y1 - 2012 A1 - Roy,Ankush A1 - Halder,Biswaiit A1 - Garain,Utpal A1 - David Doermann AB - In this paper, we address the problem of the automatic authentication of paper currency. Indian banknotes are used to show how a system can be developed for discriminating counterfeit notes from genuine notes. Image processing and pattern recognition techniques are designed to carefully analyze embedded security features. Experiments conducted on forensic samples show that a high precision low cost machine can be developed to address this problem. The analysis of current security features’ ability to protect against counterfeiting also suggest topics that should be considered in designing future currency notes. ER - TY - JOUR T1 - BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics JF - Systematic BiologySyst Biol Y1 - 2012 A1 - Ayres,Daniel L A1 - Darling,Aaron A1 - Zwickl,Derrick J A1 - Beerli,Peter A1 - Holder,Mark T A1 - Lewis,Paul O A1 - Huelsenbeck,John P A1 - Ronquist,Fredrik A1 - Swofford,David L A1 - Cummings, Michael P. A1 - Rambaut,Andrew A1 - Suchard,Marc A KW - Bayesian phylogenetics KW - gpu KW - maximum likelihood KW - parallel computing AB - Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software. VL - 61 SN - 1063-5157, 1076-836X UR - http://sysbio.oxfordjournals.org/content/61/1/170 CP - 1 M3 - 10.1093/sysbio/syr100 ER - TY - CONF T1 - Before We Knew It: An Empirical Study of Zero-day Attacks in the Real World T2 - CCS '12 Proceedings of the 2012 ACM conference on Computer and Communications Security Y1 - 2012 A1 - Bilge, Leyla A1 - Tudor Dumitras KW - full disclosure KW - vulnerabilities KW - zero-day attacks AB - Little is known about the duration and prevalence of zero-day attacks, which exploit vulnerabilities that have not been disclosed publicly. Knowledge of new vulnerabilities gives cyber criminals a free pass to attack any target of their choosing, while remaining undetected. Unfortunately, these serious threats are difficult to analyze, because, in general, data is not available until after an attack is discovered. Moreover, zero-day attacks are rare events that are unlikely to be observed in honeypots or in lab experiments. In this paper, we describe a method for automatically identifying zero-day attacks from field-gathered data that records when benign and malicious binaries are downloaded on 11 million real hosts around the world. Searching this data set for malicious files that exploit known vulnerabilities indicates which files appeared on the Internet before the corresponding vulnerabilities were disclosed. We identify 18 vulnerabilities exploited before disclosure, of which 11 were not previously known to have been employed in zero-day attacks. We also find that a typical zero-day attack lasts 312 days on average and that, after vulnerabilities are disclosed publicly, the volume of attacks exploiting them increases by up to 5 orders of magnitude. JA - CCS '12 Proceedings of the 2012 ACM conference on Computer and Communications Security T3 - CCS '12 PB - ACM SN - 978-1-4503-1651-4 UR - http://doi.acm.org/10.1145/2382196.2382284 ER - TY - CHAP T1 - On the Centrality of Off-Line E-Cash to Concrete Partial Information Games T2 - Security and Cryptography for Networks Y1 - 2012 A1 - Choi, Seung Geol A1 - Dana Dachman-Soled A1 - Yung, Moti ED - Visconti, Ivan ED - Prisco, Roberto De KW - Computer Appl. in Administrative Data Processing KW - Computer Communication Networks KW - Data Encryption KW - Management of Computing and Information Systems KW - Systems and Data Security AB - Cryptography has developed numerous protocols for solving “partial information games” that are seemingly paradoxical. Some protocols are generic (e.g., secure multi-party computation) and others, due to the importance of the scenario they represent, are designed to solve a concrete problem directly. Designing efficient and secure protocols for (off-line) e-cash, e-voting, and e-auction are some of the most heavily researched concrete problems, representing various settings where privacy and correctness of the procedure is highly important. In this work, we initiate the exploration of the relationships among e-cash, e-voting and e-auction in the universal composability (UC) framework, by considering general variants of the three problems. In particular, we first define ideal functionalities for e-cash, e-voting, and e-auction, and then give a construction of a protocol that UC-realizes the e-voting (resp., e-auction) functionality in the e-cash hybrid model. This (black-box) reducibility demonstrates the centrality of off-line e-cash and implies that designing a solution to e-cash may bear fruits in other areas. Constructing a solution to one protocol problem based on a second protocol problem has been traditional in cryptography, but typically has concentrated on building complex protocols on simple primitives (e.g., secure multi-party computation from Oblivious Transfer, signature from one-way functions, etc.). The novelty here is reducibility among mature protocols and using the ideal functionality as a design tool in realizing other ideal functionalities. We suggest this new approach, and we only consider the very basic general properties from the various primitives to demonstrate its viability. Namely, we only consider the basic coin e-cash model, the e-voting that is correct and private and relies on trusted registration, and e-auction relying on a trusted auctioneer. Naturally, relationships among protocols with further properties (i.e., extended functionalities), using the approach advocated herein, are left as open questions. JA - Security and Cryptography for Networks T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-32927-2, 978-3-642-32928-9 UR - http://link.springer.com/chapter/10.1007/978-3-642-32928-9_15 ER - TY - JOUR T1 - Class consistent k-means: Application to face and action recognition JF - Computer Vision and Image Understanding Y1 - 2012 A1 - Zhuolin Jiang A1 - Zhe Lin A1 - Davis, Larry S. KW - Action recognition KW - Class consistent k-means KW - Discriminative tree classifier KW - face recognition KW - Supervised clustering AB - A class-consistent k-means clustering algorithm (CCKM) and its hierarchical extension (Hierarchical CCKM) are presented for generating discriminative visual words for recognition problems. In addition to using the labels of training data themselves, we associate a class label with each cluster center to enforce discriminability in the resulting visual words. Our algorithms encourage data points from the same class to be assigned to the same visual word, and those from different classes to be assigned to different visual words. More specifically, we introduce a class consistency term in the clustering process which penalizes assignment of data points from different classes to the same cluster. The optimization process is efficient and bounded by the complexity of k-means clustering. A very efficient and discriminative tree classifier can be learned for various recognition tasks via the Hierarchical CCKM. The effectiveness of the proposed algorithms is validated on two public face datasets and four benchmark action datasets. VL - 116 SN - 1077-3142 UR - http://www.sciencedirect.com/science/article/pii/S1077314212000367 CP - 6 M3 - 10.1016/j.cviu.2012.02.004 ER - TY - BOOK T1 - Computational Analysis of Terrorist Groups: Lashkar-e-Taiba Y1 - 2012 A1 - V.S. Subrahmanian A1 - Mannes, Aaron A1 - Sliva,Amy A1 - Shakarian, Jana A1 - Dickerson, John P. AB - Computational Analysis of Terrorist Groups: Lashkar-e-Taiba provides an in-depth look at Web intelligence, and how advanced mathematics and modern computing technology can influence the insights we have on terrorist groups. This book primarily focuses on one famous terrorist group known as Lashkar-e-Taiba (or LeT), and how it operates. After 10 years of counter Al Qaeda operations, LeT is considered by many in the counter-terrorism community to be an even greater threat to the US and world peace than Al Qaeda. Computational Analysis of Terrorist Groups: Lashkar-e-Taiba is the first book that demonstrates how to use modern computational analysis techniques including methods for big data analysis. This book presents how to quantify both the environment in which LeT operate, and the actions it took over a 20-year period, and represent it as a relational database table. This table is then mined using sophisticated data mining algorithms in order to gain detailed, mathematical, computational and statistical insights into LeT and its operations. This book also provides a detailed history of Lashkar-e-Taiba based on extensive analysis conducted by using open source information and public statements. Each chapter includes a case study, as well as a slide describing the key results which are available on the authors web sites. Computational Analysis of Terrorist Groups: Lashkar-e-Taiba is designed for a professional market composed of government or military workers, researchers and computer scientists working in the web intelligence field. Advanced-level students in computer science will also find this valuable as a reference book. PB - Springer Publishing Company, Incorporated SN - 1461447682, 9781461447689 ER - TY - CHAP T1 - Computational Extractors and Pseudorandomness T2 - Theory of Cryptography Y1 - 2012 A1 - Dana Dachman-Soled A1 - Gennaro, Rosario A1 - Krawczyk, Hugo A1 - Malkin, Tal ED - Cramer, Ronald KW - Algorithm Analysis and Problem Complexity KW - Coding and Information Theory KW - Computer Communication Networks KW - Data Encryption KW - Math Applications in Computer Science KW - Systems and Data Security AB - Computational extractors are efficient procedures that map a source of sufficiently high min-entropy to an output that is computationally indistinguishable from uniform. By relaxing the statistical closeness property of traditional randomness extractors one hopes to improve the efficiency and entropy parameters of these extractors, while keeping their utility for cryptographic applications. In this work we investigate computational extractors and consider questions of existence and inherent complexity from the theoretical and practical angles, with particular focus on the relationship to pseudorandomness. An obvious way to build a computational extractor is via the “extract-then-prg” method: apply a statistical extractor and use its output to seed a PRG. This approach carries with it the entropy cost inherent to implementing statistical extractors, namely, the source entropy needs to be substantially higher than the PRG’s seed length. It also requires a PRG and thus relies on one-way functions. We study the necessity of one-way functions in the construction of computational extractors and determine matching lower and upper bounds on the “black-box efficiency” of generic constructions of computational extractors that use a one-way permutation as an oracle. Under this efficiency measure we prove a direct correspondence between the complexity of computational extractors and that of pseudorandom generators, showing the optimality of the extract-then-prg approach for generic constructions of computational extractors and confirming the intuition that to build a computational extractor via a PRG one needs to make up for the entropy gap intrinsic to statistical extractors. On the other hand, we show that with stronger cryptographic primitives one can have more entropy- and computationally-efficient constructions. In particular, we show a construction of a very practical computational extractor from any weak PRF without resorting to statistical extractors. JA - Theory of Cryptography T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-28913-2, 978-3-642-28914-9 UR - http://link.springer.com/chapter/10.1007/978-3-642-28914-9_22 ER - TY - JOUR T1 - Density-Based Multifeature Background Subtraction with Support Vector Machine JF - Pattern Analysis and Machine Intelligence, IEEE Transactions on Y1 - 2012 A1 - Han,Bohyung A1 - Davis, Larry S. KW - algorithm;density-based KW - application;illumination KW - approximation;object KW - background KW - camera;support KW - change;kernel KW - Computer KW - density KW - detection;pixelwise KW - detection;support KW - extraction;image KW - feature;background KW - generative KW - Haar-like KW - likelihood KW - machine;Haar KW - machines;vectors; KW - modeling KW - multifeature KW - segmentation KW - segmentation;object KW - subtraction KW - technique;discriminative KW - technique;high-level KW - techniques;spatial KW - transforms;cameras;computer KW - variation;spatio-temporal KW - variation;static KW - vector KW - vector;binary KW - VISION KW - vision;feature AB - Background modeling and subtraction is a natural technique for object detection in videos captured by a static camera, and also a critical preprocessing step in various high-level computer vision applications. However, there have not been many studies concerning useful features and binary segmentation algorithms for this problem. We propose a pixelwise background modeling and subtraction technique using multiple features, where generative and discriminative techniques are combined for classification. In our algorithm, color, gradient, and Haar-like features are integrated to handle spatio-temporal variations for each pixel. A pixelwise generative background model is obtained for each feature efficiently and effectively by Kernel Density Approximation (KDA). Background subtraction is performed in a discriminative manner using a Support Vector Machine (SVM) over background likelihood vectors for a set of features. The proposed algorithm is robust to shadow, illumination changes, spatial variations of background. We compare the performance of the algorithm with other density-based methods using several different feature combinations and modeling techniques, both quantitatively and qualitatively. VL - 34 SN - 0162-8828 CP - 5 M3 - 10.1109/TPAMI.2011.243 ER - TY - JOUR T1 - Dimension-independent multi-resolution morse complexes JF - Computers & Graphics Y1 - 2012 A1 - De Floriani, Leila A1 - Čomić,L. ER - TY - JOUR T1 - Efficient FMM accelerated vortex methods in three dimensions via the Lamb-Helmholtz decomposition JF - arXiv:1201.5430 Y1 - 2012 A1 - Gumerov, Nail A. A1 - Duraiswami, Ramani KW - Computer Science - Numerical Analysis KW - Mathematical Physics KW - Physics - Computational Physics KW - Physics - Fluid Dynamics AB - Vortex element methods are often used to efficiently simulate incompressible flows using Lagrangian techniques. Use of the FMM (Fast Multipole Method) allows considerable speed up of both velocity evaluation and vorticity evolution terms in these methods. Both equations require field evaluation of constrained (divergence free) vector valued quantities (velocity, vorticity) and cross terms from these. These are usually evaluated by performing several FMM accelerated sums of scalar harmonic functions. We present a formulation of the vortex methods based on the Lamb-Helmholtz decomposition of the velocity in terms of two scalar potentials. In its original form, this decomposition is not invariant with respect to translation, violating a key requirement for the FMM. One of the key contributions of this paper is a theory for translation for this representation. The translation theory is developed by introducing "conversion" operators, which enable the representation to be restored in an arbitrary reference frame. Using this form, extremely efficient vortex element computations can be made, which need evaluation of just two scalar harmonic FMM sums for evaluating the velocity and vorticity evolution terms. Details of the decomposition, translation and conversion formulae, and sample numerical results are presented. UR - http://arxiv.org/abs/1201.5430 ER - TY - CHAP T1 - Efficient Password Authenticated Key Exchange via Oblivious Transfer T2 - Public Key Cryptography – PKC 2012 Y1 - 2012 A1 - Canetti, Ran A1 - Dana Dachman-Soled A1 - Vaikuntanathan, Vinod A1 - Wee, Hoeteck ED - Fischlin, Marc ED - Buchmann, Johannes ED - Manulis, Mark KW - adaptive security KW - Algorithm Analysis and Problem Complexity KW - Computer Communication Networks KW - Data Encryption KW - Discrete Mathematics in Computer Science KW - Management of Computing and Information Systems KW - oblivious transfer KW - Password Authenticated Key Exchange KW - search assumptions KW - Systems and Data Security KW - UC security AB - We present a new framework for constructing efficient password authenticated key exchange (PAKE) protocols based on oblivious transfer (OT). Using this framework, we obtain: an efficient and simple UC-secure PAKE protocol that is secure against adaptive corruptions without erasures. efficient and simple PAKE protocols under the Computational Diffie-Hellman (CDH) assumption and the hardness of factoring. (Previous efficient constructions rely on hash proof systems, which appears to be inherently limited to decisional assumptions.) All of our constructions assume a common reference string (CRS) but do not rely on random oracles. JA - Public Key Cryptography – PKC 2012 T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-30056-1, 978-3-642-30057-8 UR - http://link.springer.com/chapter/10.1007/978-3-642-30057-8_27 ER - TY - JOUR T1 - Face Identification Using Large Feature Sets JF - Image Processing, IEEE Transactions on Y1 - 2012 A1 - Schwartz, W.R. A1 - Guo,Huimin A1 - Choi,Jonghyun A1 - Davis, Larry S. KW - (mathematics); KW - approximations;trees KW - challenge KW - data KW - data;tree-based KW - discriminative KW - environments;face KW - feature KW - FERET;FRGC KW - grand KW - identification KW - least KW - recognition KW - recognition;least KW - sets;face KW - sets;facial KW - squares KW - squares;training KW - structure;uncontrolled KW - task;face KW - technology;multichannel KW - weighting;partial AB - With the goal of matching unknown faces against a gallery of known people, the face identification task has been studied for several decades. There are very accurate techniques to perform face identification in controlled environments, particularly when large numbers of samples are available for each face. However, face identification under uncontrolled environments or with a lack of training data is still an unsolved problem. We employ a large and rich set of feature descriptors (with more than 70 000 descriptors) for face identification using partial least squares to perform multichannel feature weighting. Then, we extend the method to a tree-based discriminative structure to reduce the time required to evaluate probe samples. The method is evaluated on Facial Recognition Technology (FERET) and Face Recognition Grand Challenge (FRGC) data sets. Experiments show that our identification method outperforms current state-of-the-art results, particularly for identifying faces acquired across varying conditions. VL - 21 SN - 1057-7149 CP - 4 M3 - 10.1109/TIP.2011.2176951 ER - TY - JOUR T1 - GAGE: A Critical Evaluation of Genome Assemblies and Assembly Algorithms JF - Genome Research Y1 - 2012 A1 - Salzberg,Steven L. A1 - Phillippy,Adam M A1 - Zimin,Aleksey A1 - Puiu,Daniela A1 - Magoc,Tanja A1 - Koren,Sergey A1 - Treangen,Todd J A1 - Schatz,Michael C A1 - Delcher,Arthur L. A1 - Roberts,Michael A1 - Marçais,Guillaume A1 - Pop, Mihai A1 - Yorke,James A. AB - New sequencing technology has dramatically altered the landscape of whole-genome sequencing, allowing scientists to initiate numerous projects to decode the genomes of previously unsequenced organisms. The lowest-cost technology can generate deep coverage of most species, including mammals, in just a few days. The sequence data generated by one of these projects consist of millions or billions of short DNA sequences (reads) that range from 50 to 150 nt in length. These sequences must then be assembled de novo before most genome analyses can begin. Unfortunately, genome assembly remains a very difficult problem, made more difficult by shorter reads and unreliable long-range linking information. In this study, we evaluated several of the leading de novo assembly algorithms on four different short-read data sets, all generated by Illumina sequencers. Our results describe the relative performance of the different assemblers as well as other significant differences in assembly difficulty that appear to be inherent in the genomes themselves. Three overarching conclusions are apparent: first, that data quality, rather than the assembler itself, has a dramatic effect on the quality of an assembled genome; second, that the degree of contiguity of an assembly varies enormously among different assemblers and different genomes; and third, that the correctness of an assembly also varies widely and is not well correlated with statistics on contiguity. To enable others to replicate our results, all of our data and methods are freely available, as are all assemblers used in this study. VL - 22 UR - http://genome.cshlp.org/content/22/3/557 CP - 3 M3 - 10.1101/gr.131383.111 ER - TY - JOUR T1 - Gene Prediction with Glimmer for Metagenomic Sequences Augmented by Classification and Clustering JF - Nucleic Acids ResearchNucl. Acids Res. Y1 - 2012 A1 - Kelley,David R A1 - Liu,Bo A1 - Delcher,Arthur L. A1 - Pop, Mihai A1 - Salzberg,Steven L. AB - Environmental shotgun sequencing (or metagenomics) is widely used to survey the communities of microbial organisms that live in many diverse ecosystems, such as the human body. Finding the protein-coding genes within the sequences is an important step for assessing the functional capacity of a metagenome. In this work, we developed a metagenomics gene prediction system Glimmer-MG that achieves significantly greater accuracy than previous systems via novel approaches to a number of important prediction subtasks. First, we introduce the use of phylogenetic classifications of the sequences to model parameterization. We also cluster the sequences, grouping together those that likely originated from the same organism. Analogous to iterative schemes that are useful for whole genomes, we retrain our models within each cluster on the initial gene predictions before making final predictions. Finally, we model both insertion/deletion and substitution sequencing errors using a different approach than previous software, allowing Glimmer-MG to change coding frame or pass through stop codons by predicting an error. In a comparison among multiple gene finding methods, Glimmer-MG makes the most sensitive and precise predictions on simulated and real metagenomes for all read lengths and error rates tested. VL - 40 SN - 0305-1048, 1362-4962 UR - http://nar.oxfordjournals.org/content/40/1/e9 CP - 1 M3 - 10.1093/nar/gkr1067 ER - TY - JOUR T1 - Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage. JF - ISME J Y1 - 2012 A1 - Dupont, Chris L A1 - Rusch, Douglas B A1 - Yooseph, Shibu A1 - Lombardo, Mary-Jane A1 - Richter, R Alexander A1 - Valas, Ruben A1 - Novotny, Mark A1 - Yee-Greenbaum, Joyclyn A1 - Jeremy D Selengut A1 - Haft, Dan H A1 - Halpern, Aaron L A1 - Lasken, Roger S A1 - Nealson, Kenneth A1 - Friedman, Robert A1 - Venter, J Craig KW - Computational Biology KW - Gammaproteobacteria KW - Genome, Bacterial KW - Genomic Library KW - metagenomics KW - Oceans and Seas KW - Phylogeny KW - plankton KW - Rhodopsin KW - RNA, Ribosomal, 16S KW - Seawater AB -

Bacteria in the 16S rRNA clade SAR86 are among the most abundant uncultivated constituents of microbial assemblages in the surface ocean for which little genomic information is currently available. Bioinformatic techniques were used to assemble two nearly complete genomes from marine metagenomes and single-cell sequencing provided two more partial genomes. Recruitment of metagenomic data shows that these SAR86 genomes substantially increase our knowledge of non-photosynthetic bacteria in the surface ocean. Phylogenomic analyses establish SAR86 as a basal and divergent lineage of γ-proteobacteria, and the individual genomes display a temperature-dependent distribution. Modestly sized at 1.25-1.7 Mbp, the SAR86 genomes lack several pathways for amino-acid and vitamin synthesis as well as sulfate reduction, trends commonly observed in other abundant marine microbes. SAR86 appears to be an aerobic chemoheterotroph with the potential for proteorhodopsin-based ATP generation, though the apparent lack of a retinal biosynthesis pathway may require it to scavenge exogenously-derived pigments to utilize proteorhodopsin. The genomes contain an expanded capacity for the degradation of lipids and carbohydrates acquired using a wealth of tonB-dependent outer membrane receptors. Like the abundant planktonic marine bacterial clade SAR11, SAR86 exhibits metabolic streamlining, but also a distinct carbon compound specialization, possibly avoiding competition.

VL - 6 CP - 6 M3 - 10.1038/ismej.2011.189 ER - TY - RPRT T1 - A Hybrid System for Error Detection in Electronic Dictionaries Y1 - 2012 A1 - Zajic, David A1 - David Doermann A1 - Bloodgood,Michael A1 - Rodrigues,Paul A1 - Ye,Peng A1 - Zotkina,Elena AB - A progress report on CASL’s research on error detection in electronic dictionaries, including a hybrid system, application and evaluation on a second dictionary and a graphical user interface. JA - Technical Reports of the Center for the Advanced Study of Language ER - TY - JOUR T1 - Identification of Coli Surface Antigen 23, a Novel Adhesin of Enterotoxigenic Escherichia coli JF - Infection and immunity Y1 - 2012 A1 - Del Canto, F. A1 - Botkin, D.J. A1 - Valenzuela, P. A1 - Popov, V. A1 - Ruiz-Perez, F. A1 - Nataro, J.P. A1 - Levine, M.M. A1 - Stine, O.C. A1 - Pop, Mihai A1 - Torres, A.G. A1 - others AB - Enterotoxigenic Escherichia coli (ETEC) is an important cause of diarrhea, mainly in developing countries. Although there are 25 different ETEC adhesins described in strains affecting humans, between 15% and 50% of the clinical isolates from different geographical regions are negative for these adhesins, suggesting that additional unidentified adhesion determinants might be present. Here, we report the discovery of Coli Surface Antigen 23 (CS23), a novel adhesin expressed by an ETEC serogroup O4 strain (ETEC 1766a), which was negative for the previously known ETEC adhesins, albeit it has the ability to adhere to Caco-2 cells. CS23 is encoded by an 8.8-kb locus which contains 9 open reading frames (ORFs), 7 of them sharing significant identity with genes required for assembly of K88-related fimbriae. This gene locus, named aal (adhesion-associated locus), is required for the adhesion ability of ETEC 1766a and was able to confer this adhesive phenotype to a nonadherent E. coli HB101 strain. The CS23 major structural subunit, AalE, shares limited identity with known pilin proteins, and it is more closely related to the CS13 pilin protein CshE, carried by human ETEC strains. Our data indicate that CS23 is a new member of the diverse adhesin repertoire used by ETEC strains. VL - 80 CP - 8 ER - TY - JOUR T1 - InterPro in 2011: new developments in the family and domain prediction database. JF - Nucleic Acids Res Y1 - 2012 A1 - Hunter, Sarah A1 - Jones, Philip A1 - Mitchell, Alex A1 - Apweiler, Rolf A1 - Attwood, Teresa K A1 - Bateman, Alex A1 - Bernard, Thomas A1 - Binns, David A1 - Bork, Peer A1 - Burge, Sarah A1 - de Castro, Edouard A1 - Coggill, Penny A1 - Corbett, Matthew A1 - Das, Ujjwal A1 - Daugherty, Louise A1 - Duquenne, Lauranne A1 - Finn, Robert D A1 - Fraser, Matthew A1 - Gough, Julian A1 - Haft, Daniel A1 - Hulo, Nicolas A1 - Kahn, Daniel A1 - Kelly, Elizabeth A1 - Letunic, Ivica A1 - Lonsdale, David A1 - Lopez, Rodrigo A1 - Madera, Martin A1 - Maslen, John A1 - McAnulla, Craig A1 - McDowall, Jennifer A1 - McMenamin, Conor A1 - Mi, Huaiyu A1 - Mutowo-Muellenet, Prudence A1 - Mulder, Nicola A1 - Natale, Darren A1 - Orengo, Christine A1 - Pesseat, Sebastien A1 - Punta, Marco A1 - Quinn, Antony F A1 - Rivoire, Catherine A1 - Sangrador-Vegas, Amaia A1 - Jeremy D Selengut A1 - Sigrist, Christian J A A1 - Scheremetjew, Maxim A1 - Tate, John A1 - Thimmajanarthanan, Manjulapramila A1 - Thomas, Paul D A1 - Wu, Cathy H A1 - Yeats, Corin A1 - Yong, Siew-Yit KW - Databases, Protein KW - Protein Structure, Tertiary KW - Proteins KW - Sequence Analysis, Protein KW - software KW - Terminology as Topic KW - User-Computer Interface AB -

InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.

VL - 40 CP - Database issue M3 - 10.1093/nar/gkr948 ER - TY - JOUR T1 - Vibrio Cholerae Classical Biotype Strains Reveal Distinct Signatures in Mexico JF - Journal of Clinical Microbiology Y1 - 2012 A1 - Alam,Munirul A1 - Islam,M. Tarequl A1 - Rashed,Shah Manzur A1 - Johura,Fatema-Tuz A1 - Bhuiyan,Nurul A. A1 - Delgado,Gabriela A1 - Morales,Rosario A1 - Mendez,Jose Luis A1 - Navarro,Armando A1 - Watanabe,Haruo A1 - Hasan,Nur-A. A1 - Rita R Colwell A1 - Cravioto,Alejandro AB - Vibrio cholerae O1 Classical (CL) biotype caused the 5th and 6th, and probably the earlier cholera pandemics, before the El Tor (ET) biotype initiated the 7th pandemic in Asia in the 1970's by completely displacing the CL biotype. Although the CL biotype was thought to be extinct in Asia, and it had never been reported from Latin America, V. cholerae CL and ET biotypes, including hybrid ET were found associated with endemic cholera in Mexico between 1991 and 1997. In this study, CL biotype strains isolated from endemic cholera in Mexico, between 1983 and 1997 were characterized in terms of major phenotypic and genetic traits, and compared with CL biotype strains isolated in Bangladesh between 1962 and 1989. According to sero- and bio-typing data, all V. cholerae strains tested had the major phenotypic and genotypic characteristics specific for the CL biotype. Antibiograms revealed the majority of the Bangladeshi strains to be resistant to trimethoprim/sulfamethoxazole, furazolidone, ampicillin, and gentamycin, while the Mexican strains were sensitive to all of these drugs, as well as to ciprofloxacin, erythromycin, and tetracycline. Pulsed-field gel electrophoresis (PFGE) of NotI-digested genomic DNA revealed characteristic banding patterns for all the CL biotype strains, although the Mexican strains differed with the Bangladeshi strains in 1-2 DNA bands. The difference may be subtle, but consistent, as confirmed by the sub-clustering patterns in the PFGE-based dendrogram, and can serve as regional signature, suggesting pre-1991 existence and evolution of the CL biotype strains in the Americas, independent from that of Asia. SN - 0095-1137, 1098-660X UR - http://jcm.asm.org/content/early/2012/04/12/JCM.00189-12 M3 - 10.1128/JCM.00189-12 ER - TY - JOUR T1 - Vibrio cholerae in a historically cholera-free country JF - Environmental Microbiology Reports Y1 - 2012 A1 - Haley, Bradd J. A1 - Chen, Arlene A1 - Grim, Christopher J. A1 - Clark, Philip A1 - Diaz, Celia M A1 - Taviani, Elisa A1 - Hasan, Nur A. A1 - Sancomb, Elizabeth A1 - Elnemr, Wessam M A1 - Islam, Muhammad A. A1 - Huq, Anwar A1 - Rita R Colwell A1 - Benediktsdóttir, Eva AB - We report the autochthonous existence of Vibrio cholerae in coastal waters of Iceland, a geothermally active country where cholera is absent and has never been reported. Seawater, mussel and macroalgae samples were collected close to, and distant from, sites where geothermal activity causes a significant increase in water temperature during low tides. Vibrio cholerae was detected only at geothermal‐influenced sites during low‐tides. None of the V. cholerae isolates encoded cholera toxin (ctxAB) and all were non‐O1/non‐O139 serogroups. However, all isolates encoded other virulence factors that are associated with cholera as well as extra‐intestinal V. cholerae infections. The virulence factors were functional at temperatures of coastal waters of Iceland, suggesting an ecological role. It is noteworthy that V. cholerae was isolated from samples collected at sites distant from anthropogenic influence, supporting the conclusion that V. cholerae is autochthonous to the aquatic environment of Iceland. UR - http://doi.wiley.com/10.1111/j.1758-2229.2012.00332.x CP - 4 M3 - 10.1111/j.1758-2229.2012.00332.x ER - TY - CONF T1 - Learning Document Structure for Retrieval and Classification T2 - International Conference on Pattern Recognition (ICPR 2012) Y1 - 2012 A1 - Kumar,Jayant A1 - Ye,Peng A1 - David Doermann AB - In this paper, we present a method for the retrieval of document images with chosen layout characteristics. The proposed method is based on statistics of patch codewords over different regions of image. We begin with a set of wanted and a random set of unwanted images representative of a large heterogeneous collection. We then use raw-image patches extracted from the unlabeled images to learn a codebook. To model the spatial relationships between patches, the image is recursively partitioned horizontally and vertically, and a histogram of patch-codewords is computed in each partition. The resulting set of features give a high precision and recall for the retrieval of hand-drawn and machine-print table-documents, and unconstrained mixed form-type documents, when trained using a random forest classifier. We compare our method to the spatial-pyramid method, and show that the proposed approach for learning layout characteristics is competitive for document images. JA - International Conference on Pattern Recognition (ICPR 2012) ER - TY - CONF T1 - Learning features for predicting OCR accuracy T2 - International Conference on Pattern Recognition (ICPR) Y1 - 2012 A1 - Ye,Peng A1 - David Doermann AB - In this paper, we present a new method for assessing the quality of degraded document images using unsupervised feature learning. The goal is to build a computational model to automatically predict OCR accuracy of a degraded document image without a reference image. Current approaches for this problem typically rely on hand-crafted features whose design is based on heuristic rules that may not be generalizable. In contrast, we explore an unsupervised feature learning framework to learn effective and efficient features for predicting OCR accuracy. Our experimental results, on a set of historic newspaper images, show that the proposed method outperforms a baseline method which combines features from previous works. JA - International Conference on Pattern Recognition (ICPR) ER - TY - CONF T1 - Learning Text-line Segmentation using Codebooks and Graph Partitioning T2 - International Conference on Frontiers in Handwriting Recognition (ICFHR) Y1 - 2012 A1 - Kang,Le A1 - Kumar,Jayant A1 - Ye,Peng A1 - David Doermann AB - In this paper, we present a codebook based method for handwritten text-line segmentation which uses image patches in the training data to learn a graph-based similarity for clustering. We first construct a codebook of image-patches using K-medoids, and obtain exemplars which encode local evidence. We then obtain the corresponding codewords for all patches extracted from a given image and construct a similarity graph using the learned evidence and partitioned to obtain textlines. Our learning based approach performs well on a field dataset containing degraded and un-constrained handwritten Arabic document images. Results on ICDAR 2009 segmentation contest dataset show that the method is competitive with previous approaches. JA - International Conference on Frontiers in Handwriting Recognition (ICFHR) ER - TY - CONF T1 - Linguistic Resources for Handwriting Recognition and Translation Evaluation T2 - Eight International Conference on Language Resources and Evaluation (LREC'12) Y1 - 2012 A1 - Song, Zhiyi A1 - Ismael, Safa A1 - Grimes, Stephen A1 - David Doermann A1 - Strassel,Stephanie JA - Eight International Conference on Language Resources and Evaluation (LREC'12) ER - TY - CONF T1 - Local Segmentation of Touching Characters using Contour based Shape Decomposition T2 - Document Analysis Systems Y1 - 2012 A1 - Kang,Le A1 - David Doermann A1 - Cao,Huiagu A1 - Prasad,Rohit A1 - Natarajan,Prem AB - We propose a contour based shape decomposition approach that provides local segmentation of touching characters. The shape contour is linearized into edgelets and edgelets are merged into boundary fragments. Connection cost between boundary fragments is obtained by considering local smoothness, connection length and a stroke-level property Similar Stroke Rate. Samples of connections among boundary fragments are randomly generated and the one with the minimum global cost is selected to produce optimal segmentation of the shape. To obtain a binary segmentation using this approach, we make an iterative search for the parameters that yields two components on a shape. Experimental results on a number of synthetic shape images and the LTP dataset showed that this contour based shape decomposition technique is promising and it is effective on providing local segmentation of touching characters. JA - Document Analysis Systems ER - TY - CONF T1 - Logo Retrieval in Document Images T2 - Document Analysis Systems Y1 - 2012 A1 - Jain,Rajiv A1 - David Doermann AB - This paper presents a scalable algorithm for segmentation free logo retrieval in document images. The contributions of this paper include the use of the SURF feature for logo retrieval, a novel indexing algorithm for efficient retrieval of SURF features and a method to filter results using the orientation of local features and geometric constraints. Results demonstrate that logo retrieval can be performed with high accurately and efficiently scaled to a large datasets. JA - Document Analysis Systems ER - TY - JOUR T1 - Media, Aggregators and the Link Economy: Strategic Hyperlink Formation in Content Networks JF - Management Science Y1 - 2012 A1 - Dellarocas, Chris A1 - Katona,Zsolt A1 - Rand, William AB - A key property of the World Wide Web is the possibility for firms to place virtually costless links to third-party content as a substitute or complement to their own content. This ability to hyperlink has enabled new types of players, such as search engines and content aggregators, to successfully enter content ecosystems, attracting traffic and revenues by hosting links to the content of others. This, in turn, has sparked a heated controversy between content producers and aggregators regarding the legitimacy and social costs/benefits of uninhibited free linking. This work is the first to model the implications of interrelated and strategic hyper-linking and content investments. Our results provide a nuanced view of the much-touted “link economy,” highlighting both the beneficial consequences and the drawbacks of free hyperlinks for content producers and consumers. We show that content sites can reduce competition and improve profits by forming links to each other; in such networks one site makes high investments in content and other sites link to it. Interestingly, competitive dynamics often preclude the formation of link networks, even in settings where they would improve everyone’s profits. Furthermore, such networks improve economic efficiency only when all members have similar abilities to produce content; otherwise the less capable nodes can free-ride on the content of the more capable nodes, reducing profits for the capable nodes as well as the average content quality available to consumers. Within these networks, aggregators have both positive and negative effects. By making it easier for consumers to access good quality content they increase the appeal of the entire content ecosystem relative to the alternatives. To the extent that this increases the total traffic flowing into the content ecosystem, aggregators can help increase the profits of the highest quality content sites. At the same time, however, the market entry of aggregators takes away some of the revenue that would otherwise go to content sites. Finally, by placing links to only a subset of available content, aggregators further increase competitive pressure on content sites. Interestingly, this can increase the likelihood that such sites will then attempt to alleviate the competitive pressure by forming link networks. ER - TY - JOUR T1 - Modality and Negation in SIMT Use of Modality and Negation in Semantically-Informed Syntactic MT JF - Computational Linguistics Y1 - 2012 A1 - Baker,Kathryn A1 - Bloodgood,Michael A1 - Dorr, Bonnie J A1 - Callison-Burch,Chris A1 - Filardo,Nathaniel W. A1 - Piatko,Christine A1 - Levin,Lori A1 - Miller,Scott AB - This paper describes the resource- and system-building efforts of an eight-week Johns Hopkins University Human Language Technology Center of Excellence Summer Camp for Applied Language Exploration (SCALE-2009) on Semantically-Informed Machine Translation (SIMT). We describe a new modality/negation (MN) annotation scheme, the creation of a (publicly available) MN lexicon, and two automated MN taggers that we built using the annotation scheme and lexicon. Our annotation scheme isolates three components of modality and negation: a trigger (a word that conveys modality or negation), a target (an action associated with modality or negation) and a holder (an experiencer of modality). We describe how our MN lexicon was semi-automatically produced and we demonstrate that a structure-based MN tagger results in precision around 86% (depending on genre) for tagging of a standard LDC data set. SN - 0891-2017 UR - http://dx.doi.org/10.1162/COLI_a_00099 M3 - 10.1162/COLI_a_00099 ER - TY - JOUR T1 - No-Reference Image Quality Assessment using Visual Codebooks JF - Image Processing, IEEE Transactions on Y1 - 2012 A1 - Ye,P. A1 - David Doermann AB - The goal of no-reference objective image quality assessment (NR-IQA) is to develop a computational model that can predict the human perceived quality of distorted images accurately and automatically without any prior knowledge of reference images. Most existing NR-IQA approaches are distortionspecific (DS) and are typically limited to one or two specific types of distortions. In most practical applications, however, information about the distortion type is not really available. In this paper, we propose a general-purpose NR-IQA approach based on visual codebooks. A visual codebook consisting of Gabor filter based local features extracted from local image patches is used to capture complex statistics of a natural image. The codebook encodes statistics by quantizing the feature space and accumulating histograms of patch appearances. This method does not assume any specific types of distortions, however, when evaluating images with a particular type of distortion, it does require examples with the same or similar distortion for training. Experimental results demonstrate that the predicted quality score using our method is consistent with human perceived image quality. The proposed method is comparable to state-ofthe- art general purpose NR-IQA methods and outperforms the full-reference image quality metrics, peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) on the LIVE image quality assessment database. VL - PP SN - 1057-7149 CP - 99 M3 - 10.1109/TIP.2012.2190086 ER - TY - CONF T1 - Optimizing epidemic protection for socially essential workers T2 - Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium Y1 - 2012 A1 - Barrett,Chris A1 - Beckman,Richard A1 - Bisset,Keith A1 - Chen,Jiangzhuo A1 - DuBois,Thomas A1 - Eubank,Stephen A1 - Kumar,V. S. Anil A1 - Lewis,Bryan A1 - Marathe,Madhav V. A1 - Srinivasan, Aravind A1 - Stretz,Paula E. KW - epidemiology KW - OPTIMIZATION KW - public health informatics AB - Public-health policy makers have many tools to mitigate an epidemic's effects. Most related research focuses on the direct effects on those infected (in terms of health, life, or productivity). Interventions including treatment, prophylaxis, quarantine, and social distancing are well studied in this context. These interventions do not address indirect effects due to the loss of critical services and infrastructures when too many of those responsible for their day-to-day operations fall ill. We examine, both analytically and through simulation, the protection of such essential subpopulations by sequestering them, effectively isolating them into groups during an epidemic. We develop a framework for studying the benefits of sequestering and heuristics for when to sequester. We also prove a key property of sequestering placement which helps partition the subpopulations optimally. Thus we provide a first step toward determining how to allocate resources between the direct protection of a population, and protection of those responsible for critical services. JA - Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium T3 - IHI '12 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0781-9 UR - http://doi.acm.org/10.1145/2110363.2110371 M3 - 10.1145/2110363.2110371 ER - TY - CONF T1 - The Provenance of WINE T2 - Dependable Computing Conference (EDCC), 2012 Ninth European Y1 - 2012 A1 - Tudor Dumitras A1 - Efstathopoulos, P. KW - Benchmark testing KW - CYBER SECURITY KW - cyber security experiments KW - data attacks KW - data collection KW - dependability benchmarking KW - distributed databases KW - distributed sensors KW - experimental research KW - field data KW - information quality KW - MALWARE KW - Pipelines KW - provenance KW - provenance information KW - raw data sharing KW - research groups KW - security of data KW - self-documenting experimental process KW - sensor fusion KW - software KW - variable standards KW - WINE KW - WINE benchmark AB - The results of cyber security experiments are often impossible to reproduce, owing to the lack of adequate descriptions of the data collection and experimental processes. Such provenance information is difficult to record consistently when collecting data from distributed sensors and when sharing raw data among research groups with variable standards for documenting the steps that produce the final experimental result. In the WINE benchmark, which provides field data for cyber security experiments, we aim to make the experimental process self-documenting. The data collected includes provenance information – such as when, where and how an attack was first observed or detected – and allows researchers to gauge information quality. Experiments are conducted on a common test bed, which provides tools for recording each procedural step. The ability to understand the provenance of research results enables rigorous cyber security experiments, conducted at scale. JA - Dependable Computing Conference (EDCC), 2012 Ninth European ER - TY - CONF T1 - A Random Forest System Combination Approach for Error Detection in Digital Dictionaries T2 - Innovative hybrid approaches to the processing of textual data, EACL 2012 Workshop Y1 - 2012 A1 - Bloodgood,Michael A1 - Ye,Peng A1 - Rodrigues,Paul A1 - Zajic, David A1 - David Doermann AB - When digitizing a print bilingual dictionary, whether via optical character recognition or manual entry, it is inevitable that errors are introduced into the electronic version that is created. We investigate automating the process of detecting errors in an XML representation of a digitized print dictionary using a hybrid approach that combines rule-based, feature-based, and language model-based methods. We investigate combining methods and show that using random forests is a promising approach. We find that in isolation, unsupervised methods rival the performance of supervised methods. Random forests typically require training data so we investigate how we can apply random forests to combine individual base methods that are themselves unsupervised without requiring large amounts of training data. Experiments reveal empirically that a relatively small amount of data is sufficient and can potentially be further reduced through specific selection criteria. JA - Innovative hybrid approaches to the processing of textual data, EACL 2012 Workshop ER - TY - JOUR T1 - Recognizing Human Actions by Learning and Matching Shape-Motion Prototype Trees JF - Pattern Analysis and Machine Intelligence, IEEE Transactions on Y1 - 2012 A1 - Zhuolin Jiang A1 - Zhe Lin A1 - Davis, Larry S. KW - action prototype KW - actor location KW - brute-force computation KW - CMU action data set KW - distance measures KW - dynamic backgrounds KW - dynamic prototype sequence matching KW - flexible action matching KW - frame-to-frame distances KW - frame-to-prototype correspondence KW - hierarchical k-means clustering KW - human action recognition KW - Image matching KW - image recognition KW - Image sequences KW - joint probability model KW - joint shape KW - KTH action data set KW - large gesture data set KW - learning KW - learning (artificial intelligence) KW - look-up table indexing KW - motion space KW - moving cameras KW - pattern clustering KW - prototype-to-prototype distances KW - shape-motion prototype-based approach KW - table lookup KW - training sequence KW - UCF sports data set KW - Video sequences KW - video signal processing KW - Weizmann action data set AB - A shape-motion prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible action matching in long video sequences. During training, an action prototype tree is learned in a joint shape and motion space via hierarchical K-means clustering and each training sequence is represented as a labeled prototype sequence; then a look-up table of prototype-to-prototype distances is generated. During testing, based on a joint probability model of the actor location and action prototype, the actor is tracked while a frame-to-prototype correspondence is established by maximizing the joint probability, which is efficiently performed by searching the learned prototype tree; then actions are recognized using dynamic prototype sequence matching. Distance measures used for sequence matching are rapidly obtained by look-up table indexing, which is an order of magnitude faster than brute-force computation of frame-to-frame distances. Our approach enables robust action matching in challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences. Experimental results demonstrate that our approach achieves recognition rates of 92.86 percent on a large gesture data set (with dynamic backgrounds), 100 percent on the Weizmann action data set, 95.77 percent on the KTH action data set, 88 percent on the UCF sports data set, and 87.27 percent on the CMU action data set. VL - 34 SN - 0162-8828 CP - 3 M3 - 10.1109/TPAMI.2011.147 ER - TY - CHAP T1 - Securing Circuits against Constant-Rate Tampering T2 - Advances in Cryptology – CRYPTO 2012 Y1 - 2012 A1 - Dana Dachman-Soled A1 - Kalai, Yael Tauman ED - Safavi-Naini, Reihaneh ED - Canetti, Ran KW - circuit compiler KW - Computer Communication Networks KW - computers and society KW - Data Encryption KW - Discrete Mathematics in Computer Science KW - Management of Computing and Information Systems KW - PCP of proximity KW - side-channel attacks KW - Systems and Data Security KW - tampering AB - We present a compiler that converts any circuit into one that remains secure even if a constant fraction of its wires are tampered with. Following the seminal work of Ishai et. al. (Eurocrypt 2006), we consider adversaries who may choose an arbitrary set of wires to corrupt, and may set each such wire to 0 or to 1, or may toggle with the wire. We prove that such adversaries, who continuously tamper with the circuit, can learn at most logarithmically many bits of secret information (in addition to black-box access to the circuit). Our results are information theoretic. JA - Advances in Cryptology – CRYPTO 2012 T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-32008-8, 978-3-642-32009-5 UR - http://link.springer.com/chapter/10.1007/978-3-642-32009-5_31 ER - TY - CONF T1 - Sharpness Estimation for Document and Scene Images T2 - International Conference on Pattern Recognition (ICPR 2012) Y1 - 2012 A1 - Kumar,Jayant A1 - Chen, Francine A1 - David Doermann AB - Images of document pages have different characteristics than images of natural scenes, and so the sharpness measures developed for natural scene images do not necessarily extend to document images primarily composed of text. We present an efficient and simple method for effectively estimating the sharpness/ blurriness of document images that also performs well on natural scenes. Our method can be used to predict the sharpness in scenarios where images are blurred due to camera-motion (or hand-shake), defocus, or inherent properties of the imaging system. The proposed method outperforms the perceptually-based, no-reference sharpness work of [1] and [4], which was shown to perform better than 14 other no-reference sharpness measures on the LIVE dataset. JA - International Conference on Pattern Recognition (ICPR 2012) ER - TY - JOUR T1 - Speeding Up Particle Trajectory Simulations under Moving Force Fields using GPUs JF - Journal of Computing and Information Science in Engineering Y1 - 2012 A1 - Patro,R. A1 - Dickerson,J. P. A1 - Bista,S. A1 - Gupta,S.K. A1 - Varshney, Amitabh AB - In this paper, we introduce a GPU-based framework forsimulating particle trajectories under both static and dynamic force fields. By exploiting the highly parallel nature of the problem and making efficient use of the available hardware, our simulator exhibits a significant speedup over its CPU- based analog. We apply our framework to a specific experi- mental simulation: the computation of trapping probabilities associated with micron-sized silica beads in optical trapping workbenches. When evaluating large numbers of trajectories (4096), we see approximately a 356 times speedup of the GPU-based simulator over its CPU-based counterpart. ER - TY - CONF T1 - Sub-cellular feature detection and automated extraction of collocalized actin and myosin regions T2 - the 2nd ACM SIGHIT symposiumProceedings of the 2nd ACM SIGHIT symposium on International health informatics - IHI '12 Y1 - 2012 A1 - Martineau, Justin A1 - Mokashi,Ronil A1 - Chapman, David A1 - Grasso, Michael A1 - Brady,Mary A1 - Yesha,Yelena A1 - Yesha,Yaacov A1 - Cardone, Antonio A1 - Dima, Alden JA - the 2nd ACM SIGHIT symposiumProceedings of the 2nd ACM SIGHIT symposium on International health informatics - IHI '12 PB - ACM Press CY - Miami, Florida, USANew York, New York, USA SN - 9781450307819 J1 - IHI '12 M3 - 10.1145/211036310.1145/2110363.2110409 ER - TY - CONF T1 - Submodular Dictionary Learning for Sparse Coding T2 - IEEE conference on Computer Vision and Pattern Recognition Y1 - 2012 A1 - Zhuolin Jiang A1 - Zhang, G. A1 - Davis, Larry S. AB - A greedy-based approach to learn a compact and dis-criminative dictionary for sparse representation is pre- sented. We propose an objective function consisting of two components: entropy rate of a random walk on a graph and a discriminative term. Dictionary learning is achieved by finding a graph topology which maximizes the objec- tive function. By exploiting the monotonicity and submod- ularity properties of the objective function and the matroid constraint, we present a highly efficient greedy-based op- timization algorithm. It is more than an order of magni- tude faster than several recently proposed dictionary learn- ing approaches. Moreover, the greedy algorithm gives a near-optimal solution with a (1/2)-approximation bound. Our approach yields dictionaries having the property that feature points from the same class have very similar sparse codes. Experimental results demonstrate that our approach outperforms several recently proposed dictionary learning techniques for face, action and object category recognition. JA - IEEE conference on Computer Vision and Pattern Recognition ER - TY - JOUR T1 - A Temporal Pattern Search Algorithm for Personal History Event Visualization JF - Knowledge and Data Engineering, IEEE Transactions on Y1 - 2012 A1 - Wang,T. D A1 - Deshpande, Amol A1 - Shneiderman, Ben KW - automaton-based approach KW - binary search KW - bit-parallel approach KW - data visualisation KW - electronic health records KW - event array KW - finite automata KW - interactive visualization program KW - Lifelines2 visualization tool KW - medical information systems KW - NFA approach KW - nondeterministic finite automaton KW - O(m2n lg(n)) problem KW - pattern matching KW - personal history event visualization KW - Shift-And approach KW - temporal pattern search algorithm KW - time stamp AB - We present Temporal Pattern Search (TPS), a novel algorithm for searching for temporal patterns of events in historical personal histories. The traditional method of searching for such patterns uses an automaton-based approach over a single array of events, sorted by time stamps. Instead, TPS operates on a set of arrays, where each array contains all events of the same type, sorted by time stamps. TPS searches for a particular item in the pattern using a binary search over the appropriate arrays. Although binary search is considerably more expensive per item, it allows TPS to skip many unnecessary events in personal histories. We show that TPS's running time is bounded by O(m2n lg(n)), where m is the length of (number of events) a search pattern, and n is the number of events in a record (history). Although the asymptotic running time of TPS is inferior to that of a nondeterministic finite automaton (NFA) approach (O(mn)), TPS performs better than NFA under our experimental conditions. We also show TPS is very competitive with Shift-And, a bit-parallel approach, with real data. Since the experimental conditions we describe here subsume the conditions under which analysts would typically use TPS (i.e., within an interactive visualization program), we argue that TPS is an appropriate design choice for us. VL - 24 SN - 1041-4347 CP - 5 M3 - 10.1109/TKDE.2010.257 ER - TY - JOUR T1 - Towards a cost model for network traffic JF - SIGCOMM Comput. Commun. Rev. Y1 - 2012 A1 - Motiwala,Murtaza A1 - Dhamdhere,Amogh A1 - Feamster, Nick A1 - Lakhina,Anukool KW - cost optimization KW - traffic cost model AB - We develop a holistic cost model that operators can use to help evaluate the costs of various routing and peering decisions. Using real traffic data from a large carrier network, we show how network operators can use this cost model to significantly reduce the cost of carrying traffic in their networks. We find that adjusting the routing for a small fraction of total flows (and total traffic volume) significantly reduces cost in many cases. We also show how operators can use the cost model both to evaluate potential peering arrangements and for other network operations problems. VL - 42 SN - 0146-4833 UR - http://doi.acm.org/10.1145/2096149.2096157 CP - 1 M3 - 10.1145/2096149.2096157 ER - TY - CONF T1 - Unsupervised Feature Learning Framework for No-reference Image Quality Assessment T2 - CVPR Y1 - 2012 A1 - Ye,Peng A1 - Kumar,Jayant A1 - Kang,Le A1 - David Doermann AB - In this paper, we present an efficient general-purpose objective no-reference (NR) image quality assessment (IQA) framework based on unsupervised feature learning. The goal is to build a computational model to automatically predict human perceived image quality without a reference image and without knowing the distortion present in the image. Previous approaches for this problem typically rely on hand-crafted features which are carefully designed based on prior knowledge. In contrast, we use raw-image-patches extracted from a set of unlabeled images to learn a dictionary in an unsupervised manner. We use soft-assignment coding with max pooling to obtain effective image representations for quality estimation. The proposed algorithm is very computationally appealing, using raw image patches as local descriptors and using soft-assignment for encoding. Furthermore, unlike previous methods, our unsupervised feature learning strategy enables our method to adapt to different domains. CORNIA (Codebook Representation for No-Reference Image Assessment) was tested on LIVE database and shown to perform statistically better than full-reference quality measure structural similarity index (SSIM) and was shown to be comparable to state-of-the-art general purpose NR-IQA algorithms. JA - CVPR ER - TY - CONF T1 - You'Re Capped: Understanding the Effects of Bandwidth Caps on Broadband Use in the Home T2 - SIGCHI '12 Y1 - 2012 A1 - Marshini Chetty A1 - Banks, Richard A1 - Brush, A.J. A1 - Donner, Jonathan A1 - Grinter, Rebecca KW - Bandwidth KW - bandwidth cap KW - data cap KW - Internet KW - metered use KW - pricing KW - usage-based billing KW - usage-based pricing AB - Bandwidth caps, a limit on the amount of data users can upload and download in a month, are common globally for both home and mobile Internet access. With caps, each bit of data consumed comes at a cost against a monthly quota or a running tab. Yet, relatively little work has considered the implications of this usage-based pricing model on the user experience. In this paper, we present results from a qualitative study of households living with bandwidth caps. Our findings suggest home users grapple with three uncertainties regarding their bandwidth usage: invisible balances, mysterious processes, and multiple users. We discuss how these uncertainties impact their usage and describe the potential for better tools to help monitor and manage data caps. We conclude that as a community we need to cater for users under Internet cost constraints. JA - SIGCHI '12 T3 - CHI '12 PB - ACM SN - 978-1-4503-1015-4 UR - http://doi.acm.org/10.1145/2207676.2208714 ER - TY - JOUR T1 - Accelerated evolution of 3'avian FOXE1 genes, and thyroid and feather specific expression of chicken FoxE1 JF - BMC Evolutionary Biology Y1 - 2011 A1 - Yaklichkin,Sergey Yu A1 - Darnell,Diana K A1 - Pier,Maricela V A1 - Antin,Parker B A1 - Hannenhalli, Sridhar AB - The forkhead transcription factor gene E1 (FOXE1) plays an important role in regulation of thyroid development, palate formation and hair morphogenesis in mammals. However, avian FOXE1 genes have not been characterized and as such, codon evolution of FOXE1 orthologs in a broader evolutionary context of mammals and birds is not known. VL - 11 SN - 1471-2148 UR - http://www.biomedcentral.com/1471-2148/11/302 CP - 1 M3 - 10.1186/1471-2148-11-302 ER - TY - CONF T1 - Action recognition using Partial Least Squares and Support Vector Machines T2 - Image Processing (ICIP), 2011 18th IEEE International Conference on Y1 - 2011 A1 - Ramadan,S. A1 - Davis, Larry S. KW - analysis;support KW - approach;multiclass KW - approximations;regression KW - dataset;action KW - dimensional KW - extraction;feature KW - extraction;image KW - feature KW - high KW - INRIA KW - IXMAS KW - least KW - machines; KW - machines;very KW - partial KW - properties;support KW - recognition KW - recognition;least KW - regressors;spatiotemporal KW - squares KW - SVM;multiple KW - vector KW - vectors AB - We introduce an action recognition approach based on Partial Least Squares (PLS) and Support Vector Machines (SVM). We extract very high dimensional feature vectors representing spatio-temporal properties of actions and use multiple PLS regressors to find relevant features that distinguish amongst action classes. Finally, we use a multi-class SVM to learn and classify those relevant features. We applied our approach to INRIA's IXMAS dataset. Experimental results show that our method is superior to other methods applied to the IXMAS dataset. JA - Image Processing (ICIP), 2011 18th IEEE International Conference on M3 - 10.1109/ICIP.2011.6116399 ER - TY - CONF T1 - Ant Colony Optimization in a Changing Environment T2 - 2011 AAAI Fall Symposium Series Y1 - 2011 A1 - Seymour,John Jefferson A1 - Tuzo,Joseph A1 - desJardins, Marie AB - Ant colony optimization (ACO) algorithms are computational problem-solving methods that are inspired by the complex behaviors of ant colonies; specifically, the ways in which ants interact with each other and their environment to optimize the overall performance of the ant colony. Our eventual goal is to develop and experiment with ACO methods that can more effectively adapt to dynamically changing environments and problems. We describe biological ant systems and the dynamics of their environments and behaviors. We then introduce a family of dynamic ACO algorithms that can handle dynamic modifications of their inputs. We report empirical results, showing that dynamic ACO algorithms can effectively adapt to time-varying environments. JA - 2011 AAAI Fall Symposium Series UR - http://www.aaai.org/ocs/index.php/FSS/FSS11/paper/viewPaper/4223 ER - TY - CHAP T1 - Automated Planning Logic Synthesis for Autonomous Unmanned Vehicles in Competitive Environments with Deceptive Adversaries T2 - New Horizons in Evolutionary Robotics Y1 - 2011 A1 - Svec,Petr A1 - Gupta, Satyandra K. ED - Doncieux,Stéphane ED - Bredèche,Nicolas ED - Mouret,Jean-Baptiste KW - engineering AB - We developed a new approach for automated synthesis of a planning logic for autonomous unmanned vehicles. This new approach can be viewed as an automated iterative process during which an initial version of a logic is synthesized and then gradually improved by detecting and fixing its shortcomings. This is achieved by combining data mining for extraction of vehicle’s states of failure and Genetic Programming (GP) technique for synthesis of corresponding navigation code. We verified the feasibility of the approach using unmanned surface vehicles (USVs) simulation. Our focus was specifically on the generation of a planning logic used for blocking the advancement of an intruder boat towards a valuable target. Developing autonomy logic for this behavior is challenging as the intruder’s attacking logic is human-competitive with deceptive behavior so the USV is required to learn specific maneuvers for specific situations to do successful blocking. We compared the performance of the generated blocking logic to the performance of logic that was manually implemented. Our results show that the new approach was able to synthesize a blocking logic with performance closely approaching the performance of the logic coded by hand. JA - New Horizons in Evolutionary Robotics T3 - Studies in Computational Intelligence PB - Springer Berlin / Heidelberg VL - 341 SN - 978-3-642-18271-6 UR - http://www.springerlink.com/content/f454477212518671/abstract/ ER - TY - CONF T1 - AVSS 2011 demo session: A large-scale benchmark dataset for event recognition in surveillance video T2 - Advanced Video and Signal-Based Surveillance (AVSS), 2011 8th IEEE International Conference on Y1 - 2011 A1 - Oh,Sangmin A1 - Hoogs,Anthony A1 - Perera,Amitha A1 - Cuntoor,Naresh A1 - Chen,Chia-Chih A1 - Lee,Jong Taek A1 - Mukherjee,Saurajit A1 - Aggarwal, JK A1 - Lee,Hyungtae A1 - Davis, Larry S. A1 - Swears,Eran A1 - Wang,Xiaoyang A1 - Ji,Qiang A1 - Reddy,Kishore A1 - Shah,Mubarak A1 - Vondrick,Carl A1 - Pirsiavash,Hamed A1 - Ramanan,Deva A1 - Yuen,Jenny A1 - Torralba,Antonio A1 - Song,Bi A1 - Fong,Anesco A1 - Roy-Chowdhury,Amit A1 - Desai,Mita AB - We introduce to the surveillance community the VIRAT Video Dataset[1], which is a new large-scale surveillance video dataset designed to assess the performance of event recognition algorithms in realistic scenes1. JA - Advanced Video and Signal-Based Surveillance (AVSS), 2011 8th IEEE International Conference on M3 - 10.1109/AVSS.2011.6027400 ER - TY - JOUR T1 - Bacillus Anthracis Comparative Genome Analysis in Support of the Amerithrax Investigation JF - Proceedings of the National Academy of SciencesPNAS Y1 - 2011 A1 - Rasko,David A A1 - Worsham,Patricia L A1 - Abshire,Terry G A1 - Stanley,Scott T A1 - Bannan,Jason D A1 - Wilson,Mark R A1 - Langham,Richard J A1 - Decker,R. Scott A1 - Jiang,Lingxia A1 - Read,Timothy D. A1 - Phillippy,Adam M A1 - Salzberg,Steven L. A1 - Pop, Mihai A1 - Van Ert,Matthew N A1 - Kenefic,Leo J A1 - Keim,Paul S A1 - Fraser-Liggett,Claire M A1 - Ravel,Jacques AB - Before the anthrax letter attacks of 2001, the developing field of microbial forensics relied on microbial genotyping schemes based on a small portion of a genome sequence. Amerithrax, the investigation into the anthrax letter attacks, applied high-resolution whole-genome sequencing and comparative genomics to identify key genetic features of the letters’ Bacillus anthracis Ames strain. During systematic microbiological analysis of the spore material from the letters, we identified a number of morphological variants based on phenotypic characteristics and the ability to sporulate. The genomes of these morphological variants were sequenced and compared with that of the B. anthracis Ames ancestor, the progenitor of all B. anthracis Ames strains. Through comparative genomics, we identified four distinct loci with verifiable genetic mutations. Three of the four mutations could be directly linked to sporulation pathways in B. anthracis and more specifically to the regulation of the phosphorylation state of Spo0F, a key regulatory protein in the initiation of the sporulation cascade, thus linking phenotype to genotype. None of these variant genotypes were identified in single-colony environmental B. anthracis Ames isolates associated with the investigation. These genotypes were identified only in B. anthracis morphotypes isolated from the letters, indicating that the variants were not prevalent in the environment, not even the environments associated with the investigation. This study demonstrates the forensic value of systematic microbiological analysis combined with whole-genome sequencing and comparative genomics. VL - 108 SN - 0027-8424, 1091-6490 UR - http://www.pnas.org/content/108/12/5027 CP - 12 M3 - 10.1073/pnas.1016657108 ER - TY - CONF T1 - Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance T2 - Computer Vision (ICCV), 2011 IEEE International Conference on Y1 - 2011 A1 - Farrell,R. A1 - Oza,O. A1 - Zhang,Ning A1 - Morariu,V.I. A1 - Darrell,T. A1 - Davis, Larry S. KW - appearance KW - Birdlets;category KW - categorization;subordinate-level KW - detection;pose KW - detectors;pose-normalized KW - distinctions;shape KW - estimation; KW - extraction;pose KW - extraction;subordinate-level KW - information KW - model;salient KW - models;volumetric KW - pixels;part KW - poselet KW - primitives;computer KW - resolution;information KW - retrieval;object KW - scheme;volumetric KW - taxonomy;computer KW - vision;image AB - Subordinate-level categorization typically rests on establishing salient distinctions between part-level characteristics of objects, in contrast to basic-level categorization, where the presence or absence of parts is determinative. We develop an approach for subordinate categorization in vision, focusing on an avian domain due to the fine-grained structure of the category taxonomy for this domain. We explore a pose-normalized appearance model based on a volumetric poselet scheme. The variation in shape and appearance properties of these parts across a taxonomy provides the cues needed for subordinate categorization. Training pose detectors requires a relatively large amount of training data per category when done from scratch; using a subordinate-level approach, we exploit a pose classifier trained at the basic-level, and extract part appearance and shape information to build subordinate-level models. Our model associates the underlying image pattern parameters used for detection with corresponding volumetric part location, scale and orientation parameters. These parameters implicitly define a mapping from the image pixels into a pose-normalized appearance space, removing view and pose dependencies, facilitating fine-grained categorization from relatively few training examples. JA - Computer Vision (ICCV), 2011 IEEE International Conference on M3 - 10.1109/ICCV.2011.6126238 ER - TY - CHAP T1 - On the Black-Box Complexity of Optimally-Fair Coin Tossing T2 - Theory of Cryptography Y1 - 2011 A1 - Dana Dachman-Soled A1 - Lindell, Yehuda A1 - Mahmoody, Mohammad A1 - Malkin, Tal ED - Ishai, Yuval KW - Algorithm Analysis and Problem Complexity KW - black-box separations KW - Coding and Information Theory KW - coin tossing KW - Computer Communication Networks KW - Data Encryption KW - lower-bound KW - Math Applications in Computer Science KW - optimally-fair coin tossing KW - round-complexity KW - Systems and Data Security AB - A fair two-party coin tossing protocol is one in which both parties output the same bit that is almost uniformly distributed (i.e., it equals 0 and 1 with probability that is at most negligibly far from one half). It is well known that it is impossible to achieve fair coin tossing even in the presence of fail-stop adversaries (Cleve, FOCS 1986). In fact, Cleve showed that for every coin tossing protocol running for r rounds, an efficient fail-stop adversary can bias the output by Ω(1/r). Since this is the best possible, a protocol that limits the bias of any adversary to O(1/r) is called optimally-fair. The only optimally-fair protocol that is known to exist relies on the existence of oblivious transfer, because it uses general secure computation (Moran, Naor and Segev, TCC 2009). However, it is possible to achieve a bias of O(1/r√)O(1/\sqrt{r}) in r rounds relying only on the assumption that there exist one-way functions. In this paper we show that it is impossible to achieve optimally-fair coin tossing via a black-box construction from one-way functions for r that is less than O(n/logn), where n is the input/output length of the one-way function used. An important corollary of this is that it is impossible to construct an optimally-fair coin tossing protocol via a black-box construction from one-way functions whose round complexity is independent of the security parameter n determining the security of the one-way function being used. Informally speaking, the main ingredient of our proof is to eliminate the random-oracle from “secure” protocols with “low round-complexity” and simulate the protocol securely against semi-honest adversaries in the plain model. We believe our simulation lemma to be of broader interest. JA - Theory of Cryptography T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-19570-9, 978-3-642-19571-6 UR - http://link.springer.com/chapter/10.1007/978-3-642-19571-6_27 ER - TY - JOUR T1 - Broadband internet performance: A view from the gateway JF - SIGCOMM-Computer Communication Review Y1 - 2011 A1 - Sundaresan,S. A1 - de Donato,W. A1 - Feamster, Nick A1 - Teixeira,R. A1 - Crawford,S. A1 - Pescapè,A. AB - We present the first study of network access link performance mea-sured directly from home gateway devices. Policymakers, ISPs, and users are increasingly interested in studying the performance of Internet access links. Because of many confounding factors in a home network or on end hosts, however, thoroughly understanding access network performance requires deploying measurement in- frastructure in users’ homes as gateway devices. In conjunction with the Federal Communication Commission’s study of broad- band Internet access in the United States, we study the throughput and latency of network access links using longitudinal measure- ments from nearly 4,000 gateway devices across 8 ISPs from a de- ployment of over 4,200 devices. We study the performance users achieve and how various factors ranging from the user’s choice of modem to the ISP’s traffic shaping policies can affect performance. Our study yields many important findings about the characteristics of existing access networks. Our findings also provide insights into the ways that access network performance should be measured and presented to users, which can help inform ongoing broader efforts to benchmark the performance of access networks. VL - 41 CP - 4 ER - TY - JOUR T1 - Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)? JF - Systematic BiologySyst Biol Y1 - 2011 A1 - Cho,Soowon A1 - Zwick,Andreas A1 - Regier,Jerome C A1 - Mitter,Charles A1 - Cummings, Michael P. A1 - Yao,Jianxiu A1 - Du,Zaile A1 - Zhao,Hong A1 - Kawahara,Akito Y A1 - Weller,Susan A1 - Davis,Donald R A1 - Baixeras,Joaquin A1 - Brown,John W A1 - Parr,Cynthia KW - Ditrysia KW - gene sampling KW - Hexapoda KW - Lepidoptera KW - missing data KW - molecular phylogenetics KW - nuclear genes KW - taxon sampling AB - This paper addresses the question of whether one can economically improve the robustness of a molecular phylogeny estimate by increasing gene sampling in only a subset of taxa, without having the analysis invalidated by artifacts arising from large blocks of missing data. Our case study stems from an ongoing effort to resolve poorly understood deeper relationships in the large clade Ditrysia ( > 150,000 species) of the insect order Lepidoptera (butterflies and moths). Seeking to remedy the overall weak support for deeper divergences in an initial study based on five nuclear genes (6.6 kb) in 123 exemplars, we nearly tripled the total gene sample (to 26 genes, 18.4 kb) but only in a third (41) of the taxa. The resulting partially augmented data matrix (45% intentionally missing data) consistently increased bootstrap support for groupings previously identified in the five-gene (nearly) complete matrix, while introducing no contradictory groupings of the kind that missing data have been predicted to produce. Our results add to growing evidence that data sets differing substantially in gene and taxon sampling can often be safely and profitably combined. The strongest overall support for nodes above the family level came from including all nucleotide changes, while partitioning sites into sets undergoing mostly nonsynonymous versus mostly synonymous change. In contrast, support for the deepest node for which any persuasive molecular evidence has yet emerged (78–85% bootstrap) was weak or nonexistent unless synonymous change was entirely excluded, a result plausibly attributed to compositional heterogeneity. This node (Gelechioidea + Apoditrysia), tentatively proposed by previous authors on the basis of four morphological synapomorphies, is the first major subset of ditrysian superfamilies to receive strong statistical support in any phylogenetic study. A “more-genes-only” data set (41 taxa×26 genes) also gave strong signal for a second deep grouping (Macrolepidoptera) that was obscured, but not strongly contradicted, in more taxon-rich analyses. VL - 60 SN - 1063-5157, 1076-836X UR - http://sysbio.oxfordjournals.org/content/60/6/782 CP - 6 M3 - 10.1093/sysbio/syr079 ER - TY - CHAP T1 - A Canonical Form for Testing Boolean Function Properties T2 - Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques Y1 - 2011 A1 - Dana Dachman-Soled A1 - Servedio, Rocco A. ED - Goldberg, Leslie Ann ED - Jansen, Klaus ED - Ravi, R. ED - Rolim, José D. P. KW - Algorithm Analysis and Problem Complexity KW - Boolean functions KW - Computation by Abstract Devices KW - Computer Communication Networks KW - Computer Graphics KW - Data structures KW - Discrete Mathematics in Computer Science KW - property testing AB - In a well-known result Goldreich and Trevisan (2003) showed that every testable graph property has a “canonical” tester in which a set of vertices is selected at random and the edges queried are the complete graph over the selected vertices. We define a similar-in-spirit canonical form for Boolean function testing algorithms, and show that under some mild conditions property testers for Boolean functions can be transformed into this canonical form. Our first main result shows, roughly speaking, that every “nice” family of Boolean functions that has low noise sensitivity and is testable by an “independent tester,” has a canonical testing algorithm. Our second main result is similar but holds instead for families of Boolean functions that are closed under ID-negative minors. Taken together, these two results cover almost all of the constant-query Boolean function testing algorithms that we know of in the literature, and show that all of these testing algorithms can be automatically converted into a canonical form. JA - Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-22934-3, 978-3-642-22935-0 UR - http://link.springer.com/chapter/10.1007/978-3-642-22935-0_39 ER - TY - CONF T1 - A case for query by image and text content: searching computer help using screenshots and keywords T2 - Proceedings of the 20th international conference on World wide web Y1 - 2011 A1 - Tom Yeh A1 - White,Brandyn A1 - San Pedro,Jose A1 - Katz,Boriz A1 - Davis, Larry S. KW - content-based image retrieval KW - multimodal search KW - online help AB - The multimedia information retrieval community has dedicated extensive research effort to the problem of content-based image retrieval (CBIR). However, these systems find their main limitation in the difficulty of creating pictorial queries. As a result, few systems offer the option of querying by visual examples, and rely on automatic concept detection and tagging techniques to provide support for searching visual content using textual queries.This paper proposes and studies a practical multimodal web search scenario, where CBIR fits intuitively to improve the retrieval of rich information queries. Many online articles contain useful know-how knowledge about computer applications. These articles tend to be richly illustrated by screenshots. We present a system to search for such software know-how articles that leverages the visual correspondences between screenshots. Users can naturally create pictorial queries simply by taking a screenshot of the application to retrieve a list of articles containing a matching screenshot. We build a prototype comprising 150k articles that are classified into walkthrough, book, gallery, and general categories, and provide a comprehensive evaluation of this system, focusing on technical (accuracy of CBIR techniques) and usability (perceived system usefulness) aspects. We also consider the study of added value features of such a visual-supported search, including the ability to perform cross-lingual queries. We find that the system is able to retrieve matching screenshots for a wide variety of programs, across language boundaries, and provide subjectively more useful results than keyword-based web and image search engines. JA - Proceedings of the 20th international conference on World wide web T3 - WWW '11 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0632-4 UR - http://doi.acm.org/10.1145/1963405.1963513 M3 - 10.1145/1963405.1963513 ER - TY - CONF T1 - A case for query by image and text content: searching computer help using screenshots and keywords T2 - Proceedings of the 20th international conference on World wide web Y1 - 2011 A1 - Tom Yeh A1 - White,Brandyn A1 - San Pedro,Jose A1 - Katz,Boriz A1 - Davis, Larry S. KW - content-based image retrieval KW - multimodal search KW - online help AB - The multimedia information retrieval community has dedicated extensive research effort to the problem of content-based image retrieval (CBIR). However, these systems find their main limitation in the difficulty of creating pictorial queries. As a result, few systems offer the option of querying by visual examples, and rely on automatic concept detection and tagging techniques to provide support for searching visual content using textual queries. This paper proposes and studies a practical multimodal web search scenario, where CBIR fits intuitively to improve the retrieval of rich information queries. Many online articles contain useful know-how knowledge about computer applications. These articles tend to be richly illustrated by screenshots. We present a system to search for such software know-how articles that leverages the visual correspondences between screenshots. Users can naturally create pictorial queries simply by taking a screenshot of the application to retrieve a list of articles containing a matching screenshot. We build a prototype comprising 150k articles that are classified into walkthrough, book, gallery, and general categories, and provide a comprehensive evaluation of this system, focusing on technical (accuracy of CBIR techniques) and usability (perceived system usefulness) aspects. We also consider the study of added value features of such a visual-supported search, including the ability to perform cross-lingual queries. We find that the system is able to retrieve matching screenshots for a wide variety of programs, across language boundaries, and provide subjectively more useful results than keyword-based web and image search engines. JA - Proceedings of the 20th international conference on World wide web T3 - WWW '11 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0632-4 UR - http://doi.acm.org/10.1145/1963405.1963513 M3 - 10.1145/1963405.1963513 ER - TY - JOUR T1 - Cell cycle dependent TN-C promoter activity determined by live cell imaging. JF - Cytometry. Part A : the journal of the International Society for Analytical Cytology Y1 - 2011 A1 - Halter, Michael A1 - Sisan, Daniel R A1 - Chalfoun, Joe A1 - Stottrup, Benjamin L A1 - Cardone, Antonio A1 - Dima,Alden A. A1 - Tona, Alessandro A1 - Plant,Anne L. A1 - Elliott, John T KW - Animals KW - cell cycle KW - Gene Expression Regulation KW - Green Fluorescent Proteins KW - Image Processing, Computer-Assisted KW - Mice KW - Microscopy, Fluorescence KW - Microscopy, Phase-Contrast KW - NIH 3T3 Cells KW - Promoter Regions, Genetic KW - Tenascin AB - The extracellular matrix protein tenascin-C plays a critical role in development, wound healing, and cancer progression, but how it is controlled and how it exerts its physiological responses remain unclear. By quantifying the behavior of live cells with phase contrast and fluorescence microscopy, the dynamic regulation of TN-C promoter activity is examined. We employ an NIH 3T3 cell line stably transfected with the TN-C promoter ligated to the gene sequence for destabilized green fluorescent protein (GFP). Fully automated image analysis routines, validated by comparison with data derived from manual segmentation and tracking of single cells, are used to quantify changes in the cellular GFP in hundreds of individual cells throughout their cell cycle during live cell imaging experiments lasting 62 h. We find that individual cells vary substantially in their expression patterns over the cell cycle, but that on average TN-C promoter activity increases during the last 40% of the cell cycle. We also find that the increase in promoter activity is proportional to the activity earlier in the cell cycle. This work illustrates the application of live cell microscopy and automated image analysis of a promoter-driven GFP reporter cell line to identify subtle gene regulatory mechanisms that are difficult to uncover using population averaged measurements. VL - 79 CP - 3 U1 - http://www.ncbi.nlm.nih.gov/pubmed/22045641?dopt=Abstract M3 - 10.1002/cyto.a.21028 ER - TY - CONF T1 - CHI 2011 sustainability community invited panel: challenges ahead T2 - 2011 annual conference extended abstracts on Human factors in computing systems Y1 - 2011 A1 - Khan,Azam A1 - Bartram,Lyn A1 - Blevis,Eli A1 - DiSalvo,Carl A1 - Jon Froehlich A1 - Kurtenbach,Gordon KW - design KW - Environment KW - sustainability; community KW - user behavior AB - As part of a new CHI Sustainability Community, focused on environmental sustainability, this panel will discuss specific ways in which HCI research will be critical in finding solutions to this global challenge. While research to date has primarily focused on the end consumer, the panel will be challenged with enlarging the discussion to include the designer as a target user and to consider interfaces and interactions that support sustainable design and sustainable manufacturing, as well as sustainable consumption. Specifically, to make real progress, we seek to enumerate ways that HCI needs to grow, as well as to find ways that can help more HCI researchers to become involved. JA - 2011 annual conference extended abstracts on Human factors in computing systems T3 - CHI EA '11 PB - ACM CY - New York, NY, USA UR - http://doi.acm.org/10.1145/1979482.1979484 M3 - 10.1145/1979482.1979484 ER - TY - RPRT T1 - Citation Handling for Improved Summarization of Scientific Documents Y1 - 2011 A1 - Whidby,Michael A1 - Zajic, David A1 - Dorr, Bonnie J KW - Technical Report AB - In this paper we present the first steps toward improving summarizationof scientific documents through citation analysis and parsing. Prior work (Mohammad et al., 2009) argues that citation texts (sentences that cite other papers) play a crucial role in automatic summarization of a topical area, but did not take into account the noise introduced by the citations themselves. We demonstrate that it is possible to improve summarization output through careful handling of these citations. We base our experiments on the application of an improved trimming approach to summarization of citation texts extracted from Question-Answering and Dependency-Parsing documents. We demonstrate that confidence scores from the Stanford NLP Parser (Klein and Manning, 2003) are significantly improved, and that Trimmer (Zajic et al., 2007), a sentence-compression tool, is able to generate higher-quality candidates. Our summarization output is currently used as part of a larger system, Action Science Explorer (ASE) (Gove, 2011). PB - Instititue for Advanced Computer Studies, Univ of Maryland, College Park UR - http://drum.lib.umd.edu/handle/1903/11822 ER - TY - CONF T1 - Cloud software upgrades: Challenges and opportunities T2 - 2011 International Workshop on the Maintenance and Evolution of Service-Oriented and Cloud-Based Systems (MESOCA) Y1 - 2011 A1 - Neamtiu, I. A1 - Tudor Dumitras KW - Availability KW - Cloud computing KW - cloud computing software KW - cloud upgrade frequency KW - collision course KW - Electronic publishing KW - Encyclopedias KW - service level agreement KW - service-oriented architecture AB - The fast evolution pace for cloud computing software is on a collision course with our growing reliance on cloud computing. On one hand, cloud software must have the agility to evolve rapidly, in order to remain competitive; on the other hand, more and more critical services become dependent on the cloud and demand high availability through firm Service Level Agreements (SLAs) for cloud infrastructures. This race between the needs to increase both the cloud upgrade frequency and the service availability is unsustainable. In this paper we highlight challenges and opportunities for upgrades in the cloud. We survey the release histories of several cloud applications to analyze their evolution pace, and we discuss the shortcomings with current cloud upgrade mechanisms. We outline several solutions for sustaining this evolution while improving availability, by focusing on the novel characteristics of cloud computing. By discussing several promising directions for realizing this vision, we propose a research agenda for the future of software upgrades in the cloud. JA - 2011 International Workshop on the Maintenance and Evolution of Service-Oriented and Cloud-Based Systems (MESOCA) ER - TY - JOUR T1 - Computing morse decompositions for triangulated terrains: an analysis and an experimental evaluation JF - Image Analysis and Processing–ICIAP 2011 Y1 - 2011 A1 - Vitali,M. A1 - De Floriani, Leila A1 - Magillo,P. AB - We consider the problem of extracting the morphology of a terrain discretized as a triangle mesh. We discuss first how to transpose Morse theory to the discrete case in order to describe the morphology of triangulated terrains. We review algorithms for computing Morse decompositions, that we have adapted and implemented for triangulated terrains. We compare the the Morse decompositions produced by them, by considering two different metrics. M3 - 10.1007/978-3-642-24085-0_58 ER - TY - CONF T1 - A Corpus-Guided Framework for Robotic Visual Perception T2 - Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence Y1 - 2011 A1 - Teo,C.L. A1 - Yang, Y. A1 - Daumé, Hal A1 - Fermüller, Cornelia A1 - Aloimonos, J. JA - Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence ER - TY - CONF T1 - Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language T2 - Electronic lexicography in the 21st century: new applications for new users (eLEX2011) Y1 - 2011 A1 - Zajic, David A1 - Maxwell,Michael A1 - David Doermann A1 - Rodrigues,Paul A1 - Bloodgood,Michael AB - We describe a paradigm for combining manual and automatic error correction of noisy structured lexicographic data. Modifications to the structure and underlying text of the lexicographic data are expressed in a simple, interpreted programming language. Dictionary Manipulation Language (DML) commands identify nodes by unique identifiers, and manipulations are performed using simple commands such as create, move, set text, etc. Corrected lexicons are produced by applying sequences of DML commands to the source version of the lexicon. DML commands can be written manually to repair one-off errors or generated automatically to correct recurring problems. We discuss advantages of the paradigm for the task of editing digital bilingual dictionaries. JA - Electronic lexicography in the 21st century: new applications for new users (eLEX2011) ER - TY - RPRT T1 - Countering Botnets: Anomaly-Based Detection, Comprehensive Analysis, and Efficient Mitigation Y1 - 2011 A1 - Lee,Wenke A1 - Dagon,David A1 - Giffin,Jon A1 - Feamster, Nick A1 - Ollman,Gunter A1 - Westby,Jody A1 - Wesson,Rick A1 - Vixie,Paul KW - *ELECTRONIC SECURITY KW - *INFORMATION SECURITY KW - *INTERNET KW - *INTRUSION DETECTION(COMPUTERS) KW - algorithms KW - BGP ROUTE INJECTION KW - BGP(BORDER GATEWAY PROTOCOLS) KW - BOTNET DETECTION KW - BOTNET TRACEBACK AND ATTRIBUTION KW - BOTNETS(MALWARE) KW - CLIENT SERVER SYSTEMS KW - COMMUNICATIONS PROTOCOLS KW - COMPUTER PROGRAMMING AND SOFTWARE KW - COMPUTER SYSTEMS MANAGEMENT AND STANDARDS KW - CYBER ATTACKS KW - CYBER SECURITY KW - CYBERNETICS KW - CYBERTERRORISM KW - CYBERWARFARE KW - DATA PROCESSING SECURITY KW - DNS BASED MONITORING KW - DNS BASED REDIRECTION KW - DNS(DOMAIN NAME SYSTEMS) KW - INFORMATION SCIENCE KW - INTERNET BROWSERS KW - ISP(INTERNET SERVICE PROVIDERS) KW - MALWARE KW - MALWARE ANALYSIS KW - Online Systems KW - WUAFRLDHS1BOTN AB - We cover five general areas: (1) botnet detection, (2) botnet analysis, (3) botnet mitigation, (4) add-on tasks to the original contract, including the Conficker Working Group Lessons Learned, Layer-8 Exploration of Botnet Organization, and DREN research, and (5) commercialization in this paper. We have successfully developed new botnet detection and analysis capabilities in this project. These algorithms have been evaluated using real-world data, and have been put into actual, deployed systems. The most significant technical developments include a new dynamic reputation systems for DNS domains, a scalable anomaly detection system for botnet detection in very large network, and a transparent malware analysis system. In addition, on several occasions we have used our botnet data and analysis to help law enforcement agencies arrest botmasters. We also have had great success transitioning technologies to commercial products that are now used by government agencies, ISPs, and major corporations. PB - GEORGIA TECH RESEARCH CORP ATLANTA UR - http://stinet.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA543919 ER - TY - CONF T1 - Creating contextual help for GUIs using screenshots T2 - Proceedings of the 24th annual ACM symposium on User interface software and technology Y1 - 2011 A1 - Tom Yeh A1 - Chang,Tsung-Hsiang A1 - Xie,Bo A1 - Walsh,Greg A1 - Watkins,Ivan A1 - Wongsuphasawat,Krist A1 - Huang,Man A1 - Davis, Larry S. A1 - Bederson, Benjamin B. KW - contextual help KW - help KW - pixel analysis AB - Contextual help is effective for learning how to use GUIs by showing instructions and highlights on the actual interface rather than in a separate viewer. However, end-users and third-party tech support typically cannot create contextual help to assist other users because it requires programming skill and source code access. We present a creation tool for contextual help that allows users to apply common computer skills-taking screenshots and writing simple scripts. We perform pixel analysis on screenshots to make this tool applicable to a wide range of applications and platforms without source code access. We evaluated the tool's usability with three groups of participants: developers, in-structors, and tech support. We further validated the applicability of our tool with 60 real tasks supported by the tech support of a university campus. JA - Proceedings of the 24th annual ACM symposium on User interface software and technology T3 - UIST '11 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0716-1 UR - http://doi.acm.org/10.1145/2047196.2047214 M3 - 10.1145/2047196.2047214 ER - TY - CONF T1 - Cross Language Entity Linking T2 - IJCNLP: International Joint Conference on Natural Language Processing Y1 - 2011 A1 - McNamee,Paul A1 - Mayfield,James A1 - Lawrie,Dawn A1 - Oard, Douglas A1 - David Doermann AB - There has been substantial recent interest in aligning mentions of named entities in unstructured texts to knowledge base descriptors, a task commonly called entity linking. This technology is crucial for applications in knowledge discovery and text data mining. This paper presents experiments in the new problem of cross language entity linking, where documents and named entities are in a different language than that used for the content of the reference knowledge base. We have created a new test collection to evaluate cross-language entity linking performance in twenty-one languages. We present experiments that examine issues such as: the importance of transliteration; the utility of cross-language information retrieval; and, the potential benefit of multilingual named entity recognition. Our best model achieves performance which is 94% of a strong monolingual baseline. JA - IJCNLP: International Joint Conference on Natural Language Processing ER - TY - CONF T1 - Cross-Language Entity Linking in Maryland during a Hurricane T2 - TAC Y1 - 2011 A1 - McNamee,Paul A1 - Mayfield,James A1 - Oard, Douglas A1 - David Doermann A1 - Xu,Tan A1 - Wu,Ke AB - Our team from the JHUHLTCOE and the University of Maryland submitted runs for all three variants of the TACKBP entity linking task. For the monolingual tasks, we essentially mirrored our HLTCOETAC-KBP 2010 submission, making only modest changes to accommodate differences in 2011, namely the requirement to cluster NIL responses, and the change in evaluation measure. However, our work on the cross-lingual task was significantly more involved, requiring development of robust, multiphased transliteration software, use of techniques in cross-language information retrieval, and reliance on a Chinese-to- English statistical machine translation system. In this paper we describe our work for the 2011 evaluation and the results we obtained. JA - TAC ER - TY - CONF T1 - Declarative analysis of noisy information networks T2 - 2011 IEEE 27th International Conference on Data Engineering Workshops (ICDEW) Y1 - 2011 A1 - Moustafa,W. E A1 - Namata,G. A1 - Deshpande, Amol A1 - Getoor, Lise KW - Cleaning KW - Data analysis KW - data cleaning operations KW - data management system KW - data mining KW - Databases KW - Datalog KW - declarative analysis KW - graph structure KW - information networks KW - Noise measurement KW - noisy information networks KW - Prediction algorithms KW - semantics KW - Syntactics AB - There is a growing interest in methods for analyzing data describing networks of all types, including information, biological, physical, and social networks. Typically the data describing these networks is observational, and thus noisy and incomplete; it is often at the wrong level of fidelity and abstraction for meaningful data analysis. This has resulted in a growing body of work on extracting, cleaning, and annotating network data. Unfortunately, much of this work is ad hoc and domain-specific. In this paper, we present the architecture of a data management system that enables efficient, declarative analysis of large-scale information networks. We identify a set of primitives to support the extraction and inference of a network from observational data, and describe a framework that enables a network analyst to easily implement and combine new extraction and analysis techniques, and efficiently apply them to large observation networks. The key insight behind our approach is to decouple, to the extent possible, (a) the operations that require traversing the graph structure (typically the computationally expensive step), from (b) the operations that do the modification and update of the extracted network. We present an analysis language based on Datalog, and show how to use it to cleanly achieve such decoupling. We briefly describe our prototype system that supports these abstractions. We include a preliminary performance evaluation of the system and show that our approach scales well and can efficiently handle a wide spectrum of data cleaning operations on network data. JA - 2011 IEEE 27th International Conference on Data Engineering Workshops (ICDEW) PB - IEEE SN - 978-1-4244-9195-7 M3 - 10.1109/ICDEW.2011.5767619 ER - TY - CONF T1 - A Decomposition-based Approach to Modeling and Understanding Arbitrary Shapes T2 - Eurographics Italian Chapter Conference 2011 Y1 - 2011 A1 - Canino,D. A1 - De Floriani, Leila JA - Eurographics Italian Chapter Conference 2011 ER - TY - JOUR T1 - Design of Revolute Joints for In-Mold Assembly Using Insert Molding JF - Journal of Mechanical Design Y1 - 2011 A1 - Ananthanarayanan,A. A1 - Ehrlich,L. A1 - Desai,J. P. A1 - Gupta,S.K. VL - 133 ER - TY - CONF T1 - Detecting Structural Irregularity in Electronic Dictionaries Using Language Modeling T2 - Electronic lexicography in the 21st century: new applications for new users (eLEX2011) Y1 - 2011 A1 - Rodrigues,Paul A1 - Zajic, David A1 - Bloodgood,Michael A1 - Ye,Peng A1 - David Doermann AB - Dictionaries are often developed using Extensible Markup Language (XML)-based standards. Very often, these standards allow many high-level repeating elements to represent lexical entries, and utilize descendants of these repeating elements to represent the structure within each lexical entry, in the form of an XML tree. In many cases, dictionaries are published that have errors and inconsistencies that would be too expensive to find manually. This paper discusses a method for dictionary writers to audit structural regularity across entries in a dictionary, quickly, by using statistical language modelling. The approach learns the patterns of XML nodes that could occur within an XML tree, and then calculates the probability of each XML tree in the dictionary against these patterns to look for entries that diverge from the norm. JA - Electronic lexicography in the 21st century: new applications for new users (eLEX2011) ER - TY - JOUR T1 - Dimension-independent simplification and refinement of Morse complexes JF - Graphical Models Y1 - 2011 A1 - Čomić,Lidija A1 - De Floriani, Leila KW - Morse complexes KW - Morse theory KW - Refinement KW - shape modeling KW - simplification KW - Topological representations AB - Ascending and descending Morse complexes, determined by a scalar field f defined over a manifold M, induce a subdivision of M into regions associated with critical points of f, and compactly represent the topology of M. We define two simplification operators on Morse complexes, which work in arbitrary dimensions, and we define their inverse refinement operators. We describe how simplification and refinement operators affect Morse complexes on M, and we show that these operators form a complete set of atomic operators to create and update Morse complexes on M. Thus, any operator that modifies Morse complexes on M can be expressed as a suitable sequence of the atomic simplification and refinement operators we have defined. The simplification and refinement operators also provide a suitable basis for the construction of a multi-resolution representation of Morse complexes. VL - 73 SN - 1524-0703 UR - http://www.sciencedirect.com/science/article/pii/S1524070311000154 CP - 5 M3 - 10.1016/j.gmod.2011.05.001 ER - TY - RPRT T1 - Discrete Curvature Estimators: an Experimental Evaluation Y1 - 2011 A1 - Mesmoudi,M. M. A1 - De Floriani, Leila A1 - Magillo,P. AB - In this paper, we consider a surface, embedded in 3D space, which is discretely approximatedas triangle mesh. We define a method for estimating mean and Gaussian curvatures of such a discrete surface, based on simulating the analytic definition of mean and Gaussian curvatures in the continuum. Such method leads to what we call mean and Gaussian Ccurvatures. We present experiments evaluating such curvatures. PB - Department of Computer Science and Information Science, University of Genoa VL - DISI-TR-11-12 ER - TY - CONF T1 - Document Image Classification and Labeling using Multiple Instance Learning T2 - Intl. Conf. on Document Analysis and Recognition (ICDAR 11) Y1 - 2011 A1 - Kumar,Jayant A1 - Pillai,Jaishanker A1 - David Doermann AB - The labeling of large sets of images for training or testing analysis systems can be a very costly and time-consuming process. Multiple instance learning (MIL) is a generalization of traditional supervised learning which relaxes the need for exact labels on training instances. Instead, the labels are required only for a set of instances known as bags. In this paper, we apply MIL to the retrieval and localization of signatures and the retrieval of images containing machine-printed text, and show that a gain of 15-20% in performance can be achieved over the supervised learning with weak-labeling. We also compare our approach to supervised learning with fully annotated training data and report a competitive accuracy for MIL. Using our experiments on real-world datasets, we show that MIL is a good alternative when the training data has only document-level annotation. JA - Intl. Conf. on Document Analysis and Recognition (ICDAR 11) ER - TY - JOUR T1 - EAAI-10: The First Symposium on Educational Advances in Artificial Intelligence JF - AI Magazine Y1 - 2011 A1 - desJardins, Marie A1 - Sahami,Mehran A1 - Wagstaff,Kiri KW - education, AI, symposium VL - 32 SN - 0738-4602 UR - http://www.aaai.org/ojs/index.php/aimagazine/article/view/2323 CP - 1 M3 - 10.1609/aimag.v32i1.2323 ER - TY - CONF T1 - Educational advances in artificial intelligence T2 - Proceedings of the 42nd ACM technical symposium on Computer science education Y1 - 2011 A1 - Sahami,Mehran A1 - desJardins, Marie A1 - Dodds,Zachary A1 - Neller,Todd KW - artificial intelligence education KW - model AI assignments AB - In 2010 a new annual symposium on Educational Advances in Artificial Intelligence (EAAI) was launched as part of the AAAI annual meeting. The event was held in cooperation with ACM SIGCSE and has many similar goals related to broadening and disseminating work in computer science education. EAAI has a particular focus, however, as the event is specific to educational work in Artificial Intelligence and collocated with a major research conference (AAAI) to promote more interaction between researchers and educators in that domain. This panel seeks to introduce participants to EAAI as a way of fostering more interaction between educational communities in computing. Specifically, the panel will discuss the goals of EAAI, provide an overview of the kinds of work presented at the symposium, and identify potential synergies between that EAAI and SIGCSE as a way of better linking the two communities going forward. JA - Proceedings of the 42nd ACM technical symposium on Computer science education T3 - SIGCSE '11 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0500-6 UR - http://doi.acm.org/10.1145/1953163.1953189 M3 - 10.1145/1953163.1953189 ER - TY - CONF T1 - Efficient and secure threshold-based event validation for vanets Y1 - 2011 A1 - Hsiao, H. C. A1 - Studer, A. A1 - Dubey, R. A1 - Elaine Shi A1 - Perrig, A. AB - Determining whether the number of vehicles reporting anevent is above a threshold is an important mechanism for VANETs, because many applications rely on a threshold number of notifications to reach agreement among vehicles, to determine the validity of an event, or to prevent the abuse of emergency alarms. We present the first efficient and se- cure threshold-based event validation protocol for VANETs. Quite counter-intuitively, we found that the z-smallest ap- proach offers the best tradeoff between security and effi- ciency since other approaches perform better for probabilis- tic counting. Analysis and simulation shows that our pro- tocol provides > 99% accuracy despite the presence of at- tackers, collection and distribution of alerts in less than 1 second, and negligible impact on network performance. UR - http://www.eecs.berkeley.edu/~elaines/docs/hsiao_wisec2011.pdf ER - TY - JOUR T1 - EFFICIENT PARALLEL NON-NEGATIVE LEAST SQUARES ON MULTI-CORE ARCHITECTURES JF - SIAM Journal on Scientific Computing Y1 - 2011 A1 - Luo,Y. A1 - Duraiswami, Ramani AB - We parallelize a version of the active-set iterative algorithm derived from the originalworks of Lawson and Hanson [Solving Least Squares Problems, Prentice-Hall, 1974] on multicore architectures. This algorithm requires the solution of an unconstrained least squares problem in every step of the iteration for a matrix composed of the passive columns of the original system matrix. To achieve improved performance, we use parallelizable procedures to efficiently update and downdate the QR factorization of the matrix at each iteration, to account for inserted and removed columns. We use a reordering strategy of the columns in the decomposition to reduce computation and memory access costs. We consider graphics processing units (GPUs) as a new mode for efficient parallel computations and compare our implementations to that of multicore CPUs. Both synthetic and nonsynthetic data are used in the experiments. VL - 33 CP - 5 ER - TY - JOUR T1 - Energy efficient monitoring in sensor networks JF - Algorithmica Y1 - 2011 A1 - Deshpande, Amol A1 - Khuller, Samir A1 - Malekian,A. A1 - Toossi,M. AB - We study a set of problems related to efficient battery energy utilization for monitoring applications in a wireless sensor network with the goal to increase the sensor network lifetime. We study several generalizations of a basic problem called Set k-Cover. The problem can be described as follows: we are given a set of sensors, and a set of targets to be monitored. Each target can be monitored by a subset of the sensors. To increase the lifetime of the sensor network, we would like to partition the sensors into k sets (or time-slots), and activate each set of sensors in a different time-slot, thus extending the battery life of the sensors by a factor of k. The goal is to find a partitioning that maximizes the total coverage of the targets for a given k. This problem is known to be NP-hard. We develop an improved approximation algorithm for this problem using a reduction to Max k-Cut. Moreover, we are able to demonstrate that this algorithm is efficient, and yields almost optimal solutions in practice.We also consider generalizations of this problem in several different directions. First, we allow each sensor to be active in α different sets (time-slots). This means that the battery life is extended by a factor of k, and allows for a richer space of solutions. We also consider different coverage requirements, such as requiring that all targets, or at least a certain number of targets, be covered in each time slot. In the Set k-Cover formulation, there is no requirement that a target be monitored at all, or in any number of time slots. We develop a randomized rounding algorithm for this problem. We also consider extensions where each sensor can monitor only a bounded number of targets in any time-slot, and not all the targets adjacent to it. This kind of problem may arise when a sensor has a directional camera, or some other physical constraint might prevent it from monitoring all adjacent targets even when it is active. We develop the first approximation algorithms for this problem. VL - 59 CP - 1 M3 - 10.1007/s00453-010-9407-z ER - TY - CONF T1 - Evaluating visual and statistical exploration of scientific literature networks T2 - 2011 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) Y1 - 2011 A1 - Gove,R. A1 - Dunne,C. A1 - Shneiderman, Ben A1 - Klavans,J. A1 - Dorr, Bonnie J KW - abstracting KW - academic literature KW - action science explorer KW - automatic clustering KW - citation analysis KW - citation network visualization KW - Communities KW - Context KW - custom exploration goal KW - Data visualization KW - Databases KW - Document filtering KW - document handling KW - document ranking KW - easy-to-understand metrics KW - empirical evaluation KW - Google KW - Graphical user interfaces KW - Information filtering KW - Information Visualization KW - Libraries KW - literature exploration KW - network statistics KW - paper filtering KW - paper ranking KW - scientific literature network KW - statistical exploration KW - summarization technique KW - user-defined tasks KW - visual exploration KW - Visualization AB - Action Science Explorer (ASE) is a tool designed to support users in rapidly generating readily consumable summaries of academic literature. It uses citation network visualization, ranking and filtering papers by network statistics, and automatic clustering and summarization techniques. We describe how early formative evaluations of ASE led to a mature system evaluation, consisting of an in-depth empirical evaluation with four domain experts. The evaluation tasks were of two types: predefined tasks to test system performance in common scenarios, and user-defined tasks to test the system's usefulness for custom exploration goals. The primary contribution of this paper is a validation of the ASE design and recommendations to provide: easy-to-understand metrics for ranking and filtering documents, user control over which document sets to explore, and overviews of the document set in coordinated views along with details-on-demand of specific papers. We contribute a taxonomy of features for literature search and exploration tools and describe exploration goals identified by our participants. JA - 2011 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) PB - IEEE SN - 978-1-4577-1246-3 M3 - 10.1109/VLHCC.2011.6070403 ER - TY - CHAP T1 - Face Tracking and Recognition in Video T2 - Handbook of Face RecognitionHandbook of Face Recognition Y1 - 2011 A1 - Chellapa, Rama A1 - Du,Ming A1 - Turaga,Pavan A1 - Zhou,Shaohua Kevin ED - Li,Stan Z. ED - Jain,Anil K. AB - In this chapter, we describe the utility of videos in enhancing performance of image-based recognition tasks. We discuss a joint tracking-recognition framework that allows for using the motion information in a video to better localize and identify the person in the video using still galleries. We discuss how to jointly capture facial appearance and dynamics to obtain a parametric representation for video-to-video recognition. We discuss recognition in multi-camera networks where the probe and gallery both consist of multi-camera videos. Concluding remarks and directions for future research are provided. JA - Handbook of Face RecognitionHandbook of Face Recognition PB - Springer London SN - 978-0-85729-932-1 UR - http://dx.doi.org/10.1007/978-0-85729-932-1_13 ER - TY - CONF T1 - Face verification using large feature sets and one shot similarity T2 - Biometrics (IJCB), 2011 International Joint Conference on Y1 - 2011 A1 - Guo,Huimin A1 - Robson Schwartz,W. A1 - Davis, Larry S. KW - analysis;set KW - approximations;regression KW - descriptor;labeled KW - Face KW - feature KW - in KW - information;face KW - information;texture KW - least KW - LFW;PLS;PLS KW - recognition;least KW - regression;color KW - sets;one KW - shot KW - similarity;partial KW - squares KW - squares;shape KW - the KW - theory; KW - verification;facial KW - wild;large AB - We present a method for face verification that combines Partial Least Squares (PLS) and the One-Shot similarity model[28]. First, a large feature set combining shape, texture and color information is used to describe a face. Then PLS is applied to reduce the dimensionality of the feature set with multi-channel feature weighting. This provides a discriminative facial descriptor. PLS regression is used to compute the similarity score of an image pair by One-Shot learning. Given two feature vector representing face images, the One-Shot algorithm learns discriminative models exclusively for the vectors being compared. A small set of unlabeled images, not containing images belonging to the people being compared, is used as a reference (negative) set. The approach is evaluated on the Labeled Face in the Wild (LFW) benchmark and shows very comparable results to the state-of-the-art methods (achieving 86.12% classification accuracy) while maintaining simplicity and good generalization ability. JA - Biometrics (IJCB), 2011 International Joint Conference on M3 - 10.1109/IJCB.2011.6117498 ER - TY - CONF T1 - Fast Rule-line Removal using Integral Images and Support Vector Machines T2 - Intl. Conf. on Document Analysis and Recognition (ICDAR 11) Y1 - 2011 A1 - Kumar,Jayant A1 - David Doermann AB - In this paper, we present a fast and effective method for removing pre-printed rule-lines in handwritten document images. We use an integral-image representation which allows fast computation of features and apply techniques for large scale Support Vector learning using a data selection strategy to sample a small subset of training data. Results on both constructed and real-world data sets show that the method is effective for rule-line removal. We compare our method to a subspace-based method and show that better accuracy can be achieved in considerably less time. The integral-image based features proposed in the paper are generic and can be applied to other problems as well. JA - Intl. Conf. on Document Analysis and Recognition (ICDAR 11) ER - TY - JOUR T1 - Gene Coexpression Network Topology of Cardiac Development, Hypertrophy, and FailureClinical Perspective JF - Circulation: Cardiovascular GeneticsCirc Cardiovasc Genet Y1 - 2011 A1 - Dewey,Frederick E A1 - Perez,Marco V A1 - Wheeler,Matthew T A1 - Watt,Clifton A1 - Spin,Joshua A1 - Langfelder,Peter A1 - Horvath,Steve A1 - Hannenhalli, Sridhar A1 - Cappola,Thomas P. A1 - Ashley,Euan A KW - fetal KW - Gene expression KW - heart failure KW - hypertrophy KW - myocardium AB - Background— Network analysis techniques allow a more accurate reflection of underlying systems biology to be realized than traditional unidimensional molecular biology approaches. Using gene coexpression network analysis, we define the gene expression network topology of cardiac hypertrophy and failure and the extent of recapitulation of fetal gene expression programs in failing and hypertrophied adult myocardium.Methods and Results— We assembled all myocardial transcript data in the Gene Expression Omnibus (n=1617). Because hierarchical analysis revealed species had primacy over disease clustering, we focused this analysis on the most complete (murine) dataset (n=478). Using gene coexpression network analysis, we derived functional modules, regulatory mediators, and higher-order topological relationships between genes and identified 50 gene coexpression modules in developing myocardium that were not present in normal adult tissue. We found that known gene expression markers of myocardial adaptation were members of upregulated modules but not hub genes. We identified ZIC2 as a novel transcription factor associated with coexpression modules common to developing and failing myocardium. Of 50 fetal gene coexpression modules, 3 (6%) were reproduced in hypertrophied myocardium and 7 (14%) were reproduced in failing myocardium. One fetal module was common to both failing and hypertrophied myocardium. Conclusions— Network modeling allows systems analysis of cardiovascular development and disease. Although we did not find evidence for a global coordinated program of fetal gene expression in adult myocardial adaptation, our analysis revealed specific gene expression modules active during both development and disease and specific candidates for their regulation. VL - 4 SN - 1942-325X, 1942-3268 UR - http://circgenetics.ahajournals.org/content/4/1/26 CP - 1 M3 - 10.1161/CIRCGENETICS.110.941757 ER - TY - CONF T1 - GPU algorithms for diamond-based multiresolution terrain processing T2 - Eurographics Symposium on Parallel Graphics and Visualization Y1 - 2011 A1 - Yalçın,M. A. A1 - Weiss,K. A1 - De Floriani, Leila AB - We present parallel algorithms for processing, extracting and rendering adaptively sampled regular terraindatasets represented as a multiresolution model defined by a super-square-based diamond hierarchy. This model represents a terrain as a nested triangle mesh generated through a series of longest edge bisections and encoded in an implicit hierarchical structure, which clusters triangles into diamonds and diamonds into super-squares. We decompose the problem into three parallel algorithms for performing: generation of the diamond hierarchy from a regularly distributed terrain dataset, selective refinement on the diamond hierarchy and generation of the corre- sponding crack-free triangle mesh for processing and rendering. We avoid the data transfer bottleneck common to previous approaches by processing all data entirely on the GPU. We demonstrate that this parallel approach can be successfully applied to interactive terrain visualization with a high tessellation quality on commodity GPUs. JA - Eurographics Symposium on Parallel Graphics and Visualization ER - TY - JOUR T1 - Hawkeye and AMOS: Visualizing and Assessing the Quality of Genome Assemblies JF - Briefings in Bioinformatics Y1 - 2011 A1 - Schatz,Michael C A1 - Phillippy,Adam M A1 - Sommer,Daniel D A1 - Delcher,Arthur L. A1 - Puiu,Daniela A1 - Narzisi,Giuseppe A1 - Salzberg,Steven L. A1 - Pop, Mihai KW - assembly forensics KW - DNA Sequencing KW - genome assembly KW - visual analytics AB - Since its launch in 2004, the open-source AMOS project has released several innovative DNA sequence analysis applications including: Hawkeye, a visual analytics tool for inspecting the structure of genome assemblies; the Assembly Forensics and FRCurve pipelines for systematically evaluating the quality of a genome assembly; and AMOScmp, the first comparative genome assembler. These applications have been used to assemble and analyze dozens of genomes ranging in complexity from simple microbial species through mammalian genomes. Recent efforts have been focused on enhancing support for new data characteristics brought on by second- and now third-generation sequencing. This review describes the major components of AMOS in light of these challenges, with an emphasis on methods for assessing assembly quality and the visual analytics capabilities of Hawkeye. These interactive graphical aspects are essential for navigating and understanding the complexities of a genome assembly, from the overall genome structure down to individual bases. Hawkeye and AMOS are available open source at http://amos.sourceforge.net. SN - 1467-5463, 1477-4054 UR - http://bib.oxfordjournals.org/content/early/2011/12/23/bib.bbr074 M3 - 10.1093/bib/bbr074 ER - TY - CONF T1 - Helping Users Shop for ISPs with Internet Nutrition Labels T2 - ACM SIGCOMM Workshop on Home Networks '11 Y1 - 2011 A1 - Sundaresan, Srikanth A1 - Feamster, Nick A1 - Teixeira, Renata A1 - Tang, Anthony A1 - Edwards, W. Keith A1 - Grinter, Rebecca E. A1 - Marshini Chetty A1 - de Donato, Walter KW - access networks KW - benchmarking KW - bismark KW - broadband networks KW - gateway measurements AB - When purchasing home broadband access from Internet service providers (ISPs), users must decide which service plans are most appropriate for their needs. Today, ISPs advertise their available service plans using only generic upload and download speeds. Unfortunately, these metrics do not always accurately reflect the varying performance that home users will experience for a wide range of applications. In this paper, we propose that each ISP service plan carry a "nutrition label" that conveys more comprehensive information about network metrics along many dimensions, including various aspects of throughput, latency, loss rate, and jitter. We first justify why these metrics should form the basis of a network nutrition label. Then, we demonstrate that current plans that are superficially similar with respect to advertised download rates may have different performance according to the label metrics. We close with a discussion of the challenges involved in presenting a nutrition label to users in a way that is both accurate and easy to understand. JA - ACM SIGCOMM Workshop on Home Networks '11 T3 - HomeNets '11 PB - ACM SN - 978-1-4503-0798-7 UR - http://doi.acm.org/10.1145/2018567.2018571 ER - TY - JOUR T1 - A Hierarchical Algorithm for Fast Debye Summation with Applications to Small Angle Scattering JF - Technical Reports from UMIACS Y1 - 2011 A1 - Gumerov, Nail A. A1 - Berlin,Konstantin A1 - Fushman, David A1 - Duraiswami, Ramani KW - Technical Report AB - Debye summation, which involves the summation of sinc functions of distances between all pair of atoms in three dimensional space, arises in computations performed in crystallography, small/wide angle X-ray scattering (SAXS/WAXS) and small angle neutron scattering (SANS). Direct evaluation of Debye summation has quadratic complexity, which results in computational bottleneck when determining crystal properties, or running structure refinement protocols that involve SAXS or SANS, even for moderately sized molecules. We present a fast approximation algorithm that efficiently computes the summation to any prescribed accuracy epsilon in linear time. The algorithm is similar to the fast multipole method (FMM), and is based on a hierarchical spatial decomposition of the molecule coupled with local harmonic expansions and translation of these expansions. An even more efficient implementation is possible when the scattering profile is all that is required, as in small angle scattering reconstruction (SAS) of macromolecules. We examine the relationship of the proposed algorithm to existing approximate methods for profile computations, and provide detailed description of the algorithm, including error bounds and algorithms for stable computation of the translation operators. Our theoretical and computational results show orders of magnitude improvement in computation complexity over existing methods, while maintaining prescribed accuracy. UR - http://drum.lib.umd.edu/handle/1903/11857 ER - TY - JOUR T1 - IA*: An adjacency-based representation for non-manifold simplicial shapes in arbitrary dimensions JF - Computers & Graphics Y1 - 2011 A1 - Canino,David A1 - De Floriani, Leila A1 - Weiss,Kenneth KW - Non-manifold data structures KW - simplicial complexes KW - Topological data structures AB - We propose a compact, dimension-independent data structure for manifold, non-manifold and non-regular simplicial complexes, that we call the Generalized Indexed Data Structure with Adjacencies (IA⁎ data structure). It encodes only top simplices, i.e. the ones that are not on the boundary of any other simplex, plus a suitable subset of the adjacency relations. We describe the IA⁎ data structure in arbitrary dimensions, and compare the storage requirements of its 2D and 3D instances with both dimension-specific and dimension-independent representations. We show that the IA⁎ data structure is more cost effective than other dimension-independent representations and is even slightly more compact than the existing dimension-specific ones. We present efficient algorithms for navigating a simplicial complex described as an IA⁎ data structure. This shows that the IA⁎ data structure allows retrieving all topological relations of a given simplex by considering only its local neighborhood and thus it is a more efficient alternative to incidence-based representations when information does not need to be encoded for boundary simplices. VL - 35 SN - 0097-8493 UR - http://www.sciencedirect.com/science/article/pii/S0097849311000483 CP - 3 M3 - 10.1016/j.cag.2011.03.009 ER - TY - CONF T1 - Image ranking and retrieval based on multi-attribute queries T2 - Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Y1 - 2011 A1 - Siddiquie,B. A1 - Feris,R.S. A1 - Davis, Larry S. KW - datasets;image KW - datasets;PASCAL KW - faces KW - FaceTracer KW - in KW - methods;labeled KW - queries;image KW - ranking;image KW - retrieval KW - retrieval; KW - the KW - VOC KW - wild;multiattribute AB - We propose a novel approach for ranking and retrieval of images based on multi-attribute queries. Existing image retrieval methods train separate classifiers for each word and heuristically combine their outputs for retrieving multiword queries. Moreover, these approaches also ignore the interdependencies among the query terms. In contrast, we propose a principled approach for multi-attribute retrieval which explicitly models the correlations that are present between the attributes. Given a multi-attribute query, we also utilize other attributes in the vocabulary which are not present in the query, for ranking/retrieval. Furthermore, we integrate ranking and retrieval within the same formulation, by posing them as structured prediction problems. Extensive experimental evaluation on the Labeled Faces in the Wild(LFW), FaceTracer and PASCAL VOC datasets show that our approach significantly outperforms several state-of-the-art ranking and retrieval methods. JA - Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on M3 - 10.1109/CVPR.2011.5995329 ER - TY - JOUR T1 - Increased gene sampling provides stronger support for higher-level groups within gracillariid leaf mining moths and relatives (Lepidoptera: Gracillariidae) JF - BMC Evol Biol Y1 - 2011 A1 - Kawahara,A. Y A1 - Ohshima,I A1 - Kawakita,A A1 - Regier,J. C A1 - Mitter,C A1 - Cummings, Michael P. A1 - Davis,DR A1 - Wagner,DL A1 - De Prinis,J A1 - Lopez-Vaamonde,C VL - 11:182 ER - TY - JOUR T1 - Instrumenting home networks JF - SIGCOMM Comput. Commun. Rev. Y1 - 2011 A1 - Calvert,Kenneth L. A1 - Edwards,W. Keith A1 - Feamster, Nick A1 - Grinter,Rebecca E. A1 - Deng,Ye A1 - Zhou,Xuzi KW - home network management KW - home network troubleshooting AB - In managing and troubleshooting home networks, one of the challenges is in knowing what is actually happening. Availability of a record of events that occurred on the home network before trouble appeared would go a long way toward addressing that challenge. In this position/work-in-progress paper, we consider requirements for a general-purpose logging facility for home networks. Such a facility, if properly designed, would potentially have other uses. We describe several such uses and discuss requirements to be considered in the design of a logging platform that would be widely supported and accepted. We also report on our initial deployment of such a facility. VL - 41 SN - 0146-4833 UR - http://doi.acm.org/10.1145/1925861.1925875 CP - 1 M3 - 10.1145/1925861.1925875 ER - TY - JOUR T1 - An iterative algorithm for homology computation on simplicial shapes JF - Computer-Aided Design Y1 - 2011 A1 - Boltcheva,Dobrina A1 - Canino,David A1 - Merino Aceituno,Sara A1 - Léon,Jean-Claude A1 - De Floriani, Leila A1 - Hétroy,Franck KW - Computational topology KW - Generators KW - Mayer–Vietoris sequence KW - shape decomposition KW - simplicial complexes KW - Z -homology AB - We propose a new iterative algorithm for computing the homology of arbitrary shapes discretized through simplicial complexes. We demonstrate how the simplicial homology of a shape can be effectively expressed in terms of the homology of its sub-components. The proposed algorithm retrieves the complete homological information of an input shape including the Betti numbers, the torsion coefficients and the representative homology generators.To the best of our knowledge, this is the first algorithm based on the constructive Mayer–Vietoris sequence, which relates the homology of a topological space to the homologies of its sub-spaces, i.e. the sub-components of the input shape and their intersections. We demonstrate the validity of our approach through a specific shape decomposition, based only on topological properties, which minimizes the size of the intersections between the sub-components and increases the efficiency of the algorithm. VL - 43 SN - 0010-4485 UR - http://www.sciencedirect.com/science/article/pii/S0010448511002144 CP - 11 M3 - 10.1016/j.cad.2011.08.015 ER - TY - CONF T1 - Kernel partial least squares for speaker recognition T2 - Twelfth Annual Conference of the International Speech Communication Association Y1 - 2011 A1 - Srinivasan,B.V. A1 - Garcia-Romero,D. A1 - Zotkin,Dmitry N A1 - Duraiswami, Ramani AB - I-vectors are a concise representation of speaker characteristics. Recent advances in speaker recognition have utilized their ability to capture speaker and channel variability to develop efficient recognition engines. Inter-speaker relationships in the i-vector space are non-linear. Accomplishing effective speaker recognition requires a good modeling of these non-linearities and can be cast as a machine learning problem. In this paper, we propose a kernel partial least squares (kernel PLS, or KPLS) framework for modeling speakers in the i-vectors space. The resulting recognition system is tested across several conditions of the NIST SRE 2010 extended core data set and compared against state-of-the-art systems: Joint Factor Analysis (JFA), Probabilistic Linear Discriminant Analysis (PLDA), and Cosine Distance Scoring (CDS) classifiers. Improvements are shown. JA - Twelfth Annual Conference of the International Speech Communication Association ER - TY - CONF T1 - Kernel PLS regression for robust monocular pose estimation T2 - Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on Y1 - 2011 A1 - Dondera,R. A1 - Davis, Larry S. KW - (computer KW - 3D KW - analysis;rendering KW - correlations;projection KW - detection;monocular KW - detection;pose KW - estimation;Gaussian KW - estimation;nonlinear KW - estimation;regression KW - GP KW - graphics); KW - images;rendering KW - latent KW - monocular KW - PLS KW - pose KW - process;Kernel KW - processes;object KW - regression;Gaussian KW - regression;human KW - software;robust KW - structures;realistic KW - to AB - We evaluate the robustness of five regression techniques for monocular 3D pose estimation. While most of the discriminative pose estimation methods focus on overcoming the fundamental problem of insufficient training data, we are interested in characterizing performance improvement for increasingly large training sets. Commercially available rendering software allows us to efficiently generate large numbers of realistic images of poses from diverse actions. Inspired by recent work in human detection, we apply PLS and kPLS regression to pose estimation. We observe that kPLS regression incrementally approximates GP regression using the strongest nonlinear correlations between image features and pose. This provides robustness, and our experiments show kPLS regression is more robust than two GP-based state-of-the-art methods for pose estimation. JA - Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on M3 - 10.1109/CVPRW.2011.5981750 ER - TY - JOUR T1 - Kernelized Renyi distance for subset selection and similarity scoring JF - Technical Reports of the Computer Science Department Y1 - 2011 A1 - Srinivasan,Balaji Vasan A1 - Duraiswami, Ramani KW - Technical Report AB - Renyi entropy refers to a generalized class of entropies that have beenused in several applications. In this work, we derive a non-parametric distance between distributions based on the quadratic Renyi entropy. The distributions are estimated via Parzen density estimates. The quadratic complexity of the distance evaluation is mitigated with GPU-based parallelization. This results in an efficiently evaluated non-parametric entropic distance - the kernelized Renyi distance or the KRD. We adapt the KRD into a similarity measure and show its application to speaker recognition. We further extend KRD to measure dissimilarities between distributions and illustrate its applications to statistical subset selection and dictionary learning for object recognition and pose estimation. UR - http://drum.lib.umd.edu/handle/1903/12132 ER - TY - CONF T1 - A large-scale benchmark dataset for event recognition in surveillance video T2 - Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Y1 - 2011 A1 - Oh,Sangmin A1 - Hoogs, A. A1 - Perera,A. A1 - Cuntoor, N. A1 - Chen,Chia-Chih A1 - Lee,Jong Taek A1 - Mukherjee,S. A1 - Aggarwal, JK A1 - Lee,Hyungtae A1 - Davis, Larry S. A1 - Swears,E. A1 - Wang,Xioyang A1 - Ji,Qiang A1 - Reddy,K. A1 - Shah,M. A1 - Vondrick,C. A1 - Pirsiavash,H. A1 - Ramanan,D. A1 - Yuen,J. A1 - Torralba,A. A1 - Song,Bi A1 - Fong,A. A1 - Roy-Chowdhury, A. A1 - Desai,M. KW - algorithm;evaluation KW - CVER KW - databases; KW - databases;video KW - dataset;moving KW - event KW - metrics;large-scale KW - object KW - recognition KW - recognition;diverse KW - recognition;video KW - scenes;surveillance KW - surveillance;visual KW - tasks;computer KW - tracks;outdoor KW - video KW - video;computer KW - vision;continuous KW - vision;image KW - visual AB - We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoor areas with wide coverage. Previous datasets for action recognition are unrealistic for real-world surveillance because they consist of short clips showing one action by one individual [15, 8]. Datasets have been developed for movies [11] and sports [12], but, these actions and scene conditions do not apply effectively to surveillance videos. Our dataset consists of many outdoor scenes with actions occurring naturally by non-actors in continuously captured videos of the real world. The dataset includes large numbers of instances for 23 event types distributed throughout 29 hours of video. This data is accompanied by detailed annotations which include both moving object tracks and event examples, which will provide solid basis for large-scale evaluation. Additionally, we propose different types of evaluation modes for visual recognition tasks and evaluation metrics along with our preliminary experimental results. We believe that this dataset will stimulate diverse aspects of computer vision research and help us to advance the CVER tasks in the years ahead. JA - Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on M3 - 10.1109/CVPR.2011.5995586 ER - TY - CONF T1 - Learning a discriminative dictionary for sparse coding via label consistent K-SVD T2 - Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Y1 - 2011 A1 - Zhuolin Jiang A1 - Zhe Lin A1 - Davis, Larry S. KW - classification error KW - Dictionaries KW - dictionary learning process KW - discriminative sparse code error KW - face recognition KW - image classification KW - Image coding KW - K-SVD KW - label consistent KW - learning (artificial intelligence) KW - object category recognition KW - Object recognition KW - optimal linear classifier KW - reconstruction error KW - singular value decomposition KW - Training data AB - A label consistent K-SVD (LC-KSVD) algorithm to learn a discriminative dictionary for sparse coding is presented. In addition to using class labels of training data, we also associate label information with each dictionary item (columns of the dictionary matrix) to enforce discriminability in sparse codes during the dictionary learning process. More specifically, we introduce a new label consistent constraint called `discriminative sparse-code error' and combine it with the reconstruction error and the classification error to form a unified objective function. The optimal solution is efficiently obtained using the K-SVD algorithm. Our algorithm learns a single over-complete dictionary and an optimal linear classifier jointly. It yields dictionaries so that feature points with the same class labels have similar sparse codes. Experimental results demonstrate that our algorithm outperforms many recently proposed sparse coding techniques for face and object category recognition under the same learning conditions. JA - Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on M3 - 10.1109/CVPR.2011.5995354 ER - TY - JOUR T1 - Lightweight Graphical Models for Selectivity Estimation Without Independence Assumptions JF - Proceedings of the VLDB Endowment Y1 - 2011 A1 - Tzoumas,K. A1 - Deshpande, Amol A1 - Jensen,C. S AB - As a result of decades of research and industrial development, mod-ern query optimizers are complex software artifacts. However, the quality of the query plan chosen by an optimizer is largely deter- mined by the quality of the underlying statistical summaries. Small selectivity estimation errors, propagated exponentially, can lead to severely sub-optimal plans. Modern optimizers typically maintain one-dimensional statistical summaries and make the attribute value independence and join uniformity assumptions for efficiently esti- mating selectivities. Therefore, selectivity estimation errors in to- day’s optimizers are frequently caused by missed correlations be- tween attributes. We present a selectivity estimation approach that does not make the independence assumptions. By carefully using concepts from the field of graphical models, we are able to fac- tor the joint probability distribution of all the attributes in the data- base into small, usually two-dimensional distributions. We describe several optimizations that can make selectivity estimation highly efficient, and we present a complete implementation inside Post- greSQL’s query optimizer. Experimental results indicate an order of magnitude better selectivity estimates, while keeping optimiza- tion time in the range of tens of milliseconds. VL - 4 CP - 7 ER - TY - JOUR T1 - Linear versus Mel Frequency Cepstral Coefficients for Speaker Recognition JF - IEEE Automatic Speech Recognition and Understanding Workshop Y1 - 2011 A1 - Zhou,X. A1 - Garcia-Romero,D. A1 - Duraiswami, Ramani A1 - Espy-Wilson,C. A1 - Shamma,S. AB - Mel-frequency cepstral coefficients (MFCC) havebeen dominantly used in speaker recognition as well as in speech recognition. However, based on theories in speech production, some speaker characteristics associated with the structure of the vocal tract, particularly the vocal tract length, are reflected more in the high frequency range of speech. This insight suggests that a linear scale in frequency may provide some advantages in speaker recognition over the mel scale. Based on two state-of-the- art speaker recognition back-end systems (one Joint Factor Analysis system and one Probabilistic Linear Discriminant Analysis system), this study compares the performances between MFCC and LFCC (Linear frequency cepstral coefficients) in the NIST SRE (Speaker Recognition Evaluation) 2010 extended-core task. Our results in SRE10 show that, while they are complementary to each other, LFCC consistently outperforms MFCC, mainly due to its better performance in the female trials. This can be explained by the relatively shorter vocal tract in females and the resulting higher formant frequencies in speech. LFCC benefits more in female speech by better capturing the spectral characteristics in the high frequency region. In addition, our results show some advantage of LFCC over MFCC in reverberant speech. LFCC is as robust as MFCC in the babble noise, but not in the white noise. It is concluded that LFCC should be more widely used, at least for the female trials, by the mainstream of the speaker recognition community. ER - TY - JOUR T1 - Local Response Context Applied to Pedestrian Detection JF - Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications Y1 - 2011 A1 - Schwartz,W. A1 - Davis, Larry S. A1 - Pedrini,H. AB - Appearing as an important task in computer vision, pedestrian detection has been widely investigated in the recent years. To design a robust detector, we propose a feature descriptor called Local Response Context (LRC). This descriptor captures discriminative information regarding the surrounding of the person’s location by sampling the response map obtained by a generic sliding window detector. A partial least squares regression model using LRC descriptors is learned and employed as a second classification stage (after the execution of the generic detector to obtain the response map). Experiments based on the ETHZ pedestrian dataset show that the proposed approach improves significantly the results achieved by the generic detector alone and is comparable to the state-of-the-art methods. ER - TY - CHAP T1 - Machine Translation Evaluation and Optimization T2 - Handbook of Natural Language Processing and Machine TranslationHandbook of Natural Language Processing and Machine Translation Y1 - 2011 A1 - Dorr, Bonnie J A1 - Olive,Joseph A1 - McCary,John A1 - Christianson,Caitlin ED - Olive,Joseph ED - Christianson,Caitlin ED - McCary,John AB - The evaluation of machine translation (MT) systems is a vital field of research, both for determining the effectiveness of existing MT systems and for optimizing the performance of MT systems. This part describes a range of different evaluation approaches used in the GALE community and introduces evaluation protocols and methodologies used in the program. We discuss the development and use of automatic, human, task-based and semi-automatic (human-in-the-loop) methods of evaluating machine translation, focusing on the use of a human-mediated translation error rate HTER as the evaluation standard used in GALE. We discuss the workflow associated with the use of this measure, including post editing, quality control, and scoring. We document the evaluation tasks, data, protocols, and results of recent GALE MT Evaluations. In addition, we present a range of different approaches for optimizing MT systems on the basis of different measures. We outline the requirements and specific problems when using different optimization approaches and describe how the characteristics of different MT metrics affect the optimization. Finally, we describe novel recent and ongoing work on the development of fully automatic MT evaluation metrics that have the potential to substantially improve the effectiveness of evaluation and optimization of MT systems. JA - Handbook of Natural Language Processing and Machine TranslationHandbook of Natural Language Processing and Machine Translation PB - Springer New York SN - 978-1-4419-7713-7 UR - http://dx.doi.org/10.1007/978-1-4419-7713-7_5 ER - TY - CONF T1 - Maryland at FIRE 2011: Retrieval of OCRed Bengali T2 - FIRE Y1 - 2011 A1 - Garain,Utpal A1 - David Doermann A1 - Oard,Douglas D. AB - In this year's Forum for Information Retrieval Evaluation (FIRE), the University of Maryland participated in the Retrieval of Indic Script OCRed Text (RISOT) task to experiment with the retrieval of Bengali script OCR’d documents. The experiments focused on evaluating a retrieval strategy motivated by recent work on Cross-Language Information Retrieval (CLIR), but which makes use of OCR error modeling rather than parallel text alignment. The approach obtains a probability distribution over substitutions for the actual query terms that possibly correspond to terms in the document representation. The results reported indicate that this is a promising way of using OCR error modeling to improve CLIR. JA - FIRE ER - TY - CONF T1 - Maximizing Expected Utility for Stochastic Combinatorial Optimization Problems T2 - 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS) Y1 - 2011 A1 - Li,Jian A1 - Deshpande, Amol KW - Approximation algorithms KW - Approximation methods KW - combinatorial problems KW - Fourier series KW - knapsack problems KW - optimisation KW - OPTIMIZATION KW - polynomial approximation KW - polynomial time approximation algorithm KW - Polynomials KW - Random variables KW - stochastic combinatorial optimization KW - stochastic knapsack KW - stochastic shortest path KW - stochastic spanning tree KW - vectors AB - We study the stochastic versions of a broad class of combinatorial problems where the weights of the elements in the input dataset are uncertain. The class of problems that we study includes shortest paths, minimum weight spanning trees, and minimum weight matchings over probabilistic graphs, and other combinatorial problems like knapsack. We observe that the expected value is inadequate in capturing different types of risk averse or risk-prone behaviors, and instead we consider a more general objective which is to maximize the expected utility of the solution for some given utility function, rather than the expected weight (expected weight becomes a special case). We show that we can obtain a polynomial time approximation algorithm with additive error ϵ for any ϵ >; 0, if there is a pseudopolynomial time algorithm for the exact version of the problem (This is true for the problems mentioned above) and the maximum value of the utility function is bounded by a constant. Our result generalizes several prior results on stochastic shortest path, stochastic spanning tree, and stochastic knapsack. Our algorithm for utility maximization makes use of the separability of exponential utility and a technique to decompose a general utility function into exponential utility functions, which may be useful in other stochastic optimization problems. JA - 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science (FOCS) PB - IEEE SN - 978-1-4577-1843-4 M3 - 10.1109/FOCS.2011.33 ER - TY - PAT T1 - Method and System for Dereverberation of Signals Propagating in ... Y1 - 2011 A1 - O'donovan,Adam A1 - Duraiswami, Ramani A1 - Zotkin,Dmitry N ED - University of Maryland AB - The dereverberation of signals in reverberating environments is carried out via acquiring the representation (image) of spatial distribution of the signals in space of interest and automatic identification of reflections of the source signal in the reverberative space. The technique relies on identification of prominent features at the image, as well as corresponding directions of propagation of signals manifested by the prominent features at the image, and computation of similarity metric between signals corresponding to the prominent features in the image. The time delays between the correlated signals (i.e., source signal and related reflections) are found and the signals are added coherently. Multiple beamformers operate on the source signal and corresponding reflections, enabling one to improve the signal-to-noise ratio in multi-path environments. VL - 13/047,311 UR - http://www.google.com/patents?id=gBHuAQAAEBAJ ER - TY - CONF T1 - Model AI Assignments 2011 T2 - Second AAAI Symposium on Educational Advances in Artificial Intelligence Y1 - 2011 A1 - Neller,Todd William A1 - desJardins, Marie A1 - Oates,Tim A1 - Taylor,Matthew E AB - The Model AI Assignments session seeks to gather and disseminate the best assignment designs of the Artificial Intelligence (AI) Education community. Recognizing that assignments form the core of student learning experience, we here present abstracts of three AI assignments from the 2011 session that are easily adoptable, playfully engaging, and flexible for a variety of instructor needs. JA - Second AAAI Symposium on Educational Advances in Artificial Intelligence UR - http://www.aaai.org/ocs/index.php/EAAI/EAAI11/paper/viewPaper/3452 ER - TY - JOUR T1 - Modeling Multiresolution 3D Scalar Fields through Regular Simplex Bisection JF - Scientific Visualization: Interactions, Features, Metaphors Y1 - 2011 A1 - Weiss,K. A1 - De Floriani, Leila A1 - Hagen,H. AB - We review modeling techniques for multiresolution three-dimensional scalar fields based on a dis-cretization of the field domain into nested tetrahedral meshes generated through regular simplex bisection. Such meshes are described through hierarchical data structures and their representa- tion is characterized by the modeling primitive used. The primary conceptual distinction among the different approaches proposed in the literature is whether they treat tetrahedra or clusters of tetrahedra, called diamonds, as the modeling primitive. We first focus on representations for the modeling primitive and for nested meshes. Next, we survey the applications of these meshes to modeling multiresolution 3D scalar fields, with an emphasis on interactive visualization. We also consider the relationship of such meshes to octrees. Finally, we discuss directions for further research VL - 2 ER - TY - CONF T1 - MOMMIE Knows Best: Systematic Optimizations for Verifiable Distributed Algorithms T2 - HotOS'13 Proceedings of the 13th USENIX Conference on Hot Topics in Operating Systems Y1 - 2011 A1 - Maniatis, Petros A1 - Dietz, Michael A1 - Charalampos Papamanthou JA - HotOS'13 Proceedings of the 13th USENIX Conference on Hot Topics in Operating Systems T3 - HotOS'13 PB - USENIX Association UR - http://dl.acm.org/citation.cfm?id=1991596.1991636 ER - TY - CONF T1 - Multi-agent event recognition in structured scenarios T2 - Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Y1 - 2011 A1 - Morariu,V.I. A1 - Davis, Larry S. KW - Allen KW - analysis;Markov KW - descriptions;video KW - event KW - grounding KW - inference;semantic KW - interval KW - Logic KW - logic;interval-based KW - logic;Markov KW - logic;multi-agent KW - logical KW - networks;bottom-up KW - processes;formal KW - processing; KW - reasoning;multiagent KW - reasoning;video KW - recognition;probabilistic KW - recognition;temporal KW - scheme;first-order KW - signal KW - spatio-temporal KW - systems;object KW - temporal AB - We present a framework for the automatic recognition of complex multi-agent events in settings where structure is imposed by rules that agents must follow while performing activities. Given semantic spatio-temporal descriptions of what generally happens (i.e., rules, event descriptions, physical constraints), and based on video analysis, we determine the events that occurred. Knowledge about spatio-temporal structure is encoded using first-order logic using an approach based on Allen's Interval Logic, and robustness to low-level observation uncertainty is provided by Markov Logic Networks (MLN). Our main contribution is that we integrate interval-based temporal reasoning with probabilistic logical inference, relying on an efficient bottom-up grounding scheme to avoid combinatorial explosion. Applied to one-on-one basketball, our framework detects and tracks players, their hands and feet, and the ball, generates event observations from the resulting trajectories, and performs probabilistic logical inference to determine the most consistent sequence of events. We demonstrate our approach on 1hr (100,000 frames) of outdoor videos. JA - Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on M3 - 10.1109/CVPR.2011.5995386 ER - TY - CONF T1 - NetVisia: Heat Map & Matrix Visualization of Dynamic Social Network Statistics & Content T2 - Privacy, Security, Risk and Trust (PASSAT), 2011 IEEE Third International Conference on and 2011 IEEE Third International Confernece on Social Computing (SocialCom) Y1 - 2011 A1 - Gove,R. A1 - Gramsky,N. A1 - Kirby,R. A1 - Sefer,E. A1 - Sopan,A. A1 - Dunne,C. A1 - Shneiderman, Ben A1 - Taieb-Maimon,M. KW - business intelligence concept KW - business intelligence entity KW - competitive intelligence KW - data visualisation KW - dynamic networks KW - dynamic social network KW - heat map KW - Heating KW - Image color analysis KW - Information Visualization KW - Layout KW - matrix visualization KW - measurement KW - NetVisia system KW - network evolution KW - network visualization KW - node-link diagrams KW - outlier node KW - social network content KW - Social network services KW - social network statistics KW - social networking (online) KW - social networks KW - static network visualization KW - time period KW - topological similarity KW - Training KW - usability KW - user evaluation KW - User interfaces AB - Visualizations of static networks in the form of node-link diagrams have evolved rapidly, though researchers are still grappling with how best to show evolution of nodes over time in these diagrams. This paper introduces NetVisia, a social network visualization system designed to support users in exploring temporal evolution in networks by using heat maps to display node attribute changes over time. NetVisia's novel contributions to network visualizations are to (1) cluster nodes in the heat map by similar metric values instead of by topological similarity, and (2) align nodes in the heat map by events. We compare NetVisia to existing systems and describe a formative user evaluation of a NetVisia prototype with four participants that emphasized the need for tool tips and coordinated views. Despite the presence of some usability issues, in 30-40 minutes the user evaluation participants discovered new insights about the data set which had not been discovered using other systems. We discuss implemented improvements to NetVisia, and analyze a co-occurrence network of 228 business intelligence concepts and entities. This analysis confirms the utility of a clustered heat map to discover outlier nodes and time periods. JA - Privacy, Security, Risk and Trust (PASSAT), 2011 IEEE Third International Conference on and 2011 IEEE Third International Confernece on Social Computing (SocialCom) PB - IEEE SN - 978-1-4577-1931-8 M3 - 10.1109/PASSAT/SocialCom.2011.216 ER - TY - JOUR T1 - Network Clustering Approximation Algorithm Using One Pass Black Box Sampling JF - arXiv:1110.3563 Y1 - 2011 A1 - DuBois,Thomas A1 - Golbeck,Jennifer A1 - Srinivasan, Aravind KW - Computer Science - Social and Information Networks KW - Physics - Physics and Society AB - Finding a good clustering of vertices in a network, where vertices in the same cluster are more tightly connected than those in different clusters, is a useful, important, and well-studied task. Many clustering algorithms scale well, however they are not designed to operate upon internet-scale networks with billions of nodes or more. We study one of the fastest and most memory efficient algorithms possible - clustering based on the connected components in a random edge-induced subgraph. When defining the cost of a clustering to be its distance from such a random clustering, we show that this surprisingly simple algorithm gives a solution that is within an expected factor of two or three of optimal with either of two natural distance functions. In fact, this approximation guarantee works for any problem where there is a probability distribution on clusterings. We then examine the behavior of this algorithm in the context of social network trust inference. UR - http://arxiv.org/abs/1110.3563 ER - TY - CONF T1 - No-reference image quality assessment based on visual codebook T2 - Image Processing (ICIP), 2011 18th IEEE International Conference on Y1 - 2011 A1 - Ye,Peng A1 - David Doermann KW - assessment;quality KW - codebook;Gabor KW - descriptors;complex KW - estimation;visual KW - extraction;image KW - filter;appearance KW - filters;feature KW - gabor KW - image KW - image;no-reference KW - patches;natural KW - QUALITY KW - statistics;local KW - texture; AB - In this paper, we propose a new learning based No-Reference Image Quality Assessment (NR-IQA) algorithm, which uses a visual codebook consisting of robust appearance descriptors extracted from local image patches to capture complex statistics of natural image for quality estimation. We use Gabor filter based local features as appearance descriptors and the codebook method encodes the statistics of natural image classes by vector quantizing the feature space and accumulating histograms of patch appearances based on this coding. This method does not assume any specific types of distortion and experimental results on the LIVE image quality assessment database show that this method provides consistent and reliable performance in quality estimation that exceeds other state-of-the-art NR-IQA approaches and is competitive with the full reference measure PSNR. JA - Image Processing (ICIP), 2011 18th IEEE International Conference on M3 - 10.1109/ICIP.2011.6116318 ER - TY - CONF T1 - A novel feature descriptor based on the shearlet transform T2 - Image Processing (ICIP), 2011 18th IEEE International Conference on Y1 - 2011 A1 - Schwartz, W.R. A1 - da Silva,R.D. A1 - Davis, Larry S. A1 - Pedrini,H. KW - analysis;multiscale KW - analysis;object KW - classification;face KW - classification;image KW - classification;intensity KW - coefficients;image KW - descriptor;feature KW - detection;object KW - distribution KW - edge KW - EXTRACTION KW - extraction;image KW - gradient KW - gradients;histograms KW - identification;feature KW - methods;histograms KW - of KW - orientations;face KW - oriented KW - recognition;feature KW - recognition;shearlet KW - recognition;transforms; KW - shearlet KW - singularities;texture KW - texture;object KW - transform;signal AB - Problems such as image classification, object detection and recognition rely on low-level feature descriptors to represent visual information. Several feature extraction methods have been proposed, including the Histograms of Oriented Gradients (HOG), which captures edge information by analyzing the distribution of intensity gradients and their directions. In addition to directions, the analysis of edge at different scales provides valuable information. Shearlet transforms provide a general framework for analyzing and representing data with anisotropic information at multiple scales. As a consequence, signal singularities, such as edges, can be precisely detected and located in images. Based on the idea of employing histograms to estimate the distribution of edge orientations and on the accurate multi-scale analysis provided by shearlet transforms, we propose a feature descriptor called Histograms of Shearlet Coefficients (HSC). Experimental results comparing HOG with HSC show that HSC provides significantly better results for the problems of texture classification and face identification. JA - Image Processing (ICIP), 2011 18th IEEE International Conference on M3 - 10.1109/ICIP.2011.6115600 ER - TY - CONF T1 - NSF/IEEE-TCPP curriculum initiative on parallel and distributed computing: core topics for undergraduates T2 - Proceedings of the 42nd ACM technical symposium on Computer science education Y1 - 2011 A1 - Prasad,S. K. A1 - Chtchelkanova,A. A1 - Das,S. A1 - Dehne,F. A1 - Gouda,M. A1 - Gupta,A. A1 - JaJa, Joseph F. A1 - Kant,K. A1 - La Salle,A. A1 - LeBlanc,R. A1 - others JA - Proceedings of the 42nd ACM technical symposium on Computer science education ER - TY - JOUR T1 - Object Detection and Tracking for Intelligent Video Surveillance JF - Multimedia Analysis, Processing and Communications Y1 - 2011 A1 - Kim,K. A1 - Davis, Larry S. AB - Appearing as an important task in computer vision, pedestrian detection has been widely investigated in the recent years. To design a robust detector, we propose a feature descriptor called Local Response Context (LRC). This descriptor captures discriminative information regarding the surrounding of the person’s location by sampling the response map obtained by a generic sliding window detector. A partial least squares regression model using LRC descriptors is learned and employed as a second classification stage (after the execution of the generic detector to obtain the response map). Experiments based on the ETHZ pedestrian dataset show that the proposed approach improves significantly the results achieved by the generic detector alone and is comparable to the state-of-the-art methods. ER - TY - CONF T1 - Offline Writer Identification using K-Adjacent Segments T2 - International Conference on Document Analysis and Recognition Y1 - 2011 A1 - Jain,Rajiv A1 - David Doermann AB - This paper presents a method for performing offline writer identification by using K-adjacent segment (KAS) features in a bag-of-features framework to model a user’s handwriting. This approach achieves a top 1 recognition rate of 93% on the benchmark IAMEnglish handwriting dataset, which outperforms current state of the art features. Results further demonstrate that identification performance improves as the number of training samples increase, and additionally, that the performance of the KAS features extend to Arabic handwriting found in the MADCAT dataset. JA - International Conference on Document Analysis and Recognition ER - TY - CONF T1 - Overview of the FIRE 2011 RISOTTask T2 - FIRE Y1 - 2011 A1 - Garain,Utpal A1 - Paik,Jiaul A1 - Pal,Tamaltaru A1 - Majumder,Prasenjit A1 - David Doermann A1 - Oard, Douglas AB - RISOT was a pilot task in FIRE 2011 which focused on the retrieval of automatically recognized text from machine printed sources. The collection used for search was a subset of the FIRE 2008 and 2010 Bengali test collections that contained 92 topics and 62,825 documents. Two teams participated, submitting a total of 11 monolingual runs. JA - FIRE ER - TY - JOUR T1 - Partial least squares based speaker recognition system JF - Snowbird Learning Workshop Y1 - 2011 A1 - Srinivasan,B.V. A1 - Zotkin,Dmitry N A1 - Duraiswami, Ramani ER - TY - CONF T1 - A partial least squares framework for speaker recognition T2 - Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on Y1 - 2011 A1 - Srinivasan,B.V. A1 - Zotkin,Dmitry N A1 - Duraiswami, Ramani KW - approximations;speaker KW - attribute KW - background KW - GMM;Gaussian KW - least KW - mixture KW - model;Gaussian KW - model;NIST KW - modeling KW - processes;least KW - recognition; KW - recognition;speaker KW - separability;latent KW - squares KW - squares;partial-least-squares;speaker KW - SRE;interclass KW - technique;multiple KW - utterances;nuisance KW - variability;partial KW - variable KW - verification;universal AB - Modern approaches to speaker recognition (verification) operate in a space of "supervectors" created via concatenation of the mean vectors of a Gaussian mixture model (GMM) adapted from a universal background model (UBM). In this space, a number of approaches to model inter-class separability and nuisance attribute variability have been proposed. We develop a method for modeling the variability associated with each class (speaker) by using partial-least-squares - a latent variable modeling technique, which isolates the most informative subspace for each speaker. The method is tested on NIST SRE 2008 data and provides promising results. The method is shown to be noise-robust and to be able to efficiently learn the subspace corresponding to a speaker on training data consisting of multiple utterances. JA - Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on M3 - 10.1109/ICASSP.2011.5947548 ER - TY - PAT T1 - Photo-based mobile deixis system and related techniques Y1 - 2011 A1 - Darrell,Trevor J A1 - Tom Yeh A1 - Tollmar,Konrad ED - Massachusetts Institute of Technology AB - A mobile deixis device includes a camera to capture an image and a wireless handheld device, coupled to the camera and to a wireless network, to communicate the image with existing databases to find similar images. The mobile deixis device further includes a processor, coupled to the device, to process found database records related to similar images and a display to view found database records that include web pages including images. With such an arrangement, users can specify a location of interest by simply pointing a camera-equipped cellular phone at the object of interest and by searching an image database or relevant web resources, users can quickly identify good matches from several close ones to find an object of interest. VL - 10/762,941 UR - http://www.google.com/patents?id=jeXwAAAAEBAJ CP - 7872669 ER - TY - CONF T1 - Piecing together the segmentation jigsaw using context T2 - Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Y1 - 2011 A1 - Chen,Xi A1 - Jain, A. A1 - Gupta,A. A1 - Davis, Larry S. KW - algorithms;image KW - approximation KW - formulation;greedy KW - function KW - information;cost KW - manner;image KW - programming;approximation KW - recognition;image KW - segmentation; KW - segmentation;jigsaw KW - segmentation;quadratic KW - solution;contextual KW - theory;greedy AB - We present an approach to jointly solve the segmentation and recognition problem using a multiple segmentation framework. We formulate the problem as segment selection from a pool of segments, assigning each selected segment a class label. Previous multiple segmentation approaches used local appearance matching to select segments in a greedy manner. In contrast, our approach formulates a cost function based on contextual information in conjunction with appearance matching. This relaxed cost function formulation is minimized using an efficient quadratic programming solver and an approximate solution is obtained by discretizing the relaxed solution. Our approach improves labeling performance compared to other segmentation based recognition approaches. JA - Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on M3 - 10.1109/CVPR.2011.5995367 ER - TY - CONF T1 - Playing to Program: Towards an Intelligent Programming Tutor for RUR-PLE T2 - Second AAAI Symposium on Educational Advances in Artificial Intelligence Y1 - 2011 A1 - desJardins, Marie A1 - Ciavolino,Amy A1 - Deloatch,Robert A1 - Feasley,Eliana AB - Intelligent tutoring systems (ITSs) provide students with a one-on-one tutor, allowing them to work at their own pace, and helping them to focus on their weaker areas. The RUR1–Python Learning Environment (RUR-PLE), a game-like virtual environment to help students learn to program, provides an interface for students to write their own Python code and visualize the code execution (Roberge 2005). RUR-PLE provides a fixed sequence of learning lessons for students to explore. We are extending RUR-PLE to develop the Playing to Program (PtP) ITS, which consists of three components: (1) a Bayesian student model that tracks student competence, (2) a diagnosis module that provides tailored feedback to students, and (3) a problem selection module that guides the student’s learning process. In this paper, we summarize RUR-PLE and the PtP design, and describe an ongoing user study to evaluate the predictive accuracy of our student modeling approach. JA - Second AAAI Symposium on Educational Advances in Artificial Intelligence UR - http://www.aaai.org/ocs/index.php/EAAI/EAAI11/paper/viewPaper/3497 ER - TY - CONF T1 - Predicting Trust and Distrust in Social Networks T2 - Privacy, Security, Risk and Trust (PASSAT), 2011 IEEE Third International Conference on and 2011 IEEE Third International Confernece on Social Computing (SocialCom) Y1 - 2011 A1 - DuBois,T. A1 - Golbeck,J. A1 - Srinivasan, Aravind KW - distrust prediction KW - Electronic publishing KW - Encyclopedias KW - graph theory KW - inference algorithm KW - Inference algorithms KW - inference mechanisms KW - Internet KW - negative trust KW - online social networks KW - positive trust KW - Prediction algorithms KW - probability KW - random graphs KW - security of data KW - social media KW - social networking (online) KW - spring-embedding algorithm KW - Training KW - trust inference KW - trust probabilistic interpretation KW - user behavior KW - user satisfaction KW - user-generated content KW - user-generated interactions AB - As user-generated content and interactions have overtaken the web as the default mode of use, questions of whom and what to trust have become increasingly important. Fortunately, online social networks and social media have made it easy for users to indicate whom they trust and whom they do not. However, this does not solve the problem since each user is only likely to know a tiny fraction of other users, we must have methods for inferring trust - and distrust - between users who do not know one another. In this paper, we present a new method for computing both trust and distrust (i.e., positive and negative trust). We do this by combining an inference algorithm that relies on a probabilistic interpretation of trust based on random graphs with a modified spring-embedding algorithm. Our algorithm correctly classifies hidden trust edges as positive or negative with high accuracy. These results are useful in a wide range of social web applications where trust is important to user behavior and satisfaction. JA - Privacy, Security, Risk and Trust (PASSAT), 2011 IEEE Third International Conference on and 2011 IEEE Third International Confernece on Social Computing (SocialCom) PB - IEEE SN - 978-1-4577-1931-8 M3 - 10.1109/PASSAT/SocialCom.2011.56 ER - TY - CONF T1 - The PR-star octree: a spatio-topological data structure for tetrahedral meshes T2 - Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems Y1 - 2011 A1 - Weiss,Kenneth A1 - De Floriani, Leila A1 - Fellegara,Riccardo A1 - Velloso,Marcelo AB - We propose the PR-star octree as a combined spatial data structure for performing efficient topological queries on tetrahedral meshes. The PR-star octree augments the Point Region octree (PR Octree) with a list of tetrahedra incident to its indexed vertices, i.e. those in the star of its vertices. Thus, each leaf node encodes the minimal amount of information necessary to locally reconstruct the topological connectivity of its indexed elements. This provides the flexibility to efficiently construct the optimal data structure to solve the task at hand using a fraction of the memory required for a corresponding data structure on the global tetrahedral mesh. Due to the spatial locality of successive queries in typical GIS applications, the construction costs of these runtime data structures are amortized over multiple accesses while processing each node. We demonstrate the advantages of the PR-star octree representation in several typical GIS applications, including detection of the domain boundaries, computation of local curvature estimates and mesh simplification. JA - Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems T3 - GIS '11 PB - ACM CY - New York, NY, USA SN - 978-1-4503-1031-4 UR - http://doi.acm.org/10.1145/2093973.2093987 M3 - 10.1145/2093973.2093987 ER - TY - JOUR T1 - Rapid understanding of scientific paper collections: integrating statistics, text analysis, and visualization JF - University of Maryland, Human-Computer Interaction Lab Tech Report HCIL-2011 Y1 - 2011 A1 - Dunne,C. A1 - Shneiderman, Ben A1 - Gove,R. A1 - Klavans,J. A1 - Dorr, Bonnie J AB - Keeping up with rapidly growing research fields, especially when there aremultiple interdisciplinary sources, requires substantial effort for researchers, program managers, or venture capital investors. Current theories and tools are directed at finding a paper or website, not gaining an understanding of the key papers, authors, controversies, and hypotheses. This report presents an effort to integrate statistics, text analytics, and visualization in a multiple coordinated window environment that supports exploration. Our prototype system, Action Science Explorer (ASE), provides an environment for demon- strating principles of coordination and conducting iterative usability tests of them with interested and knowledgeable users. We developed an under- standing of the value of reference management, statistics, citation context extraction, natural language summarization for single and multiple docu- ments, filters to interactively select key papers, and network visualization to see citation patterns and identify clusters. The three-phase usability study guided our revisions to ASE and led us to improve the testing methods. ER - TY - JOUR T1 - Role of Zooplankton Diversity in Vibrio Cholerae Population Dynamics and in the Incidence of Cholera in the Bangladesh Sundarbans JF - Applied and Environmental Microbiology Y1 - 2011 A1 - De Magny,Guillaume Constantin A1 - Mozumder,Pronob K. A1 - Grim,Christopher J. A1 - Hasan,Nur A. A1 - Naser,M. Niamul A1 - Alam,Munirul A1 - Sack,R. Bradley A1 - Huq,Anwar A1 - Rita R Colwell AB - Vibrio cholerae, a bacterium autochthonous to the aquatic environment, is the causative agent of cholera, a severe watery, life-threatening diarrheal disease occurring predominantly in developing countries. V. cholerae, including both serogroups O1 and O139, is found in association with crustacean zooplankton, mainly copepods, and notably in ponds, rivers, and estuarine systems globally. The incidence of cholera and occurrence of pathogenic V. cholerae strains with zooplankton were studied in two areas of Bangladesh: Bakerganj and Mathbaria. Chitinous zooplankton communities of several bodies of water were analyzed in order to understand the interaction of the zooplankton population composition with the population dynamics of pathogenic V. cholerae and incidence of cholera. Two dominant zooplankton groups were found to be consistently associated with detection of V. cholerae and/or occurrence of cholera cases, namely, rotifers and cladocerans, in addition to copepods. Local differences indicate there are subtle ecological factors that can influence interactions between V. cholerae, its plankton hosts, and the incidence of cholera. VL - 77 SN - 0099-2240, 1098-5336 UR - http://aem.asm.org/content/77/17/6125 CP - 17 M3 - 10.1128/AEM.01472-10 ER - TY - CONF T1 - Scalable fast multipole methods on distributed heterogeneous architectures T2 - High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for Y1 - 2011 A1 - Hu,Qi A1 - Gumerov, Nail A. A1 - Duraiswami, Ramani KW - accelerators;OpenMP;analysis KW - algorithm;iterative KW - and KW - architecture;CUDA;FMM KW - architectures; KW - architectures;divide-and-conquer KW - based KW - conquer KW - CPU-GPU KW - CPU;scalable KW - data KW - fast KW - heterogeneous KW - loop;data KW - loop;multicore KW - methods;graphics KW - methods;multiprocessing KW - methods;time KW - multipole KW - parts;distributed KW - PROCESSING KW - stepping KW - structures;divide KW - structures;GPU KW - systems;parallel KW - translation KW - units;iterative AB - We fundamentally reconsider implementation of the Fast Multipole Method (FMM) on a computing node with a heterogeneous CPU-GPU architecture with multicore CPU(s) and one or more GPU accelerators, as well as on an interconnected cluster of such nodes. The FMM is a divide- and-conquer algorithm that performs a fast N-body sum using a spatial decomposition and is often used in a time- stepping or iterative loop. Using the observation that the local summation and the analysis-based translation parts of the FMM are independent, we map these respectively to the GPUs and CPUs. Careful analysis of the FMM is performed to distribute work optimally between the multicore CPUs and the GPU accelerators. We first develop a single node version where the CPU part is parallelized using OpenMP and the GPU version via CUDA. New parallel algorithms for creating FMM data structures are presented together with load balancing strategies for the single node and distributed multiple-node versions. Our implementation can perform the N-body sum for 128M particles on 16 nodes in 4.23 seconds, a performance not achieved by others in the literature on such clusters. JA - High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for ER - TY - CHAP T1 - Secure Efficient Multiparty Computing of Multivariate Polynomials and Applications T2 - Applied Cryptography and Network Security Y1 - 2011 A1 - Dana Dachman-Soled A1 - Malkin, Tal A1 - Raykova, Mariana A1 - Yung, Moti ED - Lopez, Javier ED - Tsudik, Gene KW - additive homomorphic encryption KW - Algorithm Analysis and Problem Complexity KW - Computer Communication Networks KW - Data Encryption KW - Discrete Mathematics in Computer Science KW - Management of Computing and Information Systems KW - multiparty set intersection KW - multivariate polynomial evaluation KW - secret sharing KW - secure multiparty computation KW - Systems and Data Security KW - threshold cryptosystems AB - We present a robust secure methodology for computing functions that are represented as multivariate polynomials where parties hold different variables as private inputs. Our generic efficient protocols are fully black-box and employ threshold additive homomorphic encryption; they do not assume honest majority, yet are robust in detecting any misbehavior. We achieve solutions that take advantage of the algebraic structure of the polynomials, and are polynomial-time in all parameters (security parameter, polynomial size, polynomial degree, number of parties). We further exploit a “round table” communication paradigm to reduce the complexity in the number of parties. A large collection of problems are naturally and efficiently represented as multivariate polynomials over a field or a ring: problems from linear algebra, statistics, logic, as well as operations on sets represented as polynomials. In particular, we present a new efficient solution to the multi-party set intersection problem, and a solution to a multi-party variant of the polynomial reconstruction problem. JA - Applied Cryptography and Network Security T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-21553-7, 978-3-642-21554-4 UR - http://link.springer.com/chapter/10.1007/978-3-642-21554-4_8 ER - TY - CONF T1 - Segmentation of Handwritten Textlines in Presence of Touching Components T2 - Intl. Conf. on Document Analysis and Recognition (ICDAR 11) Y1 - 2011 A1 - Kumar,Jayant A1 - Kang,Le A1 - David Doermann A1 - Abd-Almageed, Wael AB - This paper presents an approach to textline extraction in handwritten document images which combines local and global techniques. We propose a graph-based technique to detect touching and proximity errors that are common with handwritten text lines. In a refinement step, we use Expectation-Maximization (EM) to iteratively split the error segments to obtain correct text-lines. We show improvement in accuracies using our correction method on datasets of Arabic document images. Results on a set of artificially generated proximity images show that the method is effective for handling touching errors in handwritten document images. JA - Intl. Conf. on Document Analysis and Recognition (ICDAR 11) ER - TY - CONF T1 - Selective transfer between learning tasks using task-based boosting T2 - Twenty-Fifth AAAI Conference on Artificial Intelligence Y1 - 2011 A1 - Eaton,E. A1 - desJardins, Marie A1 - others JA - Twenty-Fifth AAAI Conference on Artificial Intelligence ER - TY - CONF T1 - Sensitivity analysis and explanations for robust query evaluation in probabilistic databases T2 - Proceedings of the 2011 international conference on Management of data Y1 - 2011 A1 - Kanagal,B. A1 - Li,J. A1 - Deshpande, Amol AB - Probabilistic database systems have successfully established them-selves as a tool for managing uncertain data. However, much of the research in this area has focused on efficient query evaluation and has largely ignored two key issues that commonly arise in uncer- tain data management: First, how to provide explanations for query results, e.g., “Why is this tuple in my result?” or “Why does this output tuple have such high probability?”. Second, the problem of determining the sensitive input tuples for the given query, e.g., users are interested to know the input tuples that can substantially alter the output, when their probabilities are modified (since they may be unsure about the input probability values). Existing systems pro- vide the lineage/provenance of each of the output tuples in addition to the output probabilities, which is a boolean formula indicating the dependence of the output tuple on the input tuples. However, lineage does not immediately provide a quantitative relationship and it is not informative when we have multiple output tuples. In this paper, we propose a unified framework that can handle both the issues mentioned above to facilitate robust query processing. We formally define the notions of influence and explanations and provide algorithms to determine the top-l influential set of variables and the top-l set of explanations for a variety of queries, including conjunctive queries, probabilistic threshold queries, top-k queries and aggregation queries. Further, our framework naturally enables highly efficient incremental evaluation when input probabilities are modified (e.g., if uncertainty is resolved). Our preliminary exper- imental results demonstrate the benefits of our framework for per- forming robust query processing over probabilistic databases. JA - Proceedings of the 2011 international conference on Management of data ER - TY - CONF T1 - Shape Codebook based Handwritten and Machine Printed Text Zone Extraction T2 - Document Recognition and Retrieval Y1 - 2011 A1 - Kumar,Jayant A1 - Prasad,Rohit A1 - Cao,Huiagu A1 - Abd-Almageed, Wael A1 - David Doermann A1 - Natarajan,Prem AB - We present a novel method for extracting handwritten and printed text zones from noisy document images with mixed content. We use Triple-Adjacent-Segment (TAS) based features which encode local shape characteristics of text in a consistent manner. We first construct two different codebooks of the shape features extracted from a set of handwritten and printed text documents. In the next step, we compute the normalized histogram of codewords for each segmented zone and use it to train Support Vector Machine (SVM) classifier. Due to a codebook based approach, our method is robust to the background noise present in the image. The TAS features used are invariant to translation, scale and rotation of text. In our experimental results, we show that a pixel-weighted zone classification accuracy of 98% can be achieved for noisy Arabic documents. Further, we demonstrate the effectiveness of our method for document page classification and show that a high precision can be achieved for machine printed documents. The proposed method is robust to the size of zones, which may contain text content at word, line or paragraph level. JA - Document Recognition and Retrieval CY - San Francisco ER - TY - JOUR T1 - Simplex and Diamond Hierarchies: Models and Applications JF - Computer Graphics Forum Y1 - 2011 A1 - Weiss,K. A1 - De Floriani, Leila KW - hierarchy of diamonds KW - hierarchy of simplices KW - I.3.5 [Computer Graphics]: Computational Geometry and Object Modelling— Hierarchy and geometric transformations KW - I.3.6 [Computer Graphics]: Methodology and Techniques—Graphics data structures and data types KW - interactive terrain visualization KW - mesh‐based multiresolution models KW - multiresolution isosurfaces KW - nested refinement schemes KW - Regular simplex bisection KW - scalar field visualization KW - spatial access structures AB - Hierarchical spatial decompositions are a basic modelling tool in a variety of application domains. Several papers on this subject deal with hierarchical simplicial decompositions generated through regular simplex bisection. Such decompositions, originally developed for finite elements, are extensively used as the basis for multi-resolution models of scalar fields, such as terrains, and static or time-varying volume data. They have also been used as an alternative to quadtrees and octrees as spatial access structures. The primary distinction among all such approaches is whether they treat the simplex or clusters of simplices, called diamonds, as the modelling primitive. This leads to two classes of data structures and to different query approaches. We present the hierarchical models in a dimension-independent manner, and organize the description of the various applications, primarily interactive terrain rendering and isosurface extraction, according to the dimension of the domain. VL - 30 SN - 1467-8659 UR - http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8659.2011.01853.x/abstract?userIsAuthenticated=false&deniedAccessCustomisedMessage= CP - 8 M3 - 10.1111/j.1467-8659.2011.01853.x ER - TY - CONF T1 - Simplifying morphological representations of 2D and 3D scalar fields T2 - Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems Y1 - 2011 A1 - Čomić,Lidija A1 - De Floriani, Leila A1 - Iuricich,Federico KW - morphological representations KW - Morse complexes KW - multi-dimensional data sets KW - simplification AB - We describe a dual graph-based representation for the ascending and descending Morse complexes of a scalar field, and a compact and dimension-independent data structure based on it, which assumes a discrete representation of the field as a simplicial mesh. We present atomic dimension-independent simplification operators on the graph-based representation. Based on such operators, we have developed a simplification algorithm, which allows generalization of the ascending and descending Morse complexes at different levels of resolution. We show here the results of our implementation, discussing the computation times and the size of the resulting simplified graphs, also in comparison with the size of the original full-resolution graph. JA - Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems T3 - GIS '11 PB - ACM CY - New York, NY, USA SN - 978-1-4503-1031-4 UR - http://doi.acm.org/10.1145/2093973.2094042 M3 - 10.1145/2093973.2094042 ER - TY - CONF T1 - Smale-like decomposition and forman theory for discrete scalar fields T2 - Discrete Geometry for Computer Imagery Y1 - 2011 A1 - Čomić,L. A1 - Mesmoudi,M. A1 - De Floriani, Leila AB - Forman theory, which is a discrete alternative for cell complexes to the well-known Morse theory, is currently finding several applications in areas where the data to be handled are discrete, such as image processing and computer graphics. Here, we show that a discrete scalar field f, defined on the vertices of a triangulated multidimensional domain Σ, and its gradient vector field Grad f through the Smale-like decomposition of f [6], are both the restriction of a Forman function F and its gradient field Grad F that extends f over all the simplexes of Σ. We present an algorithm that gives an explicit construction of such an extension. Hence, the scalar field f inherits the properties of Forman gradient vector fields and functions from field Grad F and function F. JA - Discrete Geometry for Computer Imagery M3 - 10.1007/978-3-642-19867-0_40 ER - TY - PAT T1 - Smart task-driven video collection Y1 - 2011 A1 - Lim,Ser-Nam A1 - Mittal,Anurag A1 - Davis, Larry S. ED - Siemens Corporation AB - A multi-camera system that collects images and videos of moving objects in dynamic and crowded scenes, subject to task constraints is disclosed. The system constructs “task visibility intervals” comprising information about what can be sensed in future time intervals. Methods for constructing these intervals applying prediction of future object motion and including consideration of factors such as object occlusion and camera control parameters are also disclosed. Using a plane-sweep algorithm, these atomic intervals can be combined in a method to form multi-task intervals, during which a single camera can collect videos suitable for multiple tasks simultaneously. Methods for fast camera scheduling that yield solutions within a small constant factor of an optimal solution are also disclosed. VL - 11/649,393 UR - http://www.google.com/patents?id=pgPnAQAAEBAJ CP - 7961215 ER - TY - CONF T1 - Spam or ham?: characterizing and detecting fraudulent "not spam" reports in web mail systems T2 - Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference Y1 - 2011 A1 - Ramachandran,Anirudh A1 - Dasgupta,Anirban A1 - Feamster, Nick A1 - Weinberger,Kilian AB - Web mail providers rely on users to "vote" to quickly and col-laboratively identify spam messages. Unfortunately, spammers have begun to use bots to control large collections of compromised Web mail accounts not just to send spam, but also to vote "not spam" on incoming spam emails in an attempt to thwart collaborative filtering. We call this practice a vote gaming attack. This attack confuses spam filters, since it causes spam messages to be mislabeled as legitimate; thus, spammer IP addresses can continue sending spam for longer. In this paper, we introduce the vote gaming attack and study the extent of these attacks in practice, using four months of email voting data from a large Web mail provider. We develop a model for vote gaming attacks, explain why existing detection mechanisms cannot detect them, and develop a new, scalable clustering-based detection method that identifies compromised accounts that engage in vote-gaming attacks. Our method detected 1.1 million potentially compromised accounts with only a 0.17% false positive rate, which is nearly 10 times more effective than existing clustering methods used to detect bots that send spam from compromised Web mail accounts. JA - Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference T3 - CEAS '11 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0788-8 UR - http://doi.acm.org/10.1145/2030376.2030401 M3 - 10.1145/2030376.2030401 ER - TY - CONF T1 - Stroke-like Pattern Noise Removal in Binary Document Images T2 - International Conference on Document Analysis and Recognition Y1 - 2011 A1 - Agrawal,Mudit A1 - David Doermann AB - This paper presents a two-phased stroke-like pattern noise (SPN) removal algorithm for binary document images. The proposed approach aims at understanding script-independent prominent text component features using supervised classification as a first step. It then uses their cohesiveness and stroke-width properties to filter and associate smaller text components with them using an unsupervised classification technique. In order to perform text extraction, and hence noise removal, at diacritic-level, this divide-and-conquer technique does not assume the availability of accurate and large amounts of ground-truth data at component-level for training purposes. The method was tested on a collection of degraded and noisy, machine-printed and handwritten binary Arabic text documents. Results show pixel-level precision and recall of 98% and 97% respectively. JA - International Conference on Document Analysis and Recognition ER - TY - JOUR T1 - Temperature regulation of virulence factors in the pathogen Vibrio coralliilyticus JF - The ISME Journal Y1 - 2011 A1 - Kimes,Nikole E. A1 - Grim,Christopher J. A1 - Johnson,Wesley R. A1 - Hasan,Nur A. A1 - Tall,Ben D. A1 - Kothary,Mahendra H. A1 - Kiss,Hajnalka A1 - Munk,A. Christine A1 - Tapia,Roxanne A1 - Green,Lance A1 - Detter,Chris A1 - Bruce,David C. A1 - Brettin,Thomas S. A1 - Rita R Colwell A1 - Morris,Pamela J. KW - ecophysiology KW - ecosystems KW - environmental biotechnology KW - geomicrobiology KW - ISME J KW - microbe interactions KW - microbial communities KW - microbial ecology KW - microbial engineering KW - microbial epidemiology KW - microbial genomics KW - microorganisms AB - Sea surface temperatures (SST) are rising because of global climate change. As a result, pathogenic Vibrio species that infect humans and marine organisms during warmer summer months are of growing concern. Coral reefs, in particular, are already experiencing unprecedented degradation worldwide due in part to infectious disease outbreaks and bleaching episodes that are exacerbated by increasing SST. For example, Vibrio coralliilyticus, a globally distributed bacterium associated with multiple coral diseases, infects corals at temperatures above 27 °C. The mechanisms underlying this temperature-dependent pathogenicity, however, are unknown. In this study, we identify potential virulence mechanisms using whole genome sequencing of V. coralliilyticus ATCC (American Type Culture Collection) BAA-450. Furthermore, we demonstrate direct temperature regulation of numerous virulence factors using proteomic analysis and bioassays. Virulence factors involved in motility, host degradation, secretion, antimicrobial resistance and transcriptional regulation are upregulated at the higher virulent temperature of 27 °C, concurrent with phenotypic changes in motility, antibiotic resistance, hemolysis, cytotoxicity and bioluminescence. These results provide evidence that temperature regulates multiple virulence mechanisms in V. coralliilyticus, independent of abundance. The ecological and biological significance of this temperature-dependent virulence response is reinforced by climate change models that predict tropical SST to consistently exceed 27 °C during the spring, summer and fall seasons. We propose V. coralliilyticus as a model Gram-negative bacterium to study temperature-dependent pathogenicity in Vibrio-related diseases. VL - 6 SN - 1751-7362 UR - http://www.nature.com/ismej/journal/v6/n4/full/ismej2011154a.html CP - 4 M3 - 10.1038/ismej.2011.154 ER - TY - CONF T1 - Template based Segmentation of Touching Components in Handwritten Text Lines T2 - 11th Intl. Conf. on Document Analysis and Recognition (ICDAR'11) Y1 - 2011 A1 - Kang,Le A1 - David Doermann AB - In this paper, we present a template based approach to the segmentation of touching components in handwritten text lines. Local patches around touching components are identified and a dictionary is created consisting of template patches together with correct segmentations. We use two shape context based methods to compute similarity between input patches and dictionary templates to find the best match. The template's known segmentation is then transformed to segment the input patch. Experiments are carried on a dataset of touching text lines. JA - 11th Intl. Conf. on Document Analysis and Recognition (ICDAR'11) ER - TY - CONF T1 - Toward a Standard Benchmark for Computer Security Research: The Worldwide Intelligence Network Environment (WINE) T2 - BADGERS '11 Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security Y1 - 2011 A1 - Tudor Dumitras A1 - Shou, Darren AB - Unlike benchmarks that focus on performance or reliability evaluations, a benchmark for computer security must necessarily include sensitive code and data. Because these artifacts could damage systems or reveal personally identifiable information about the users affected by cyber attacks, publicly disseminating such a benchmark raises several scientific, ethical and legal challenges. We propose the Worldwide Intelligence Network Environment (WINE), a security-benchmarking approach based on rigorous experimental methods. WINE includes representative field data, collected worldwide from 240,000 sensors, for new empirical studies, and it will enable the validation of research on all the phases in the lifecycle of security threats. We tackle the key challenges for security benchmarking by designing a platform for repeatable experimentation on the WINE data sets and by collecting the metadata required for understanding the results. In this paper, we review the unique characteristics of the WINE data, we discuss why rigorous benchmarking will provide fresh insights on the security arms race and we propose a research agenda for this area. JA - BADGERS '11 Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security T3 - BADGERS '11 PB - ACM SN - 978-1-4503-0768-0 UR - http://doi.acm.org/10.1145/1978672.1978683 ER - TY - JOUR T1 - Toward improved aeromechanics simulations using recent advancements in scientific computing JF - Proceedings 67th Annual Forum of the American Helicopter Society Y1 - 2011 A1 - Hu,Q. A1 - Syal,M. A1 - Gumerov, Nail A. A1 - Duraiswami, Ramani A1 - Leishman,J.G. ER - TY - CONF T1 - A trust model for supply chain management T2 - The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 3 Y1 - 2011 A1 - Haghpanah,Y. A1 - desJardins, Marie A1 - others JA - The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 3 ER - TY - CONF T1 - A Two Degree of Freedom Nanopositioner With Electrothermal Actuator for Decoupled Motion Y1 - 2011 A1 - Kim,Yong-Sik A1 - Dagalakis,Nicholas G. A1 - Gupta, Satyandra K. AB - Building a two degree-of-freedom (2 DOF) MEMS nanopositioner with decoupled X-Y motion has been a challenge in nanopositioner design. In this paper a novel design concept on making the decoupled motion of the MEMS nanopositioner is suggested. The suggested nanopositioner has two electrothermal actuators and employs a fully nested motion platform with suspended anchors. The suggested MEMS nanopositioner is capable of delivering displacement from the electrothermal actuator to the motion platform without coupled motion between the two X-Y axes. The design concept, finite element analysis (FEA) results, fabrication procedures and the performance of the 2 DOF nanopositioner is presented. In order to test the nanopositioner moving platform decoupled motion, one actuator moves the platform by 60 µm, while the other actuator is kept at the same position. The platform position cross talk error was measured to be less than 1 µm standard deviation. PB - ASME SN - 978-0-7918-5484-6 UR - http://link.aip.org/link/ASMECP/v2011/i54846/p447/s1&Agg=doi M3 - 10.1115/DETC2011-48619 ER - TY - RPRT T1 - Understanding Scientific Literature Networks: An Evaluation of Action Science Explorer Y1 - 2011 A1 - Gove,R. A1 - Dunne,C. A1 - Shneiderman, Ben A1 - Klavans,J. A1 - Dorr, Bonnie J AB - Action Science Explorer (ASE) is a tool designed to supportusers in rapidly generating readily consumable summaries of academic literature. The authors describe ASE and report on how early formative evaluations led to a mature system evaluation, consisting of an in-depth empirical evaluation with 4 domain expert participants. The user study tasks were of two types: predefined tasks to test system perfor- mance in common scenarios, and user-defined tasks to test the system’s usefulness for custom exploration goals. This paper concludes by describing ASE’s attribute ranking ca- pability which is a novel contribution for exploring scientific literature networks. It makes design recommendations to: give the users control over which documents to explore, easy- to-understand metrics for ranking documents, and overviews of the document set in coordinated views along with details- on-demand of specific papers. PB - University of Maryland at College Park ER - TY - JOUR T1 - A unified approach to ranking in probabilistic databases JF - The VLDB Journal Y1 - 2011 A1 - Li,Jian A1 - Saha,Barna A1 - Deshpande, Amol KW - Approximation techniques KW - Graphical models KW - Learning to rank KW - Probabilistic databases KW - Ranking AB - Ranking is a fundamental operation in data analysis and decision support and plays an even more crucial role if the dataset being explored exhibits uncertainty. This has led to much work in understanding how to rank the tuples in a probabilistic dataset in recent years. In this article, we present a unified approach to ranking and top-k query processing in probabilistic databases by viewing it as a multi-criterion optimization problem and by deriving a set of features that capture the key properties of a probabilistic dataset that dictate the ranked result. We contend that a single, specific ranking function may not suffice for probabilistic databases, and we instead propose two parameterized ranking functions, called PRF ¿ and PRF e, that generalize or can approximate many of the previously proposed ranking functions. We present novel generating functions-based algorithms for efficiently ranking large datasets according to these ranking functions, even if the datasets exhibit complex correlations modeled using probabilistic and/xor trees or Markov networks. We further propose that the parameters of the ranking function be learned from user preferences, and we develop an approach to learn those parameters. Finally, we present a comprehensive experimental study that illustrates the effectiveness of our parameterized ranking functions, especially PRF e, at approximating other ranking functions and the scalability of our proposed algorithms for exact or approximate ranking. VL - 20 SN - 1066-8888 UR - http://dx.doi.org/10.1007/s00778-011-0220-3 CP - 2 M3 - 10.1007/s00778-011-0220-3 ER - TY - CONF T1 - Using Agent-Based Simulation to Determine an Optimal Lane-Changing Strategy on a Multi-Lane Highway T2 - 2011 AAAI Fall Symposium Series Y1 - 2011 A1 - Tuzo,J. A1 - Seymour,J. A1 - desJardins, Marie A1 - others JA - 2011 AAAI Fall Symposium Series ER - TY - JOUR T1 - Using classifier cascades for scalable e-mail classification JF - Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, ACM International Conference Proceedings Series Y1 - 2011 A1 - Pujara,J. A1 - Daumé, Hal A1 - Getoor, Lise AB - In many real-world scenarios, we must make judgments in the presence of computational constraints. One common computational constraint arises when the features used to make a judgment each have differing acquisition costs, but there is a fixed total budget for a set of judgments. Par- ticularly when there are a large number of classifications that must be made in a real-time, an intelligent strategy for optimizing accuracy versus computational costs is essential. E-mail classification is an area where accurate and timely results require such a trade-off. We identify two scenarios where intelligent feature acquisition can improve classifier performance. In granular classification we seek to clas- sify e-mails with increasingly specific labels structured in a hierarchy, where each level of the hierarchy requires a differ- ent trade-off between cost and accuracy. In load-sensitive classification, we classify a set of instances within an ar- bitrary total budget for acquiring features. Our method, Adaptive Classifier Cascades (ACC), designs a policy to combine a series of base classifiers with increasing compu- tational costs given a desired trade-off between cost and ac- curacy. Using this method, we learn a relationship between feature costs and label hierarchies, for granular classification and cost budgets, for load-sensitive classification. We eval- uate our method on real-world e-mail datasets with realistic estimates of feature acquisition cost, and we demonstrate su- perior results when compared to baseline classifiers that do not have a granular, cost-sensitive feature acquisition policy. ER - TY - JOUR T1 - Vehicle Detection Using Partial Least Squares JF - Pattern Analysis and Machine Intelligence, IEEE Transactions on Y1 - 2011 A1 - Kembhavi,A. A1 - Harwood,D. A1 - Davis, Larry S. KW - aerial images;color probability maps;oriented gradients feature;partial least squares;powerful feature selection analysis;urban planning;vehicle detection;visual surveillance;gradient methods;image colour analysis;least squares approximations;road vehicle KW - Automated; AB - Detecting vehicles in aerial images has a wide range of applications, from urban planning to visual surveillance. We describe a vehicle detector that improves upon previous approaches by incorporating a very large and rich set of image descriptors. A new feature set called Color Probability Maps is used to capture the color statistics of vehicles and their surroundings, along with the Histograms of Oriented Gradients feature and a simple yet powerful image descriptor that captures the structural characteristics of objects named Pairs of Pixels. The combination of these features leads to an extremely high-dimensional feature set (approximately 70,000 elements). Partial Least Squares is first used to project the data onto a much lower dimensional subspace. Then, a powerful feature selection analysis is employed to improve the performance while vastly reducing the number of features that must be calculated. We compare our system to previous approaches on two challenging data sets and show superior performance. VL - 33 SN - 0162-8828 CP - 6 M3 - 10.1109/TPAMI.2010.182 ER - TY - JOUR T1 - Visualizing Missing Data: Graph Interpretation User Study JF - IFIP Lecture Notes in Computer Science (LNCS) Y1 - 2011 A1 - Drizd,Terence A1 - Eaton,Cyntrica A1 - Plaisant, Catherine AB - Visualizing Missing Data: Graph Interpretation User Study VL - 3585 UR - http://dl.ifip.org/index.php/lncs/article/view/25927 CP - 3585 ER - TY - JOUR T1 - Advanced tracking systems: computational approaches to be introduced to new series JF - Augmented vision & reality Y1 - 2010 A1 - HAMMOUD,R. A1 - Porikli, F. A1 - Davis, Larry S. AB - Modern visual tracking systems implement a computational process that is often divided into several modules such as localization, tracking, recognition, behavior analysis and classification of events. This book will focus on recent advances in computational approaches for detection and tracking of human body, road boundaries and lane markers as well as on recognition of human activities, drowsiness and distraction state. This book is composed of seven distinct parts. Part I covers people localization algorithms in video sequences. Part II describes successful approaches for tracking people and bodyparts. The third part focuses on tracking of pedestrian and vehicles in outdoor images. Part IV describes recent methods to track lane markers and road boundaries. In part V, methods to track head, hand and facial features are reviewed. The last two parts cover the topics of automatic recognition and classification of activity, gesture, behavior, drowsiness and visual distraction state of humans. VL - 1 ER - TY - CHAP T1 - Aligning Spatio-Temporal Signals on a Special Manifold T2 - Computer Vision – ECCV 2010Computer Vision – ECCV 2010 Y1 - 2010 A1 - Ruonan Li A1 - Chellapa, Rama ED - Daniilidis,Kostas ED - Maragos,Petros ED - Paragios,Nikos AB - We investigate the spatio-temporal alignment of videos or features/signals extracted from them. Specifically, we formally define an alignment manifold and formulate the alignment problem as an optimization procedure on this non-linear space by exploiting its intrinsic geometry. We focus our attention on semantically meaningful videos or signals, e.g., those describing or capturing human motion or activities, and propose a new formalism for temporal alignment accounting for executing rate variations among realizations of the same video event. By construction, we address this static and deterministic alignment task in a dynamic and stochastic manner: we regard the search for optimal alignment parameters as a recursive state estimation problem for a particular dynamic system evolving on the alignment manifold. Consequently, a Sequential Importance Sampling iteration on the alignment manifold is designed for effective and efficient alignment. We demonstrate the performance on several types of input data that arise in vision problems. JA - Computer Vision – ECCV 2010Computer Vision – ECCV 2010 T3 - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 6315 SN - 978-3-642-15554-3 UR - http://dx.doi.org/10.1007/978-3-642-15555-0_40 ER - TY - CONF T1 - An animated multivariate visualization for physiological and clinical data in the ICU T2 - Proceedings of the 1st ACM International Health Informatics Symposium Y1 - 2010 A1 - Ordóñez,Patricia A1 - desJardins, Marie A1 - Lombardi,Michael A1 - Lehmann,Christoph U. A1 - Fackler,Jim KW - computational physiology KW - Information Visualization KW - multivariate KW - time series AB - Current visualizations of electronic medical data in the Intensive Care Unit (ICU) consist of stacked univariate plots of variables over time and a tabular display of the current numeric values for the corresponding variables and occasionally an alarm limit. The value of information is dependent upon knowledge of historic values to determine a change in state. With the ability to acquire more historic information, providers need more sophisticated visualization tools to assist them in analyzing the data in a multivariate fashion over time. We present a multivariate time series visualization that is interactive and animated, and has proven to be as effective as current methods in the ICU for predicting an episode of acute hypotension in terms of accuracy, confidence, and efficiency with only 30-60 minutes of training. JA - Proceedings of the 1st ACM International Health Informatics Symposium T3 - IHI '10 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0030-8 UR - http://doi.acm.org/10.1145/1882992.1883109 M3 - 10.1145/1882992.1883109 ER - TY - CHAP T1 - Articulation-Invariant Representation of Non-planar Shapes T2 - Computer Vision – ECCV 2010Computer Vision – ECCV 2010 Y1 - 2010 A1 - Gopalan,Raghuraman A1 - Turaga,Pavan A1 - Chellapa, Rama ED - Daniilidis,Kostas ED - Maragos,Petros ED - Paragios,Nikos AB - Given a set of points corresponding to a 2D projection of a non-planar shape, we would like to obtain a representation invariant to articulations (under no self-occlusions). It is a challenging problem since we need to account for the changes in 2D shape due to 3D articulations, viewpoint variations, as well as the varying effects of imaging process on different regions of the shape due to its non-planarity. By modeling an articulating shape as a combination of approximate convex parts connected by non-convex junctions, we propose to preserve distances between a pair of points by (i) estimating the parts of the shape through approximate convex decomposition, by introducing a robust measure of convexity and (ii) performing part-wise affine normalization by assuming a weak perspective camera model, and then relating the points using the inner distance which is insensitive to planar articulations. We demonstrate the effectiveness of our representation on a dataset with non-planar articulations, and on standard shape retrieval datasets like MPEG-7. JA - Computer Vision – ECCV 2010Computer Vision – ECCV 2010 T3 - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 6313 SN - 978-3-642-15557-4 UR - http://dx.doi.org/10.1007/978-3-642-15558-1_21 ER - TY - JOUR T1 - Audio visual scene analysis using spherical arrays and cameras. JF - The Journal of the Acoustical Society of America Y1 - 2010 A1 - O'donovan,Adam A1 - Duraiswami, Ramani A1 - Zotkin,Dmitry N A1 - Gumerov, Nail A. AB - While audition and vision are used together by living beings to make sense of the world, the observation of the world using machines in applications such as surveillance and robotics has proceeded largely separately. We describe the use of spherical microphone arrays as “audio cameras” and spherical array of video cameras as a tool to perform multi‐modal scene analysis that attempts to answer questions such as “Who?,”, “What?,” “Where?,” and “Why?.” Signal processing algorithms to identify the number of people and their identities and to isolate and dereverberate their speech using multi‐modal processing will be described. The use of graphics processor based signal processing allows for real‐time implementation of these algorithms. [Work supported by ONR.] VL - 127 UR - http://link.aip.org/link/?JAS/127/1979/3 CP - 3 M3 - 10.1121/1.3385079 ER - TY - CONF T1 - Automatic matched filter recovery via the audio camera T2 - Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on Y1 - 2010 A1 - O'Donovan,A.E. A1 - Duraiswami, Ramani A1 - Zotkin,Dmitry N KW - acoustic KW - array;real-time KW - arrays;transient KW - audio KW - camera;automatic KW - constraints;microphone KW - filter KW - filters;microphone KW - images;room KW - impulse KW - Matched KW - positions;acoustic KW - processing;cameras;matched KW - radiators;array KW - recovery;beamforming;geometric KW - response; KW - response;sound KW - sensor;acoustic KW - signal KW - source;source/receiver KW - sources;audio AB - The sound reaching the acoustic sensor in a realistic environment contains not only the part arriving directly from the sound source but also a number of environmental reflections. The effect of those on the sound is equivalent to a convolution with the room impulse response and can be undone via deconvolution - a technique known as matched filter processing. However, the filter is usually pre-computed in advance using known room geometry and source/receiver positions, and any deviations from those cause the performance to degrade significantly. In this work, an algorithm is proposed to compute the matched filter automatically using an audio camera - a microphone array based system that provides real-time audio images (essentially plots of steered response power in various directions) of environment. Acoustic sources, as well as their significant reflections, are revealed as peaks in the audio image. The reflections are associated with sound source(s) using an acoustic similarity metric, and an approximate matched filter is computed to align the reflections in time with the direct arrival. Preliminary experimental evaluation of the method is performed. It is shown that in case of two sources the reflections are identified correctly, the time delays recovered agree well with those computed from geometric constraints, and that the output SNR improves when the reflections are added coherently to the signal obtained by beamforming directly at the source. JA - Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on M3 - 10.1109/ICASSP.2010.5496187 ER - TY - CONF T1 - Broadening student enthusiasm for computer science with a great insights course T2 - Proceedings of the 41st ACM technical symposium on Computer science education Y1 - 2010 A1 - desJardins, Marie A1 - Littman,Michael KW - attitudes towards computing KW - introductory courses AB - We describe the "Great Insights in Computer Science" courses that are taught at Rutgers and UMBC. These courses were designed independently, but have in common a broad, engaging introduction to computing for non-majors. Both courses include a programming component to help the students gain an intuition for computational concepts, but neither is primarily programming focused. We present data to show that these courses attract a diverse group of students; are rated positively; and increase students' understanding of, and attitudes towards, computing and computational issues. JA - Proceedings of the 41st ACM technical symposium on Computer science education T3 - SIGCSE '10 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0006-3 UR - http://doi.acm.org/10.1145/1734263.1734317 M3 - 10.1145/1734263.1734317 ER - TY - JOUR T1 - Building a dynamic reputation system for DNS JF - 19th Usenix Security Symposium Y1 - 2010 A1 - Antonakakis,M. A1 - Perdisci,R. A1 - Dagon,D. A1 - Lee,W. A1 - Feamster, Nick AB - The Domain Name System (DNS) is an essential protocolused by both legitimate Internet applications and cyber at- tacks. For example, botnets rely on DNS to support agile com- mand and control infrastructures. An effective way to disrupt these attacks is to place malicious domains on a “blocklist” (or “blacklist”) or to add a filtering rule in a firewall or net- work intrusion detection system. To evade such security coun- termeasures, attackers have used DNS agility, e.g., by using new domains daily to evade static blacklists and firewalls. In this paper we propose Notos, a dynamic reputation system for DNS. The premise of this system is that malicious, agile use of DNS has unique characteristics and can be distinguished from legitimate, professionally provisioned DNS services. No- tos uses passive DNS query data and analyzes the network and zone features of domains. It builds models of known legit- imate domains and malicious domains, and uses these models to compute a reputation score for a new domain indicative of whether the domain is malicious or legitimate. We have eval- uated Notos in a large ISP’s network with DNS traffic from 1.4 million users. Our results show that Notos can identify malicious domains with high accuracy (true positive rate of 96.8%) and low false positive rate (0.38%), and can identify these domains weeks or even months before they appear in public blacklists. ER - TY - CONF T1 - cdec: A decoder, alignment, and learning framework for finite-state and context-free translation models T2 - Proceedings of the ACL 2010 System Demonstrations Y1 - 2010 A1 - Dyer,C. A1 - Weese,J. A1 - Setiawan,H. A1 - Lopez,A. A1 - Ture,F. A1 - Eidelman,V. A1 - Ganitkevitch,J. A1 - Blunsom,P. A1 - Resnik, Philip JA - Proceedings of the ACL 2010 System Demonstrations ER - TY - JOUR T1 - Characterizing scattering coefficients numerically via the fast multipole accelerated boundary element method. JF - The Journal of the Acoustical Society of America Y1 - 2010 A1 - Gumerov, Nail A. A1 - Duraiswami, Ramani AB - Various panels are used in acoustical installations to provide desired characteristics to spaces to be used for listening. These panels have complex shapes with geometrical features that have sizes corresponding to wavelengths of sounds of interest. The complex interaction of acoustical waves with these shapes is what gives these surfaces their desirable properties. Experimental characterization of the acoustical properties of these surfaces under random and specular incidence is relatively time consuming. An alternate procedure is to numerically simulate the scattering behavior, and then computing the coefficients of interest from the simulation. A significant obstacle to such computations is the time taken for simulation. Fast multipole acceleration of boundary element methods [Gumerov & Duraiswami, J. Acoust. Soc. Am. 125 (2009)] is a promising approach to speeding up computations. We report on the application of this method to the computation of various scattering coefficients. VL - 127 UR - http://link.aip.org/link/?JAS/127/1751/3 CP - 3 M3 - 10.1121/1.3383664 ER - TY - JOUR T1 - Children as codesigners of new technologies: Valuing the imagination to transform what is possible JF - New Directions for Youth Development Y1 - 2010 A1 - Druin, Allison VL - 2010 CP - 128 ER - TY - CONF T1 - Children's roles using keyword search interfaces at home T2 - Proceedings of the 28th international conference on Human factors in computing systems Y1 - 2010 A1 - Druin, Allison A1 - Foss,E. A1 - Hutchinson,H. A1 - Golub,E. A1 - Hatley,L. JA - Proceedings of the 28th international conference on Human factors in computing systems ER - TY - CONF T1 - Clear Panels: a technique to design mobile application interactivity T2 - Proceedings of the 8th ACM Conference on Designing Interactive Systems Y1 - 2010 A1 - Brown,Q. A1 - Bonsignore,E. A1 - Hatley,L. A1 - Druin, Allison A1 - Walsh,G. A1 - Foss,E. A1 - Brewer,R. A1 - Hammer,J. A1 - Golub,E. JA - Proceedings of the 8th ACM Conference on Designing Interactive Systems ER - TY - JOUR T1 - Comparative genomic analysis reveals evidence of two novel Vibrio species closely related to V. cholerae JF - BMC Microbiology Y1 - 2010 A1 - Bradd,H. A1 - Christopher,G. A1 - Nur,H. A1 - Seon-Young,C. A1 - Jongsik,C. A1 - Thomas,B. A1 - David,B. A1 - Jean,C. A1 - Chris,D. J. A1 - Cliff,H. A1 - Rita R Colwell AB - In recent years genome sequencing has been used to characterize new bacterial species, a method of analysis available as a result of improved methodology and reduced cost. Included in a constantly expanding list of Vibrio species are several that have been reclassified as novel members of the Vibrionaceae. The description of two putative new Vibrio species, Vibrio sp. RC341 and Vibrio sp. RC586 for which we propose the names V. metecus and V. parilis, respectively, previously characterized as non-toxigenic environmental variants of V. cholerae is presented in this study. Results Based on results of whole-genome average nucleotide identity (ANI), average amino acid identity (AAI), rpoB similarity, MLSA, and phylogenetic analysis, the new species are concluded to be phylogenetically closely related to V. cholerae and V. mimicus. Vibrio sp. RC341 and Vibrio sp. RC586 demonstrate features characteristic of V. cholerae and V. mimicus, respectively, on differential and selective media, but their genomes show a 12 to 15% divergence (88 to 85% ANI and 92 to 91% AAI) compared to the sequences of V. cholerae and V. mimicus genomes (ANI <95% and AAI <96% indicative of separate species). Vibrio sp. RC341 and Vibrio sp. RC586 share 2104 ORFs (59%) and 2058 ORFs (56%) with the published core genome of V. cholerae and 2956 (82%) and 3048 ORFs (84%) with V. mimicus MB-451, respectively. The novel species share 2926 ORFs with each other (81% Vibrio sp. RC341 and 81% Vibrio sp. RC586). Virulence-associated factors and genomic islands of V. cholerae and V. mimicus, including VSP-I and II, were found in these environmental Vibrio spp. Conclusions Results of this analysis demonstrate these two environmental vibrios, previously characterized as variant V. cholerae strains, are new species which have evolved from ancestral lineages of the V. cholerae and V. mimicus clade. The presence of conserved integration loci for genomic islands as well as evidence of horizontal gene transfer between these two new species, V. cholerae, and V. mimicus suggests genomic islands and virulence factors are transferred between these species. VL - 10 ER - TY - CHAP T1 - Compressive Acquisition of Dynamic Scenes T2 - Computer Vision – ECCV 2010Computer Vision – ECCV 2010 Y1 - 2010 A1 - Sankaranarayanan,Aswin A1 - Turaga,Pavan A1 - Baraniuk,Richard A1 - Chellapa, Rama ED - Daniilidis,Kostas ED - Maragos,Petros ED - Paragios,Nikos AB - Compressive sensing (CS) is a new approach for the acquisition and recovery of sparse signals and images that enables sampling rates significantly below the classical Nyquist rate. Despite significant progress in the theory and methods of CS, little headway has been made in compressive video acquisition and recovery. Video CS is complicated by the ephemeral nature of dynamic events, which makes direct extensions of standard CS imaging architectures and signal models infeasible. In this paper, we develop a new framework for video CS for dynamic textured scenes that models the evolution of the scene as a linear dynamical system (LDS). This reduces the video recovery problem to first estimating the model parameters of the LDS from compressive measurements, from which the image frames are then reconstructed. We exploit the low-dimensional dynamic parameters (the state sequence) and high-dimensional static parameters (the observation matrix) of the LDS to devise a novel compressive measurement strategy that measures only the dynamic part of the scene at each instant and accumulates measurements over time to estimate the static parameters. This enables us to considerably lower the compressive measurement rate considerably. We validate our approach with a range of experiments including classification experiments that highlight the effectiveness of the proposed approach. JA - Computer Vision – ECCV 2010Computer Vision – ECCV 2010 T3 - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 6311 SN - 978-3-642-15548-2 UR - http://dx.doi.org/10.1007/978-3-642-15549-9_10 ER - TY - JOUR T1 - Computation of the head-related transfer function via the fast multipole accelerated boundary element method and its spherical harmonic representation JF - The Journal of the Acoustical Society of America Y1 - 2010 A1 - Gumerov, Nail A. A1 - O'Donovan,Adam E. A1 - Duraiswami, Ramani A1 - Zotkin,Dmitry N KW - auditory evoked potentials KW - bioacoustics KW - boundary-elements methods KW - Ear KW - Harmonic analysis AB - The head-related transfer function (HRTF) is computed using the fast multipole accelerated boundary element method. For efficiency, the HRTF is computed using the reciprocity principle by placing a source at the ear and computing its field. Analysis is presented to modify the boundary value problem accordingly. To compute the HRTF corresponding to different ranges via a single computation, a compact and accurate representation of the HRTF, termed the spherical spectrum, is developed. Computations are reduced to a two stage process, the computation of the spherical spectrum and a subsequent evaluation of the HRTF. This representation allows easy interpolation and range extrapolation of HRTFs. HRTF computations are performed for the range of audible frequencies up to 20 kHz for several models including a sphere, human head models [the Neumann KU-100 (“Fritz”) and the Knowles KEMAR (“Kemar”) manikins], and head-and-torso model (the Kemar manikin). Comparisons between the different cases are provided. Comparisons with the computational data of other authors and available experimental data are conducted and show satisfactory agreement for the frequencies for which reliable experimental data are available. Results show that, given a good mesh, it is feasible to compute the HRTF over the full audible range on a regular personal computer. VL - 127 UR - http://link.aip.org/link/?JAS/127/370/1 CP - 1 M3 - 10.1121/1.3257598 ER - TY - CONF T1 - On Computing Compression Trees for Data Collection in Wireless Sensor Networks T2 - 2010 Proceedings IEEE INFOCOM Y1 - 2010 A1 - Li,Jian A1 - Deshpande, Amol A1 - Khuller, Samir KW - Approximation algorithms KW - Base stations KW - Communications Society KW - Computer networks KW - Computer science KW - computing compression trees KW - Costs KW - data collection KW - Data communication KW - data compression KW - designing algorithms KW - Educational institutions KW - Entropy KW - graph concept KW - Monitoring KW - Protocols KW - trees (mathematics) KW - weakly connected dominating sets KW - Wireless sensor networks AB - We address the problem of efficiently gathering correlated data from a wireless sensor network, with the aim of designing algorithms with provable optimality guarantees, and understanding how close we can get to the known theoretical lower bounds. Our proposed approach is based on finding an optimal or a near-optimal compression tree for a given sensor network: a compression tree is a directed tree over the sensor network nodes such that the value of a node is compressed using the value of its parent. We focus on broadcast communication model in this paper, but our results are more generally applicable to a unicast communication model as well. We draw connections between the data collection problem and a previously studied graph concept called weakly connected dominating sets, and we use this to develop novel approximation algorithms for the problem. We present comparative results on several synthetic and real-world datasets showing that our algorithms construct near-optimal compression trees that yield a significant reduction in the data collection cost. JA - 2010 Proceedings IEEE INFOCOM PB - IEEE SN - 978-1-4244-5836-3 M3 - 10.1109/INFCOM.2010.5462035 ER - TY - JOUR T1 - Confidence-based feature acquisition to minimize training and test costs JF - Proceedings of the SIAM Conference on Data Mining Y1 - 2010 A1 - desJardins, Marie A1 - MacGlashan,James A1 - Wagstaff,Kiri L AB - We present Confidence-based Feature Acquisition (CFA), a novel supervised learning method for acquiring missing feature values when there is missing data at both training and test time. Previous work has considered the cases of missing data at training time (e.g., Active Feature Acquisition, AFA 8), or at test time (e.g., Cost-Sensitive Naive Bayes, CSNB 2), but not both. At training time, CFA constructs a cascaded ensemble of classifiers, starting with the zero-cost features and adding a single feature for each successive model. For each model, CFA selects a subset of training instances for which the added feature should be acquired. At test time, the set of models is applied sequentially (as a cascade), stopping when a user-supplied confidence threshold is met. We compare CFA to AFA, CSNB, and several other baselines, and find that CFAs accuracy is at least as high as the other methods, while incurring significantly lower feature acquisition costs. VL - 76 CP - 373 M3 - 10.2307/2287057 ER - TY - JOUR T1 - Connecting generations: developing co-design methods for older adults and children JF - Behaviour & Information Technology Y1 - 2010 A1 - Xie,B. A1 - Druin, Allison A1 - Fails,J. A1 - Massey,S. A1 - Golub,E. A1 - Franckel,S. A1 - Schneider,K. VL - 99999 CP - 1 ER - TY - CONF T1 - Context-Aware and Content-Based Dynamic Voronoi Page Segmentation T2 - The Nineth IAPRInternational Workshop on Document Analysis Systems Y1 - 2010 A1 - Agrawal,Mudit A1 - David Doermann AB - This paper presents a dynamic approach to document page segmentation based on inter-component relationships, local patterns and context features. State-of-the art page segmentation algorithms segment zones based on local properties of neighboring connected components such as distance and orientation, and do not typically consider additional properties other than size. Our proposed approach uses a contextually aware and dynamically adaptive page segmentation scheme. The page is first over-segmented using a dynamically adaptive scheme of separation features based on [2] and adapted from [13]. A decision to form zones is then based on the context built from these local separation features and high-level content features. Zone-based evaluation was performed on sets of printed and handwritten documents in English and Arabic scripts with multiple font types, sizes and we achieved an increase of 15% over the accuracy reported in [2]. JA - The Nineth IAPRInternational Workshop on Document Analysis Systems ER - TY - JOUR T1 - Deploying sensor networks with guaranteed fault tolerance JF - IEEE/ACM Transactions on Networking (TON) Y1 - 2010 A1 - Bredin,J. L A1 - Demaine,E. D A1 - Hajiaghayi, Mohammad T. A1 - Rus,D. VL - 18 CP - 1 ER - TY - RPRT T1 - Design, Analysis and Comparison of Spatial Indexes for Tetrahedral Meshes Y1 - 2010 A1 - De Floriani, Leila A1 - Fellegara,R. A1 - Magillo,P. AB - We address the problem of performing spatial queries on tetrahedral meshes. These latterarise in several application domains including 3D GIS, scientific visualization, finite element analysis. We have defined and implemented a family of spatial indexes, that we call tetrahedral trees. Tetrahedral trees are based on a subdivision of a cubic domain containing the mesh defined either by an octree or a 3D kd-tree. For each of them, we have four variants of the spatial index, depending on four different subdivision criteria. Here, we present such indexes, we discuss how to construct them and perform classical spatial queries such as point location and window queries. We compare the various tetrahedral trees based in memory usage, performances in spatial queries and computation times for constructing them. PB - Department of Computer Science and Information Science, University of Genoa VL - DISI-TR-2010-05 ER - TY - CHAP T1 - Dynamic and Multidimensional Dataflow Graphs T2 - Handbook of Signal Processing Systems Y1 - 2010 A1 - Bhattacharyya, Shuvra S. A1 - Deprettere, Ed F. A1 - Keinert, Joachim ED - Bhattacharyya, Shuvra S. ED - Deprettere, Ed F. ED - Leupers, Rainer ED - Takala, Jarmo KW - Communications Engineering, Networks KW - Computer Systems Organization and Communication Networks KW - Processor Architectures KW - Signal, Image and Speech Processing AB - Much of the work to date on dataflow models for signal processing system design has focused decidable dataflow models that are best suited for onedimensional signal processing. In this chapter, we review more general dataflow modeling techniques that are targeted to applications that include multidimensional signal processing and dynamic dataflow behavior. As dataflow techniques are applied to signal processing systems that are more complex, and demand increasing degrees of agility and flexibility, these classes of more general dataflow models are of correspondingly increasing interest. We begin with a discussion of two dataflow modeling techniques - multi-dimensional synchronous dataflow and windowed dataflow - that are targeted towards multidimensional signal processing applications. We then provide a motivation for dynamic dataflow models of computation, and review a number of specific methods that have emerged in this class of models. Our coverage of dynamic dataflowmodels in this chapter includes Boolean dataflow, the stream-based function model, CAL, parameterized dataflow, and enable-invoke dataflow. JA - Handbook of Signal Processing Systems PB - Springer US SN - 978-1-4419-6344-4, 978-1-4419-6345-1 UR - http://link.springer.com/chapter/10.1007/978-1-4419-6345-1_32 ER - TY - JOUR T1 - Efficient kriging for real-time spatio-temporal interpolation JF - Proceedings of the 20 th Conference on Probability and Statistics in the Atmospheric Sciences Y1 - 2010 A1 - Srinivasan,B.V. A1 - Duraiswami, Ramani A1 - Murtugudde,R. AB - Atmospheric data is often recorded at scattered stationlocations. While the data is generally available over a long period of time it cannot be used directly for extract- ing coherent patterns and mechanistic correlations. The only recourse is to spatially and temporally interpolate the data both to organize the station recording to a reg- ular grid and to query the data for predictions at a par- ticular location or time of interest. Spatio-temporal in- terpolation approaches require the evaluation of weights at each point of interest. A widely used interpolation ap- proach is kriging. However, kriging has a computational cost that scales as the cube of the number of data points N, resulting in cubic time complexity for each point of interest, which leads to a time complexity of O(N4) for interpolation at O(N) points. In this work, we formulate the kriging problem, to first reduce the computational cost to O(N3). We use an iterative solver (Saad, 2003), and further accelerate the solver using fast summation algo- rithms like GPUML (Srinivasan and Duraiswami, 2009) or FIGTREE (Morariu et al., 2008). We illustrate the speedup on synthetic data and compare the performance with other standard kriging approaches to demonstrate substantial improvement in the performance of our ap- proach. We then apply the developed approach on ocean color data from the Chesapeake Bay and present some quantitative analysis of the kriged results. ER - TY - CHAP T1 - Evolutionary framework for Lepidoptera model systems T2 - Genetics and Molecular Biology of LepidopteraGenetics and Molecular Biology of Lepidoptera Y1 - 2010 A1 - Roe,A. A1 - Weller,S. A1 - Baixeras,J. A1 - Brown,J. W A1 - Cummings, Michael P. A1 - Davis,DR A1 - Horak,M A1 - Kawahara,A. Y A1 - Mitter,C A1 - Parr,C.S. A1 - Regier,J. C A1 - Rubinoff,D A1 - Simonsen,TJ A1 - Wahlberg,N A1 - Zwick,A. ED - Goldsmith,M ED - Marec,F AB - “Model systems” are specific organisms upon which detailed studies have been conducted examining a fundamental biological question. If the studies are robust, their results can be extrapolated among an array of organisms that possess features in common with the subject organism. The true power of model systems lies in the ability to extrapolate these details across larger groups of organisms. In order to generalize these results, comparative studies are essential and require that model systems be placed into their evolutionary or phylogenetic context. This chapter examines model systems in the insect order Lepidoptera from the perspective of several different superfamilies. Historically, many species of Lepidoptera have been essential in the development of invaluable model systems in the fields of development biology, genetics, molecular biology, physiology, co-evolution, population dynamics, and ecology. JA - Genetics and Molecular Biology of LepidopteraGenetics and Molecular Biology of Lepidoptera PB - Taylor & Francis CY - Boca Raton ER - TY - CHAP T1 - An Experimental Study of Color-Based Segmentation Algorithms Based on the Mean-Shift Concept T2 - Computer Vision – ECCV 2010Computer Vision – ECCV 2010 Y1 - 2010 A1 - Bitsakos,K. A1 - Fermüller, Cornelia A1 - Aloimonos, J. ED - Daniilidis,Kostas ED - Maragos,Petros ED - Paragios,Nikos AB - We point out a difference between the original mean-shift formulation of Fukunaga and Hostetler and the common variant in the computer vision community, namely whether the pairwise comparison is performed with the original or with the filtered image of the previous iteration. This leads to a new hybrid algorithm, called Color Mean Shift, that roughly speaking, treats color as Fukunaga’s algorithm and spatial coordinates as Comaniciu’s algorithm. We perform experiments to evaluate how different kernel functions and color spaces affect the final filtering and segmentation results, and the computational speed, using the Berkeley and Weizmann segmentation databases. We conclude that the new method gives better results than existing mean shift ones on four standard comparison measures ( improvement on RAND and BDE measures respectively for color images), with slightly higher running times ( ). Overall, the new method produces segmentations comparable in quality to the ones obtained with current state of the art segmentation algorithms. JA - Computer Vision – ECCV 2010Computer Vision – ECCV 2010 T3 - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 6312 SN - 978-3-642-15551-2 UR - http://dx.doi.org/10.1007/978-3-642-15552-9_37 ER - TY - JOUR T1 - Fast Computation of Kernel Estimators JF - Journal of Computational and Graphical Statistics Y1 - 2010 A1 - Raykar,Vikas C. A1 - Duraiswami, Ramani A1 - Zhao,Linda H. AB - The computational complexity of evaluating the kernel density estimate (or its derivatives) at m evaluation points given n sample points scales quadratically as O(nm)?making it prohibitively expensive for large datasets. While approximate methods like binning could speed up the computation, they lack a precise control over the accuracy of the approximation. There is no straightforward way of choosing the binning parameters a priori in order to achieve a desired approximation error. We propose a novel computationally efficient ε-exact approximation algorithm for the univariate Gaussian kernel-based density derivative estimation that reduces the computational complexity from O(nm) to linear O(n+m). The user can specify a desired accuracy ε. The algorithm guarantees that the actual error between the approximation and the original kernel estimate will always be less than ε. We also apply our proposed fast algorithm to speed up automatic bandwidth selection procedures. We compare our method to the best available binning methods in terms of the speed and the accuracy. Our experimental results show that the proposed method is almost twice as fast as the best binning methods and is around five orders of magnitude more accurate. The software for the proposed method is available online. VL - 19 SN - 1061-8600 UR - http://www.tandfonline.com/doi/abs/10.1198/jcgs.2010.09046 CP - 1 M3 - 10.1198/jcgs.2010.09046 ER - TY - JOUR T1 - Fast matrix-vector product based fgmres for kernel machines JF - Copper Mountain Conference on Iterative Methods Y1 - 2010 A1 - Srinivasan,B.V. A1 - Duraiswami, Ramani A1 - Gumerov, Nail A. AB - Kernel based approaches for machine learning have gained huge interest in the past decades because of their robustness. In somealgorithms, the primary problem is the solution of a linear system involving the kernel matrix. Iterative Krylov approaches are often used to solve these efficiently [2, 3]. Fast matrix-vector products can be used to accelerate each Krylov iteration to further optimize the performance. In order to reduce the number of iterations of the Krylov approach, a preconditioner becomes necessary in many cases. Several researchers have proposed flexible preconditioning methods where the preconditioner changes with each iteration, and this class of preconditioners are shown to have good performance [6, 12]. In this paper, we use a Tikhonov regularized kernel matrix as a preconditioner for flexible GMRES [12] to solve kernel matrix based systems of equations. We use a truncated conjugate gradient (CG) method to solve the preconditioner system and further accelerate each CG iteration using fast matrix-vector products. The convergence of the proposed preconditioned GMRES is shown on synthetic data. The performance is further validated on problems in Gaussian process regression and radial basis function interpolation. Improvements are seen in each case. VL - 2 ER - TY - JOUR T1 - Fast radial basis function interpolation via preconditioned Krylov iteration JF - SIAM Journal on Scientific Computing Y1 - 2010 A1 - Gumerov, Nail A. A1 - Duraiswami, Ramani AB - We consider a preconditioned Krylov subspace iterative algorithm presented by Faul,Goodsell, and Powell (IMA J. Numer. Anal. 25 (2005), pp. 1–24) for computing the coefficients of a radial basis function interpolant over N data points. This preconditioned Krylov iteration has been demonstrated to be extremely robust to the distribution of the points and the iteration rapidly convergent. However, the iterative method has several steps whose computational and memory costs scale as O(N2), both in preliminary computations that compute the preconditioner and in the matrix-vector product involved in each step of the iteration. We effectively accelerate the iterative method to achieve an overall cost of O(N log N). The matrix vector product is accelerated via the use of the fast multipole method. The preconditioner requires the computation of a set of closest points to each point. We develop an O(N log N) algorithm for this step as well. Results are presented for multiquadric interpolation in R2 and biharmonic interpolation in R3. A novel FMM algorithm for the evaluation of sums involving multiquadric functions in R2 is presented as well. VL - 29 CP - 5 ER - TY - JOUR T1 - Finishing genomes with limited resources: lessons from an ensemble of microbial genomes JF - BMC Genomics Y1 - 2010 A1 - Nagarajan,Niranjan A1 - Cook,Christopher A1 - Di Bonaventura,Maria Pia A1 - Ge,Hong A1 - Richards,Allen A1 - Bishop-Lilly,Kimberly A A1 - DeSalle,Robert A1 - Read,Timothy D. A1 - Pop, Mihai AB - While new sequencing technologies have ushered in an era where microbial genomes can be easily sequenced, the goal of routinely producing high-quality draft and finished genomes in a cost-effective fashion has still remained elusive. Due to shorter read lengths and limitations in library construction protocols, shotgun sequencing and assembly based on these technologies often results in fragmented assemblies. Correspondingly, while draft assemblies can be obtained in days, finishing can take many months and hence the time and effort can only be justified for high-priority genomes and in large sequencing centers. In this work, we revisit this issue in light of our own experience in producing finished and nearly-finished genomes for a range of microbial species in a small-lab setting. These genomes were finished with surprisingly little investments in terms of time, computational effort and lab work, suggesting that the increased access to sequencing might also eventually lead to a greater proportion of finished genomes from small labs and genomics cores. VL - 11 SN - 1471-2164 UR - http://www.biomedcentral.com/1471-2164/11/242 CP - 1 M3 - 10.1186/1471-2164-11-242 ER - TY - CONF T1 - GEDI - AGroundtruthing Environment for Document Images T2 - Ninth IAPRInternational Workshop on Document Analysis Systems (DAS 2010) Y1 - 2010 A1 - David Doermann A1 - Zotkina,Elena A1 - Huiping Li AB - In this paper, we describe a freely available highly configurable document image annotation tool called GEDI – Groundtruthing Environment for Document Images. Its basic structure involves two types of files, an Image file, and a corresponding .xml file in GEDI format. When users begin ground truthing an image, they can configure the interface to allow the creation of different types of zones, each of which may have a custom set of “attributes”. The output is compatible with the UMDDocLib architecture [2] and has been used in numerous funded and unfunded programs to create datasets in multiple languages. GEDI has been developed and released to the community as a comprehensive tool that we hope will ease the burden of document annotation and encourage additional sharing of data. JA - Ninth IAPRInternational Workshop on Document Analysis Systems (DAS 2010) ER - TY - JOUR T1 - Generating Phrasal and Sentential Paraphrases: A Survey of Data-Driven Methods JF - Computational Linguistics Y1 - 2010 A1 - Madnani,Nitin A1 - Dorr, Bonnie J AB - The task of paraphrasing is inherently familiar to speakers of all languages. Moreover, the task of automatically generating or extracting semantic equivalences for the various units of language—words, phrases, and sentences—is an important part of natural language processing (NLP) and is being increasingly employed to improve the performance of several NLP applications. In this article, we attempt to conduct a comprehensive and application-independent survey of data-driven phrasal and sentential paraphrase generation methods, while also conveying an appreciation for the importance and potential use of paraphrases in the field of NLP research. Recent work done in manual and automatic construction of paraphrase corpora is also examined. We also discuss the strategies used for evaluating paraphrase generation techniques and briefly explore some future trends in paraphrase generation. VL - 36 SN - 0891-2017 UR - http://dx.doi.org/10.1162/coli_a_00002 CP - 3 M3 - 10.1162/coli_a_00002 ER - TY - JOUR T1 - Genome Sequence of Hybrid Vibrio Cholerae O1 MJ-1236, B-33, and CIRS101 and Comparative Genomics with V. Cholerae JF - Journal of BacteriologyJ. Bacteriol. Y1 - 2010 A1 - Grim,Christopher J. A1 - Hasan,Nur A. A1 - Taviani,Elisa A1 - Haley,Bradd A1 - Jongsik Chun A1 - Brettin,Thomas S. A1 - Bruce,David C. A1 - Detter,J. Chris A1 - Han,Cliff S. A1 - Chertkov,Olga A1 - Challacombe,Jean A1 - Huq,Anwar A1 - Nair,G. Balakrish A1 - Rita R Colwell AB - The genomes of Vibrio cholerae O1 Matlab variant MJ-1236, Mozambique O1 El Tor variant B33, and altered O1 El Tor CIRS101 were sequenced. All three strains were found to belong to the phylocore group 1 clade of V. cholerae, which includes the 7th-pandemic O1 El Tor and serogroup O139 isolates, despite displaying certain characteristics of the classical biotype. All three strains were found to harbor a hybrid variant of CTXΦ and an integrative conjugative element (ICE), leading to their establishment as successful clinical clones and the displacement of prototypical O1 El Tor. The absence of strain- and group-specific genomic islands, some of which appear to be prophages and phage-like elements, seems to be the most likely factor in the recent establishment of dominance of V. cholerae CIRS101 over the other two hybrid strains. VL - 192 SN - 0021-9193, 1098-5530 UR - http://jb.asm.org/content/192/13/3524 CP - 13 M3 - 10.1128/JB.00040-10 ER - TY - CONF T1 - A geometric approach to curvature estimation on triangulated 3D shapes T2 - Int. Conf. on Computer Graphics Theory and Applications (GRAPP) Y1 - 2010 A1 - Mesmoudi,M. M. A1 - De Floriani, Leila A1 - Magillo,P. AB - We present a geometric approach to define discrete normal, principal, Gaussian and mean curvatures, thatwe call Ccurvature. Our approach is based on the notion of concentrated curvature of a polygonal line and a simulation of rotation of the normal plane of the surface at a point. The advantages of our approach is its simplicity and its natural meaning. A comparison with widely-used discrete methods is presented. JA - Int. Conf. on Computer Graphics Theory and Applications (GRAPP) ER - TY - JOUR T1 - GPUML: Graphical processors for speeding up kernel machines JF - Workshop on High Performance Analytics-Algorithms, Implementations, and Applications Y1 - 2010 A1 - Srinivasan,B.V. A1 - Hu,Q. A1 - Duraiswami, Ramani AB - Algorithms based on kernel methods play a central rolein statistical machine learning. At their core are a num- ber of linear algebra operations on matrices of kernel functions which take as arguments the training and test- ing data. These range from the simple matrix-vector product, to more complex matrix decompositions, and iterative formulations of these. Often the algorithms scale quadratically or cubically, both in memory and op- erational complexity, and as data sizes increase, kernel methods scale poorly. We use parallelized approaches on a multi-core graphical processor (GPU) to partially address this lack of scalability. GPUs are used to scale three different classes of problems, a simple kernel- matrix-vector product, iterative solution of linear sys- tems of kernel function and QR and Cholesky decom- position of kernel matrices. Application of these accel- erated approaches in scaling several kernel based learn- ing approaches are shown, and in each case substantial speedups are obtained. The core software is released as an open source package, GPUML. ER - TY - CONF T1 - A graph-theoretic approach to protect static and moving targets from adversaries T2 - Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1 Y1 - 2010 A1 - Dickerson,J. P. A1 - Simari,G. I A1 - V.S. Subrahmanian A1 - Kraus,Sarit KW - adversarial reasoning KW - agent systems KW - game theory AB - The static asset protection problem (SAP) in a road network is that of allocating resources to protect vertices, given any possible behavior by an adversary determined to attack those assets. The dynamic asset protection (DAP) problem is a version of SAP where the asset is following a fixed and widely known route (e.g., a parade route) and needs to be protected. We formalize what it means for a given allocation of resources to be "optimal" for protecting a desired set of assets, and show that randomly allocating resources to a single edge cut in the road network solves this problem. Unlike SAP, we show that DAP is not only an NP-complete problem, but that approximating DAP is also NP-hard. We provide the GreedyDAP heuristic algorithm to solve DAP and show experimentally that it works well in practice, using road network data for real cities. JA - Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1 T3 - AAMAS '10 PB - International Foundation for Autonomous Agents and Multiagent Systems CY - Richland, SC SN - 978-0-9826571-1-9 UR - http://dl.acm.org/citation.cfm?id=1838206.1838248 ER - TY - BOOK T1 - Handbook of Signal Processing Systems Y1 - 2010 A1 - Bhattacharyya, Shuvra S. A1 - Deprettere, Ed F. KW - Computers / Information Theory KW - Technology & Engineering / Electrical KW - Technology & Engineering / Signals & Signal Processing AB - The Handbook is organized in four parts. The first part motivates representative applications that drive and apply state-of-the art methods for design and implementation of signal processing systems; the second part discusses architectures for implementing these applications; the third part focuses on compilers and simulation tools; and the fourth part describes models of computation and their associated design tools and methodologies. PB - Springer SN - 9781441963451 ER - TY - CONF T1 - Handwritten Arabic text line segmentation using affinity propagation T2 - Proceedings of the 9th IAPR International Workshop on Document Analysis Systems Y1 - 2010 A1 - Kumar,Jayant A1 - Abd-Almageed, Wael A1 - Kang,Le A1 - David Doermann KW - affinity propagation KW - arabic KW - arabic documents KW - breadth-first search KW - clustering KW - dijkstra's shortest path algorithm KW - handwritten documents KW - line detection KW - text line segmentation AB - In this paper, we present a novel graph-based method for extracting handwritten text lines in monochromatic Arabic document images. Our approach consists of two steps - Coarse text line estimation using primary components which define the line and assignment of diacritic components which are more difficult to associate with a given line. We first estimate local orientation at each primary component to build a sparse similarity graph. We then, use a shortest path algorithm to compute similarities between non-neighboring components. From this graph, we obtain coarse text lines using two estimates obtained from Affinity propagation and Breadth-first search. In the second step, we assign secondary components to each text line. The proposed method is very fast and robust to non-uniform skew and character size variations, normally present in handwritten text lines. We evaluate our method using a pixel-matching criteria, and report 96% accuracy on a dataset of 125 Arabic document images. We also present a proximity analysis on datasets generated by artificially decreasing the spacings between text lines to demonstrate the robustness of our approach. JA - Proceedings of the 9th IAPR International Workshop on Document Analysis Systems T3 - DAS '10 PB - ACM CY - New York, NY, USA SN - 978-1-60558-773-8 UR - http://doi.acm.org/10.1145/1815330.1815348 M3 - 10.1145/1815330.1815348 ER - TY - CONF T1 - Image classification of vascular smooth muscle cells T2 - Proceedings of the 1st ACM International Health Informatics Symposium Y1 - 2010 A1 - Grasso,Michael A. A1 - Mokashi,Ronil A1 - Dalvi,Darshana A1 - Cardone, Antonio A1 - Dima,Alden A. A1 - Bhadriraju,Kiran A1 - Plant,Anne L. A1 - Brady,Mary A1 - Yesha,Yaacov A1 - Yesha,Yelena KW - cell biology KW - digital image processing KW - machine learning JA - Proceedings of the 1st ACM International Health Informatics Symposium T3 - IHI '10 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0030-8 UR - http://doi.acm.org/10.1145/1882992.1883068 M3 - 10.1145/1882992.1883068 ER - TY - JOUR T1 - On the importance of sharing negative results JF - SIGKDD explorations Y1 - 2010 A1 - Giraud-Carrier,C. A1 - Dunham,M.H. A1 - Atreya,A. A1 - Elkan,C. A1 - Perlich,C. A1 - Swirszcz,G. A1 - Shi,X. A1 - Philip,S.Y. A1 - Fürnkranz,J. A1 - Sima,J.F. VL - 12 CP - 2 ER - TY - THES T1 - Improving the Dependability of Distributed Systems Through Air Software Upgrades Y1 - 2010 A1 - Tudor Dumitras AB - Traditional fault-tolerance mechanisms concentrate almost entirely on responding to, avoiding, or tolerating unexpected faults or security violations. However, scheduled events, such as software upgrades, account for most of the system unavailability and often introduce data corruption or latent errors. Through two empirical studies, this dissertation identifies the leading causes of upgrade failure—breaking hidden dependencies—and of planned downtime—complex data conversions—in distributed enterprise systems. These findings represent the foundation of a new benchmark for software-upgrade dependability. This dissertation further introduces the AIR properties—A TOMICITY, ISOLATION and RUNTIME-TESTING—required for improving the dependability of distributed systems that undergo major software upgrades. The AIR properties are realized in Imago, a system designed to reduce both planned and unplanned downtime by upgrading distributed systems end-to-end. Imago builds upon the idea of isolating the production system from the upgrade operations, in order to avoid breaking hidden dependencies and to decouple the data conversions from the normal system operation. Imago includes novel mechanisms, such as providing a parallel universe for the new version, performing data conversions opportunistically, intercepting the live workload at the ingress and egress points or executing an atomic switchover to the new version, which allow it to deliver the AIR properties. Imago harnesses opportunities provided by the emerging cloud-computing technologies, by trading resource overhead (needed by the parallel universe) for an improved dependability of the software upgrades. This approach separates the functional aspects of the upgrade from the mechanisms for online upgrade, enabling an upgrade-as-a-service model. This dissertation also describes techniques for assessing the impact of software upgrades, in order to reason about the implications of relaxing the AIR guarantees . PB - Carnegie Mellon University CY - Pittsburgh, PA, USA ER - TY - CONF T1 - Increasing representational power and scaling reasoning in probabilistic databases T2 - Proceedings of the 13th International Conference on Database Theory Y1 - 2010 A1 - Deshpande, Amol AB - Increasing numbers of real-world application domains are generating data that is inherently noisy, incomplete, and probabilistic in nature. Statistical analysis and probabilistic inference, widely used in those domains, often introduce additional layers of uncertainty. Examples include sensor data analysis, data integration and information extraction on the Web, social network analysis, and scientific and biomedical data management. Managing and querying such data requires us to combine the tools and the techniques from a variety of disciplines including databases, first-order logic, and probabilistic reasoning. There has been much work at the intersection of these research areas in recent years. The work on probabilistic databases has made great advances in efficiently executing SQL and inference queries over large-scale uncertain datasets [2, 1]. The research in first-order probabilistic models like probabilistic relational models [5], Markov logic networks [10] etc. (see Getoor and Taskar [6] for a comprehensive overview), and the work on lifted inference [9, 3, 8, 11] has resulted in several techniques for efficiently integrating first-order logic and probabilistic reasoning. In this talk, I will present some of the foundations of large-scale probabilistic data management, and the challenges in scaling the representational power and the reasoning capabilities of probabilistic databases. I will use the PrDB probabilistic data management system being developed at the University of Maryland as a case study for this purpose [4, 7, 12]. Unlike the other recent work on probabilistic databases, PrDB is designed to represent uncertain data with rich correlation structures, and it uses probabilistic graphical models as the basic representation model. I will discuss how PrDB supports compact specification of uncertainties at different abstraction levels, from "schema-level" uncertainties that apply to entire relations to "tuple-specific" uncertainties that apply to a specific tuple or a specific set of tuples; I will also discuss how this relates to the work on first-order probabilistic models. Query evaluation in PrDB can be formulated as inference in appropriately constructed graphical models, and I will briefly present some of the key novel techniques that we have developed for efficient query evaluation, and their relationship to recent work on efficient lifted inference. I will conclude with a discussion of some of the open research challenges moving forward. JA - Proceedings of the 13th International Conference on Database Theory T3 - ICDT '10 PB - ACM CY - New York, NY, USA SN - 978-1-60558-947-3 UR - http://doi.acm.org/10.1145/1804669.1804671 M3 - 10.1145/1804669.1804671 ER - TY - JOUR T1 - Insights into head-related transfer function: Spatial dimensionality and continuous representation JF - The Journal of the Acoustical Society of America Y1 - 2010 A1 - Zhang,Wen A1 - Abhayapala,Thushara D. A1 - Kennedy,Rodney A. A1 - Duraiswami, Ramani KW - acoustic signal processing KW - Bessel functions KW - Fourier series KW - hearing KW - Transfer functions AB - This paper studies head-related transfer function (HRTF) sampling and synthesis in a three-dimensional auditory scene based on a general modal decomposition of the HRTF in all frequency-range-angle domains. The main finding is that the HRTF decomposition with the derived spatial basis function modes can be well approximated by a finite number, which is defined as the spatial dimensionality of the HRTF. The dimensionality determines the minimum number of parameters to represent the HRTF corresponding to all directions and also the required spatial resolution in HRTF measurement. The general model is further developed to a continuous HRTF representation, in which the normalized spatial modes can achieve HRTF near-field and far-field representations in one formulation. The remaining HRTF spectral components are compactly represented using a Fourier spherical Bessel series, where the aim is to generate the HRTF with much higher spectral resolution in fewer parameters from typical measurements, which usually have limited spectral resolution constrained by sampling conditions. A low-computation algorithm is developed to obtain the model coefficients from the existing measurements. The HRTF synthesis using the proposed model is validated by three sets of data: (i) synthetic HRTFs from the spherical head model, (ii) the MIT KEMAR (Knowles Electronics Mannequin for Acoustics Research) data, and (iii) 45-subject CIPIC HRTF measurements. VL - 127 UR - http://link.aip.org/link/?JAS/127/2347/1 CP - 4 M3 - 10.1121/1.3336399 ER - TY - JOUR T1 - Interlingual Annotation of Parallel Text Corpora: A New Framework for Annotation and Evaluation JF - Natural Language Engineering Y1 - 2010 A1 - Dorr, Bonnie J A1 - Passonneau,Rebecca J. A1 - Farwell,David A1 - Green,Rebecca A1 - Habash,Nizar A1 - Helmreich,Stephen A1 - Hovy,Eduard A1 - Levin,Lori A1 - Miller,Keith J. A1 - Mitamura,Teruko A1 - Rambow,Owen A1 - Siddharthan,Advaith AB - This paper focuses on an important step in the creation of a system of meaning representation and the development of semantically annotated parallel corpora, for use in applications such as machine translation, question answering, text summarization, and information retrieval. The work described below constitutes the first effort of any kind to annotate multiple translations of foreign-language texts with interlingual content. Three levels of representation are introduced: deep syntactic dependencies (IL0), intermediate semantic representations (IL1), and a normalized representation that unifies conversives, nonliteral language, and paraphrase (IL2). The resulting annotated, multilingually induced, parallel corpora will be useful as an empirical basis for a wide range of research, including the development and evaluation of interlingual NLP systems and paraphrase-extraction systems as well as a host of other research and development efforts in theoretical and applied linguistics, foreign language pedagogy, translation studies, and other related disciplines. VL - 16 CP - 03 M3 - 10.1017/S1351324910000070 ER - TY - CONF T1 - Investigating the impact of design processes on children T2 - Proceedings of the 9th International Conference on Interaction Design and Children Y1 - 2010 A1 - Guha,M.L. A1 - Druin, Allison A1 - Fails,J. A JA - Proceedings of the 9th International Conference on Interaction Design and Children ER - TY - JOUR T1 - iOpener Workbench: Tools for rapid understanding of scientific literature JF - Human-Computer Interaction Lab 27th Annual Symposium, University of Maryland, College Park, MD Y1 - 2010 A1 - Dunne,C. A1 - Shneiderman, Ben A1 - Dorr, Bonnie J A1 - Klavans,J. ER - TY - JOUR T1 - Isodiamond Hierarchies: An Efficient Multiresolution Representation for Isosurfaces and Interval Volumes JF - Visualization and Computer Graphics, IEEE Transactions on Y1 - 2010 A1 - Weiss,K. A1 - De Floriani, Leila KW - Computer-Assisted;Imaging KW - data processing speed;edge bisection;encoding;interval volumes;isodiamond hierarchies;isosurfaces;mesh representation;minimal isodiamond hierarchy;multiresolution representation;multiresolution scalar field model;relevant isodiamond hierarchy;volume data KW - Theoretical;User-Computer Interface; KW - Three-Dimensional;Models AB - Efficient multiresolution representations for isosurfaces and interval volumes are becoming increasingly important as the gap between volume data sizes and processing speed continues to widen. Our multiresolution scalar field model is a hierarchy of tetrahedral clusters generated by longest edge bisection that we call a hierarchy of diamonds. We propose two multiresolution models for representing isosurfaces, or interval volumes, extracted from a hierarchy of diamonds which exploit its regular structure. These models are defined by subsets of diamonds in the hierarchy that we call isodiamonds, which are enhanced with geometric and topological information for encoding the relation between the isosurface, or interval volume, and the diamond itself. The first multiresolution model, called a relevant isodiamond hierarchy, encodes the isodiamonds intersected by the isosurface, or interval volume, as well as their nonintersected ancestors, while the second model, called a minimal isodiamond hierarchy, encodes only the intersected isodiamonds. Since both models operate directly on the extracted isosurface or interval volume, they require significantly less memory and support faster selective refinement queries than the original multiresolution scalar field, but do not support dynamic isovalue modifications. Moreover, since a minimal isodiamond hierarchy only encodes intersected isodiamonds, its extracted meshes require significantly less memory than those extracted from a relevant isodiamond hierarchy. We demonstrate the compactness of isodiamond hierarchies by comparing them to an indexed representation of the mesh at full resolution. VL - 16 SN - 1077-2626 CP - 4 M3 - 10.1109/TVCG.2010.29 ER - TY - CONF T1 - Kernelized Rényi distance for speaker recognition T2 - Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on Y1 - 2010 A1 - Vasan Srinivasan,B. A1 - Duraiswami, Ramani A1 - Zotkin,Dmitry N KW - #x0301;nyi KW - approach;input KW - distance;reference KW - entropy;graphical KW - equipment;entropy;speaker KW - graphic KW - identification;speaker KW - processor;information KW - Re KW - recognition; KW - recognition;speaker KW - signals;kernelized KW - signals;speaker KW - theoretic KW - verification;computer AB - Speaker recognition systems classify a test signal as a speaker or an imposter by evaluating a matching score between input and reference signals. We propose a new information theoretic approach for computation of the matching score using the Re #x0301;nyi entropy. The proposed entropic distance, the Kernelized Re #x0301;nyi distance (KRD), is formulated in a non-parametric way and the resulting measure is efficiently evaluated in a parallelized fashion on a graphical processor. The distance is then adapted as a scoring function and its performance compared with other popular scoring approaches in a speaker identification and speaker verification framework. JA - Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on M3 - 10.1109/ICASSP.2010.5495587 ER - TY - JOUR T1 - Learning algorithms for link prediction based on chance constraints JF - Machine Learning and Knowledge Discovery in Databases Y1 - 2010 A1 - Doppa,J. A1 - Yu,J. A1 - Tadepalli,P. A1 - Getoor, Lise AB - In this paper, we consider the link prediction problem, where we are given a partial snapshot of a network at some time and the goal is to predict the additional links formed at a later time. The accuracy of current prediction methods is quite low due to the extreme class skew and the large number of potential links. Here, we describe learning algorithms based on chance constrained programs and show that they exhibit all the properties needed for a good link predictor, namely, they allow preferential bias to positive or negative class; handle skewness in the data; and scale to large networks. Our experimental results on three real-world domains—co-authorship networks, biological networks and citation networks—show significant performance improvement over baseline algorithms. We conclude by briefly describing some promising future directions based on this work. M3 - 10.1007/978-3-642-15880-3_28 ER - TY - JOUR T1 - Learning what and how of contextual models for scene labeling JF - Computer Vision–ECCV 2010 Y1 - 2010 A1 - Jain, A. A1 - Gupta,A. A1 - Davis, Larry S. AB - We present a data-driven approach to predict the importance of edges and construct a Markov network for image analysis based on statistical models of global and local image features. We also address the coupled problem of predicting the feature weights associated with each edge of a Markov network for evaluation of context. Experimental results indicate that this scene dependent structure construction model eliminates spurious edges and improves performance over fully-connected and neighborhood connected Markov network. ER - TY - CONF T1 - Lineage processing over correlated probabilistic databases T2 - Proceedings of the 2010 international conference on Management of data Y1 - 2010 A1 - Kanagal,Bhargav A1 - Deshpande, Amol KW - conjunctive queries KW - Indexing KW - junction trees KW - lineage KW - Probabilistic databases AB - In this paper, we address the problem of scalably evaluating conjunctive queries over correlated probabilistic databases containing tuple or attribute uncertainties. Like previous work, we adopt a two-phase approach where we first compute lineages of the output tuples, and then compute the probabilities of the lineage formulas. However unlike previous work, we allow for arbitrary and complex correlations to be present in the data, captured via a forest of junction trees. We observe that evaluating even read-once (tree structured) lineages (e.g., those generated by hierarchical conjunctive queries), polynomially computable over tuple independent probabilistic databases, is #P-complete for lightly correlated probabilistic databases like Markov sequences. We characterize the complexity of exact computation of the probability of the lineage formula on a correlated database using a parameter called lwidth (analogous to the notion of treewidth). For lineages that result in low lwidth, we compute exact probabilities using a novel message passing algorithm, and for lineages that induce large lwidths, we develop approximate Monte Carlo algorithms to estimate the result probabilities. We scale our algorithms to very large correlated probabilistic databases using the previously proposed INDSEP data structure. To mitigate the complexity of lineage evaluation, we develop optimization techniques to process a batch of lineages by sharing computation across formulas, and to exploit any independence relationships that may exist in the data. Our experimental study illustrates the benefits of using our algorithms for processing lineage formulas over correlated probabilistic databases. JA - Proceedings of the 2010 international conference on Management of data T3 - SIGMOD '10 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0032-2 UR - http://doi.acm.org/10.1145/1807167.1807241 M3 - 10.1145/1807167.1807241 ER - TY - JOUR T1 - Loudspeaker and Microphone Array Signal Processing-Plane-Wave Decomposition of Acoustical Scenes Via Spherical and Cylindrical Microphone Arrays JF - IEEE transactions on audio, speech, and language processing Y1 - 2010 A1 - Zotkin,Dmitry N A1 - Duraiswami, Ramani A1 - Gumerov, Nail A. VL - 20 CP - 1 ER - TY - CONF T1 - The Metacognitive Loop: An Architecture for Building Robust Intelligent Systems T2 - 2010 AAAI Fall Symposium Series Y1 - 2010 A1 - Shahri,Hamid Haidarian A1 - Dinalankara,Wikum A1 - Fults,Scott A1 - Wilson,Shomir A1 - Perlis, Don A1 - Schmill,Matt A1 - Oates,Tim A1 - Josyula,Darsana A1 - Anderson,Michael KW - commonsense KW - ontologies KW - robust intelligent systems AB - The Metacognitive Loop: An Architecture for Building Robust Intelligent Systems JA - 2010 AAAI Fall Symposium Series UR - http://www.aaai.org/ocs/index.php/FSS/FSS10/paper/view/2161 ER - TY - PAT T1 - Method for measurement of head related transfer functions Y1 - 2010 A1 - Duraiswami, Ramani A1 - Gumerov, Nail A. ED - University of Maryland AB - Head Related Transfer Functions (HRTFs) of an individual are measured in rapid fashion in an arrangement where a sound source is positioned in the individual's ear channel, while microphones are arranged in the microphone array enveloping the individual's head. The pressure waves generated by the sounds emanating from the sound source reach the microphones and are converted into corresponding electrical signals which are further processed in a processing system to extract HRTFs, which may then be used to synthesize a spatial audio scene. The acoustic field generated by the sounds from the sound source can be evaluated at any desired point inside or outside the microphone array. VL - 10/702,465 UR - http://www.google.com/patents?id=JHrQAAAAEBAJ CP - 7720229 ER - TY - CONF T1 - Mobile collaboration: collaboratively reading and creating children's stories on mobile devices T2 - Proceedings of the 9th International Conference on Interaction Design and Children Y1 - 2010 A1 - Fails,J. A A1 - Druin, Allison A1 - Guha,M.L. JA - Proceedings of the 9th International Conference on Interaction Design and Children ER - TY - CHAP T1 - Mobile Visual Aid Tools for Users with Visual Impairments T2 - Mobile Multimedia ProcessingMobile Multimedia Processing Y1 - 2010 A1 - Liu,Xu A1 - David Doermann A1 - Huiping Li AB - In this chapter we describe “MobileEye”, a software suite which converts a camera enabled mobile device into a multi-function vision tool that can assist the visually impaired in their daily activities. MobileEye consists of four subsystems, each customized for a specific type of visual disabilities: A color channel mapper which can tell the visually impaired different colors; a software based magnifier which provides image magnification as well as enhancement; a pattern recognizer which can read currencies; and a document retriever which allows access to printed materials. We developed cutting edge computer vision and image processing technologies, and tackled the challenges of implementing them on mobile devices with limited computational resources and low image quality. The system minimizes keyboard operation for the usability of users with visual impairments. Currently the software suite runs on Symbian and Windows Mobile handsets. In this chapter we provides a high level overview of the system, and then discuss the pattern recognizer in detail. The challenge is how to build a real-time recognition system on mobile devices and we present our detailed solutions. JA - Mobile Multimedia ProcessingMobile Multimedia Processing PB - LNSC 5960 ER - TY - JOUR T1 - A modality lexicon and its use in automatic tagging JF - Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC’10) Y1 - 2010 A1 - Baker,K. A1 - Bloodgood,M. A1 - Dorr, Bonnie J A1 - Filardo,N.W. A1 - Levin,L. A1 - Piatko,C. AB - This paper describes our resource-building results for an eight-week JHU Human Language Technology Center of Excellence SummerCamp for Applied Language Exploration (SCALE-2009) on Semantically-Informed Machine Translation. Specifically, we describe the construction of a modality annotation scheme, a modality lexicon, and two automated modality taggers that were built using the lexicon and annotation scheme. Our annotation scheme is based on identifying three components of modality: a trigger, a target and a holder. We describe how our modality lexicon was produced semi-automatically, expanding from an initial hand-selected list of modality trigger words and phrases. The resulting expanded modality lexicon is being made publicly available. We demonstrate that one tagger—a structure-based tagger—results in precision around 86% (depending on genre) for tagging of a standard LDC data set. In a machine translation application, using the structure-based tagger to annotate English modalities on an English-Urdu training corpus improved the translation quality score for Urdu by 0.3 Bleu points in the face of sparse training data. ER - TY - CONF T1 - Modeling and generalization of discrete Morse terrain decompositions T2 - Proc. 20th Int. Conf. on Pattern Recognition, ICPR Y1 - 2010 A1 - De Floriani, Leila A1 - Magillo,P. A1 - Vitali,M. AB - We address the problem of morphological analysis ofreal terrains. We describe a morphological model for a terrain by considering extensions of Morse theory to the discrete case. We propose a two-level model of the morphology of a terrain based on a graph joining the critical points of the terrain through integral lines. We present a new set of generalization operators specific for discrete piece-wise linear terrain models, which are used to reduce noise and the size of the morphological representation. We show results of our approach on real terrains. JA - Proc. 20th Int. Conf. on Pattern Recognition, ICPR VL - 10 ER - TY - JOUR T1 - Multi-Camera Tracking with Adaptive Resource Allocation JF - International Journal of Computer Vision Y1 - 2010 A1 - Han,B. A1 - Joo, S.W. A1 - Davis, Larry S. ER - TY - CONF T1 - Multiresolution analysis of 3D images based on discrete distortion T2 - International Conference on Pattern Recognition (ICPR) Y1 - 2010 A1 - Weiss,K. A1 - De Floriani, Leila A1 - Mesmoudi,M. M. AB - We consider a model of a 3D image obtained bydiscretizing it into a multiresolution tetrahedral mesh known as a hierarchy of diamonds. This model enables us to extract crack-free approximations of the 3D image at any uniform or variable resolution, thus reducing the size of the data set without reducing the accuracy. A 3D intensity image is a scalar field (the intensity field) defined at the vertices of a 3D regular grid and thus the graph of the image is a hypersurface in R4. We measure the discrete distortion, a generalization of the notion of curvature, of the transformation which maps the tetrahedralized 3D grid onto its graph in R4. We evaluate the use of a hierarchy of diamonds to analyze properties of a 3D image, such as its discrete distortion, directly on lower resolution approximations. Our results indicate that distortion-guided extractions focus the resolution of approximated images on the salient features of the intensity image. JA - International Conference on Pattern Recognition (ICPR) ER - TY - CONF T1 - Multiresolution morse triangulations T2 - Proceedings of the 14th ACM Symposium on Solid and Physical Modeling Y1 - 2010 A1 - Danovaro,Emanuele A1 - De Floriani, Leila A1 - Magillo,Paola A1 - Vitali,Maria AB - We address the problem of representing the geometry and the morphology of a triangulated surface endowed with a scalar field in a combined geometric and topological multiresolution model. The model, called a Multiresolution Morse Triangulation (MMT), is composed of a multiresolution triangle mesh, and of a multiresolution Morse complex describing the morphology of the field. The MMT is built through a combined morphological and geometrical generalization, and supports queries to extract consistent geometrical and morphological representations of the field at both uniform and variable resolutions. JA - Proceedings of the 14th ACM Symposium on Solid and Physical Modeling T3 - SPM '10 PB - ACM CY - New York, NY, USA SN - 978-1-60558-984-8 UR - http://doi.acm.org/10.1145/1839778.1839806 M3 - 10.1145/1839778.1839806 ER - TY - CONF T1 - Multi-view clustering with constraint propagation for learning with an incomplete mapping between views T2 - Proceedings of the 19th ACM international conference on Information and knowledge management Y1 - 2010 A1 - Eaton,Eric A1 - desJardins, Marie A1 - Jacob,Sara KW - constrained clustering KW - multi-view learning KW - semi-supervised learning AB - Multi-view learning algorithms typically assume a complete bipartite mapping between the different views in order to exchange information during the learning process. However, many applications provide only a partial mapping between the views, creating a challenge for current methods. To address this problem, we propose a multi-view algorithm based on constrained clustering that can operate with an incomplete mapping. Given a set of pairwise constraints in each view, our approach propagates these constraints using a local similarity measure to those instances that can be mapped to the other views, allowing the propagated constraints to be transferred across views via the partial mapping. It uses co-EM to iteratively estimate the propagation within each view based on the current clustering model, transfer the constraints across views, and update the clustering model, thereby learning a unified model for all views. We show that this approach significantly improves clustering performance over several other methods for transferring constraints and allows multi-view clustering to be reliably applied when given a limited mapping between the views. JA - Proceedings of the 19th ACM international conference on Information and knowledge management T3 - CIKM '10 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0099-5 UR - http://doi.acm.org/10.1145/1871437.1871489 M3 - 10.1145/1871437.1871489 ER - TY - CONF T1 - Nested refinement domains for tetrahedral and diamond hierarchies T2 - IEEE Visualization Y1 - 2010 A1 - Weiss,K. A1 - De Floriani, Leila AB - Three nested refinement domains for hierarchies of tetrahedra and diamonds. The descendant domain (left) is the limit shape of thedomain covered by all descendants of a given diamond (colored). Due to the fractal nature of these shapes, we introduce the more conservative convex descendant domain (middle) and bounding box descendant domain (right) to simplify the computation while still tightly covering the descendant domain. In each case, the refinement domain of one of the diamond’s parents, grandparents and great-grandparents is shown. JA - IEEE Visualization ER - TY - JOUR T1 - Non-visual exploration of geographic maps: Does sonification help? JF - Disability & Rehabilitation: Assistive Technology Y1 - 2010 A1 - Delogu,Franco A1 - Palmiero,Massimiliano A1 - Federici,Stefano A1 - Plaisant, Catherine A1 - Zhao,Haixia A1 - Belardinelli,Olivetti AB - Purpose. This study aims at evaluating the effectiveness of sonification as a mean to provide access to geo-referenced information to users with visual impairments.Method. Thiry-five participants (10 congenitally blind, 10 with acquired blindness and 15 blindfolded sighted) completed four tasks of progressive difficulty. During each task, participants first explored a sonified map by using either a tablet or a keyboard to move across regions and listened to sounds giving information about the current location. Then the participants were asked to identify, among four tactile maps, the one that crossmodally corresponds to the sonifed map they just explored. Finally, participants answered a self-report questionnaire of understanding and satisfaction. Results. Participants achieved high accuracy in all of the four tactile map discrimination tasks. No significant performance difference was found neither between subjects that used keyboard or tablet, nor between the three groups of blind and sighted participants. Differences between groups and interfaces were found in the usage strategies. High levels of satisfaction and understanding of the tools and tasks emerged from users' reports. VL - 5 SN - 1748-3107, 1748-3115 UR - http://informahealthcare.com/doi/abs/10.3109/17483100903100277 CP - 3 M3 - 10.3109/17483100903100277 ER - TY - CONF T1 - Obtaining valid safety data for software safety measurement and process improvement T2 - Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement Y1 - 2010 A1 - Basili, Victor R. A1 - Zelkowitz, Marvin V A1 - Layman,Lucas A1 - Dangle,Kathleen A1 - Diep,Madeline KW - case study KW - NASA KW - risk analysis KW - safety metrics AB - We report on a preliminary case study to examine software safety risk in the early design phase of the NASA Constellation spaceflight program. Our goal is to provide NASA quality assurance managers with information regarding the ongoing state of software safety across the program. We examined 154 hazard reports created during the preliminary design phase of three major flight hardware systems within the Constellation program. Our purpose was two-fold: 1) to quantify the relative importance of software with respect to system safety; and 2) to identify potential risks due to incorrect application of the safety process, deficiencies in the safety process, or the lack of a defined process. One early outcome of this work was to show that there are structural deficiencies in collecting valid safety data that make software safety different from hardware safety. In our conclusions we present some of these deficiencies. JA - Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement T3 - ESEM '10 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0039-1 UR - http://doi.acm.org/10.1145/1852786.1852846 M3 - 10.1145/1852786.1852846 ER - TY - JOUR T1 - Overlap-based cell tracker JF - The Journal of Research of the National Institute of Standards and Technology Y1 - 2010 A1 - Chalfoun, J A1 - Cardone, Antonio A1 - Dima, A.A. A1 - Allen, D.P. A1 - Halter, M.W. AB - order to facilitate the extraction of quantitative data from live cell image sets, automated image analysis methods are needed. This paper presents an introduction to the general principle of an overlap cell tracking software developed by the National Institute of Standards and Technology (NIST). This cell tracker has the ability to track cells across a set of time lapse images acquired at high rates based on the amount of overlap between cellular regions in consecutive frames. It is designed to be highly flexible, requires little user parameterization, and has a fast execution time. VL - 115 CP - 6 ER - TY - JOUR T1 - Partial least squares on graphical processor for efficient pattern recognition JF - Technical Reports of the Computer Science Department Y1 - 2010 A1 - Srinivasan,Balaji Vasan A1 - Schwartz,William Robson A1 - Duraiswami, Ramani A1 - Davis, Larry S. KW - Technical Report AB - Partial least squares (PLS) methods have recently been used for many patternrecognition problems in computer vision. Here, PLS is primarily used as a supervised dimensionality reduction tool to obtain effective feature combinations for better learning. However, application of PLS to large datasets is hindered by its higher computational cost. We propose an approach to accelerate the classical PLS algorithm on graphical processors to obtain the same performance at a reduced cost. Although, PLS modeling is practically an offline training process, accelerating it helps large scale modeling. The proposed acceleration is shown to perform well and it yields upto ~30X speedup, It is applied on standard datasets in human detection and face recognition. UR - http://drum.lib.umd.edu/handle/1903/10975 ER - TY - CONF T1 - Performance Evaluation Tools for Zone Segmentation and Classification (PETS) T2 - International Conference on Pattern Recognition Y1 - 2010 A1 - Seo,W. A1 - Agrawal,Mudit A1 - David Doermann AB - This paper overviews a set of Performance Evaluation ToolS (PETS) for zone segmentation and classification. The tools allow researchers and developers to evaluate, optimize and compare their algorithms by providing a variety of quantitative performance metrics. The evaluation of segmentation quality is based on the pixel-based overlaps between two sets of regions proposed by Randriamasy and Vincent. PETS extends the approach by providing a set of metrics for overlap analysis, RLE and polygonal representation of regions and introduces type-matching to evaluate zone classification. The software is available for research use. JA - International Conference on Pattern Recognition ER - TY - JOUR T1 - Perturbing the Ubiquitin Pathway Reveals How Mitosis Is Hijacked to Denucleate and Regulate Cell Proliferation and Differentiation In Vivo JF - PLoS ONEPLoS ONE Y1 - 2010 A1 - Caceres,Andrea A1 - Shang,Fu A1 - Wawrousek,Eric A1 - Liu,Qing A1 - Avidan,Orna A1 - Cvekl,Ales A1 - Yang,Ying A1 - Haririnia,Aydin A1 - Storaska,Andrew A1 - Fushman, David A1 - Kuszak,Jer A1 - Dudek,Edward A1 - Smith,Donald A1 - Taylor,Allen AB - BackgroundThe eye lens presents a unique opportunity to explore roles for specific molecules in cell proliferation, differentiation and development because cells remain in place throughout life and, like red blood cells and keratinocytes, they go through the most extreme differentiation, including removal of nuclei and cessation of protein synthesis. Ubiquitination controls many critical cellular processes, most of which require specific lysines on ubiquitin (Ub). Of the 7 lysines (K) least is known about effects of modification of K6. Methodology and Principal Findings We replaced K6 with tryptophan (W) because K6 is the most readily modified K and W is the most structurally similar residue to biotin. The backbone of K6W-Ub is indistinguishable from that of Wt-Ub. K6W-Ub is effectively conjugated and deconjugated but the conjugates are not degraded via the ubiquitin proteasome pathways (UPP). Expression of K6W-ubiquitin in the lens and lens cells results in accumulation of intracellular aggregates and also slows cell proliferation and the differentiation program, including expression of lens specific proteins, differentiation of epithelial cells into fibers, achieving proper fiber cell morphology, and removal of nuclei. The latter is critical for transparency, but the mechanism by which cell nuclei are removed has remained an age old enigma. This was also solved by expressing K6W-Ub. p27kip, a UPP substrate accumulates in lenses which express K6W-Ub. This precludes phosphorylation of nuclear lamin by the mitotic kinase, a prerequisite for disassembly of the nuclear membrane. Thus the nucleus remains intact and DNAseIIβ neither gains entry to the nucleus nor degrades the DNA. These results could not be obtained using chemical proteasome inhibitors that cannot be directed to specific tissues. Conclusions and Significance K6W-Ub provides a novel, genetic means to study functions of the UPP because it can be targeted to specific cells and tissues. A fully functional UPP is required to execute most stages of lens differentiation, specifically removal of cell nuclei. In the absence of a functional UPP, small aggregate prone, cataractous lenses are formed. VL - 5 UR - http://dx.doi.org/10.1371/journal.pone.0013331 CP - 10 M3 - 10.1371/journal.pone.0013331 ER - TY - JOUR T1 - Plane-Wave Decomposition of Acoustical Scenes Via Spherical and Cylindrical Microphone Arrays JF - IEEE Transactions on Audio, Speech, and Language Processing Y1 - 2010 A1 - Zotkin,Dmitry N A1 - Duraiswami, Ramani A1 - Gumerov, Nail A. KW - Acoustic fields KW - acoustic position measurement KW - acoustic signal processing KW - acoustic waves KW - acoustical scene analysis KW - array signal processing KW - circular arrays KW - cylindrical microphone arrays KW - direction-independent acoustic behavior KW - microphone arrays KW - orthogonal basis functions KW - plane-wave decomposition KW - Position measurement KW - signal reconstruction KW - sound field reconstruction KW - sound field representation KW - source localization KW - spatial audio playback KW - spherical harmonics based beamforming algorithm KW - spherical microphone arrays AB - Spherical and cylindrical microphone arrays offer a number of attractive properties such as direction-independent acoustic behavior and ability to reconstruct the sound field in the vicinity of the array. Beamforming and scene analysis for such arrays is typically done using sound field representation in terms of orthogonal basis functions (spherical/cylindrical harmonics). In this paper, an alternative sound field representation in terms of plane waves is described, and a method for estimating it directly from measurements at microphones is proposed. It is shown that representing a field as a collection of plane waves arriving from various directions simplifies source localization, beamforming, and spatial audio playback. A comparison of the new method with the well-known spherical harmonics based beamforming algorithm is done, and it is shown that both algorithms can be expressed in the same framework but with weights computed differently. It is also shown that the proposed method can be extended to cylindrical arrays. A number of features important for the design and operation of spherical microphone arrays in real applications are revealed. Results indicate that it is possible to reconstruct the sound scene up to order p with p2 microphones spherical array. VL - 18 SN - 1558-7916 CP - 1 M3 - 10.1109/TASL.2009.2022000 ER - TY - PAT T1 - Plasmonic Systems and Devices Utilizing Surface Plasmon Polariton Y1 - 2010 A1 - Smolyaninov,Igor I. A1 - Vishkin, Uzi A1 - Davis,Christopher C. AB - Plasmonic systems and devices that utilize surface plasmon polaritons (or “plasmons”) for inter-chip and/or intra-chip communications are provided. A plasmonic system includes a microchip that has an integrated circuit module and a plasmonic device configured to interface with the integrated circuit module. The plasmonic device includes a first electrode, a second electrode positioned at a non-contact distance from the first electrode, and a tunneling-junction configured to create a plasmon when a potential difference is created between the first electrode and the second electrode. VL - 12/697,595 UR - http://www.google.com/patents?id=2VnRAAAAEBAJ ER - TY - JOUR T1 - The pre‐seventh pandemic Vibrio cholerae BX 330286 El Tor genome: evidence for the environment as a genome reservoir JF - Environmental Microbiology Reports Y1 - 2010 A1 - Haley,Bradd J. A1 - Grim,Christopher J. A1 - Hasan,Nur A. A1 - Taviani,Elisa A1 - Jongsik Chun A1 - Brettin,Thomas S. A1 - Bruce,David C. A1 - Challacombe,Jean F. A1 - Detter,J. Chris A1 - Han,Cliff S. A1 - Huq,Anwar A1 - Nair,G. Balakrish A1 - Rita R Colwell AB - Vibrio cholerae O1 El Tor BX 330286 was isolated from a water sample in Australia in 1986, 9 years after an indigenous outbreak of cholera occurred in that region. This environmental strain encodes virulence factors highly similar to those of clinical strains, suggesting an ability to cause disease in humans. We demonstrate its high similarity in gene content and genome-wide nucleotide sequence to clinical V. cholerae strains, notably to pre-seventh pandemic O1 El Tor strains isolated in 1910 (V. cholerae NCTC 8457) and 1937 (V. cholerae MAK 757), as well as seventh pandemic strains isolated after 1960 globally. Here we demonstrate that this strain represents a transitory clone with shared characteristics between pre-seventh and seventh pandemic strains of V. cholerae. Interestingly, this strain was isolated 25 years after the beginning of the seventh pandemic, suggesting the environment as a genome reservoir in areas where cholera does not occur in sporadic, endemic or epidemic form. VL - 2 SN - 1758-2229 UR - http://onlinelibrary.wiley.com/doi/10.1111/j.1758-2229.2010.00141.x/abstract?userIsAuthenticated=false&deniedAccessCustomisedMessage= CP - 1 M3 - 10.1111/j.1758-2229.2010.00141.x ER - TY - CONF T1 - Putting the user in the loop: interactive Maximal Marginal Relevance for query-focused summarization T2 - Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics Y1 - 2010 A1 - Jimmy Lin A1 - Madnani,Nitin A1 - Dorr, Bonnie J AB - This work represents an initial attempt to move beyond "single-shot" summarization to interactive summarization. We present an extension to the classic Maximal Marginal Relevance (MMR) algorithm that places a user "in the loop" to assist in candidate selection. Experiments in the complex interactive Question Answering (ciQA) task at TREC 2007 show that interactively-constructed responses are significantly higher in quality than automatically-generated ones. This novel algorithm provides a starting point for future work on interactive summarization. JA - Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics T3 - HLT '10 PB - Association for Computational Linguistics CY - Stroudsburg, PA, USA SN - 1-932432-65-5 UR - http://dl.acm.org/citation.cfm?id=1857999.1858040 ER - TY - JOUR T1 - Random sampling for estimating the performance of fast summations JF - Technical Reports of the Computer Science Department Y1 - 2010 A1 - Srinivasan,Balaji Vasan A1 - Duraiswami, Ramani KW - Technical Report AB - Summation of functions of N source points evaluated at M target pointsoccurs commonly in many applications. To scale these approaches for large datasets, many fast algorithms have been proposed. In this technical report, we propose a Chernoff bound based efficient approach to test the performance of a fast summation algorithms providing a probabilistic accuracy. We further validate and use our approach in separate comparisons. UR - http://drum.lib.umd.edu/handle/1903/10976 ER - TY - JOUR T1 - Ranking continuous probabilistic datasets JF - Proc. VLDB Endow. Y1 - 2010 A1 - Li,Jian A1 - Deshpande, Amol AB - Ranking is a fundamental operation in data analysis and decision support, and plays an even more crucial role if the dataset being explored exhibits uncertainty. This has led to much work in understanding how to rank uncertain datasets in recent years. In this paper, we address the problem of ranking when the tuple scores are uncertain, and the uncertainty is captured using continuous probability distributions (e.g. Gaussian distributions). We present a comprehensive solution to compute the values of a parameterized ranking function (PRF) [18] for arbitrary continuous probability distributions (and thus rank the uncertain dataset); PRF can be used to simulate or approximate many other ranking functions proposed in prior work. We develop exact polynomial time algorithms for some continuous probability distribution classes, and efficient approximation schemes with provable guarantees for arbitrary probability distributions. Our algorithms can also be used for exact or approximate evaluation of k-nearest neighbor queries over uncertain objects, whose positions are modeled using continuous probability distributions. Our experimental evaluation over several datasets illustrates the effectiveness of our approach at efficiently ranking uncertain datasets with continuous attribute uncertainty. VL - 3 SN - 2150-8097 UR - http://dl.acm.org/citation.cfm?id=1920841.1920923 CP - 1-2 ER - TY - JOUR T1 - Read-once functions and query evaluation in probabilistic databases JF - Proc. VLDB Endow. Y1 - 2010 A1 - Sen,Prithviraj A1 - Deshpande, Amol A1 - Getoor, Lise AB - Probabilistic databases hold promise of being a viable means for large-scale uncertainty management, increasingly needed in a number of real world applications domains. However, query evaluation in probabilistic databases remains a computational challenge. Prior work on efficient exact query evaluation in probabilistic databases has largely concentrated on query-centric formulations (e.g., safe plans, hierarchical queries), in that, they only consider characteristics of the query and not the data in the database. It is easy to construct examples where a supposedly hard query run on an appropriate database gives rise to a tractable query evaluation problem. In this paper, we develop efficient query evaluation techniques that leverage characteristics of both the query and the data in the database. We focus on tuple-independent databases where the query evaluation problem is equivalent to computing marginal probabilities of Boolean formulas associated with the result tuples. This latter task is easy if the Boolean formulas can be factorized into a form that has every variable appearing at most once (called read-once). However, a naive approach that directly uses previously developed Boolean formula factorization algorithms is inefficient, because those algorithms require the input formulas to be in the disjunctive normal form (DNF). We instead develop novel, more efficient factorization algorithms that directly construct the read-once expression for a result tuple Boolean formula (if one exists), for a large subclass of queries (specifically, conjunctive queries without self-joins). We empirically demonstrate that (1) our proposed techniques are orders of magnitude faster than generic inference algorithms for queries where the result Boolean formulas can be factorized into read-once expressions, and (2) for the special case of hierarchical queries, they rival the efficiency of prior techniques specifically designed to handle such queries. VL - 3 SN - 2150-8097 UR - http://dl.acm.org/citation.cfm?id=1920841.1920975 CP - 1-2 ER - TY - JOUR T1 - A robust and scalable approach to face identification JF - Computer Vision–ECCV 2010 Y1 - 2010 A1 - Schwartz,W. A1 - Guo,H. A1 - Davis, Larry S. AB - The problem of face identification has received significant attention over the years. For a given probe face, the goal of face identification is to match this unknown face against a gallery of known people. Due to the availability of large amounts of data acquired in a variety of conditions, techniques that are both robust to uncontrolled acquisition conditions and scalable to large gallery sizes, which may need to be incrementally built, are challenges. In this work we tackle both problems. Initially, we propose a novel approach to robust face identification based on Partial Least Squares (PLS) to perform multi-channel feature weighting. Then, we extend the method to a tree-based discriminative structure aiming at reducing the time required to evaluate novel probe samples. The method is evaluated through experiments on FERET and FRGC datasets. In most of the comparisons our method outperforms state-of-art face identification techniques. Furthermore, our method presents scalability to large datasets. ER - TY - JOUR T1 - Shape analysis and its applications in image understanding JF - IEEE transactions on pattern analysis and machine intelligence Y1 - 2010 A1 - Zhe Lin A1 - Davis, Larry S. AB - We propose a shape-based, hierarchical part-template matching approach to simultaneous human detection and segmentation combining local part-based and global shape-template-based schemes. The approach relies on the key idea of matching a part-template tree to images hierarchically to detect humans and estimate their poses. For learning a generic human detector, a pose-adaptive feature computation scheme is developed based on a tree matching approach. Instead of traditional concatenation-style image location-based feature encoding, we extract features adaptively in the context of human poses and train a kernel-SVM classifier to separate human/nonhuman patterns. Specifically, the features are collected in the local context of poses by tracing around the estimated shape boundaries. We also introduce an approach to multiple occluded human detection and segmentation based on an iterative occlusion compensation scheme. The output of our learned generic human detector can be used as an initial set of human hypotheses for the iterative optimization. We evaluate our approaches on three public pedestrian data sets (INRIA, MIT-CBCL, and USC-B) and two crowded sequences from Caviar Benchmark and Munich Airport data sets. VL - 32 CP - 4 ER - TY - JOUR T1 - Shape-based human detection and segmentation via hierarchical part-template matching JF - Pattern Analysis and Machine Intelligence, IEEE Transactions on Y1 - 2010 A1 - Lin,Z. A1 - Davis, Larry S. AB - We propose a shape-based, hierarchical part-template matching approach to simultaneous human detection and segmentation combining local part-based and global shape-template-based schemes. The approach relies on the key idea of matching a part-template tree to images hierarchically to detect humans and estimate their poses. For learning a generic human detector, a pose-adaptive feature computation scheme is developed based on a tree matching approach. Instead of traditional concatenation-style image location-based feature encoding, we extract features adaptively in the context of human poses and train a kernel-SVM classifier to separate human/nonhuman patterns. Specifically, the features are collected in the local context of poses by tracing around the estimated shape boundaries. We also introduce an approach to multiple occluded human detection and segmentation based on an iterative occlusion compensation scheme. The output of our learned generic human detector can be used as an initial set of human hypotheses for the iterative optimization. We evaluate our approaches on three public pedestrian data sets (INRIA, MIT-CBCL, and USC-B) and two crowded sequences from Caviar Benchmark and Munich Airport data sets. VL - 32 CP - 4 ER - TY - JOUR T1 - Sharing-aware horizontal partitioning for exploiting correlations during query processing JF - Proc. VLDB Endow. Y1 - 2010 A1 - Tzoumas,Kostas A1 - Deshpande, Amol A1 - Jensen,Christian S. AB - Optimization of join queries based on average selectivities is suboptimal in highly correlated databases. In such databases, relations are naturally divided into partitions, each partition having substantially different statistical characteristics. It is very compelling to discover such data partitions during query optimization and create multiple plans for a given query, one plan being optimal for a particular combination of data partitions. This scenario calls for the sharing of state among plans, so that common intermediate results are not recomputed. We study this problem in a setting with a routing-based query execution engine based on eddies [1]. Eddies naturally encapsulate horizontal partitioning and maximal state sharing across multiple plans. We define the notion of a conditional join plan, a novel representation of the search space that enables us to address the problem in a principled way. We present a low-overhead greedy algorithm that uses statistical summaries based on graphical models. Experimental results suggest an order of magnitude faster execution time over traditional optimization for high correlations, while maintaining the same performance for low correlations. VL - 3 SN - 2150-8097 UR - http://dl.acm.org/citation.cfm?id=1920841.1920911 CP - 1-2 ER - TY - JOUR T1 - Signal Processing for Audio HCI JF - Handbook of Signal Processing Systems Y1 - 2010 A1 - Zotkin,Dmitry N A1 - Duraiswami, Ramani AB - This chapter reviews recent advances in computer audio processing from the viewpoint of improving the human-computer interface. Microphone arrays are described as basic tools for untethered audio acquisition, and principles for the synthesis of realistic virtual audio are outlined. The influence of room acoustics on audio acquisition and production is also considered. The chapter finishes with a review of several relevant signal processing systems, including a fast head-related transfer function (HRTF) measurement system and a complete system for capture, visualization, and reproduction of auditory scenes. ER - TY - CONF T1 - Spatial indexing on tetrahedral meshes T2 - Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems Y1 - 2010 A1 - De Floriani, Leila A1 - Fellegara,Riccardo A1 - Magillo,Paola KW - kd-trees KW - octrees KW - spatial indexes KW - tetrahedral meshes AB - We address the problem of performing spatial queries on tetrahedral meshes. These latter arise in several application domains including 3D GIS, scientific visualization, finite element analysis. We have defined and implemented a family of spatial indexes, that we call tetrahedral trees. Tetrahedral trees subdivide a cubic domain containing the mesh in an octree or 3D kd-tree fashion, with three different subdivision criteria. Here, we present and compare such indexes, their memory usage, and spatial queries on them. JA - Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems T3 - GIS '10 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0428-3 UR - http://doi.acm.org/10.1145/1869790.1869873 M3 - 10.1145/1869790.1869873 ER - TY - JOUR T1 - SPECIAL SECTION ON SHAPE ANALYSIS AND ITS APPLICATIONS IN IMAGE UNDERSTANDING JF - IEEE Transactions on Pattern Analysis and Machine Intelligence Y1 - 2010 A1 - Srivastava, A. A1 - Damon,J.N. A1 - Dryden,I.L. A1 - Jermyn,I.H. A1 - Das,S. A1 - Vaswani, N. A1 - Huckemann,S. A1 - Hotz,T. A1 - Munk,A. A1 - Lin,Z. A1 - others VL - 32 CP - 4 ER - TY - PAT T1 - System and method for analysis of an opinion expressed in documents with regard to a particular topic Y1 - 2010 A1 - V.S. Subrahmanian A1 - Picariello,Antonio A1 - Dorr, Bonnie J A1 - Reforgiato,Diego Recupero A1 - Cesarano,Carmine A1 - Sagoff,Amelia AB - System and method for analysis of an opinion expressed in documents on a particular topic computes opinion strength on a continuous numeric scale, or qualitatively. A variety of opinion scoring techniques are plugged in to score opinion expressing words and sentences in documents. These scores are aggregated to measure the opinion intensity of documents. Multilingual opinion analysis is supported by capability to concurrently identify and visualize the opinion intensity expressed in documents in multiple languages. A multi-dimensional representation of the measured opinion intensity is generated which is agreeable with multi-lingual domain. VL - 11/808,278 UR - http://www.google.com/patents?id=j9fLAAAAEBAJ ER - TY - CONF T1 - To Upgrade or Not to Upgrade: Impact of Online Upgrades Across Multiple Administrative Domains T2 - OOPSLA'10 Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications Y1 - 2010 A1 - Tudor Dumitras A1 - Narasimhan, Priya A1 - Tilevich, Eli KW - mixed-version race KW - multiple administrative domains KW - online upgrade KW - risk assessment AB - Online software upgrades are often plagued by runtime behaviors that are poorly understood and difficult to ascertain. For example, the interactions among multiple versions of the software expose the system to race conditions that can introduce latent errors or data corruption. Moreover, industry trends suggest that online upgrades are currently needed in large-scale enterprise systems, which often span multiple administrative domains (e.g., Web 2.0 applications that rely on AJAX client-side code or systems that lease cloud-computing resources). In such systems, the enterprise does not control all the tiers of the system and cannot coordinate the upgrade process, making existing techniques inadequate to prevent mixed-version races. In this paper, we present an analytical framework for impact assessment, which allows system administrators to directly compare the risk of following an online-upgrade plan with the risk of delaying or canceling the upgrade. We also describe an executable model that implements our formal impact assessment and enables a systematic approach for deciding whether an online upgrade is appropriate. Our model provides a method of last resort for avoiding undesirable program behaviors, in situations where mixed-version races cannot be avoided through other technical means. JA - OOPSLA'10 Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications T3 - OOPSLA '10 PB - ACM SN - 978-1-4503-0203-6 UR - http://doi.acm.org/10.1145/1869459.1869530 ER - TY - CONF T1 - Toque: designing a cooking-based programming language for and with children T2 - Proceedings of the 28th international conference on Human factors in computing systems Y1 - 2010 A1 - Tarkan,S. A1 - Sazawal,V. A1 - Druin, Allison A1 - Golub,E. A1 - Bonsignore,E. M A1 - Walsh,G. A1 - Atrash,Z. JA - Proceedings of the 28th international conference on Human factors in computing systems ER - TY - JOUR T1 - A tree-based approach to integrated action localization, recognition and segmentation JF - ECCV Workshops Y1 - 2010 A1 - Zhuolin Jiang A1 - Lin,Z. A1 - Davis, Larry S. AB - A tree-based approach to integrated action segmentation,localization and recognition is proposed. An action is represented as a sequence of joint hog-flow descriptors extracted independently from each frame. During training, a set of action prototypes is first learned based on a k-means clustering, and then a binary tree model is constructed from the set of action prototypes based on hierarchical k-means cluster- ing. Each tree node is characterized by a shape-motion descriptor and a rejection threshold, and an action segmentation mask is defined for leaf nodes (corresponding to a prototype). During testing, an action is local- ized by mapping each test frame to a nearest neighbor prototype using a fast matching method to search the learned tree, followed by global fil- tering refinement. An action is recognized by maximizing the sum of the joint probabilities of the action category and action prototype over test frames. Our approach does not explicitly rely on human tracking and background subtraction, and enables action localization and recognition in realistic and challenging conditions (such as crowded backgrounds). Experimental results show that our approach can achieve recognition rates of 100% on the CMU action dataset and 100% on the Weizmann dataset. ER - TY - CONF T1 - Using a Trust Model in Decision Making for Supply Chain Management T2 - Workshops at the Twenty-Fourth AAAI Conference on Artificial Intelligence Y1 - 2010 A1 - Haghpanah,Y. A1 - desJardins, Marie A1 - others JA - Workshops at the Twenty-Fourth AAAI Conference on Artificial Intelligence ER - TY - JOUR T1 - Utilizing Hierarchical Multiprocessing for Medical Image Registration JF - IEEE Signal Processing Magazine Y1 - 2010 A1 - Plishker,W. A1 - Dandekar,O. A1 - Bhattacharyya, Shuvra S. A1 - Shekhar,R. KW - Acceleration KW - application parallelism KW - Biomedical imaging KW - domain-specific taxonomy KW - GPU acceleration KW - gradient descent approach KW - Graphics processing unit KW - hierarchical multiprocessing KW - image registration KW - Magnetic resonance imaging KW - Medical diagnostic imaging KW - medical image processing KW - medical image registration KW - multicore platform set KW - Multicore processing KW - PARALLEL PROCESSING KW - parallel programming KW - Robustness KW - Signal processing algorithms KW - Ultrasonic imaging AB - This work discusses an approach to utilize hierarchical multiprocessing in the context of medical image registration. By first organizing application parallelism into a domain-specific taxonomy, an algorithm is structured to target a set of multicore platforms.The approach on a cluster of graphics processing units (GPUs) requiring the use of two parallel programming environments to achieve fast execution times is demonstrated.There is negligible loss in accuracy for rigid registration when employing GPU acceleration, but it does adversely effect our nonrigid registration implementation due to our usage of a gradient descent approach. VL - 27 SN - 1053-5888 CP - 2 ER - TY - CONF T1 - Web-scale computer vision using MapReduce for multimedia data mining T2 - Proceedings of the Tenth International Workshop on Multimedia Data Mining Y1 - 2010 A1 - White,Brandyn A1 - Tom Yeh A1 - Jimmy Lin A1 - Davis, Larry S. KW - background subtraction KW - bag-of-features KW - Cloud computing KW - clustering KW - Computer vision KW - image registration KW - MapReduce AB - This work explores computer vision applications of the MapReduce framework that are relevant to the data mining community. An overview of MapReduce and common design patterns are provided for those with limited MapReduce background. We discuss both the high level theory and the low level implementation for several computer vision algorithms: classifier training, sliding windows, clustering, bag-of-features, background subtraction, and image registration. Experimental results for the k-means clustering and single Gaussian background subtraction algorithms are performed on a 410 node Hadoop cluster. JA - Proceedings of the Tenth International Workshop on Multimedia Data Mining T3 - MDMKDD '10 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0220-3 UR - http://doi.acm.org/10.1145/1814245.1814254 M3 - 10.1145/1814245.1814254 ER - TY - RPRT T1 - Which Factors Affect Access Network Performance? Y1 - 2010 A1 - Sundaresan,S. A1 - Feamster, Nick A1 - Dicioccio,L. A1 - Teixeira,R. AB - This paper presents an analysis of the performance of residential access networks using over four months of round-trip, download, and upload measurements from more than 7,000 users across four ADSL and cable providers in France. Previous studies have characterized residential access network performance, but this paper presents the first study of how access network performance relates to other factors, such as choice of access provider, service-level agreement, and geographic location. We first explore the extent to which user performance matches the capacity advertised by an access provider, and whether the ability to achieve this capacity depends on the user’s access network. We then analyze the extent to which various factors influence the performance that users experience. Finally, we explore how different groups of users experience simultaneous performance anomalies and analyze the common characteristics of users that share fate (e.g., whether users that experience simultaneous performance degradation share the same provider, city). Our analysis informs both users and designers of networked services who wish to improve the reliability and performance of access networks through multihoming and may also assist operators with troubleshooting network issues by narrowing down likely causes. PB - Georgia Institute of Technology VL - GT-CS-10-04 UR - http://hdl.handle.net/1853/37336 ER - TY - CHAP T1 - Why Did the Person Cross the Road (There)? Scene Understanding Using Probabilistic Logic Models and Common Sense Reasoning T2 - Computer Vision – ECCV 2010 Y1 - 2010 A1 - Kembhavi,Aniruddha A1 - Tom Yeh A1 - Davis, Larry S. ED - Daniilidis,Kostas ED - Maragos,Petros ED - Paragios,Nikos AB - We develop a video understanding system for scene elements, such as bus stops, crosswalks, and intersections, that are characterized more by qualitative activities and geometry than by intrinsic appearance. The domain models for scene elements are not learned from a corpus of video, but instead, naturally elicited by humans, and represented as probabilistic logic rules within a Markov Logic Network framework. Human elicited models, however, represent object interactions as they occur in the 3D world rather than describing their appearance projection in some specific 2D image plane. We bridge this gap by recovering qualitative scene geometry to analyze object interactions in the 3D world and then reasoning about scene geometry, occlusions and common sense domain knowledge using a set of meta-rules. The effectiveness of this approach is demonstrated on a set of videos of public spaces. JA - Computer Vision – ECCV 2010 T3 - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 6312 SN - 978-3-642-15551-2 UR - http://dx.doi.org/10.1007/978-3-642-15552-9_50 ER - TY - CONF T1 - Workshop on experimental evaluation of software and systems in computer science (Evaluate 2010) T2 - Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion Y1 - 2010 A1 - Blackburn,Steven M. A1 - Diwan,Amer A1 - Hauswirth,Matthias A1 - Memon, Atif M. A1 - Sweeney,Peter F. KW - Evaluation KW - METHODOLOGY AB - We call ourselves 'computer scientists', but are we scientists? If we are scientists, then we must practice the scientific method. This includes a solid experimental evaluation. In our experience, our experimental methodology is ad hoc at best, and nonexistent at worst. This workshop brings together experts from different areas of computer science to discuss, explore, and attempt to identify the principles of sound experimental evaluation JA - Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion T3 - SPLASH '10 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0240-1 UR - http://doi.acm.org/10.1145/1869542.1869618 M3 - 10.1145/1869542.1869618 ER - TY - CONF T1 - Action recognition based on human movement characteristics T2 - Motion and Video Computing, 2009. WMVC '09. Workshop on Y1 - 2009 A1 - Dondera,R. A1 - David Doermann A1 - Davis, Larry S. KW - action KW - ballistic KW - characteristics;motion KW - correlated KW - cost;human KW - data;probability KW - databases; KW - density KW - descriptor;motion KW - dynamics;computational KW - function;robustness;shape KW - information;short KW - linear KW - Movement KW - movements;computer KW - recognition;human KW - recognition;stability;visual KW - vector KW - vision;pattern AB - We present a motion descriptor for human action recognition where appearance and shape information are unreliable. Unlike other motion-based approaches, we leverage image characteristics specific to human movement to achieve better robustness and lower computational cost. Drawing on recent work on motion recognition with ballistic dynamics, an action is modeled as a series of short correlated linear movements and represented with a probability density function over motion vector data. We are targeting common human actions composed of ballistic movements, and our descriptor can handle both short actions (e.g. reaching with the hand) and long actions with events at relatively stable time offsets (e.g. walking). The proposed descriptor is used for both classification and detection of action instances, in a nearest-neighbor framework. We evaluate the descriptor on the KTH action database and obtain a recognition rate of 90% in a relevant test setting, comparable to the state-of-the-art approaches that use other cues in addition to motion. We also acquired a database of actions with slight occlusion and a human actor manipulating objects of various shapes and appearances. This database makes the use of appearance and shape information problematic, but we obtain a recognition rate of 95%. Our work demonstrates that human movement has distinctive patterns, and that these patterns can be used effectively for action recognition. JA - Motion and Video Computing, 2009. WMVC '09. Workshop on M3 - 10.1109/WMVC.2009.5399233 ER - TY - JOUR T1 - Algorithms for distributional and adversarial pipelined filter ordering problems JF - ACM Trans. Algorithms Y1 - 2009 A1 - Condon,Anne A1 - Deshpande, Amol A1 - Hellerstein,Lisa A1 - Wu,Ning KW - flow algorithms KW - Pipelined filter ordering KW - Query optimization KW - selection ordering AB - Pipelined filter ordering is a central problem in database query optimization. The problem is to determine the optimal order in which to apply a given set of commutative filters (predicates) to a set of elements (the tuples of a relation), so as to find, as efficiently as possible, the tuples that satisfy all of the filters. Optimization of pipelined filter ordering has recently received renewed attention in the context of environments such as the Web, continuous high-speed data streams, and sensor networks. Pipelined filter ordering problems are also studied in areas such as fault detection and machine learning under names such as learning with attribute costs, minimum-sum set cover, and satisficing search. We present algorithms for two natural extensions of the classical pipelined filter ordering problem: (1) a distributional-type problem where the filters run in parallel and the goal is to maximize throughput, and (2) an adversarial-type problem where the goal is to minimize the expected value of multiplicative regret. We present two related algorithms for solving (1), both running in time O(n2), which improve on the O(n3 log n) algorithm of Kodialam. We use techniques from our algorithms for (1) to obtain an algorithm for (2). VL - 5 SN - 1549-6325 UR - http://doi.acm.org/10.1145/1497290.1497300 CP - 2 M3 - 10.1145/1497290.1497300 ER - TY - CONF T1 - Analyzing (social media) networks with NodeXL T2 - Proceedings of the fourth international conference on Communities and technologies Y1 - 2009 A1 - Smith,Marc A. A1 - Shneiderman, Ben A1 - Milic-Frayling,Natasa A1 - Mendes Rodrigues,Eduarda A1 - Barash,Vladimir A1 - Dunne,Cody A1 - Capone,Tony A1 - Perer,Adam A1 - Gleave,Eric KW - excel KW - network analysis KW - social media KW - social network KW - spreadsheet KW - Visualization AB - We present NodeXL, an extendible toolkit for network overview, discovery and exploration implemented as an add-in to the Microsoft Excel 2007 spreadsheet software. We demonstrate NodeXL data analysis and visualization features with a social media data sample drawn from an enterprise intranet social network. A sequence of NodeXL operations from data import to computation of network statistics and refinement of network visualization through sorting, filtering, and clustering functions is described. These operations reveal sociologically relevant differences in the patterns of interconnection among employee participants in the social media space. The tool and method can be broadly applied. JA - Proceedings of the fourth international conference on Communities and technologies T3 - C&T '09 PB - ACM CY - New York, NY, USA SN - 978-1-60558-713-4 UR - http://doi.acm.org/10.1145/1556460.1556497 M3 - 10.1145/1556460.1556497 ER - TY - CONF T1 - Assigning cameras to subjects in video surveillance systems T2 - Robotics and Automation, 2009. ICRA '09. IEEE International Conference on Y1 - 2009 A1 - El-Alfy,H. A1 - Jacobs, David W. A1 - Davis, Larry S. KW - agent KW - algorithm;multiple KW - assignment;computation KW - augmenting KW - cameras;video KW - cost KW - detection;video KW - graph;camera KW - MATCHING KW - matching;minimum KW - matching;target KW - path;bipartite KW - reduction;maximum KW - segment;video KW - Surveillance KW - surveillance; KW - system;graph KW - theory;image KW - TIME KW - tracking;obstacle KW - tracking;video KW - video AB - We consider the problem of tracking multiple agents moving amongst obstacles, using multiple cameras. Given an environment with obstacles, and many people moving through it, we construct a separate narrow field of view video for as many people as possible, by stitching together video segments from multiple cameras over time. We employ a novel approach to assign cameras to people as a function of time, with camera switches when needed. The problem is modeled as a bipartite graph and the solution corresponds to a maximum matching. As people move, the solution is efficiently updated by computing an augmenting path rather than by solving for a new matching. This reduces computation time by an order of magnitude. In addition, solving for the shortest augmenting path minimizes the number of camera switches at each update. When not all people can be covered by the available cameras, we cluster as many people as possible into small groups, then assign cameras to groups using a minimum cost matching algorithm. We test our method using numerous runs from different simulators. JA - Robotics and Automation, 2009. ICRA '09. IEEE International Conference on M3 - 10.1109/ROBOT.2009.5152753 ER - TY - JOUR T1 - Avid interactions underlie the Lys63-linked polyubiquitin binding specificities observed for UBA domains JF - Nature Structural & Molecular Biology Y1 - 2009 A1 - Sims,Joshua J. A1 - Haririnia,Aydin A1 - Dickinson,Bryan C. A1 - Fushman, David A1 - Cohen,Robert E. KW - apoptosis KW - basic cellular processes KW - Biochemistry KW - biophysics KW - cell biology KW - cell cycle KW - cell surface proteins KW - cell-cell interactions KW - checkpoints KW - chromatin KW - chromatin remodeling KW - chromatin structure KW - content KW - DNA recombination KW - DNA repair KW - DNA replication KW - Gene expression KW - Genetics KW - intracellular signaling KW - journal KW - macromolecules KW - mechanism KW - membrane processes KW - molecular KW - molecular basis of disease KW - molecular biology KW - molecular interactions KW - multi-component complexes KW - nature publishing group KW - nature structural molecular biology KW - nucleic acids KW - protein degradation KW - protein folding KW - protein processing KW - Proteins KW - regulation of transcription KW - regulation of translation KW - RNA KW - RNA processing KW - RNAi KW - signal transduction KW - single molecule studies KW - structure and function of proteins KW - transcription KW - translation AB - Ubiquitin (denoted Ub) receptor proteins as a group must contain a diverse set of binding specificities to distinguish the many forms of polyubiquitin (polyUb) signals. Previous studies suggested that the large class of ubiquitin-associated (UBA) domains contains members with intrinsic specificity for Lys63-linked polyUb or Lys48-linked polyUb, thus explaining how UBA-containing proteins can mediate diverse signaling events. Here we show that previously observed Lys63-polyUb selectivity in UBA domains is the result of an artifact in which the dimeric fusion partner, glutathione S-transferase (GST), positions two UBAs for higher affinity, avid interactions with Lys63-polyUb, but not with Lys48-polyUb. Freed from GST, these UBAs are either nonselective or prefer Lys48-polyUb. Accordingly, NMR experiments reveal no Lys63-polyUb–specific binding epitopes for these UBAs. We reexamine previous conclusions based on GST-UBAs and present an alternative model for how UBAs achieve a diverse range of linkage specificities. VL - 16 SN - 1545-9993 UR - http://www.nature.com/nsmb/journal/v16/n8/abs/nsmb.1637.html CP - 8 M3 - 10.1038/nsmb.1637 ER - TY - CONF T1 - Bayesian multitask learning with latent hierarchies T2 - Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence Y1 - 2009 A1 - Daumé, Hal JA - Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence ER - TY - CONF T1 - Bisimulation-based approximate lifted inference T2 - Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence Y1 - 2009 A1 - Sen,Prithviraj A1 - Deshpande, Amol A1 - Getoor, Lise AB - There has been a great deal of recent interest in methods for performing lifted inference; however, most of this work assumes that the first-order model is given as input to the system. Here, we describe lifted inference algorithms that determine symmetries and automatically lift the probabilistic model to speedup inference. In particular, we describe approximate lifted inference techniques that allow the user to trade off inference accuracy for computational efficiency by using a handful of tunable parameters, while keeping the error bounded. Our algorithms are closely related to the graph-theoretic concept of bisimulation. We report experiments on both synthetic and real data to show that in the presence of symmetries, run-times for inference can be improved significantly, with approximate lifted inference providing orders of magnitude speedup over ground inference. JA - Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence T3 - UAI '09 PB - AUAI Press CY - Arlington, Virginia, United States SN - 978-0-9749039-5-8 UR - http://dl.acm.org/citation.cfm?id=1795114.1795172 ER - TY - JOUR T1 - A broadband fast multipole accelerated boundary element method for the three dimensional Helmholtz equation JF - The Journal of the Acoustical Society of America Y1 - 2009 A1 - Gumerov, Nail A. A1 - Duraiswami, Ramani KW - acoustic wave scattering KW - boundary-elements methods KW - boundary-value problems KW - Helmholtz equations KW - iterative methods AB - The development of a fast multipole method (FMM) accelerated iterative solution of the boundary element method (BEM) for the Helmholtz equations in three dimensions is described. The FMM for the Helmholtz equation is significantly different for problems with low and high kD (where k is the wavenumber and D the domain size), and for large problems the method must be switched between levels of the hierarchy. The BEM requires several approximate computations (numerical quadrature, approximations of the boundary shapes using elements), and these errors must be balanced against approximations introduced by the FMM and the convergence criterion for iterative solution. These different errors must all be chosen in a way that, on the one hand, excess work is not done and, on the other, that the error achieved by the overall computation is acceptable. Details of translation operators for low and high kD, choice of representations, and BEM quadrature schemes, all consistent with these approximations, are described. A novel preconditioner using a low accuracy FMM accelerated solver as a right preconditioner is also described. Results of the developed solvers for large boundary value problems with 0.0001≲kD≲500 are presented and shown to perform close to theoretical expectations. VL - 125 UR - http://link.aip.org/link/?JAS/125/191/1 CP - 1 M3 - 10.1121/1.3021297 ER - TY - JOUR T1 - Call for Papers: Special Issue of the Journal of Parallel and Distributed Computing: Cloud Computing JF - J. Parallel Distrib. Comput. Y1 - 2009 A1 - Chockler,Gregory A1 - Dekel,Eliezer A1 - JaJa, Joseph F. A1 - Jimmy Lin VL - 69 SN - 0743-7315 UR - http://dx.doi.org/10.1016/j.jpdc.2009.07.002 CP - 9 M3 - 10.1016/j.jpdc.2009.07.002 ER - TY - JOUR T1 - Chance-constrained programs for link prediction JF - Proceedings of Workshop on Analyzing Networks and Learning with Graphs at NIPS Conference Y1 - 2009 A1 - Doppa,J.R. A1 - Yu,J. A1 - Tadepalli,P. A1 - Getoor, Lise AB - In this paper, we consider the link prediction problem, where we are given a par-tial snapshot of a network at some time and the goal is to predict additional links at a later time. The accuracy of the current prediction methods is quite low due to the extreme class skew and the large number of potential links. In this paper, we describe learning algorithms based on chance constrained programs and show that they exhibit all the properties needed for a good link predictor, namely, al- low preferential bias to positive or negative class; handle skewness in the data; and scale to large networks. Our experimental results on three real-world co- authorship networks show significant improvement in prediction accuracy over baseline algorithms. ER - TY - CONF T1 - Classification of non-manifold singularities from transformations of 2-manifolds T2 - Shape Modeling and Applications, 2009. SMI 2009. IEEE International Conference on Y1 - 2009 A1 - Leon,J.-C. A1 - De Floriani, Leila A1 - Hetroy,F. KW - classification;search KW - Computer KW - graphic;continuous KW - graphics;pattern KW - model;nonmanifold KW - problems;topology; KW - property;computer KW - SHAPE KW - singularity;topological KW - transformation;nonmanifold AB - Non-manifold models are frequently encountered in engineering simulations and design as well as in computer graphics. However, these models lack shape characterization for modelling and searching purposes. Topological properties act as a kernel for deriving key features of objects. Here we propose a classification for the non-manifold singularities of non-manifold objects through continuous shape transformations of 2-manifolds without boundary up to the creation of non-manifold singularities. As a result, the non-manifold objects thus created can be categorized and contribute to the definition of a general purpose taxonomy for non-manifold shapes. JA - Shape Modeling and Applications, 2009. SMI 2009. IEEE International Conference on M3 - 10.1109/SMI.2009.5170146 ER - TY - CONF T1 - Clutter Noise Removal in Binary Document Images T2 - International Conference on Document Analysis and Recognition (ICDAR '09) Y1 - 2009 A1 - Agrawal,Mudit A1 - David Doermann AB - The paper presents a clutter detection and removal algorithm for complex document images. The distance transform based approach is independent of clutter's position, size, shape and connectivity with text. Features are based on a new technique called `nth erosion' and clutter elements are identified with an SVM classifier. Removal is restrictive, so text attached to the clutter is not deleted in the process. The method was tested on a mix of degraded and noisy, machine-printed and handwritten Arabic and English text documents. Results show pixel-level accuracies of 97.5% and 95% for clutter detection and removal respectively. This approach was also extended with a noise detection and removal model for documents having a mix of clutter and salt-n-pepper noise. JA - International Conference on Document Analysis and Recognition (ICDAR '09) ER - TY - CONF T1 - Combining multiple kernels for efficient image classification T2 - Applications of Computer Vision (WACV), 2009 Workshop on Y1 - 2009 A1 - Siddiquie,B. A1 - Vitaladevuni,S.N. A1 - Davis, Larry S. KW - (artificial KW - AdaBoost;base KW - channels;multiple KW - classification;kernel KW - classification;learning KW - decision KW - feature KW - function;discriminative KW - intelligence);support KW - Kernel KW - kernel;image KW - kernels;composite KW - learning;support KW - machine;image KW - machines; KW - similarity;multiple KW - vector AB - We investigate the problem of combining multiple feature channels for the purpose of efficient image classification. Discriminative kernel based methods, such as SVMs, have been shown to be quite effective for image classification. To use these methods with several feature channels, one needs to combine base kernels computed from them. Multiple kernel learning is an effective method for combining the base kernels. However, the cost of computing the kernel similarities of a test image with each of the support vectors for all feature channels is extremely high. We propose an alternate method, where training data instances are selected, using AdaBoost, for each of the base kernels. A composite decision function, which can be evaluated by computing kernel similarities with respect to only these chosen instances, is learnt. This method significantly reduces the number of kernel computations required during testing. Experimental results on the benchmark UCI datasets, as well as on a challenging painting dataset, are included to demonstrate the effectiveness of our method. JA - Applications of Computer Vision (WACV), 2009 Workshop on M3 - 10.1109/WACV.2009.5403040 ER - TY - JOUR T1 - Complete Genome Sequence of Aggregatibacter (Haemophilus) Aphrophilus NJ8700 JF - Journal of BacteriologyJ. Bacteriol. Y1 - 2009 A1 - Di Bonaventura,Maria Pia A1 - DeSalle,Rob A1 - Pop, Mihai A1 - Nagarajan,Niranjan A1 - Figurski,David H A1 - Fine,Daniel H A1 - Kaplan,Jeffrey B A1 - Planet,Paul J AB - We report the finished and annotated genome sequence of Aggregatibacter aphrophilus strain NJ8700, a strain isolated from the oral flora of a healthy individual, and discuss characteristics that may affect its dual roles in human health and disease. This strain has a rough appearance, and its genome contains genes encoding a type VI secretion system and several factors that may participate in host colonization. VL - 191 SN - 0021-9193, 1098-5530 UR - http://jb.asm.org/content/191/14/4693 CP - 14 M3 - 10.1128/JB.00447-09 ER - TY - JOUR T1 - Composability and on-line deniability of authentication JF - Theory of Cryptography Y1 - 2009 A1 - Dodis,Y. A1 - Katz, Jonathan A1 - Smith,A. A1 - Walfish,S. AB - Protocols for deniable authentication achieve seemingly paradoxical guarantees: upon completion of the protocol the receiver is convinced that the sender authenticated the message, but neither party can convince anyone else that the other party took part in the protocol. We introduce and study on-line deniability, where deniability should hold even when one of the parties colludes with a third party during execution of the protocol. This turns out to generalize several realistic scenarios that are outside the scope of previous models.We show that a protocol achieves our definition of on-line deniability if and only if it realizes the message authentication functionality in the generalized universal composability framework; any protocol satisfying our definition thus automatically inherits strong composability guarantees. Unfortunately, we show that our definition is impossible to realize in the PKI model if adaptive corruptions are allowed (even if secure erasure is assumed). On the other hand, we show feasibility with respect to static corruptions (giving the first separation in terms of feasibility between the static and adaptive setting), and show how to realize a relaxation termed deniability with incriminating abort under adaptive corruptions. M3 - 10.1007/978-3-642-00457-5_10 ER - TY - JOUR T1 - A Comprehensive Evaluation Framework and a Comparative Study for Human Detectors JF - Intelligent Transportation Systems, IEEE Transactions on Y1 - 2009 A1 - Hussein,M. A1 - Porikli, F. A1 - Davis, Larry S. KW - classification;infrared KW - classifier;comprehensive KW - curve;INRIA KW - dataset;average KW - DET KW - DETECTION KW - detection; KW - detector;image KW - error KW - Evaluation KW - framework;cropped KW - image;image KW - imaging;object KW - log KW - miss KW - person KW - rate;cascade KW - rate;feature KW - rate;multisize KW - resize;human KW - resize;miss KW - scanning;near-infrared KW - sliding-window KW - tradeoff;false-alarm KW - window;detection AB - We introduce a framework for evaluating human detectors that considers the practical application of a detector on a full image using multisize sliding-window scanning. We produce detection error tradeoff (DET) curves relating the miss detection rate and the false-alarm rate computed by deploying the detector on cropped windows and whole images, using, in the latter, either image resize or feature resize. Plots for cascade classifiers are generated based on confidence scores instead of on variation of the number of layers. To assess a method's overall performance on a given test, we use the average log miss rate (ALMR) as an aggregate performance score. To analyze the significance of the obtained results, we conduct 10-fold cross-validation experiments. We applied our evaluation framework to two state-of-the-art cascade-based detectors on the standard INRIA person dataset and a local dataset of near-infrared images. We used our evaluation framework to study the differences between the two detectors on the two datasets with different evaluation methods. Our results show the utility of our framework. They also suggest that the descriptors used to represent features and the training window size are more important in predicting the detection performance than the nature of the imaging process, and that the choice between resizing images or features can have serious consequences. VL - 10 SN - 1524-9050 CP - 3 M3 - 10.1109/TITS.2009.2026670 ER - TY - JOUR T1 - Computation of singular and hypersingular boundary integrals by Green identity and application to boundary value problems JF - Engineering Analysis with Boundary Elements Y1 - 2009 A1 - Seydou,F. A1 - Duraiswami, Ramani A1 - Seppänen,T. A1 - Gumerov, Nail A. KW - Bounday integral method KW - Green identity KW - Hypersingular integrals KW - Nyström method KW - Singular integrals AB - The problem of computing singular and hypersingular integrals involved in a large class of boundary value problems is considered. The method is based on Green's theorem for calculating the diagonal elements of the resulting discretized matrix using the Nyström discretization method. The method is successfully applied to classical boundary value problems. Convergence of the method is also discussed. VL - 33 SN - 0955-7997 UR - http://www.sciencedirect.com/science/article/pii/S0955799709000447 CP - 8–9 M3 - 10.1016/j.enganabound.2009.02.004 ER - TY - JOUR T1 - Computation of the head-related transfer function via the boundary element method and representation via the spherical harmonic spectrum JF - Technical Reports from UMIACS UMIACS-TR-2009-06 Y1 - 2009 A1 - Gumerov, Nail A. A1 - O'donovan,Adam A1 - Duraiswami, Ramani A1 - Zotkin,Dmitry N KW - Technical Report AB - The head-related transfer function (HRTF) is computed using the fast multipole accelerated boundary element method. For efficiency, the HRTF is computed using the reciprocity principle, by placing a source at the ear and computing its field. Analysis is presented to modify the boundary value problem accordingly. To compute the HRTF corresponding to different ranges via a single computation, a compact and accurate representation of the HRTF, termed the spherical spectrum, is developed. Computations are reduced to a two stage process, the computation of the spherical spectrum and a subsequent evaluation of the HRTF. This representation allows easy interpolation and range extrapolation of HRTFs.HRTF computations are performed for the range of audible frequencies up to 20 kHz for several models including a sphere, human head models (for the “Fritz” and “Kemar”), and head and torso model (the Kemar manikin). Comparisons between the different cases and analysis of limiting cases is provided. Comparisons with the computational data of other authors and available experimental data are conducted and show satisfactory agreement for the frequencies for which reliable experimental data is available. Our results show that, given a good mesh it is feasible to compute the HRTF over the full audible range on a regular personal computer. UR - http://drum.lib.umd.edu/handle/1903/9085 ER - TY - JOUR T1 - Computing and visualizing a graph-based decomposition for non-manifold shapes JF - Graph-Based Representations in Pattern Recognition Y1 - 2009 A1 - De Floriani, Leila A1 - Panozzo,D. A1 - Hui,A. AB - Modeling and understanding complex non-manifold shapes is a key issue in shape analysis and retrieval. The topological structure of a non-manifold shape can be analyzed through its decomposition into a collection of components with a simpler topology. Here, we consider a decomposition of a non-manifold shape into components which are almost manifolds, and we present a novel graph representation which highlights the non-manifold singularities shared by the components as well as their connectivity relations. We describe an algorithm for computing the decomposition and its associated graph representation. We present a new tool for visualizing the shape decomposition and its graph as an effective support to modeling, analyzing and understanding non-manifold shapes. M3 - 10.1007/978-3-642-02124-4_7 ER - TY - JOUR T1 - On Computing Compression Trees for Data Collection in Sensor Networks JF - arXiv:0907.5442 Y1 - 2009 A1 - Li,Jian A1 - Deshpande, Amol A1 - Khuller, Samir KW - Computer Science - Information Theory KW - Computer Science - Networking and Internet Architecture AB - We address the problem of efficiently gathering correlated data from a wired or a wireless sensor network, with the aim of designing algorithms with provable optimality guarantees, and understanding how close we can get to the known theoretical lower bounds. Our proposed approach is based on finding an optimal or a near-optimal {\em compression tree} for a given sensor network: a compression tree is a directed tree over the sensor network nodes such that the value of a node is compressed using the value of its parent. We consider this problem under different communication models, including the {\em broadcast communication} model that enables many new opportunities for energy-efficient data collection. We draw connections between the data collection problem and a previously studied graph concept, called {\em weakly connected dominating sets}, and we use this to develop novel approximation algorithms for the problem. We present comparative results on several synthetic and real-world datasets showing that our algorithms construct near-optimal compression trees that yield a significant reduction in the data collection cost. UR - http://arxiv.org/abs/0907.5442 ER - TY - CONF T1 - Concurrent transition and shot detection in football videos using Fuzzy Logic T2 - Image Processing (ICIP), 2009 16th IEEE International Conference on Y1 - 2009 A1 - Refaey,M.A. A1 - Elsayed,K.M. A1 - Hanafy,S.M. A1 - Davis, Larry S. KW - analysis;inference KW - boundary;shot KW - Color KW - colour KW - detection;sports KW - functions;shot KW - histogram;concurrent KW - logic;image KW - logic;inference KW - mechanism;intensity KW - mechanisms;sport;video KW - processing; KW - processing;videonanalysis;fuzzy KW - signal KW - transition;edgeness;football KW - variance;membership KW - video;video KW - videos;fuzzy AB - Shot detection is a fundamental step in video processing and analysis that should be achieved with high degree of accuracy. In this paper, we introduce a unified algorithm for shot detection in sports video using fuzzy logic as a powerful inference mechanism. Fuzzy logic overcomes the problems of hard cut thresholds and the need to large training data used in previous work. The proposed algorithm integrates many features like color histogram, edgeness, intensity variance, etc. Membership functions to represent different features and transitions between shots have been developed to detect different shot boundary and transition types. We address the detection of cut, fade, dissolve, and wipe shot transitions. The results show that our algorithm achieves high degree of accuracy. JA - Image Processing (ICIP), 2009 16th IEEE International Conference on M3 - 10.1109/ICIP.2009.5413648 ER - TY - CONF T1 - Consensus answers for queries over probabilistic databases T2 - Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems Y1 - 2009 A1 - Li,Jian A1 - Deshpande, Amol KW - consensus answers KW - probabilistic and/xor tree KW - Probabilistic databases KW - Query processing KW - rank aggregation AB - We address the problem of finding a "best" deterministic query answer to a query over a probabilistic database. For this purpose, we propose the notion of a consensus world (or a consensus answer) which is a deterministic world (answer) that minimizes the expected distance to the possible worlds (answers). This problem can be seen as a generalization of the well-studied inconsistent information aggregation problems (e.g. rank aggregation) to probabilistic databases. We consider this problem for various types of queries including SPJ queries, Top-k ranking queries, group-by aggregate queries, and clustering. For different distance metrics, we obtain polynomial time optimal or approximation algorithms for computing the consensus answers (or prove NP-hardness). Most of our results are for a general probabilistic database model, called and/xor tree model, which significantly generalizes previous probabilistic database models like x-tuples and block-independent disjoint models, and is of independent interest. JA - Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems T3 - PODS '09 PB - ACM CY - New York, NY, USA SN - 978-1-60558-553-6 UR - http://doi.acm.org/10.1145/1559795.1559835 M3 - 10.1145/1559795.1559835 ER - TY - JOUR T1 - A cost-effective lexical acquisition process for large-scale thesaurus translation JF - Language resources and evaluation Y1 - 2009 A1 - Jimmy Lin A1 - Murray,G. C A1 - Dorr, Bonnie J A1 - Hajič,J. A1 - Pecina,P. AB - Thesauri and controlled vocabularies facilitate access to digital collections by explicitly representing the underlying principles of organization. Translation of such resources into multiple languages is an important component for providing multilingual access. However, the specificity of vocabulary terms in most thesauri precludes fully-automatic translation using general-domain lexical resources. In this paper, we present an efficient process for leveraging human translations to construct domain-specific lexical resources. This process is illustrated on a thesaurus of 56,000 concepts used to catalog a large archive of oral histories. We elicited human translations on a small subset of concepts, induced a probabilistic phrase dictionary from these translations, and used the resulting resource to automatically translate the rest of the thesaurus. Two separate evaluations demonstrate the acceptability of the automatic translations and the cost-effectiveness of our approach. VL - 43 CP - 1 ER - TY - CONF T1 - Cross-document coreference resolution: A key technology for learning by reading T2 - AAAI Spring Symposium on Learning by Reading and Learning to Read Y1 - 2009 A1 - Mayfield,J. A1 - Alexander,D. A1 - Dorr, Bonnie J A1 - Eisner,J. A1 - Elsayed,T. A1 - Finin,T. A1 - Fink,C. A1 - Freedman,M. A1 - Garera,N. A1 - McNamee,P. A1 - others JA - AAAI Spring Symposium on Learning by Reading and Learning to Read ER - TY - CONF T1 - Dependable, Online Upgrades in Enterprise Systems T2 - OOPSLA'09 Proceedings of the 24th ACM SIGPLAN Conference Companion on Object Oriented Programming Systems Languages and Applications Y1 - 2009 A1 - Tudor Dumitras KW - data migration KW - Dependability KW - hidden dependencies KW - online upgrades KW - software upgrades AB - Software upgrades are unreliable, often causing downtime or data loss. I propose Imago, an approach for removing the leading causes of upgrade failures (broken dependencies) and of planned downtime (data migrations). While imposing a higher resource overhead than previous techniques, Imago is more dependable and easier to use correctly. JA - OOPSLA'09 Proceedings of the 24th ACM SIGPLAN Conference Companion on Object Oriented Programming Systems Languages and Applications T3 - OOPSLA '09 PB - ACM SN - 978-1-60558-768-4 UR - http://doi.acm.org/10.1145/1639950.1639993 ER - TY - CONF T1 - Designing the reading experience for scanned multi-lingual picture books on mobile phones T2 - Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries Y1 - 2009 A1 - Bederson, Benjamin B. A1 - Quinn,A. A1 - Druin, Allison JA - Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries ER - TY - JOUR T1 - Diamond Hierarchies of Arbitrary Dimension JF - Computer Graphics Forum Y1 - 2009 A1 - Weiss,Kenneth A1 - De Floriani, Leila KW - I.3.5 [Computer Graphics]: Computational Geometry and Object Modeling—Hierarchy and geometric transformations KW - I.3.6 [Computer Graphics]: Methodology and Techniques—Graphics data structures and data types AB - Nested simplicial meshes generated by the simplicial bisection decomposition proposed by Maubach [Mau95] have been widely used in 2D and 3D as multi-resolution models of terrains and three-dimensional scalar fields, They are an alternative to octree representation since they allow generating crack-free representations of the underlying field. On the other hand, this method generates conforming meshes only when all simplices sharing the bisection edge are subdivided concurrently. Thus, efficient representations have been proposed in 2D and 3D based on a clustering of the simplices sharing a common longest edge in what is called a diamond. These representations exploit the regularity of the vertex distribution and the diamond structure to yield an implicit encoding of the hierarchical and geometric relationships among the triangles and tetrahedra, respectively. Here, we analyze properties of d-dimensional diamonds to better understand the hierarchical and geometric relationships among the simplices generated by Maubach's bisection scheme and derive closed-form equations for the number of vertices, simplices, parents and children of each type of diamond. We exploit these properties to yield an implicit pointerless representation for d-dimensional diamonds and reduce the number of required neighbor-finding accesses from O(d!) to O(d). VL - 28 SN - 1467-8659 UR - http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8659.2009.01506.x/abstract?userIsAuthenticated=false&deniedAccessCustomisedMessage= CP - 5 M3 - 10.1111/j.1467-8659.2009.01506.x ER - TY - JOUR T1 - Digital forensics [From the Guest Editors] JF - Signal Processing Magazine, IEEE Y1 - 2009 A1 - Delp,E. A1 - Memon, N. A1 - Wu,M. AB - This special issue provides a comprehensive overview of recent developments and open problems in digital forensics that are amenable to signal processing techniques. VL - 26 SN - 1053-5888 CP - 2 M3 - 10.1109/MSP.2008.931089 ER - TY - RPRT T1 - A dimension-independent library for building and manipulating multiresolution triangulations Y1 - 2009 A1 - De Floriani, Leila A1 - Magillo,P. A1 - Puppo,E. AB - A Multi-Triangulation (MT) is a general multiresolution model for representing k-dimensional geometricobjects through simplicial complexes. An MT integrates several alternative representations of an object, and provides simple methods for handling representations at variable resolution efficiently , thus offering a basis for the development of applications that need to manage the level-of-detail of complex objects. In this paper, we present an object-oriented library that provides an open-ended tool for building and manipulating object representations based on the MT. PB - Department of Computer Science and Information Science, University of Genoa VL - DISI-TR-99-03 ER - TY - JOUR T1 - Discrete distortion for surface meshes JF - Image Analysis and Processing–ICIAP 2009 Y1 - 2009 A1 - Mesmoudi,M. A1 - De Floriani, Leila A1 - Magillo,P. AB - Discrete distortion for two- and three-dimensional combinatorial manifolds is a discrete alternative to Ricci curvature known for differentiable manifolds. Here, we show that distortion can be successfully used to estimate mean curvature at any point of a surface. We compare our approach with the continuous case and with a common discrete approximation of mean curvature, which depends on the area of the star of each vertex in the triangulated surface. This provides a new, area-independent, tool for curvature estimation and for morphological shape analysis. We illustrate our approach through experimental results showing the behavior of discrete distortion. M3 - 10.1007/978-3-642-04146-4_70 ER - TY - CONF T1 - Ef?cient Query Evaluation over Temporally Correlated Probabilistic Streams T2 - IEEE 25th International Conference on Data Engineering, 2009. ICDE '09 Y1 - 2009 A1 - Kanagal,B. A1 - Deshpande, Amol KW - Birds KW - Computerized monitoring KW - correlated probabilistic streams KW - correlation structure KW - Data engineering KW - data mining KW - Databases KW - Event detection KW - Graphical models KW - inference mechanisms KW - Markov processes KW - polynomial time KW - probabilistic database KW - probabilistic graphical model KW - probabilistic query evaluation KW - query evaluation KW - query planning algorithm KW - query plans KW - Query processing KW - Random variables KW - stream processing operator KW - Streaming media AB - In this paper, we address the problem of efficient query evaluation over highly correlated probabilistic streams. We observe that although probabilistic streams tend to be strongly correlated in space and time, the correlations are usually quite structured (i.e., the same set of dependencies and independences repeat across time) and Markovian (i.e., the state at time "t+1" is independent of the states at previous times given the state at time "t"). We exploit this observation to compactly encode probabilistic streams by decoupling the correlation structure (the set of dependencies) from the actual probability values. We develop novel stream processing operators that can efficiently and incrementally process new data items; our operators are based on the previously proposed framework of viewing probabilistic query evaluation as inference over probabilistic graphical models (PGMs) [P. Sen and A. Deshpande, 2007]. We develop a query planning algorithm that constructs efficient query plans that are executable in polynomial-time whenever possible, and we characterize queries for which such plans are not possible. Finally we conduct an extensive experimental evaluation that illustrates the advantages of exploiting the structured nature of correlations in probabilistic streams. JA - IEEE 25th International Conference on Data Engineering, 2009. ICDE '09 PB - IEEE SN - 978-1-4244-3422-0 M3 - 10.1109/ICDE.2009.229 ER - TY - CHAP T1 - Efficient Robust Private Set Intersection T2 - Applied Cryptography and Network Security Y1 - 2009 A1 - Dana Dachman-Soled A1 - Malkin, Tal A1 - Raykova, Mariana A1 - Yung, Moti ED - Abdalla, Michel ED - Pointcheval, David ED - Fouque, Pierre-Alain ED - Vergnaud, Damien KW - Coding and Information Theory KW - Computer Communication Networks KW - Cryptographic protocols KW - Data Encryption KW - Data Structures, Cryptology and Information Theory KW - Information Systems Applications (incl.Internet) KW - Privacy Preserving Data Mining KW - Secure Two-party Computation KW - Set Intersection KW - Systems and Data Security AB - Computing Set Intersection privately and efficiently between two mutually mistrusting parties is an important basic procedure in the area of private data mining. Assuring robustness, namely, coping with potentially arbitrarily misbehaving (i.e., malicious) parties, while retaining protocol efficiency (rather than employing costly generic techniques) is an open problem. In this work the first solution to this problem is presented. JA - Applied Cryptography and Network Security T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-01956-2, 978-3-642-01957-9 UR - http://link.springer.com/chapter/10.1007/978-3-642-01957-9_8 ER - TY - CONF T1 - Efficient subset selection via the kernelized Rényi distance T2 - Computer Vision, 2009 IEEE 12th International Conference on Y1 - 2009 A1 - Srinivasan,B.V. A1 - Duraiswami, Ramani AB - With improved sensors, the amount of data available in many vision problems has increased dramatically and allows the use of sophisticated learning algorithms to perform inference on the data. However, since these algorithms scale with data size, pruning the data is sometimes necessary. The pruning procedure must be statistically valid and a representative subset of the data must be selected without introducing selection bias. Information theoretic measures have been used for sampling the data, retaining its original information content. We propose an efficient Rényi entropy based subset selection algorithm. The algorithm is first validated and then applied to two sample applications where machine learning and data pruning are used. In the first application, Gaussian process regression is used to learn object pose. Here it is shown that the algorithm combined with the subset selection is significantly more efficient. In the second application, our subset selection approach is used to replace vector quantization in a standard object recognition algorithm, and improvements are shown. JA - Computer Vision, 2009 IEEE 12th International Conference on ER - TY - CONF T1 - Energy loss in MEMS resonators and the impact on inertial and RF devices Y1 - 2009 A1 - Weinberg,M. A1 - Candler,R. A1 - Chandorkar,S. A1 - Varsanik,J. A1 - Kenny,T. A1 - Duwel,A. KW - energy loss KW - gyroscopes KW - inertial devices KW - MEMS gyros KW - MEMS resonators KW - micromachined devices KW - micromechanical resonators KW - NEMS devices KW - Q-factor KW - quality factor KW - radiofrequency applications KW - RF devices AB - In this paper, we review the current understanding of energy loss mechanisms in micromachined (MEMS and NEMS) devices. We describe the importance of high quality factor (Q) to the performance of MEMS gyros and MEMS resonators used in radio-frequency applications. M3 - 10.1109/SENSOR.2009.5285418 ER - TY - JOUR T1 - Evidence for Bidentate Substrate Binding as the Basis for the K48 Linkage Specificity of Otubain 1 JF - Journal of Molecular Biology Y1 - 2009 A1 - Wang,Tao A1 - Yin,Luming A1 - Cooper,Eric M. A1 - Lai,Ming-Yih A1 - Dickey,Seth A1 - Pickart,Cecile M. A1 - Fushman, David A1 - Wilkinson,Keith D. A1 - Cohen,Robert E. A1 - Wolberger,Cynthia KW - deubiquitination KW - isopeptide KW - linkage specificity KW - otubain KW - polyubiquitin AB - Otubain 1 belongs to the ovarian tumor (OTU) domain class of cysteine protease deubiquitinating enzymes. We show here that human otubain 1 (hOtu1) is highly linkage-specific, cleaving Lys48 (K48)-linked polyubiquitin but not K63-, K29-, K6-, or K11-linked polyubiquitin, or linear α-linked polyubiquitin. Cleavage is not limited to either end of a polyubiquitin chain, and both free and substrate-linked polyubiquitin are disassembled. Intriguingly, cleavage of K48-diubiquitin by hOtu1 can be inhibited by diubiquitins of various linkage types, as well as by monoubiquitin. NMR studies and activity assays suggest that both the proximal and distal units of K48-diubiquitin bind to hOtu1. Reaction of Cys23 with ubiquitin-vinylsulfone identified a ubiquitin binding site that is distinct from the active site, which includes Cys91. Occupancy of the active site is needed to enable tight binding to the second site. We propose that distinct binding sites for the ubiquitins on either side of the scissile bond allow hOtu1 to discriminate among different isopeptide linkages in polyubiquitin substrates. Bidentate binding may be a general strategy used to achieve linkage-specific deubiquitination. VL - 386 SN - 0022-2836 UR - http://www.sciencedirect.com/science/article/pii/S0022283608016124 CP - 4 M3 - 10.1016/j.jmb.2008.12.085 ER - TY - CONF T1 - Exponential family hybrid semi-supervised learning T2 - Proceedings of the 21st International Joint Conference on Artifical Intelligence (IJCAI-09) Y1 - 2009 A1 - Agarwal,A. A1 - Daumé, Hal JA - Proceedings of the 21st International Joint Conference on Artifical Intelligence (IJCAI-09) ER - TY - JOUR T1 - Extreme polymorphism in a vaccine antigen and risk of clinical malaria: implications for vaccine development JF - Sci Transl Med Y1 - 2009 A1 - Takala,S. L A1 - Coulibaly,D. A1 - Thera,M. A A1 - Batchelor,A. H A1 - Cummings, Michael P. A1 - Escalante,A. A A1 - Ouattara,A. A1 - Traoré,K. A1 - Niangaly,A. A1 - Djimdé,A. A A1 - Doumbo,OK A1 - Plowe,CV AB - Vaccines directed against the blood stages of Plasmodium falciparum malaria are intended to prevent the parasite from invading and replicating within host cells. No blood-stage malaria vaccine has shown clinical efficacy in humans. Most malaria vaccine antigens are parasite surface proteins that have evolved extensive genetic diversity, and this diversity could allow malaria parasites to escape vaccine-induced immunity. We examined the extent and within-host dynamics of genetic diversity in the blood-stage malaria vaccine antigen apical membrane antigen-1 in a longitudinal study in Mali. Two hundred and fourteen unique apical membrane antigen-1 haplotypes were identified among 506 human infections, and amino acid changes near a putative invasion machinery binding site were strongly associated with the development of clinical symptoms, suggesting that these residues may be important to consider in designing polyvalent apical membrane antigen-1 vaccines and in assessing vaccine efficacy in field trials. This extreme diversity may pose a serious obstacle to an effective polyvalent recombinant subunit apical membrane antigen-1 vaccine. VL - 1 CP - 2 M3 - 10.1126/scitranslmed.3000257 ER - TY - CONF T1 - Fast concurrent object localization and recognition T2 - Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on Y1 - 2009 A1 - Tom Yeh A1 - Lee,J.J. A1 - Darrell,T. AB - Object localization and recognition are important problems in computer vision. However, in many applications, exhaustive search over all object models and image locations is computationally prohibitive. While several methods have been proposed to make either recognition or localization more efficient, few have dealt with both tasks simultaneously. This paper proposes an efficient method for concurrent object localization and recognition based on a data-dependent multi-class branch-and-bound formalism. Existing bag-of-features recognition techniques which can be expressed as weighted combinations of feature counts can be readily adapted to our method. We present experimental results that demonstrate the merit of our algorithm in terms of recognition accuracy, localization accuracy, and speed, compared to baseline approaches including exhaustive search, implicit-shape model (ISM), and efficient sub-window search (ESS). Moreover, we develop two extensions to consider non-rectangular bounding regions-composite boxes and polygons-and demonstrate their ability to achieve higher recognition scores compared to traditional rectangular bounding boxes. JA - Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on PB - IEEE SN - 978-1-4244-3992-8 UR - http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5206805 M3 - 10.1109/CVPR.2009.5206805 ER - TY - JOUR T1 - Fast search for Dirichlet process mixture models JF - Arxiv preprint arXiv:0907.1812 Y1 - 2009 A1 - Daumé, Hal ER - TY - CONF T1 - First Steps to Netviz Nirvana: Evaluating Social Network Analysis with NodeXL T2 - International Conference on Computational Science and Engineering, 2009. CSE '09 Y1 - 2009 A1 - Bonsignore,E. M A1 - Dunne,C. A1 - Rotman,D. A1 - Smith,M. A1 - Capone,T. A1 - Hansen,D. L A1 - Shneiderman, Ben KW - Computer science KW - computer science education KW - data visualisation KW - Data visualization KW - Educational institutions KW - graph drawing KW - graph layout algorithm KW - Information services KW - Information Visualization KW - Internet KW - Libraries KW - Microsoft Excel open-source template KW - MILC KW - multi-dimensional in-depth long-term case studies KW - Netviz Nirvana KW - NodeXL KW - Open source software KW - Programming profession KW - SNA KW - social network analysis KW - Social network services KW - social networking (online) KW - spreadsheet programs KW - structural relationship KW - teaching KW - visual analytics KW - visualization tool KW - Web sites AB - Social Network Analysis (SNA) has evolved as a popular, standard method for modeling meaningful, often hidden structural relationships in communities. Existing SNA tools often involve extensive pre-processing or intensive programming skills that can challenge practitioners and students alike. NodeXL, an open-source template for Microsoft Excel, integrates a library of common network metrics and graph layout algorithms within the familiar spreadsheet format, offering a potentially low-barrier-to-entry framework for teaching and learning SNA. We present the preliminary findings of 2 user studies of 21 graduate students who engaged in SNA using NodeXL. The majority of students, while information professionals, had little technical background or experience with SNA techniques. Six of the participants had more technical backgrounds and were chosen specifically for their experience with graph drawing and information visualization. Our primary objectives were (1) to evaluate NodeXL as an SNA tool for a broad base of users and (2) to explore methods for teaching SNA. Our complementary dual case-study format demonstrates the usability of NodeXL for a diverse set of users, and significantly, the power of a tightly integrated metrics/visualization tool to spark insight and facilitate sense-making for students of SNA. JA - International Conference on Computational Science and Engineering, 2009. CSE '09 PB - IEEE VL - 4 SN - 978-1-4244-5334-4 M3 - 10.1109/CSE.2009.120 ER - TY - CONF T1 - Fluency, adequacy, or HTER?: exploring different human judgments with a tunable MT metric T2 - Proceedings of the Fourth Workshop on Statistical Machine Translation Y1 - 2009 A1 - Snover,Matthew A1 - Madnani,Nitin A1 - Dorr, Bonnie J A1 - Schwartz,Richard AB - Automatic Machine Translation (MT) evaluation metrics have traditionally been evaluated by the correlation of the scores they assign to MT output with human judgments of translation performance. Different types of human judgments, such as Fluency, Adequacy, and HTER, measure varying aspects of MT performance that can be captured by automatic MT metrics. We explore these differences through the use of a new tunable MT metric: TER-Plus, which extends the Translation Edit Rate evaluation metric with tunable parameters and the incorporation of morphology, synonymy and paraphrases. TER-Plus was shown to be one of the top metrics in NIST's Metrics MATR 2008 Challenge, having the highest average rank in terms of Pearson and Spearman correlation. Optimizing TER-Plus to different types of human judgments yields significantly improved correlations and meaningful changes in the weight of different types of edits, demonstrating significant differences between the types of human judgments. JA - Proceedings of the Fourth Workshop on Statistical Machine Translation T3 - StatMT '09 PB - Association for Computational Linguistics CY - Stroudsburg, PA, USA UR - http://dl.acm.org/citation.cfm?id=1626431.1626480 ER - TY - JOUR T1 - From New Zealand to Mongolia: Co-designing and deploying a digital library for the world's children JF - Special issue of Children, Youth and Environments: Children in Technological Environments: Interaction, Development, and Design Y1 - 2009 A1 - Druin, Allison A1 - Bederson, Benjamin B. A1 - Rose,A. A1 - Weeks,A. ER - TY - CONF T1 - Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus T2 - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2 Y1 - 2009 A1 - Mohammad,Saif A1 - Dunne,Cody A1 - Dorr, Bonnie J AB - Sentiment analysis often relies on a semantic orientation lexicon of positive and negative words. A number of approaches have been proposed for creating such lexicons, but they tend to be computationally expensive, and usually rely on significant manual annotation and large corpora. Most of these methods use WordNet. In contrast, we propose a simple approach to generate a high-coverage semantic orientation lexicon, which includes both individual words and multi-word expressions, using only a Roget-like thesaurus and a handful of affixes. Further, the lexicon has properties that support the Polyanna Hypothesis. Using the General Inquirer as gold standard, we show that our lexicon has 14 percentage points more correct entries than the leading WordNet-based high-coverage lexicon (SentiWordNet). In an extrinsic evaluation, we obtain significantly higher performance in determining phrase polarity using our thesaurus-based lexicon than with any other. Additionally, we explore the use of visualization techniques to gain insight into the our algorithm beyond the evaluations mentioned above. JA - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2 T3 - EMNLP '09 PB - Association for Computational Linguistics CY - Stroudsburg, PA, USA SN - 978-1-932432-62-6 UR - http://dl.acm.org/citation.cfm?id=1699571.1699591 ER - TY - JOUR T1 - Generating surveys of scientific paradigms JF - Proceedings of HLT-NAACL Y1 - 2009 A1 - Mohammad,S. A1 - Dorr, Bonnie J A1 - Egan,M. A1 - Hassan,A. A1 - Muthukrishan,P. A1 - Qazvinian,V. A1 - Radev,D. A1 - Zajic, David AB - The number of research publications in var-ious disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of tech- nical material. In this paper we present the first steps in producing an automatically gen- erated, readily consumable, technical survey. Specifically we explore the combination of citation information and summarization tech- niques. Even though prior work (Teufel et al., 2006) argues that citation text is unsuitable for summarization, we show that in the frame- work of multi-document survey creation, cita- tion texts can play a crucial role. ER - TY - JOUR T1 - Genome assortment, not serogroup, defines Vibrio cholerae pandemic strains JF - Nature Y1 - 2009 A1 - Brettin,Thomas S[Los Alamos National Laboratory A1 - Bruce,David C[Los Alamos National Laboratory A1 - Challacombe,Jean F[Los Alamos National Laboratory A1 - Detter,John C[Los Alamos National Laboratory A1 - Han,Cliff S[Los Alamos National Laboratory A1 - Munik,A. C[Los Alamos National Laboratory A1 - Chertkov,Olga[Los Alamos National Laboratory A1 - Meincke,Linda[Los Alamos National Laboratory A1 - Saunders,Elizabeth[Los Alamos National Laboratory A1 - Choi,Seon Y[SEOUL NATL UNIV A1 - Haley,Bradd J[U MARYLAND A1 - Taviani,Elisa[U MARYLAND A1 - Jeon,Yoon-Seong[INTL VACCINE INST SEOUL A1 - Kim,Dong Wook[INTL VACCINE INST SEOUL A1 - Lee,Jae-Hak[SEOUL NATL UNIV A1 - Walters,Ronald A[PNNL A1 - Hug,Anwar[NATL INST CHOLERIC ENTERIC DIS A1 - Rita R Colwell KW - 59; CHOLERA; GENES; GENETICS; GENOTYPE; ISLANDS; ORIGIN; PHENOTYPE; PUBLIC HEALTH; RECOMBINATION; STRAINS; TOXINS AB - Vibrio cholerae, the causative agent of cholera, is a bacterium autochthonous to the aquatic environment, and a serious public health threat. V. cholerae serogroup O1 is responsible for the previous two cholera pandemics, in which classical and El Tor biotypes were dominant in the 6th and the current 7th pandemics, respectively. Cholera researchers continually face newly emerging and re-emerging pathogenic clones carrying combinations of new serogroups as well as of phenotypic and genotypic properties. These genotype and phenotype changes have hampered control of the disease. Here we compare the complete genome sequences of 23 strains of V. cholerae isolated from a variety of sources and geographical locations over the past 98 years in an effort to elucidate the evolutionary mechanisms governing genetic diversity and genesis of new pathogenic clones. The genome-based phylogeny revealed 12 distinct V. cholerae phyletic lineages, of which one, designated the V. cholerae core genome (CG), comprises both O1 classical and EI Tor biotypes. All 7th pandemic clones share nearly identical gene content, i.e., the same genome backbone. The transition from 6th to 7th pandemic strains is defined here as a 'shift' between pathogenic clones belonging to the same O1 serogroup, but from significantly different phyletic lineages within the CG clade. In contrast, transition among clones during the present 7th pandemic period can be characterized as a 'drift' between clones, differentiated mainly by varying composition of laterally transferred genomic islands, resulting in emergence of variants, exemplified by V.cholerae serogroup O139 and V.cholerae O1 El Tor hybrid clones that produce cholera toxin of classical biotype. Based on the comprehensive comparative genomics presented in this study it is concluded that V. cholerae undergoes extensive genetic recombination via lateral gene transfer, and, therefore, genome assortment, not serogroup, should be used to define pathogenic V. cholerae clones. UR - http://www.osti.gov/energycitations/servlets/purl/962365-icnke9/ ER - TY - JOUR T1 - Graphical models for uncertain data JF - Managing and Mining Uncertain Data Y1 - 2009 A1 - Deshpande, Amol A1 - Getoor, Lise A1 - Sen,P. ER - TY - JOUR T1 - HEALTH AND WELLBEING JF - The fourth paradigm: data-intensive scientific discovery Y1 - 2009 A1 - Gillam,M. A1 - Feied,C. A1 - MOODY,E. A1 - Shneiderman, Ben A1 - Smith,M. A1 - DICKASON,J. ER - TY - CONF T1 - How children search the internet with keyword interfaces T2 - Proceedings of the 8th International Conference on Interaction Design and Children Y1 - 2009 A1 - Druin, Allison A1 - Foss,E. A1 - Hatley,L. A1 - Golub,E. A1 - Guha,M.L. A1 - Fails,J. A1 - Hutchinson,H. JA - Proceedings of the 8th International Conference on Interaction Design and Children ER - TY - CONF T1 - Human detection using partial least squares analysis T2 - Computer Vision, 2009 IEEE 12th International Conference on Y1 - 2009 A1 - Schwartz,William Robson A1 - Kembhavi,Aniruddha A1 - Harwood,David A1 - Davis, Larry S. AB - Significant research has been devoted to detecting people in images and videos. In this paper we describe a human detection method that augments widely used edge-based features with texture and color information, providing us with a much richer descriptor set. This augmentation results in an extremely high-dimensional feature space (more than 170,000 dimensions). In such high-dimensional spaces, classical machine learning algorithms such as SVMs are nearly intractable with respect to training. Furthermore, the number of training samples is much smaller than the dimensionality of the feature space, by at least an order of magnitude. Finally, the extraction of features from a densely sampled grid structure leads to a high degree of multicollinearity. To circumvent these data characteristics, we employ Partial Least Squares (PLS) analysis, an efficient dimensionality reduction technique, one which preserves significant discriminative information, to project the data onto a much lower dimensional subspace (20 dimensions, reduced from the original 170,000). Our human detection system, employing PLS analysis over the enriched descriptor set, is shown to outperform state-of-the-art techniques on three varied datasets including the popular INRIA pedestrian dataset, the low-resolution gray-scale DaimlerChrysler pedestrian dataset, and the ETHZ pedestrian dataset consisting of full-length videos of crowded scenes. JA - Computer Vision, 2009 IEEE 12th International Conference on M3 - 10.1109/ICCV.2009.5459205 ER - TY - JOUR T1 - Image Transformations and Blurring JF - IEEE Transactions on Pattern Analysis and Machine Intelligence Y1 - 2009 A1 - Domke, Justin A1 - Aloimonos, J. KW - reconstruction KW - restoration KW - sharpening and deblurring KW - smoothing. AB - Since cameras blur the incoming light during measurement, different images of the same surface do not contain the same information about that surface. Thus, in general, corresponding points in multiple views of a scene have different image intensities. While multiple-view geometry constrains the locations of corresponding points, it does not give relationships between the signals at corresponding locations. This paper offers an elementary treatment of these relationships. We first develop the notion of "ideal” and "real” images, corresponding to, respectively, the raw incoming light and the measured signal. This framework separates the filtering and geometric aspects of imaging. We then consider how to synthesize one view of a surface from another; if the transformation between the two views is affine, it emerges that this is possible if and only if the singular values of the affine matrix are positive. Next, we consider how to combine the information in several views of a surface into a single output image. By developing a new tool called "frequency segmentation,” we show how this can be done despite not knowing the blurring kernel. VL - 31 SN - 0162-8828 CP - 5 M3 - http://doi.ieeecomputersociety.org/10.1109/TPAMI.2008.133 ER - TY - JOUR T1 - Imaging room acoustics with the audio camera. JF - The Journal of the Acoustical Society of America Y1 - 2009 A1 - O'donovan,Adam A1 - Duraiswami, Ramani A1 - Gumerov, Nail A. A1 - Zotkin,Dmitry N AB - Using a spherical microphone array and real time signal processing using a graphical processing unit (GPU), an audio camera has been developed. This device provides images of the intensity of the sound field arriving at a point from a specified direction to the spherical array. Real‐time performance is achieved via use of GPUs. The intensity can be displayed integrated over the whole frequency band of the array, or in false color, with different frequency bands mapped to different color bands. The resulting audio camera may be combined with video cameras to achieve multimodal scene capture and analysis. A theory of registration of audio camera images with video camera images is developed, and joint analysis of audio and video images performed. An interesting application of the audio camera is the imaging of concert hall acoustics. The individual reflections that constitute the impulse response measured at a particular seat may be imaged, and their spatial origin determined. Other applications of the audio camera to people tracking, noise suppression, and camera pointing are also presented. [Work partially supported by NVIDIA and the VA.] VL - 125 UR - http://link.aip.org/link/?JAS/125/2544/2 CP - 4 ER - TY - CHAP T1 - Improved Non-committing Encryption with Applications to Adaptively Secure Protocols T2 - Advances in Cryptology – ASIACRYPT 2009 Y1 - 2009 A1 - Choi, Seung Geol A1 - Dana Dachman-Soled A1 - Malkin, Tal A1 - Wee, Hoeteck ED - Matsui, Mitsuru KW - adaptive corruption KW - Algorithm Analysis and Problem Complexity KW - Applications of Mathematics KW - Data Encryption KW - Data Structures, Cryptology and Information Theory KW - Discrete Mathematics in Computer Science KW - non-committing encryption KW - public-key encryption KW - secure multi-party computation KW - Systems and Data Security AB - We present a new construction of non-committing encryption schemes. Unlike the previous constructions of Canetti et al. (STOC ’96) and of Damgård and Nielsen (Crypto ’00), our construction achieves all of the following properties: Optimal round complexity. Our encryption scheme is a 2-round protocol, matching the round complexity of Canetti et al. and improving upon that in Damgård and Nielsen. Weaker assumptions. Our construction is based on trapdoor simulatable cryptosystems, a new primitive that we introduce as a relaxation of those used in previous works. We also show how to realize this primitive based on hardness of factoring. Improved efficiency. The amortized complexity of encrypting a single bit is O(1) public key operations on a constant-sized plaintext in the underlying cryptosystem. As a result, we obtain the first non-committing public-key encryption schemes under hardness of factoring and worst-case lattice assumptions; previously, such schemes were only known under the CDH and RSA assumptions. Combined with existing work on secure multi-party computation, we obtain protocols for multi-party computation secure against a malicious adversary that may adaptively corrupt an arbitrary number of parties under weaker assumptions than were previously known. Specifically, we obtain the first adaptively secure multi-party protocols based on hardness of factoring in both the stand-alone setting and the UC setting with a common reference string. JA - Advances in Cryptology – ASIACRYPT 2009 T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-10365-0, 978-3-642-10366-7 UR - http://link.springer.com/chapter/10.1007/978-3-642-10366-7_17 ER - TY - JOUR T1 - Improving graph drawing readability by incorporating readability metrics: A software tool for network analysts JF - University of Maryland, HCIL Tech Report HCIL-2009-13 Y1 - 2009 A1 - Dunne,C. A1 - Shneiderman, Ben AB - Designing graph drawings that effectively communicate the under-lying network is challenging as for every network there are many potential unintelligible or even misleading drawings. Automated graph layout algorithms have helped, but frequently generate in- effective drawings. In order to build awareness of effective graph drawing strategies, we detail readability metrics on a [0,1] contin- uous scale for node occlusion, edge crossing, edge crossing angle, and edge tunneling and summarize many more. Additionally, we define new node & edge readability metrics to provide more lo- calized identification of where improvement is needed. These are implemented in SocialAction, a tool for social network analysis, in order to direct users towards poor areas of the drawing and provide real-time readability metric feedback as users manipulate it. These contributions are aimed at heightening the awareness of network analysts that the images they share or publish could be of higher quality, so that readers could extract relevant information. ER - TY - JOUR T1 - Improving recommendation accuracy by clustering social networks with trust JF - Recommender Systems & the Social Web Y1 - 2009 A1 - DuBois,T. A1 - Golbeck,J. A1 - Kleint,J. A1 - Srinivasan, Aravind AB - Social trust relationships between users in social networksspeak to the similarity in opinions between the users, both in general and in important nuanced ways. They have been used in the past to make recommendations on the web. New trust metrics allow us to easily cluster users based on trust. In this paper, we investigate the use of trust clusters as a new way of improving recommendations. Previous work on the use of clusters has shown the technique to be relatively un- successful, but those clusters were based on similarity rather than trust. Our results show that when trust clusters are integrated into memory-based collaborative filtering algo- rithms, they lead to statistically significant improvements in accuracy. In this paper we discuss our methods, experi- ments, results, and potential future applications of the tech- nique. ER - TY - CONF T1 - Incremental Multiple Kernel Learning for object recognition T2 - Computer Vision, 2009 IEEE 12th International Conference on Y1 - 2009 A1 - Kembhavi,Aniruddha A1 - Siddiquie,Behjat A1 - Miezianko,Roland A1 - McCloskey,Scott A1 - Davis, Larry S. AB - A good training dataset, representative of the test images expected in a given application, is critical for ensuring good performance of a visual categorization system. Obtaining task specific datasets of visual categories is, however, far more tedious than obtaining a generic dataset of the same classes. We propose an Incremental Multiple Kernel Learning (IMKL) approach to object recognition that initializes on a generic training database and then tunes itself to the classification task at hand. Our system simultaneously updates the training dataset as well as the weights used to combine multiple information sources. We demonstrate our system on a vehicle classification problem in a video stream overlooking a traffic intersection. Our system updates itself with images of vehicles in poses more commonly observed in the scene, as well as with image patches of the background, leading to an increase in performance. A considerable change in the kernel combination weights is observed as the system gathers scene specific training data over time. The system is also seen to adapt itself to the illumination change in the scene as day transitions to night. JA - Computer Vision, 2009 IEEE 12th International Conference on M3 - 10.1109/ICCV.2009.5459179 ER - TY - CONF T1 - Indexing correlated probabilistic databases T2 - Proceedings of the 35th SIGMOD international conference on Management of data Y1 - 2009 A1 - Kanagal,Bhargav A1 - Deshpande, Amol KW - caching KW - Indexing KW - inference queries KW - junction trees KW - Probabilistic databases AB - With large amounts of correlated probabilistic data being generated in a wide range of application domains including sensor networks, information extraction, event detection etc., effectively managing and querying them has become an important research direction. While there is an exhaustive body of literature on querying independent probabilistic data, supporting efficient queries over large-scale, correlated databases remains a challenge. In this paper, we develop efficient data structures and indexes for supporting inference and decision support queries over such databases. Our proposed hierarchical data structure is suitable both for in-memory and disk-resident databases. We represent the correlations in the probabilistic database using a junction tree over the tuple-existence or attribute-value random variables, and use tree partitioning techniques to build an index structure over it. We show how to efficiently answer inference and aggregation queries using such an index, resulting in orders of magnitude performance benefits in most cases. In addition, we develop novel algorithms for efficiently keeping the index structure up-to-date as changes (inserts, updates) are made to the probabilistic database. We present a comprehensive experimental study illustrating the benefits of our approach to query processing in probabilistic databases. JA - Proceedings of the 35th SIGMOD international conference on Management of data T3 - SIGMOD '09 PB - ACM CY - New York, NY, USA SN - 978-1-60558-551-2 UR - http://doi.acm.org/10.1145/1559845.1559894 M3 - 10.1145/1559845.1559894 ER - TY - JOUR T1 - The infinite hierarchical factor regression model JF - Arxiv preprint arXiv:0908.0570 Y1 - 2009 A1 - Rai,P. A1 - Daumé, Hal ER - TY - CHAP T1 - Interlingual annotation of multilingual text corpora and FrameNet T2 - Multilingual FrameNets in Computational LexicographyMultilingual FrameNets in Computational Lexicography Y1 - 2009 A1 - Farwell,David A1 - Dorr, Bonnie J A1 - Habash,Nizar A1 - Helmreich,Stephen A1 - Hovy,Eduard A1 - Green,Rebecca A1 - Levin,Lori A1 - Miller,Keith A1 - Mitamura,Teruko A1 - Rambow,Owen A1 - Reeder,Flo A1 - Siddharthan,Advaith ED - Bisang,Walter ED - Hock,Hans Henrich ED - Winter,Werner ED - Boas,Hans C. JA - Multilingual FrameNets in Computational LexicographyMultilingual FrameNets in Computational Lexicography PB - Mouton de Gruyter CY - Berlin, New York VL - 200 SN - 978-3-11-021296-9, 978-3-11-021297-6 UR - http://www.degruyter.com/view/books/9783110212976/9783110212976.4.287/9783110212976.4.287.xml ER - TY - JOUR T1 - Language Identification for Handwritten Document Images Using AShape Codebook JF - Pattern Recognition Y1 - 2009 A1 - Zhu,Guangyu A1 - Yu,Xiaodong A1 - Li,Yi A1 - David Doermann AB - Language identification for handwritten document images is an open document analysis problem. In this paper, we propose a novel approach to language identification for documents containing mixture of handwritten and machine printed text using image descriptors constructed from a codebook of shape features. We encode local text structures using scale and rotation invariant codewords, each representing a segmentation-free shape feature that is generic enough to be detected repeatably. We learn a concise, structurally indexed shape codebook from training by clustering and partitioning similar feature types through graph cuts. Our approach is easily extensible and does not require skew correction, scale normalization, or segmentation. We quantitatively evaluate our approach using a large real-world document image collection, which is composed of 1,512 documents in eight languages (Arabic, Chinese, English, Hindi, Japanese, Korean, Russian, and Thai) and contains a complex mixture of handwritten and machine printed content. Experiments demonstrate the robustness and flexibility of our approach, and show exceptional language identification performance that exceeds the state of the art. VL - 42 ER - TY - CONF T1 - Learning Discriminative Appearance-Based Models Using Partial Least Squares T2 - Computer Graphics and Image Processing (SIBGRAPI), 2009 XXII Brazilian Symposium on Y1 - 2009 A1 - Schwartz, W.R. A1 - Davis, Larry S. KW - (artificial KW - analysis;learning KW - appearance KW - approximations;object KW - based KW - colour KW - descriptors;learning KW - discriminative KW - intelligence);least KW - learning KW - least KW - models;machine KW - person KW - recognition; KW - recognition;feature KW - squares KW - squares;image KW - techniques;partial AB - Appearance information is essential for applications such as tracking and people recognition. One of the main problems of using appearance-based discriminative models is the ambiguities among classes when the number of persons being considered increases. To reduce the amount of ambiguity, we propose the use of a rich set of feature descriptors based on color, textures and edges. Another issue regarding appearance modeling is the limited number of training samples available for each appearance. The discriminative models are created using a powerful statistical tool called partial least squares (PLS), responsible for weighting the features according to their discriminative power for each different appearance. The experimental results, based on appearance-based person recognition, demonstrate that the use of an enriched feature set analyzed by PLS reduces the ambiguity among different appearances and provides higher recognition rates when compared to other machine learning techniques. JA - Computer Graphics and Image Processing (SIBGRAPI), 2009 XXII Brazilian Symposium on M3 - 10.1109/SIBGRAPI.2009.42 ER - TY - JOUR T1 - Learning to trust in the competence and commitment of agents JF - Autonomous Agents and Multi-Agent Systems Y1 - 2009 A1 - Smith,Michael A1 - desJardins, Marie AB - For agents to collaborate in open multi-agent systems, each agent must trust in the other agents’ ability to complete tasks and willingness to cooperate. Agents need to decide between cooperative and opportunistic behavior based on their assessment of another agents’ trustworthiness. In particular, an agent can have two beliefs about a potential partner that tend to indicate trustworthiness: that the partner is competent and that the partner expects to engage in future interactions . This paper explores an approach that models competence as an agent’s probability of successfully performing an action, and models belief in future interactions as a discount factor. We evaluate the underlying decision framework’s performance given accurate knowledge of the model’s parameters in an evolutionary game setting. We then introduce a game-theoretic framework in which an agent can learn a model of another agent online, using the Harsanyi transformation. The learning agents evaluate a set of competing hypotheses about another agent during the simulated play of an indefinitely repeated game. The Harsanyi strategy is shown to demonstrate robust and successful online play against a variety of static, classic, and learning strategies in a variable-payoff Iterated Prisoner’s Dilemma setting. VL - 18 SN - 1387-2532 UR - http://dx.doi.org/10.1007/s10458-008-9055-8 CP - 1 ER - TY - CONF T1 - Logo Matching for Document Image Retrieval T2 - International Conference on Document Analysis and Recognition (ICDAR 2009) Y1 - 2009 A1 - Zhu,Guangyu A1 - David Doermann AB - Graphics detection and recognition are fundamental research problems in document image analysis and retrieval. As one of the most pervasive graphical elements in business and government documents, logos may enable immediate identification of organizational entities and serve extensively as a declaration of a document's source and ownership. In this work, we developed an automatic logo-based document image retrieval system that handles: 1) Logo detection and segmentation by boosting a cascade of classifiers across multiple image scales; and 2) Logo matching using translation, scale, and rotation invariant shape descriptors and matching algorithms. Our approach is segmentation free and layout independent and we address logo retrieval in an unconstrained setting of 2-D feature point matching. Finally, we quantitatively evaluate the effectiveness of our approach using large collections of real-world complex document images. JA - International Conference on Document Analysis and Recognition (ICDAR 2009) ER - TY - CONF T1 - Markov random topic fields T2 - Proceedings of the ACL-IJCNLP 2009 Conference Short Papers Y1 - 2009 A1 - Daumé, Hal JA - Proceedings of the ACL-IJCNLP 2009 Conference Short Papers ER - TY - JOUR T1 - The Maryland Modular Method: An Approach to Doctoral Education in Information Studies JF - Journal of education for library and information science Y1 - 2009 A1 - Druin, Allison A1 - Jaeger,P. T A1 - Golbeck,J. A1 - Fleischmann,K.R. A1 - Jimmy Lin A1 - Qu,Y. A1 - Wang,P. A1 - Xie,B. VL - 50 CP - 4 ER - TY - CONF T1 - Measurement methods for fast and accurate blackhole identification with binary tomography T2 - Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference Y1 - 2009 A1 - Cunha,Ítalo A1 - Teixeira,Renata A1 - Feamster, Nick A1 - Diot,Christophe KW - diagnosis KW - network tomography KW - troubleshooting AB - Binary tomography - the process of identifying faulty network links through coordinated end-to-end probes - is a promising method for detecting failures that the network does not automatically mask (e.g., network "blackholes"). Because tomography is sensitive to the quality of the input, however, naïve end-to-end measurements can introduce inaccuracies. This paper develops two methods for generating inputs to binary tomography algorithms that improve their inference speed and accuracy. Failure confirmation is a per-path probing technique to distinguish packet losses caused by congestion from persistent link or node failures. Aggregation strategies combine path measurements from unsynchronized monitors into a set of consistent observations. When used in conjunction with existing binary tomography algorithms, our methods identify all failures that are longer than two measurement cycles, while inducing relatively few false alarms. In two wide-area networks, our techniques decrease the number of alarms by as much as two orders of magnitude. Compared to the state of the art in binary tomography, our techniques increase the identification rate and avoid hundreds of false alarms. JA - Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference T3 - IMC '09 PB - ACM CY - New York, NY, USA SN - 978-1-60558-771-4 UR - http://doi.acm.org/10.1145/1644893.1644924 M3 - 10.1145/1644893.1644924 ER - TY - JOUR T1 - MEASURING 1ST ORDER STRETCH WITH A SINGLE FILTER JF - Relation Y1 - 2009 A1 - Bitsakos,K. A1 - Domke, J. A1 - Fermüller, Cornelia A1 - Aloimonos, J. AB - We analytically develop a filter that is able to measurethe linear stretch of the transformation around a point, and present results of applying it to real signals. We show that this method is a real-time alternative solution for measuring local signal transformations. Experimentally, this method can accurately measure stretch, however, it is sensitive to shift. VL - 10 CP - 1.132 ER - TY - CONF T1 - Measuring General Relational Structure Using the Block Modularity Clustering Objective T2 - Twenty-Second International FLAIRS Conference Y1 - 2009 A1 - Anthony,Adam Paul A1 - desJardins, Marie A1 - Lombardi,Michael AB - The performance of all relational learning techniques has an implicit dependence on the underlying connectivity structure of the relations that are used as input. In this paper, we show how clustering can be used to develop an efficient optimization strategy can be used to effectively measure the structure of a graph in the absence of labeled instances. JA - Twenty-Second International FLAIRS Conference UR - http://www.aaai.org/ocs/index.php/FLAIRS/2009/paper/viewPaper/46 ER - TY - CONF T1 - Minimizing Communication Cost in Distributed Multi-query Processing T2 - IEEE 25th International Conference on Data Engineering, 2009. ICDE '09 Y1 - 2009 A1 - Li,Jian A1 - Deshpande, Amol A1 - Khuller, Samir KW - Approximation algorithms KW - Communication networks KW - Computer science KW - Cost function KW - Data engineering KW - distributed communication network KW - distributed databases KW - distributed multi-query processing KW - grid computing KW - Large-scale systems KW - NP-hard KW - optimisation KW - Polynomials KW - Publish-subscribe KW - publish-subscribe systems KW - Query optimization KW - Query processing KW - sensor networks KW - Steiner tree problem KW - Tree graphs KW - trees (mathematics) AB - Increasing prevalence of large-scale distributed monitoring and computing environments such as sensor networks, scientific federations, Grids etc., has led to a renewed interest in the area of distributed query processing and optimization. In this paper we address a general, distributed multi-query processing problem motivated by the need to minimize the communication cost in these environments. Specifically we address the problem of optimally sharing data movement across the communication edges in a distributed communication network given a set of overlapping queries and query plans for them (specifying the operations to be executed). Most of the problem variations of our general problem can be shown to be NP-Hard by a reduction from the Steiner tree problem. However, we show that the problem can be solved optimally if the communication network is a tree, and present a novel algorithm for finding an optimal data movement plan. For general communication networks, we present efficient approximation algorithms for several variations of the problem. Finally, we present an experimental study over synthetic datasets showing both the need for exploiting the sharing of data movement and the effectiveness of our algorithms at finding such plans. JA - IEEE 25th International Conference on Data Engineering, 2009. ICDE '09 PB - IEEE SN - 978-1-4244-3422-0 M3 - 10.1109/ICDE.2009.85 ER - TY - CONF T1 - Modal expansion of HRTFs: Continuous representation in frequency-range-angle T2 - Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on Y1 - 2009 A1 - Zhang,Wen A1 - Abhayapala,T.D. A1 - Kennedy,R.A. A1 - Duraiswami, Ramani KW - analysis;signal KW - bessel KW - domains;frequency-range-angle;head KW - expansion;modal KW - Fourier KW - function;modal KW - functions; KW - reconstruction;signal KW - related KW - representation;transfer KW - series;HRTF;frequency KW - spherical KW - transfer AB - This paper proposes a continuous HRTF representation in both 3D spatial and frequency domains. The method is based on the acoustic reciprocity principle and a modal expansion of the wave equation solution to represent the HRTF variations with different variables in separate basis functions. The derived spatial basis modes can achieve HRTF near-field and far-field representation in one formulation. The HRTF frequency components are expanded using Fourier Spherical Bessel series for compact representation. The proposed model can be used to reconstruct HRTFs at any arbitrary position in space and at any frequency point from a finite number of measurements. Analytical simulated and measured HRTFs from a KEMAR are used to validate the model. JA - Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on M3 - 10.1109/ICASSP.2009.4959576 ER - TY - CONF T1 - Morphology analysis of 3D scalar fields based on morse theory and discrete distortion T2 - Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems Y1 - 2009 A1 - Mesmoudi,Mohammed Mostefa A1 - De Floriani, Leila A1 - Magillo,Paola KW - curvature KW - morphological analysis KW - Morse theory KW - segmentation AB - We investigate a morphological approach to the analysis and understanding of 3D scalar fields defined by volume data sets. We consider a discrete model of the 3D field obtained by discretizing its domain into a tetrahedral mesh. We use Morse theory as the basic mathematical tool which provides a segmentation of the graph of the scalar field based on relevant morphological features (such as critical points). Since the graph of a discrete 3D field is a tetrahedral hypersurface in 4D space, we measure the distortion of the transformation which maps the tetrahedral decomposition of the domain of the scalar field into the tetrahedral mesh representing its graph in R4, and we call it discrete distortion. We develop a segmentation algorithm to produce a Morse decompositions associated with the scalar field and its discrete distortion. We use a merging procedure to control the number of 3D regions in the segmentation output. Experimental results show the validity of our approach. JA - Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems T3 - GIS '09 PB - ACM CY - New York, NY, USA SN - 978-1-60558-649-6 UR - http://doi.acm.org/10.1145/1653771.1653799 M3 - 10.1145/1653771.1653799 ER - TY - JOUR T1 - Multi-label prediction via sparse infinite CCA JF - Advances in Neural Information Processing Systems Y1 - 2009 A1 - Rai,P. A1 - Daumé, Hal VL - 22 ER - TY - CONF T1 - Multiple instance Feature for robust part-based object detection T2 - Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on Y1 - 2009 A1 - Lin,Z. A1 - Hua,G. A1 - Davis, Larry S. JA - Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on ER - TY - JOUR T1 - No Downtime for Data Conversions: Rethinking Hot Upgrades (CMU-PDL-09-106) JF - Parallel Data Laboratory Y1 - 2009 A1 - Tudor Dumitras A1 - Narasimhan, Priya UR - http://repository.cmu.edu/pdl/74 ER - TY - CONF T1 - Non-parametric bayesian areal linguistics T2 - Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics Y1 - 2009 A1 - Daumé, Hal JA - Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics ER - TY - CONF T1 - Object detection via boosted deformable features T2 - Image Processing (ICIP), 2009 16th IEEE International Conference on Y1 - 2009 A1 - Hussein,M. A1 - Porikli, F. A1 - Davis, Larry S. KW - boosted KW - detection;object KW - detection;statistics; KW - detection;visual KW - ensembles;deformable KW - evidence;feature KW - extraction;object KW - features;human AB - It is a common practice to model an object for detection tasks as a boosted ensemble of many models built on features of the object. In this context, features are defined as subregions with fixed relative locations and extents with respect to the object's image window. We introduce using deformable features with boosted ensembles. A deformable features adapts its location depending on the visual evidence in order to match the corresponding physical feature. Therefore, deformable features can better handle deformable objects. We empirically show that boosted ensembles of deformable features perform significantly better than boosted ensembles of fixed features for human detection. JA - Image Processing (ICIP), 2009 16th IEEE International Conference on M3 - 10.1109/ICIP.2009.5414561 ER - TY - JOUR T1 - Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition JF - Pattern Analysis and Machine Intelligence, IEEE Transactions on Y1 - 2009 A1 - Gupta,A. A1 - Kembhavi,A. A1 - Davis, Larry S. KW - Automated;Recognition (Psychology);Video Recording; KW - Bayesian approach;functional compatibility;human perception;human-object interactions;objects recognition;psychological studies;spatial compatibility;Bayes methods;behavioural sciences;human factors;image recognition;motion estimation;object recognition;A KW - Biological;Movement;Pattern Recognition KW - Computer-Assisted;Models AB - Interpretation of images and videos containing humans interacting with different objects is a daunting task. It involves understanding scene or event, analyzing human movements, recognizing manipulable objects, and observing the effect of the human movement on those objects. While each of these perceptual tasks can be conducted independently, recognition rate improves when interactions between them are considered. Motivated by psychological studies of human perception, we present a Bayesian approach which integrates various perceptual tasks involved in understanding human-object interactions. Previous approaches to object and action recognition rely on static shape or appearance feature matching and motion analysis, respectively. Our approach goes beyond these traditional approaches and applies spatial and functional constraints on each of the perceptual elements for coherent semantic interpretation. Such constraints allow us to recognize objects and actions when the appearances are not discriminative enough. We also demonstrate the use of such constraints in recognition of actions from static images without using any motion information. VL - 31 SN - 0162-8828 CP - 10 M3 - 10.1109/TPAMI.2009.83 ER - TY - JOUR T1 - Off-Line Loop Investigation for Handwriting Analysis JF - IEEETransactions on Pattern Analysis and Machine Intelligence Y1 - 2009 A1 - Steinherz,T. A1 - David Doermann A1 - Rivlin,E. A1 - Intrator,N. VL - 31 CP - 2 ER - TY - JOUR T1 - P r DB: managing and exploiting rich correlations in probabilistic databases JF - The VLDB Journal Y1 - 2009 A1 - Sen,P. A1 - Deshpande, Amol A1 - Getoor, Lise AB - Due to numerous applications producing noisy data, e.g., sensor data, experimental data, data from uncurated sources, information extraction, etc., there has been a surge of interest in the development of probabilistic databases. Most probabilistic database models proposed to date, however, fail to meet the challenges of real-world applications on two counts: (1) they often restrict the kinds of uncertainty that the user can represent; and (2) the query processing algorithms often cannot scale up to the needs of the application. In this work, we define a probabilistic database model, PrDB, that uses graphical models, a state-of-the-art probabilistic modeling technique developed within the statistics and machine learning community, to model uncertain data. We show how this results in a rich, complex yet compact probabilistic database model, which can capture the commonly occurring uncertainty models (tuple uncertainty, attribute uncertainty), more complex models (correlated tuples and attributes) and allows compact representation (shared and schema-level correlations). In addition, we show how query evaluation in PrDB translates into inference in an appropriately augmented graphical model. This allows us to easily use any of a myriad of exact and approximate inference algorithms developed within the graphical modeling community. While probabilistic inference provides a generic approach to solving queries, we show how the use of shared correlations, together with a novel inference algorithm that we developed based on bisimulation, can speed query processing significantly. We present a comprehensive experimental evaluation of the proposed techniques and show that even with a few shared correlations, significant speedups are possible. VL - 18 CP - 5 M3 - 10.1007/s00778-009-0153-2 ER - TY - CONF T1 - Page Rule-Line Removal using Linear Subspaces in Monochromatic Handwritten Arabic Documents T2 - Intl. Conf. on Document Analysis and Recognition (ICDAR 09) Y1 - 2009 A1 - Abd-Almageed, Wael A1 - Kumar,Jayant A1 - David Doermann AB - In this paper we present a novel method for removing page rule lines in monochromatic handwritten Arabic documents using subspace methods with minimal effect on the quality of the foreground text. We use moment and histogram properties to extract features that represent the characteristics of the underlying rule lines. A linear subspace is incrementally built to obtain a line model that can be used to identify rule line pixels. We also introduce a novel scheme for evaluating noise removal algorithms in general and we use it to assess the quality of our rule line removal algorithm. Experimental results presented on a data set of 50 Arabic documents, handwritten by different writers, demonstrate the effectiveness of the proposed method. JA - Intl. Conf. on Document Analysis and Recognition (ICDAR 09) ER - TY - JOUR T1 - PERCEPTION AND NAVIGATION FOR AUTONOMOUS VEHICLES JF - IEEE Transactions on intelligent transportation systems Y1 - 2009 A1 - Hussein,M. A1 - Porikli, F. A1 - Davis, Larry S. AB - We introduce a framework for evaluating human detectors that considers the practical application of a detector on a full image using multisize sliding-window scanning. We produce detection error tradeoff (DET) curves relating the miss detection rate and the false-alarm rate computed by deploying the detector on cropped windows and whole images, using, in the latter, either image resize or feature resize. Plots for cascade classifiers are generated based on confidence scores instead of on variation of the number of layers. To assess a method's overall performance on a given test, we use the average log miss rate (ALMR) as an aggregate performance score. To analyze the significance of the obtained results, we conduct 10-fold cross-validation experiments. We applied our evaluation framework to two state-of-the-art cascade-based detectors on the standard INRIA Person dataset and a local dataset of near-infrared images. We used our evaluation framework to study the differences between the two detectors on the two datasets with different evaluation methods. Our results show the utility of our framework. They also suggest that the descriptors used to represent features and the training window size are more important in predicting the detection performance than the nature of the imaging process, and that the choice between resizing images or features can have serious consequences. VL - 10 CP - 3 ER - TY - CONF T1 - Plane-wave decomposition of a sound scene using a cylindrical microphone array T2 - Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on Y1 - 2009 A1 - Zotkin,Dmitry N A1 - Duraiswami, Ramani KW - array;plane KW - arrays; KW - baffle;cylindrical KW - beamforming;cylindrical KW - decomposition;sound-hard KW - localization;array KW - microphone KW - plane-wave KW - processing;microphone KW - scene KW - signal KW - spherical;source KW - waves;sound AB - The analysis for microphone arrays formed by mounting microphones on a sound-hard spherical or cylindrical baffle is typically performed using a decomposition of the sound field in terms of orthogonal basis functions. An alternative representation in terms of plane waves and a method for obtaining the coefficients of such a representation directly from measurements was proposed recently for the case of a spherical array. It was shown that representing the field as a collection of plane waves arriving from various directions simplifies both source localization and beamforming. In this paper, these results are extended to the case of the cylindrical array. Similarly to the spherical array case, localization and beamforming based on plane-wave decomposition perform as well as the traditional orthogonal function based methods while being numerically more stable. Both simulated and experimental results are presented. JA - Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on M3 - 10.1109/ICASSP.2009.4959526 ER - TY - JOUR T1 - PrDB: managing and exploiting rich correlations in probabilistic databases JF - The VLDB Journal Y1 - 2009 A1 - Sen,Prithviraj A1 - Deshpande, Amol A1 - Getoor, Lise AB - Due to numerous applications producing noisy data, e.g., sensor data, experimental data, data from uncurated sources, information extraction, etc., there has been a surge of interest in the development of probabilistic databases. Most probabilistic database models proposed to date, however, fail to meet the challenges of real-world applications on two counts: (1) they often restrict the kinds of uncertainty that the user can represent; and (2) the query processing algorithms often cannot scale up to the needs of the application. In this work, we define a probabilistic database model, P r DB, that uses graphical models, a state-of-the-art probabilistic modeling technique developed within the statistics and machine learning community, to model uncertain data. We show how this results in a rich, complex yet compact probabilistic database model, which can capture the commonly occurring uncertainty models (tuple uncertainty, attribute uncertainty), more complex models (correlated tuples and attributes) and allows compact representation (shared and schema-level correlations). In addition, we show how query evaluation in P r DB translates into inference in an appropriately augmented graphical model. This allows us to easily use any of a myriad of exact and approximate inference algorithms developed within the graphical modeling community. While probabilistic inference provides a generic approach to solving queries, we show how the use of shared correlations, together with a novel inference algorithm that we developed based on bisimulation, can speed query processing significantly. We present a comprehensive experimental evaluation of the proposed techniques and show that even with a few shared correlations, significant speedups are possible. VL - 18 SN - 1066-8888 UR - http://dx.doi.org/10.1007/s00778-009-0153-2 CP - 5 ER - TY - CHAP T1 - PrDB: Managing Large-Scale Correlated Probabilistic Databases (Abstract) T2 - Scalable Uncertainty ManagementScalable Uncertainty Management Y1 - 2009 A1 - Deshpande, Amol ED - Godo,Lluís ED - Pugliese,Andrea AB - Increasing numbers of real-world application domains are generating data that is inherently noisy, incomplete, and probabilistic in nature. Statistical inference and probabilistic modeling often introduce another layer of uncertainty on top of that. Examples of such data include measurement data collected by sensor networks, observation data in the context of social networks, scientific and biomedical data, and data collected by various online cyber-sources. Over the last few years, numerous approaches have been proposed, and several systems built, to integrate uncertainty into databases. However, these approaches typically make simplistic and restrictive assumptions concerning the types of uncertainties that can be represented. Most importantly, they often make highly restrictive independence assumptions, and cannot easily model rich correlations among the tuples or attribute values. Furthermore, they typically lack support for specifying uncertainties at different levels of abstractions, needed to handle large-scale uncertain datasets. JA - Scalable Uncertainty ManagementScalable Uncertainty Management T3 - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 5785 SN - 978-3-642-04387-1 UR - http://dx.doi.org/10.1007/978-3-642-04388-8_1 ER - TY - CONF T1 - Predicting and Controlling System-Level Parameters of Multi-Agent Systems T2 - 2009 AAAI Fall Symposium Series Y1 - 2009 A1 - Miner,Don A1 - desJardins, Marie KW - Complex Adaptive Systems KW - System-level behavior AB - Boid flocking is a system in which several individual agents follow three simple rules to generate swarm-level flocking behavior. To control this system, the user must adjust the agent program parameters, which indirectly modifies the flocking behavior. This is unintuitive because the properties of the flocking behavior are non-explicit in the agent program. In this paper, we discuss a domain-independent approach for detecting and controlling two emergent properties of boids: density and a qualitative threshold effect of swarming vs. flocking. Also, we discuss the possibility of applying this approach to detecting and controlling traffic jams in traffic simulations. JA - 2009 AAAI Fall Symposium Series UR - http://www.aaai.org/ocs/index.php/FSS/FSS09/paper/viewPaper/909 ER - TY - JOUR T1 - Probabilistic fusion-based parameter estimation for visual tracking JF - Computer Vision and Image Understanding Y1 - 2009 A1 - Han,Bohyung A1 - Davis, Larry S. KW - Component-based tracking KW - Density-based fusion KW - Mean-shift KW - visual tracking AB - In object tracking, visual features may not be discriminative enough to estimate high dimensional motion parameters accurately, and complex motion estimation is computationally expensive due to a large search space. To tackle these problems, a reasonable strategy is to track small components within the target independently in lower dimensional motion parameter spaces (e.g., translation only) and then estimate the overall high dimensional motion (e.g., translation, scale and rotation) by statistically integrating the individual tracking results. Although tracking each component in a lower dimensional space is more reliable and faster, it is not trivial to combine the local motion information and estimate global parameters in a robust way because the individual component motions are frequently inconsistent. We propose a robust fusion algorithm to estimate the complex motion parameters using variable-bandwidth mean-shift. By employing correlation-based uncertainty modeling and fusion of individual components, the motion parameter that is robust to outliers can be detected with variable-bandwidth density-based fusion (VBDF) algorithm. In addition, we describe a method to update target appearance model for each component adaptively based on the component motion consistency. We present various tracking results and compare the performance of our algorithm with others using real video sequences. VL - 113 SN - 1077-3142 UR - http://www.sciencedirect.com/science/article/pii/S1077314208001872 CP - 4 M3 - 10.1016/j.cviu.2008.11.003 ER - TY - JOUR T1 - Probabilistic graphical models JF - IEEE transactions on pattern analysis and machine intelligence Y1 - 2009 A1 - Gupta,A. A1 - Kembhavi,A. A1 - Davis, Larry S. VL - 31 CP - 10 ER - TY - CONF T1 - Recognizing actions by shape-motion prototype trees T2 - Computer Vision, 2009 IEEE 12th International Conference on Y1 - 2009 A1 - Zhe Lin A1 - Zhuolin Jiang A1 - Davis, Larry S. AB - A prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible action matching in long video sequences. During training, first, an action prototype tree is learned in a joint shape and motion space via hierarchical k-means clustering; then a lookup table of prototype-to-prototype distances is generated. During testing, based on a joint likelihood model of the actor location and action prototype, the actor is tracked while a frame-to-prototype correspondence is established by maximizing the joint likelihood, which is efficiently performed by searching the learned prototype tree; then actions are recognized using dynamic prototype sequence matching. Distance matrices used for sequence matching are rapidly obtained by look-up table indexing, which is an order of magnitude faster than brute-force computation of frame-to-frame distances. Our approach enables robust action matching in very challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences. Experimental results demonstrate that our approach achieves recognition rates of 91.07% on a large gesture dataset (with dynamic backgrounds), 100% on the Weizmann action dataset and 95.77% on the KTH action dataset. JA - Computer Vision, 2009 IEEE 12th International Conference on M3 - 10.1109/ICCV.2009.5459184 ER - TY - CONF T1 - Regularized HRTF fitting using spherical harmonics T2 - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA '09 Y1 - 2009 A1 - Zotkin,Dmitry N A1 - Duraiswami, Ramani A1 - Gumerov, Nail A. KW - Acoustic applications KW - acoustic field KW - Acoustic fields KW - acoustic intensity measurement KW - Acoustic measurements KW - acoustic signal processing KW - Acoustic testing KW - acoustic waves KW - array signal processing KW - audio acoustics KW - circular arrays KW - computational analysis KW - Ear KW - ear location KW - head-related transfer function KW - Helmholtz reciprocity principle KW - HRTF KW - HRTF fitting KW - Loudspeakers KW - Microphones KW - Position measurement KW - signal reconstruction KW - spatial audio KW - spectral reconstruction KW - spherical harmonics KW - Transfer functions AB - By the Helmholtz reciprocity principle, the head-related transfer function (HRTF) is equivalent to an acoustic field created by a transmitter placed at the ear location. Therefore, it can be represented as a spherical harmonics spectrum - a weighted sum of spherical harmonics. Such representations are useful in theoretical and computational analysis. Many different (often severely undersampled) grids are used for HRTF measurement, making the spectral reconstruction difficult. In this paper, two methods of obtaining the spectrum are presented and analyzed both on synthetic (ground-truth data available) and real HRTF measurements. JA - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA '09 PB - IEEE SN - 978-1-4244-3678-1 M3 - 10.1109/ASPAA.2009.5346521 ER - TY - CONF T1 - Rigorous Probabilistic Trust-Inference with Applications to Clustering T2 - IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09 Y1 - 2009 A1 - DuBois,Thomas A1 - Golbeck,Jennifer A1 - Srinivasan, Aravind KW - Clustering algorithms KW - Conferences KW - Educational institutions KW - Extraterrestrial measurements KW - Inference algorithms KW - Intelligent agent KW - random graphs KW - Social network services KW - trust inferrence KW - Visualization KW - Voting KW - Web sites AB - The World Wide Web has transformed into an environment where users both produce and consume information. In order to judge the validity of information, it is important to know how trustworthy its creator is. Since no individual can have direct knowledge of more than a small fraction of information authors, methods for inferring trust are needed. We propose a new trust inference scheme based on the idea that a trust network can be viewed as a random graph, and a chain of trust as a path in that graph. In addition to having an intuitive interpretation, our algorithm has several advantages, noteworthy among which is the creation of an inferred trust-metric space where the shorter the distance between two people, the higher their trust. Metric spaces have rigorous algorithms for clustering, visualization, and related problems, any of which is directly applicable to our results. JA - IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09 PB - IEEE VL - 1 SN - 978-0-7695-3801-3 M3 - 10.1109/WI-IAT.2009.109 ER - TY - JOUR T1 - Robust human detection under occlusion by integrating face and person detectors JF - Proceedings of the Third International Conference on Advances in Biometrics Y1 - 2009 A1 - Schwartz, W.R. A1 - Gopalan,R. A1 - Chellapa, Rama A1 - Davis, Larry S. AB - Human detection under occlusion is a challenging problemin computer vision. We address this problem through a framework which integrates face detection and person detection. We first investigate how the response of a face detector is correlated with the response of a person detector. From these observations, we formulate hypotheses that capture the intuitive feedback between the responses of face and person detectors and use it to verify if the individual detectors’ outputs are true or false. We illustrate the performance of our integration framework on challeng- ing images that have considerable amount of occlusion, and demonstrate its advantages over individual face and person detectors. ER - TY - JOUR T1 - Scaling kernel machine learning algorithm via the use of GPUs JF - GPU Technology Conference Y1 - 2009 A1 - Srinivasan,B.V. A1 - Duraiswami, Ramani ER - TY - JOUR T1 - Scene it or not? incremental multiple kernel learning for object detection JF - Proceedings of the International Conference on Computer Vision Y1 - 2009 A1 - Kembhavi,A. A1 - Siddiquie,B. A1 - Miezianko,R. A1 - McCloskey,S. A1 - Davis, Larry S. A1 - Schwartz, W.R. A1 - Harwood,D. A1 - Gupta,A. A1 - Farrell,R. A1 - Luo,Y. A1 - others ER - TY - CONF T1 - Second ACM Workshop on Hot Topics in Software Upgrades (HotSWUp 2009) Y1 - 2009 A1 - Tudor Dumitras A1 - Neamtiu, Iulian A1 - Tilevich, Eli KW - software KW - upgrades AB - The goal of HotSWUp is to identify cutting-edge research ideas for implementing software upgrades. In the presence of modified user requirements, deployment environments, and bug fixes, actively-used software must be upgraded continuously to ensure its utility and safety. The upgrades incorporate changes to the structure, behavior, configuration, data, or topology of a computer system. Whether applied offline or directly to a live system, such upgrades may have a significant impact on the performance and reliability of the software. Indeed, recent studies and a large body of anecdotal evidence suggest that, in practice, upgrades are failure-prone and can lead to outages, data corruption or latent errors. These problems not only inconvenience end users - they create a significant burden for organizations due to the associated downtime and high administrative costs. T3 - OOPSLA '09 PB - ACM SN - 978-1-60558-768-4 UR - http://doi.acm.org/10.1145/1639950.1639974 ER - TY - JOUR T1 - Segmentation using appearance of mesostructure roughness JF - International journal of computer vision Y1 - 2009 A1 - Yacoob,Yaser A1 - Davis, Larry S. AB - This paper introduces mesostructure roughness as an effective cue in image segmentation. Mesostructure roughness corresponds to small-scale bumps on the macrostructure (i.e., geometry) of objects. Specifically, the focus is on the texture that is created by the projection of the mesostructure roughness on the camera plane. Three intrinsic images are derived: reflectance, smooth-surface shading and mesostructure roughness shading (meta-texture images). A constructive approach is proposed for computing a meta-texture image by preserving, equalizing and enhancing the underlying surface-roughness across color, brightness and illumination variations. We evaluate the performance on sample images and illustrate quantitatively that different patches of the same material, in an image, are normalized in their statistics despite variations in color, brightness and illumination. We develop an algorithm for segmentation of an image into patches that share salient mesostructure roughness. Finally, segmentation by line-based boundary-detection is proposed and results are provided and compared to known algorithms. VL - 83 CP - 3 ER - TY - JOUR T1 - Semantically informed machine translation (SIMT) JF - SCALE summer workshop final report, Human Language Technology Center Of Excellence Y1 - 2009 A1 - Baker,K. A1 - Bethard,S. A1 - Bloodgood,M. A1 - Brown,R. A1 - Callison-Burch,C. A1 - Coppersmith,G. A1 - Dorr, Bonnie J A1 - Filardo,W. A1 - Giles,K. A1 - Irvine,A. A1 - others ER - TY - JOUR T1 - Semantic-based segmentation and annotation of 3d models JF - Image Analysis and Processing–ICIAP 2009 Y1 - 2009 A1 - Papaleo,L. A1 - De Floriani, Leila AB - 3D objects have become widely available and used in different application domains. Thus, it is becoming fundamental to use, integrate and develop techniques for extracting and maintaining their embedded knowledge. These techniques should be encapsulated in portable and intelligent systems able to semantically annotate the 3D object models in order to improve their usability and indexing, especially in innovative web cooperative environments. Lately, we are moving in this direction, with the definition and development of data structures, methods and interfaces for structuring and semantically annotating 3D complex models (and scenes) - even changing in time - according to ontology-driven metadata and following ontology-driven processes. Here, we concentrate on the tools for segmenting manifold 3D models and on the underline structural representation that we build and manipulate. We also describe the first prototype of an annotation tool which allows a hierarchical semantic-driven tagging of the segmented model and provides an interface from which the user can inspect and browse the entire segmentation graph. M3 - 10.1007/978-3-642-04146-4_13 ER - TY - CONF T1 - Semi-supervised or semi-unsupervised? T2 - Proceedings of the NAACL HLT Workshop on Semisupervised Learning for Natural Language Processing Y1 - 2009 A1 - Daumé, Hal JA - Proceedings of the NAACL HLT Workshop on Semisupervised Learning for Natural Language Processing ER - TY - RPRT T1 - A set of tools for Representing, Decomposing and Visualizing non manifold Cellular Complexes Y1 - 2009 A1 - De Floriani, Leila A1 - Panozzo,D. A1 - Hui,A. AB - Modeling and understanding complex non-manifold shapes is a key issue in shape analysis and retrieval. The topological structure of a non-manifold shape can be analyzed through its decomposition into a collection of components with a simpler topology. Here, we consider a decomposition of a non-manifold shape into components which are almost manifolds, and we present a novel graph representation which highlights the non-manifold singularities shared by the components as well as their connectivity relations. We describe an algorithm for computing the decomposition and its associated graph representation. We present a new tool for visualizing the shape decomposition and its graph as an effective support to modeling, analyzing and understanding non-manifold shapes. We describe a new data structure for non-manifold simplicial complex that we used in our decomposition software and we provide a complete description of all functionalities of the library we developed. PB - Department of Computer Science and Information Science, University of Genoa VL - DISI-TR-09-06 ER - TY - CONF T1 - Set-Based Boosting for Instance-Level Transfer T2 - Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on Y1 - 2009 A1 - Eaton,Eric A1 - desJardins, Marie AB - The success of transfer to improve learning on a target task is highly dependent on the selected source data. Instance-based transfer methods reuse data from the source tasks to augment the training data for the target task. If poorly chosen, this source data may inhibit learning, resulting in negative transfer. The current best performing algorithm for instance-based transfer, TrAdaBoost, performs poorly when given irrelevant source data. We present a novel set-based boosting technique for instance-based transfer. The proposed algorithm, TransferBoost, boosts both individual instances and collective sets of instances from each source task. In effect, TransferBoost boosts each source task, assigning higher weight to those source tasks which show positive transferability to the target task, and then adjusts the weights of the instances within each source task via AdaBoost. The results demonstrate that TransferBoost significantly improves transfer performance over existing instance-based algorithms when given a mix of relevant and irrelevant source data. JA - Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on PB - IEEE SN - 978-1-4244-5384-9 UR - http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5360442&tag=1 M3 - 10.1109/ICDMW.2009.97 ER - TY - JOUR T1 - Signature Detection and Matching for Document Image Retrieval JF - IEEETransactions on Pattern Analysis and Machine Intelligence Y1 - 2009 A1 - Zhu,Guangyu A1 - Yefeng Zheng A1 - David Doermann A1 - Jaeger,Stefan AB - As one of the most pervasive methods of individual identification and document authentication, signatures present convincing evidence and provide an important form of indexing for effective document image processing and retrieval in a broad range of applications. However, detection and segmentation of free-form objects such as signatures from clustered background is currently an open document analysis problem. In this paper, we focus on two fundamental problems in signature-based document image retrieval. First, we propose a novel multi-scale approach to jointly detecting and segmenting signatures from document images. Rather than focusing on local features that typically have large variations, our approach captures the structural saliency using a signature production model and computes the dynamic curvature of 2-D contour fragments over multiple scales. This detection framework is general and computationally tractable. Second, we treat the problem of signature retrieval in the unconstrained setting of translation, scale, and rotation invariant non-rigid shape matching. We propose two novel measures of shape dissimilarity based on anisotropic scaling and registration residual error, and present a supervised learning framework for combining complementary shape information from different dissimilarity metrics using LDA. We quantitatively study state-of-the-art shape representations, shape matching algorithms, measures of dissimilarity, and the use of multiple instances as query in document image retrieval. We further demonstrate our matching techniques in off-line signature verification. Extensive experiments using large real world collections of English and Arabic machine printed and handwritten documents demonstrate the excellent performance of our approaches. VL - 31 CP - 11 ER - TY - CHAP T1 - Simple, Black-Box Constructions of Adaptively Secure Protocols T2 - Theory of Cryptography Y1 - 2009 A1 - Choi, Seung Geol A1 - Dana Dachman-Soled A1 - Malkin, Tal A1 - Wee, Hoeteck ED - Reingold, Omer KW - Algorithm Analysis and Problem Complexity KW - computers and society KW - Data Encryption KW - Discrete Mathematics in Computer Science KW - Management of Computing and Information Systems KW - Systems and Data Security AB - We present a compiler for transforming an oblivious transfer (OT) protocol secure against an adaptive semi-honest adversary into one that is secure against an adaptive malicious adversary. Our compiler achieves security in the universal composability framework, assuming access to an ideal commitment functionality, and improves over previous work achieving the same security guarantee in two ways: it uses black-box access to the underlying protocol and achieves a constant multiplicative overhead in the round complexity. As a corollary, we obtain the first constructions of adaptively secure protocols in the stand-alone model using black-box access to a low-level primitive. JA - Theory of Cryptography T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-642-00456-8, 978-3-642-00457-5 UR - http://link.springer.com/chapter/10.1007/978-3-642-00457-5_23 ER - TY - CONF T1 - Streamed learning: one-pass SVMs T2 - Proceedings of the 21st international jont conference on Artifical intelligence Y1 - 2009 A1 - Rai,P. A1 - Daumé, Hal A1 - Venkatasubramanian,S. JA - Proceedings of the 21st international jont conference on Artifical intelligence ER - TY - CONF T1 - Streaming for large scale NLP: Language modeling T2 - Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics Y1 - 2009 A1 - Goyal,A. A1 - Daumé, Hal A1 - Venkatasubramanian,S. JA - Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics ER - TY - JOUR T1 - Supercubes: A High-Level Primitive for Diamond Hierarchies JF - Visualization and Computer Graphics, IEEE Transactions on Y1 - 2009 A1 - Weiss,K. A1 - De Floriani, Leila KW - (computer KW - adaptive KW - approach;nested KW - bisection KW - crack-free KW - datasets;computational KW - decomposition;nested KW - diamond KW - generation;rendering KW - geometry;mesh KW - graphics); KW - hierarchies;edge KW - mesh;supercubes;tetrahedra KW - meshes;polyhedral KW - representations;multiresolution KW - rule;highly KW - sharing;volumetric KW - tetrahedral AB - Volumetric datasets are often modeled using a multiresolution approach based on a nested decomposition of the domain into a polyhedral mesh. Nested tetrahedral meshes generated through the longest edge bisection rule are commonly used to decompose regular volumetric datasets since they produce highly adaptive crack-free representations. Efficient representations for such models have been achieved by clustering the set of tetrahedra sharing a common longest edge into a structure called a diamond. The alignment and orientation of the longest edge can be used to implicitly determine the geometry of a diamond and its relations to the other diamonds within the hierarchy. We introduce the supercube as a high-level primitive within such meshes that encompasses all unique types of diamonds. A supercube is a coherent set of edges corresponding to three consecutive levels of subdivision. Diamonds are uniquely characterized by the longest edge of the tetrahedra forming them and are clustered in supercubes through the association of the longest edge of a diamond with a unique edge in a supercube. Supercubes are thus a compact and highly efficient means of associating information with a subset of the vertices, edges and tetrahedra of the meshes generated through longest edge bisection. We demonstrate the effectiveness of the supercube representation when encoding multiresolution diamond hierarchies built on a subset of the points of a regular grid. We also show how supercubes can be used to efficiently extract meshes from diamond hierarchies and to reduce the storage requirements of such variable-resolution meshes. VL - 15 SN - 1077-2626 CP - 6 M3 - 10.1109/TVCG.2009.186 ER - TY - JOUR T1 - Symbolic-to-statistical hybridization: extending generation-heavy machine translation JF - Machine Translation Y1 - 2009 A1 - Habash,Nizar A1 - Dorr, Bonnie J A1 - Monz,Christof AB - The last few years have witnessed an increasing interest in hybridizing surface-based statistical approaches and rule-based symbolic approaches to machine translation (MT). Much of that work is focused on extending statistical MT systems with symbolic knowledge and components. In the brand of hybridization discussed here, we go in the opposite direction: adding statistical bilingual components to a symbolic system. Our base system is Generation-heavy machine translation (GHMT), a primarily symbolic asymmetrical approach that addresses the issue of Interlingual MT resource poverty in source-poor/target-rich language pairs by exploiting symbolic and statistical target-language resources. GHMT’s statistical components are limited to target-language models, which arguably makes it a simple form of a hybrid system . We extend the hybrid nature of GHMT by adding statistical bilingual components. We also describe the details of retargeting it to Arabic–English MT. The morphological richness of Arabic brings several challenges to the hybridization task. We conduct an extensive evaluation of multiple system variants. Our evaluation shows that this new variant of GHMT—a primarily symbolic system extended with monolingual and bilingual statistical components—has a higher degree of grammaticality than a phrase-based statistical MT system, where grammaticality is measured in terms of correct verb-argument realization and long-distance dependency translation. VL - 23 SN - 0922-6567 UR - http://dx.doi.org/10.1007/s10590-009-9056-7 CP - 1 ER - TY - PAT T1 - System and method for fast illumination-invariant background subtraction ... Y1 - 2009 A1 - Lim,Ser-Nam A1 - Mittal,Anurag A1 - Davis, Larry S. ED - Siemens Corporate Research, Inc. AB - A method for eliminating errors in foreground object detection in digitized images comprises providing a reference camera and a secondary camera, vertically aligning each said camera with a baseline that is approximately perpendicular to a ground plane, wherein said reference camera is placed lower than said secondary camera, selecting a foreground pixel in a reference view of a first point in a foreground object, finding a conjugate pixel of the foreground pixel in a secondary view, using the foreground and conjugate pixels to determine an image base pixel of a base point in the reference view, wherein said base point is a point on the ground plane below the first point, and using the foreground and image base pixels to find a location where the ground plane is first visible. VL - 11/282,513 UR - http://www.google.com/patents?id=8yK_AAAAEBAJ CP - 7512250 ER - TY - JOUR T1 - TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate JF - Machine Translation Y1 - 2009 A1 - Snover,Matthew A1 - Madnani,Nitin A1 - Dorr, Bonnie J A1 - Schwartz,Richard AB - This paper describes a new evaluation metric, TER-Plus (TERp) for automatic evaluation of machine translation (MT). TERp is an extension of Translation Edit Rate (TER). It builds on the success of TER as an evaluation metric and alignment tool and addresses several of its weaknesses through the use of paraphrases, stemming, synonyms, as well as edit costs that can be automatically optimized to correlate better with various types of human judgments. We present a correlation study comparing TERp to BLEU, METEOR and TER, and illustrate that TERp can better evaluate translation adequacy. VL - 23 SN - 0922-6567 UR - http://dx.doi.org/10.1007/s10590-009-9062-9 CP - 2 ER - TY - JOUR T1 - Three genomes from the phylum Acidobacteria provide insight into the lifestyles of these microorganisms in soils. JF - Appl Environ Microbiol Y1 - 2009 A1 - Ward, Naomi L A1 - Challacombe, Jean F A1 - Janssen, Peter H A1 - Henrissat, Bernard A1 - Coutinho, Pedro M A1 - Wu, Martin A1 - Xie, Gary A1 - Haft, Daniel H A1 - Sait, Michelle A1 - Badger, Jonathan A1 - Barabote, Ravi D A1 - Bradley, Brent A1 - Brettin, Thomas S A1 - Brinkac, Lauren M A1 - Bruce, David A1 - Creasy, Todd A1 - Daugherty, Sean C A1 - Davidsen, Tanja M A1 - DeBoy, Robert T A1 - Detter, J Chris A1 - Dodson, Robert J A1 - Durkin, A Scott A1 - Ganapathy, Anuradha A1 - Gwinn-Giglio, Michelle A1 - Han, Cliff S A1 - Khouri, Hoda A1 - Kiss, Hajnalka A1 - Kothari, Sagar P A1 - Madupu, Ramana A1 - Nelson, Karen E A1 - Nelson, William C A1 - Paulsen, Ian A1 - Penn, Kevin A1 - Ren, Qinghu A1 - Rosovitz, M J A1 - Jeremy D Selengut A1 - Shrivastava, Susmita A1 - Sullivan, Steven A A1 - Tapia, Roxanne A1 - Thompson, L Sue A1 - Watkins, Kisha L A1 - Yang, Qi A1 - Yu, Chunhui A1 - Zafar, Nikhat A1 - Zhou, Liwei A1 - Kuske, Cheryl R KW - Anti-Bacterial Agents KW - bacteria KW - Biological Transport KW - Carbohydrate Metabolism KW - Cyanobacteria KW - DNA, Bacterial KW - Fungi KW - Genome, Bacterial KW - Macrolides KW - Molecular Sequence Data KW - Nitrogen KW - Phylogeny KW - Proteobacteria KW - Sequence Analysis, DNA KW - sequence homology KW - Soil Microbiology AB -

The complete genomes of three strains from the phylum Acidobacteria were compared. Phylogenetic analysis placed them as a unique phylum. They share genomic traits with members of the Proteobacteria, the Cyanobacteria, and the Fungi. The three strains appear to be versatile heterotrophs. Genomic and culture traits indicate the use of carbon sources that span simple sugars to more complex substrates such as hemicellulose, cellulose, and chitin. The genomes encode low-specificity major facilitator superfamily transporters and high-affinity ABC transporters for sugars, suggesting that they are best suited to low-nutrient conditions. They appear capable of nitrate and nitrite reduction but not N(2) fixation or denitrification. The genomes contained numerous genes that encode siderophore receptors, but no evidence of siderophore production was found, suggesting that they may obtain iron via interaction with other microorganisms. The presence of cellulose synthesis genes and a large class of novel high-molecular-weight excreted proteins suggests potential traits for desiccation resistance, biofilm formation, and/or contribution to soil structure. Polyketide synthase and macrolide glycosylation genes suggest the production of novel antimicrobial compounds. Genes that encode a variety of novel proteins were also identified. The abundance of acidobacteria in soils worldwide and the breadth of potential carbon use by the sequenced strains suggest significant and previously unrecognized contributions to the terrestrial carbon cycle. Combining our genomic evidence with available culture traits, we postulate that cells of these isolates are long-lived, divide slowly, exhibit slow metabolic rates under low-nutrient conditions, and are well equipped to tolerate fluctuations in soil hydration.

VL - 75 CP - 7 M3 - 10.1128/AEM.02294-08 ER - TY - JOUR T1 - Toward reconstructing the evolution of advanced moths and butterflies (Lepidoptera: Ditrysia): an initial molecular study JF - BMC Evol Biol Y1 - 2009 A1 - Regier,J. C A1 - Zwick,A. A1 - Cummings, Michael P. A1 - Kawahara,A. Y A1 - Cho,S. A1 - Weller,S. A1 - Roe,A. A1 - Baixeras,J. A1 - Brown,J. W A1 - Parr,C. A1 - Davis,DR A1 - Epstein,M A1 - Hallwachs,W A1 - Hausmann,A A1 - Janzen,DH A1 - Kitching,IJ A1 - Solis,MA A1 - Yen,S-H A1 - Bazinet,A. L A1 - Mitter,C AB - BACKGROUND: In the mega-diverse insect order Lepidoptera (butterflies and moths; 165,000 described species), deeper relationships are little understood within the clade Ditrysia, to which 98% of the species belong. To begin addressing this problem, we tested the ability of five protein-coding nuclear genes (6.7 kb total), and character subsets therein, to resolve relationships among 123 species representing 27 (of 33) superfamilies and 55 (of 100) families of Ditrysia under maximum likelihood analysis. RESULTS: Our trees show broad concordance with previous morphological hypotheses of ditrysian phylogeny, although most relationships among superfamilies are weakly supported. There are also notable surprises, such as a consistently closer relationship of Pyraloidea than of butterflies to most Macrolepidoptera. Monophyly is significantly rejected by one or more character sets for the putative clades Macrolepidoptera as currently defined (P < 0.05) and Macrolepidoptera excluding Noctuoidea and Bombycoidea sensu lato (P < or = 0.005), and nearly so for the superfamily Drepanoidea as currently defined (P < 0.08). Superfamilies are typically recovered or nearly so, but usually without strong support. Relationships within superfamilies and families, however, are often robustly resolved. We provide some of the first strong molecular evidence on deeper splits within Pyraloidea, Tortricoidea, Geometroidea, Noctuoidea and others.Separate analyses of mostly synonymous versus non-synonymous character sets revealed notable differences (though not strong conflict), including a marked influence of compositional heterogeneity on apparent signal in the third codon position (nt3). As available model partitioning methods cannot correct for this variation, we assessed overall phylogeny resolution through separate examination of trees from each character set. Exploration of "tree space" with GARLI, using grid computing, showed that hundreds of searches are typically needed to find the best-feasible phylogeny estimate for these data. CONCLUSION: Our results (a) corroborate the broad outlines of the current working phylogenetic hypothesis for Ditrysia, (b) demonstrate that some prominent features of that hypothesis, including the position of the butterflies, need revision, and (c) resolve the majority of family and subfamily relationships within superfamilies as thus far sampled. Much further gene and taxon sampling will be needed, however, to strongly resolve individual deeper nodes. VL - 9 M3 - 10.1186/1471-2148-9-280 ER - TY - CONF T1 - Toward Upgrades-as-a-service in Distributed Systems T2 - Middleware'09 Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware Y1 - 2009 A1 - Tudor Dumitras A1 - Narasimhan, Priya AB - Unavailability in distributed enterprise systems is usually the result of planned events, such as upgrades, rather than failures. Major system upgrades entail complex data conversions that are difficult to perform on the fly, in the face of live workloads. Minimizing the downtime imposed by such conversions is a time-intensive and error-prone manual process. We propose upgrades-as-a-service, a novel approach that can eliminate all the causes of planned downtime recorded during the upgrade history of one of the ten most popular websites. Building on the lessons learned from past research on live upgrades in middleware systems, upgrades-as-a-service trade off a need for additional hardware resources during the upgrade for the ability to perform end-to-end upgrades online, with minimal application-specific knowledge. JA - Middleware'09 Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware T3 - Middleware '09 PB - Springer-Verlag New York, Inc. UR - http://dl.acm.org/citation.cfm?id=1656980.1657019 ER - TY - JOUR T1 - Tree-Based Encoding for Cancellations on Morse Complexes JF - Combinatorial Image Analysis Y1 - 2009 A1 - Čomić,L. A1 - De Floriani, Leila AB - A scalar function f, defined on a manifold M, can be simplified by applying a sequence of removal and contraction operators, which eliminate its critical points in pairs, and simplify the topological representation of M, provided by Morse complexes of f. The inverse refinement operators, together with a dependency relation between them, enable a construction of a multi-resolution representation of such complexes. Here, we encode a sequence of simplification operators in a data structure called an augmented cancellation forest, which will enable procedural encoding of the inverse refinement operators, and reduce the dependency relation between modifications of the Morse complexes. In this way, this representation will induce a high flexibility of the hierarchical representation of the Morse complexes, producing a large number of Morse complexes at different resolutions that can be obtained from the hierarchy. M3 - 10.1007/978-3-642-10210-3_26 ER - TY - CONF T1 - UbiGreen: investigating a mobile tool for tracking and supporting green transportation habits T2 - Proceedings of the 27th international conference on Human factors in computing systems Y1 - 2009 A1 - Jon Froehlich A1 - Dillahunt,Tawanna A1 - Klasnja,Predrag A1 - Mankoff,Jennifer A1 - Consolvo,Sunny A1 - Harrison,Beverly A1 - Landay,James A. KW - ambient displays KW - mobile phones KW - sensing KW - Sustainability KW - transportation KW - ubicomp JA - Proceedings of the 27th international conference on Human factors in computing systems T3 - CHI '09 PB - ACM CY - New York, NY, USA SN - 978-1-60558-246-7 UR - http://doi.acm.org/10.1145/1518701.1518861 M3 - 10.1145/1518701.1518861 ER - TY - RPRT T1 - Understanding the design tradeoffs for cooperative streaming multicast Y1 - 2009 A1 - Nandi,A. A1 - Bhattacharjee, Bobby A1 - Druschel,P. AB - Video streaming over the Internet is rapidly increasing in popular-ity, but the availability and quality of the content is limited by the high bandwidth cost for server-based solutions. Cooperative end- system multicast (CEM) has emerged as a promising paradigm for content distribution in the Internet, because the bandwidth over- head of disseminating content is shared among the participants of the CEM overlay network. Several CEM systems have been pro- posed and deployed, but the tradeoffs inherent in the different de- signs are not well understood. In this work, we provide a common framework in which different CEM design choices can be empirically and systematically evalu- ated. Our results show that all CEM protocols are inherently lim- ited in certain aspects of their performance. We distill our observa- tions into a novel model that explains the inherent tradeoffs of CEM design choices and provides bounds on the practical performance limits of any future CEM protocol. In particular, the model conjec- tures that no CEM design can simultaneously achieve all three of low overhead, low lag, and high streaming quality. PB - Technical report MPI-SWS-2009-002, Max Planck Institute for Software Systems ER - TY - CONF T1 - Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos T2 - Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on Y1 - 2009 A1 - Gupta,A. A1 - Srinivasan,P. A1 - Shi,Jianbo A1 - Davis, Larry S. KW - (artificial KW - action KW - activity KW - analysis;integer KW - AND-OR KW - annotation;video KW - coding; KW - constraint;video KW - construction;semantic KW - extraction;graph KW - framework;plots KW - graph;encoding;human KW - grounded KW - intelligence);spatiotemporal KW - learning KW - meaning;spatio-temporal KW - model KW - phenomena;video KW - Programming KW - programming;learning KW - recognition;human KW - representation;integer KW - storyline KW - theory;image KW - understanding;visually AB - Analyzing videos of human activities involves not only recognizing actions (typically based on their appearances), but also determining the story/plot of the video. The storyline of a video describes causal relationships between actions. Beyond recognition of individual actions, discovering causal relationships helps to better understand the semantic meaning of the activities. We present an approach to learn a visually grounded storyline model of videos directly from weakly labeled data. The storyline model is represented as an AND-OR graph, a structure that can compactly encode storyline variation across videos. The edges in the AND-OR graph correspond to causal relationships which are represented in terms of spatio-temporal constraints. We formulate an Integer Programming framework for action recognition and storyline extraction using the storyline model and visual groundings learned from training data. JA - Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on M3 - 10.1109/CVPR.2009.5206492 ER - TY - JOUR T1 - A unified approach to ranking in probabilistic databases JF - Proceedings of the VLDB Endowment Y1 - 2009 A1 - Li,Jian A1 - Saha,Barna A1 - Deshpande, Amol AB - The dramatic growth in the number of application domains that naturally generate probabilistic, uncertain data has resulted in a need for efficiently supporting complex querying and decision-making over such data. In this paper, we present a unified approach to ranking and top-k query processing in probabilistic databases by viewing it as a multi-criteria optimization problem, and by deriving a set of features that capture the key properties of a probabilistic dataset that dictate the ranked result. We contend that a single, specific ranking function may not suffice for probabilistic databases, and we instead propose two parameterized ranking functions, called PRFω and PRFe, that generalize or can approximate many of the previously proposed ranking functions. We present novel generating functions-based algorithms for efficiently ranking large datasets according to these ranking functions, even if the datasets exhibit complex correlations modeled using probabilistic and/xor trees or Markov networks. We further propose that the parameters of the ranking function be learned from user preferences, and we develop an approach to learn those parameters. Finally, we present a comprehensive experimental study that illustrates the effectiveness of our parameterized ranking functions, especially PRFe, at approximating other ranking functions and the scalability of our proposed algorithms for exact or approximate ranking. VL - 2 SN - 2150-8097 UR - http://dl.acm.org/citation.cfm?id=1687627.1687685 CP - 1 ER - TY - CONF T1 - The University of Maryland statistical machine translation system for the Fourth Workshop on Machine Translation T2 - Proceedings of the Fourth Workshop on Statistical Machine Translation Y1 - 2009 A1 - Dyer,C. A1 - Setiawan,H. A1 - Marton,Y. A1 - Resnik, Philip JA - Proceedings of the Fourth Workshop on Statistical Machine Translation ER - TY - CONF T1 - Unsupervised search-based structured prediction T2 - Proceedings of the 26th Annual International Conference on Machine Learning Y1 - 2009 A1 - Daumé, Hal AB - We describe an adaptation and application of a search-based structured prediction algorithm "Searn" to unsupervised learning problems. We show that it is possible to reduce unsupervised learning to supervised learning and demonstrate a high-quality un-supervised shift-reduce parsing model. We additionally show a close connection between unsupervised Searn and expectation maximization. Finally, we demonstrate the efficacy of a semi-supervised extension. The key idea that enables this is an application of the predict-self idea for unsupervised learning. JA - Proceedings of the 26th Annual International Conference on Machine Learning T3 - ICML '09 PB - ACM CY - New York, NY, USA SN - 978-1-60558-516-1 UR - http://doi.acm.org/10.1145/1553374.1553401 M3 - 10.1145/1553374.1553401 ER - TY - CONF T1 - Using citations to generate surveys of scientific paradigms T2 - Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics Y1 - 2009 A1 - Mohammad,Saif A1 - Dorr, Bonnie J A1 - Egan,Melissa A1 - Hassan,Ahmed A1 - Muthukrishan,Pradeep A1 - Qazvinian,Vahed A1 - Radev,Dragomir A1 - Zajic, David AB - The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of technical material. In this paper we present the first steps in producing an automatically generated, readily consumable, technical survey. Specifically we explore the combination of citation information and summarization techniques. Even though prior work (Teufel et al., 2006) argues that citation text is unsuitable for summarization, we show that in the framework of multi-document survey creation, citation texts can play a crucial role. JA - Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics T3 - NAACL '09 PB - Association for Computational Linguistics CY - Stroudsburg, PA, USA SN - 978-1-932432-41-1 UR - http://dl.acm.org/citation.cfm?id=1620754.1620839 ER - TY - JOUR T1 - Using formal specifications to support testing JF - ACM Computing Surveys Y1 - 2009 A1 - Hierons,Robert M. A1 - Krause,Paul A1 - Lüttgen,Gerald A1 - Simons,Anthony J. H. A1 - Vilkomir,Sergiy A1 - Woodward,Martin R. A1 - Zedan,Hussein A1 - Bogdanov,Kirill A1 - Bowen,Jonathan P. A1 - Cleaveland, Rance A1 - Derrick,John A1 - Dick,Jeremy A1 - Gheorghe,Marian A1 - Harman,Mark A1 - Kapoor,Kalpesh VL - 41 SN - 03600300 UR - http://dl.acm.org/citation.cfm?id=1459352.1459354 M3 - 10.1145/1459352.1459354 ER - TY - JOUR T1 - Using Graphics Processors for High-Performance Computation and Visualization of Plasma Turbulence JF - Computing in Science Engineering Y1 - 2009 A1 - Stantchev,G. A1 - Juba,D. A1 - Dorland,W. A1 - Varshney, Amitabh KW - analysis;parallel KW - computation;parallel KW - computing;numerical KW - direct KW - engineering KW - numerical KW - PROCESSING KW - processing;plasma KW - processors;data KW - simulation;graphics KW - systems;nuclear KW - turbulence KW - turbulence; KW - units;high-performance KW - visualisation;multiprocessing KW - visualization;single-program-multiple-data AB - Direct numerical simulation (DNS) of turbulence is computationally intensive and typically relies on some form of parallel processing. The authors present techniques to map DNS computations to modern graphics processing units (GPUs), which are characterized by very high memory bandwidth and hundreds of SPMD (single-program-multiple-data) processors. VL - 11 SN - 1521-9615 CP - 2 M3 - 10.1109/MCSE.2009.42 ER - TY - JOUR T1 - Video Compression and Retrieval of Moving Object Location Applied to Surveillance JF - Image Analysis and Recognition Y1 - 2009 A1 - Schwartz,W. A1 - Pedrini,H. A1 - Davis, Larry S. AB - A major problem in surveillance systems is the storage requirements for video archival; videos are recorded continuously for long periods of time, resulting in large amounts of data. Therefore, it is essential to apply efficient compression techniques. Additionally, it is useful to be able to index the archived videos based on events. In general, such events are defined by the interaction among moving objects in the scene. Consequently, besides data compression, efficient ways of storing moving objects should be considered. We present a method that exploits both temporal and spatial redundancy of videos captured from static cameras to perform compression and subsequently allows fast retrieval of moving object locations directly from the compressed data. Experimental results show that the approach achieves high compression ratios compared to other existing video compression techniques without significant quality degradation and is fast due to the simplicity of the operations required for compression and decompression. ER - TY - PAT T1 - Visual Tag Y1 - 2009 A1 - BRETEY,STEVE A1 - Kuhnly,Keith A1 - Davis, Larry S. AB - An improved animal tag for application to an animal's ear which comprises a male portion that includes a flat segment and a projection segment, a female portion that includes a generally square or generally oval shaped flat segment and a raised segment with an aperture; wherein the projection segment of the male portion is adapted to be inserted through the animal's ear and into the aperture when pressure is applied thereto. VL - 12/533,231 UR - http://www.google.com/patents?id=ZXTTAAAAEBAJ ER - TY - JOUR T1 - Visual Tracking by Continuous Density Propagation in Sequential Bayesian Filtering Framework JF - Pattern Analysis and Machine Intelligence, IEEE Transactions on Y1 - 2009 A1 - Han,Bohyung A1 - Zhu,Ying A1 - Comaniciu, D. A1 - Davis, Larry S. KW - Automated;Reproducibility of Results;Sensitivity and Specificity;Signal Processing KW - Computer-Assisted;Pattern Recognition KW - Computer-Assisted;Subtraction Technique; KW - Monte Carlo approach;continuous density propagation;density approximation;density interpolation;nonGaussian dynamic systems;nonlinear dynamic systems;particle filtering;probability density functions;sequential Bayesian filtering framework;video sequences; AB - Particle filtering is frequently used for visual tracking problems since it provides a general framework for estimating and propagating probability density functions for nonlinear and non-Gaussian dynamic systems. However, this algorithm is based on a Monte Carlo approach and the cost of sampling and measurement is a problematic issue, especially for high-dimensional problems. We describe an alternative to the classical particle filter in which the underlying density function has an analytic representation for better approximation and effective propagation. The techniques of density interpolation and density approximation are introduced to represent the likelihood and the posterior densities with Gaussian mixtures, where all relevant parameters are automatically determined. The proposed analytic approach is shown to perform more efficiently in sampling in high-dimensional space. We apply the algorithm to real-time tracking problems and demonstrate its performance on real video sequences as well as synthetic examples. VL - 31 SN - 0162-8828 CP - 5 M3 - 10.1109/TPAMI.2008.134 ER - TY - CONF T1 - Voronoi++: ADynamic Page Segmentation approach based on Voronoi and Docstrum features T2 - International Conference on Document Analysis and Recognition (ICDAR '09) Y1 - 2009 A1 - Agrawal,Mudit A1 - David Doermann AB - This paper presents a dynamic approach to document page segmentation. Current page segmentation algorithms lack the ability to dynamically adapt local variations in the size, orientation and distance of components within a page. Our approach builds upon one of the best algorithms, Kise et. al. work based on Area Voronoi Diagrams, which adapts globally to page content to determine algorithm parameters. In our approach, local thresholds are determined dynamically based on parabolic relations between components, and Docstrum based angular and neighborhood features are integrated to improve accuracy. Zone-based evaluation was performed on four sets of printed and handwritten documents in English and Arabic scripts and an increase of 33% in accuracy is reported. JA - International Conference on Document Analysis and Recognition (ICDAR '09) ER - TY - JOUR T1 - What a mesh: understanding the design tradeoffs for streaming multicast JF - SIGMETRICS Perform. Eval. Rev. Y1 - 2009 A1 - Nandi,Animesh A1 - Bhattacharjee, Bobby A1 - Druschel,Peter AB - Cooperative end-system multicast (CEM) is a promising paradigm for Internet video distribution. Several CEM systems have been proposed and deployed, but the tradeoffs inherent in the different designs are not well understood. In this work, we provide a common framework in which different CEM design choices can be empirically and systematically evaluated. Based on our results, we conjecture that all CEM systems must abide by a set of fundamental design constraints, which we express in a simple model. By necessity, existing system implementations couple the data- and control-planes and often use different transport protocols. VL - 37 SN - 0163-5999 UR - http://doi.acm.org/10.1145/1639562.1639598 CP - 2 M3 - 10.1145/1639562.1639598 ER - TY - CONF T1 - Why Do Upgrades Fail and What Can We Do About It?: Toward Dependable, Online Upgrades in Enterprise System T2 - Middleware'09 Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware Y1 - 2009 A1 - Tudor Dumitras A1 - Narasimhan, Priya AB - Enterprise-system upgrades are unreliable and often produce downtime or data-loss. Errors in the upgrade procedure, such as broken dependencies, constitute the leading cause of upgrade failures. We propose a novel upgrade-centric fault model, based on data from three independent sources, which focuses on the impact of procedural errors rather than software defects. We show that current approaches for upgrading enterprise systems, such as rolling upgrades, are vulnerable to these faults because the upgrade is not an atomic operation and it risks breaking hidden dependencies among the distributed system-components. We also present a mechanism for tolerating complex procedural errors during an upgrade. Our system, called Imago, improves availability in the fault-free case, by performing an online upgrade, and in the faulty case, by reducing the risk of failure due to breaking hidden dependencies. Imago performs an end-to-end upgrade atomically and dependably, by dedicating separate resources to the new version and by isolating the old version from the upgrade procedure. Through fault injection, we show that Imago is more reliable than online-upgrade approaches that rely on dependency-tracking and that create system states with mixed versions. JA - Middleware'09 Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware T3 - Middleware '09 PB - Springer-Verlag New York, Inc. UR - http://dl.acm.org/citation.cfm?id=1656980.1657005 ER - TY - JOUR T1 - Wideband fast multipole accelerated boundary element methods for the three-dimensional Helmholtz equation. JF - The Journal of the Acoustical Society of America Y1 - 2009 A1 - Gumerov, Nail A. A1 - Duraiswami, Ramani AB - The development of a fast multipole method (FMM) accelerated iterative solution of the boundary element method (BEM) for the Helmholtz equations in three dimensions is described. The FMM for the Helmholtz equation is significantly different for problems with low and high kD, where k is the wave number and D the domain size, and for large problems, the method must be switched between levels of the hierarchy. The BEM requires several approximate computations: numerical quadrature and approximations of the boundary shapes using elements. These errors must be balanced against approximations introduced by the FMM and the convergence criterion for an iterative solution. These different errors must all be chosen in a way that, on the one hand, excess work is not done and, on the other, that the error achieved by the overall computation is acceptable. Details of translation operators for low and high kD choice of representations and BEM quadrature schemes, all consistent with these approximations, are described. A novel preconditioner using a low accuracy FMM accelerated solver as a right preconditioner is also described. Results of the developed solvers for large boundary value problems with 0.0001⩽kD⩽500 are presented and shown to perform close to theoretical expectations. VL - 125 UR - http://link.aip.org/link/?JAS/125/2566/4 CP - 4 ER - TY - CONF T1 - 1st ACM workshop on hot topics in software upgrades (HotSWUp 2008) T2 - OOPSLA Companion'08 Companion to the 23rd ACM SIGPLAN Conference on Object-Oriented Programming Systems Languages and Applications Y1 - 2008 A1 - Tudor Dumitras A1 - Dig, Danny A1 - Neamtiu, Iulian KW - software KW - upgrades AB - The goal of HotSWUp is to identify cutting-edge research ideas for implementing software upgrades. Actively-used software is upgraded regularly to incorporate bug fixes and security patches or to keep up with the evolving requirements. Whether upgrades are applied offline or online, they significantly impact the software's performance and reliability. Recently-introduced commercial products aim to address various aspects of this problem, e.g., programing language/framework/middleware support for online upgrade, large-scale dissemination of fine-grained updates, live data migration in storage-area networks. However, recent studies and a large body of anecdotal evidence suggest that, in practice, upgrades are failure-prone, tedious, and expensive. JA - OOPSLA Companion'08 Companion to the 23rd ACM SIGPLAN Conference on Object-Oriented Programming Systems Languages and Applications T3 - OOPSLA Companion '08 PB - ACM SN - 978-1-60558-220-7 UR - http://doi.acm.org/10.1145/1449814.1449877 ER - TY - CONF T1 - ACamera Phone Based Currency Reader for the Visually Impaired T2 - The Tenth International ACMSIG ACCESS Conference on Computers and Accessibility Y1 - 2008 A1 - Liu,Xu A1 - David Doermann AB - In this paper we present a camera phone-based currency reader for the visually impaired to denominate face values of the U.S. paper dollars. Currently, they can only be identified visually and this situation will continue in foreseeable future. Our solution harvests the imaging and computational power on camera phones to read bills. Considering it is impractical for the visually impaired to capture high quality image, our currency reader performs real time processing for each captured frame as the camera approaches the bill. We develop efficient background subtraction and perspective correction algorithms and train our currency reader using the Ada-boost framework, which is efficient. Our currency reader processes 10 frames/second and achieves a false positive rate of 10^-4. Major smart phone platforms, including Symbian and Windows Mobile, are supported. JA - The Tenth International ACMSIG ACCESS Conference on Computers and Accessibility ER - TY - CONF T1 - ACamera-based Mobile Data Channel: Capacity and Analysis T2 - ACM International Conference on Multimedia Y1 - 2008 A1 - Liu,Xu A1 - David Doermann A1 - Li,H. AB - In this paper we propose a novel application, color Video Code (V-Code) and analyze its data transmission capacity through camera-based mobile data channels. Users can use the camera on a mobile device (PDA or camera phone) as a passive and pervasive data channel to download data encoded as a sequence of color visual patterns. The color V-Code is animated on a display, acquired by the camera and decoded by the pre-embedded software in the mobile device. One interesting question is what is the data transmission capacity it can achieve, theoretically and practically. To answer this question we build a camera channel model to measure color degradation using information theory and show that the capacity of the camera channel can be improved with the optimized color selection through color calibration. After initialization color models are learned automatically as downloading proceeds. We address the problem of precise registration, and implemented a fast perspective correction method to accelerate the decoder in real-time on a resource constrained device. With the optimized color set and efficient implementation we achieve a transmission bit rate of 15.4kbps on a common iMate Jamin phone (200MHz CPU). This speed is faster than the average GPRS bit rate (12kbps). JA - ACM International Conference on Multimedia ER - TY - JOUR T1 - The acl anthology reference corpus: A reference dataset for bibliographic research in computational linguistics JF - Proc. of the 6th International Conference on Language Resources and Evaluation Conference (LREC’08) Y1 - 2008 A1 - Bird,S. A1 - Dale,R. A1 - Dorr, Bonnie J A1 - Gibson,B. A1 - Joseph,M.T. A1 - Kan,M.Y. A1 - Lee,D. A1 - Powley,B. A1 - Radev,D.R. A1 - Tan,Y.F. AB - The ACL Anthology is a digital archive of conference and journal papers in natural language processing and computational linguistics.Its primary purpose is to serve as a reference repository of research results, but we believe that it can also be an object of study and a platform for research in its own right. We describe an enriched and standardized reference corpus derived from the ACL Anthology that can be used for research in scholarly document processing. This corpus, which we call the ACL Anthology Reference Corpus (ACL ARC), brings together the recent activities of a number of research groups around the world. Our goal is to make the corpus widely available, and to encourage other researchers to use it as a standard testbed for experiments in both bibliographic and bibliometric research. ER - TY - CONF T1 - Action recognition using ballistic dynamics T2 - Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on Y1 - 2008 A1 - Vitaladevuni,S.N. A1 - Kellokumpu,V. A1 - Davis, Larry S. KW - analysis;image KW - Bayesian KW - dynamics;gesture KW - feature;person-centric KW - framework;action KW - History KW - image KW - labels;psycho-kinesiological KW - morphological KW - MOTION KW - Movement KW - movements;motion KW - planning;interactive KW - processing; KW - recognition KW - recognition;ballistic KW - recognition;image KW - segmentation;video KW - signal KW - studies;image KW - task;human AB - We present a Bayesian framework for action recognition through ballistic dynamics. Psycho-kinesiological studies indicate that ballistic movements form the natural units for human movement planning. The framework leads to an efficient and robust algorithm for temporally segmenting videos into atomic movements. Individual movements are annotated with person-centric morphological labels called ballistic verbs. This is tested on a dataset of interactive movements, achieving high recognition rates. The approach is also applied on a gesture recognition task, improving a previously reported recognition rate from 84% to 92%. Consideration of ballistic dynamics enhances the performance of the popular Motion History Image feature. We also illustrate the approachpsilas general utility on real-world videos. Experiments indicate that the method is robust to view, style and appearance variations. JA - Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on M3 - 10.1109/CVPR.2008.4587806 ER - TY - JOUR T1 - Applying automatically generated semantic knowledge: A case study in machine translation JF - NSF Symposium on Semantic Knowledge Discovery, Organization and Use Y1 - 2008 A1 - Madnani,N. A1 - Resnik, Philip A1 - Dorr, Bonnie J A1 - Schwartz,R. AB - In this paper, we discuss how we apply automatically generated semantic knowledge to benefit statisticalmachine translation (SMT). Currently, almost all statistical machine translation systems rely heavily on memorizing translations of phrases. Some systems attempt to go further and generalize these learned phrase translations into templates using empirically derived information about word alignments and a small amount of syntactic information, if at all. There are several issues in a SMT pipeline that could be addressed by the application of semantic knowledge, if such knowledge were easily available. One such issue, an important one, is that of reference sparsity. The fundamental problem that translation systems have to face is that there is no such thing as the correct translation for any sentence. In fact, any given source sentence can often be translated into the target language in many valid ways. Since there can be many “correct answers,” almost all models employed by SMT systems require, in addition to a large bitext, a held-out development set comprised of multiple high-quality, human-authored reference translations in the target language in order to tune their parameters relative to a translation quality metric.1 There are several reasons that this requirement is not an easy one to satisfy. First, with a few exceptions—notably NIST’s annual MT evaluations—most new MT research data sets are provided with only a single reference translation. Second, obtaining multiple reference translations in rapid development, low-density source language scenarios (e.g. (Oard, 2003)) is likely to be severely limited (or made entirely impractical) by limitations of time, cost, and ready availability of qualified translators. ER - TY - JOUR T1 - Are multiple reference translations necessary? investigating the value of paraphrased reference translations in parameter optimization JF - Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, October Y1 - 2008 A1 - Madnani,N. A1 - Resnik, Philip A1 - Dorr, Bonnie J A1 - Schwartz,R. AB - Most state-of-the-art statistical machine trans-lation systems use log-linear models, which are defined in terms of hypothesis features and weights for those features. It is standard to tune the feature weights in order to maxi- mize a translation quality metric, using held- out test sentences and their corresponding ref- erence translations. However, obtaining refer- ence translations is expensive. In our earlier work (Madnani et al., 2007), we introduced a new full-sentence paraphrase technique, based on English-to-English decoding with an MT system, and demonstrated that the resulting paraphrases can be used to cut the number of human reference translations needed in half. In this paper, we take the idea a step further, asking how far it is possible to get with just a single good reference translation for each item in the development set. Our analysis suggests that it is necessary to invest in four or more hu- man translations in order to significantly im- prove on a single translation augmented by monolingual paraphrases. ER - TY - PAT T1 - Audio Camera Using Microphone Arrays for Real Time Capture of Audio Images ... Y1 - 2008 A1 - Duraiswami, Ramani A1 - O'donovan,Adam A1 - Neumann, Jan A1 - Gumerov, Nail A. AB - Spherical microphone arrays provide an ability to compute the acoustical intensity corresponding to different spatial directions in a given frame of audio data. These intensities may be exhibited as an image and these images are generated at a high frame rate to achieve a video image if the data capture and intensity computations can be performed sufficiently quickly, thereby creating a frame-rate audio camera. A description is provided herein regarding how such a camera is built and the processing done sufficiently quickly using graphics processors. The joint processing of and captured frame-rate audio and video images enables applications such as visual identification of noise sources, beamforming and noise-suppression in video conferenceing and others, by accounting for the spatial differences in the location of the audio and the video cameras. Based on the recognition that the spherical array can be viewed as a central projection camera, such joint analysis can be performed. VL - 12/127,451 UR - http://www.google.com/patents?id=gWKzAAAAEBAJ ER - TY - JOUR T1 - Automatic online tuning for fast Gaussian summation JF - Advances in Neural Information Processing Systems Y1 - 2008 A1 - Morariu,V. A1 - Srinivasan,B.V. A1 - Raykar,V.C. A1 - Duraiswami, Ramani A1 - Davis, Larry S. AB - Many machine learning algorithms require the summation of Gaussian kernelfunctions, an expensive operation if implemented straightforwardly. Several meth- ods have been proposed to reduce the computational complexity of evaluating such sums, including tree and analysis based methods. These achieve varying speedups depending on the bandwidth, dimension, and prescribed error, making the choice between methods difficult for machine learning tasks. We provide an algorithm that combines tree methods with the Improved Fast Gauss Transform (IFGT). As originally proposed the IFGT suffers from two problems: (1) the Taylor series expansion does not perform well for very low bandwidths, and (2) parameter se- lection is not trivial and can drastically affect performance and ease of use. We address the first problem by employing a tree data structure, resulting in four eval- uation methods whose performance varies based on the distribution of sources and targets and input parameters such as desired accuracy and bandwidth. To solve the second problem, we present an online tuning approach that results in a black box method that automatically chooses the evaluation method and its parameters to yield the best performance for the input data, desired accuracy, and bandwidth. In addition, the new IFGT parameter selection approach allows for tighter error bounds. Our approach chooses the fastest method at negligible additional cost, and has superior performance in comparisons with previous approaches. ER - TY - JOUR T1 - Bandwidth-constrained queries in sensor networks JF - The VLDB Journal Y1 - 2008 A1 - Deligiannakis,A. A1 - Kotidis,Y. A1 - Roussopoulos, Nick VL - 17 CP - 3 ER - TY - JOUR T1 - A Bayesian statistics approach to multiscale coarse graining JF - The Journal of chemical physics Y1 - 2008 A1 - Liu,P. A1 - Shi,Q. A1 - Daumé, Hal A1 - Voth,G. A VL - 129 ER - TY - JOUR T1 - Beyond nouns: Exploiting prepositions and comparative adjectives for learning visual classifiers JF - Computer Vision–ECCV 2008 Y1 - 2008 A1 - Gupta,A. A1 - Davis, Larry S. AB - Learning visual classifiers for object recognition from weakly labeled data requires determining correspondence between image regions and semantic object classes. Most approaches use co-occurrence of “nouns” and image features over large datasets to determine the correspondence, but many correspondence ambiguities remain. We further constrain the correspondence problem by exploiting additional language constructs to improve the learning process from weakly labeled data. We consider both “prepositions” and “comparative adjectives” which are used to express relationships between objects. If the models of such relationships can be determined, they help resolve correspondence ambiguities. However, learning models of these relationships requires solving the correspondence problem. We simultaneously learn the visual features defining “nouns” and the differential visual features defining such “binary-relationships” using an EM-based approach. ER - TY - CHAP T1 - Black-Box Construction of a Non-malleable Encryption Scheme from Any Semantically Secure One T2 - Theory of Cryptography Y1 - 2008 A1 - Choi, Seung Geol A1 - Dana Dachman-Soled A1 - Malkin, Tal A1 - Wee, Hoeteck ED - Canetti, Ran KW - Algorithm Analysis and Problem Complexity KW - black-box constructions KW - computers and society KW - Data Encryption KW - Discrete Mathematics in Computer Science KW - Management of Computing and Information Systems KW - non-malleability KW - public-key encryption KW - semantic security KW - Systems and Data Security AB - We show how to transform any semantically secure encryption scheme into a non-malleable one, with a black-box construction that achieves a quasi-linear blow-up in the size of the ciphertext. This improves upon the previous non-black-box construction of Pass, Shelat and Vaikuntanathan (Crypto ’06). Our construction also extends readily to guarantee non-malleability under a bounded-CCA2 attack, thereby simultaneously improving on both results in the work of Cramer et al. (Asiacrypt ’07). Our construction departs from the oft-used paradigm of re-encrypting the same message with different keys and then proving consistency of encryptions; instead, we encrypt an encoding of the message with certain locally testable and self-correcting properties. We exploit the fact that low-degree polynomials are simultaneously good error-correcting codes and a secret-sharing scheme. JA - Theory of Cryptography T3 - Lecture Notes in Computer Science PB - Springer Berlin Heidelberg SN - 978-3-540-78523-1, 978-3-540-78524-8 UR - http://link.springer.com/chapter/10.1007/978-3-540-78524-8_24 ER - TY - CONF T1 - Camera Phone Based Tools for People with Visual Impairments T2 - The First International Workshop on Mobile Multimedia Processing Y1 - 2008 A1 - Liu,Xu A1 - David Doermann A1 - Li,H. AB - In this paper we describe a set of applications that utilize a camera phone to help the visually impaired with daily tasks. Our “MobileEye” software suite turns a camera enabled mobile device into a multi-purpose vision tool that helps individuals with visual disabilities. MobileEye consists of four subsystems for different types of visual disabilities. A color channel mapper helps distinguish colors, a software magnifier helps low vision individuals see detail, a pattern recognizer helps the severely impaired recognize objects and a document retriever provides access to printed materials. We apply cutting edge computer vision and image processing technologies on the mobile devices and tackle the challenges of limited computational resources and low image quality. We also consider the usability for the visually impaired so our system requires minimum keyboard operation. We provide a full software solution which runs on Symbian and Windows Mobile handsets. This paper provides a high level overview of the system. JA - The First International Workshop on Mobile Multimedia Processing ER - TY - CONF T1 - Cancellation of critical points in 2D and 3D Morse and Morse-Smale complexes T2 - Discrete Geometry for Computer Imagery Y1 - 2008 A1 - Čomić,L. A1 - De Floriani, Leila AB - Morse theory studies the relationship between the topology of a manifold M and the critical points of a scalar function f defined on M. The Morse-Smale complex associated with f induces a subdivision of M into regions of uniform gradient flow, and represents the topology of M in a compact way. Function f can be simplified by cancelling its critical points in pairs, thus simplifying the topological representation of M, provided by the Morse-Smale complex. Here, we investigate the effect of the cancellation of critical points of f in Morse-Smale complexes in two and three dimensions by showing how the change of connectivity of a Morse-Smale complex induced by a cancellation can be interpreted and understood in a more intuitive and straightforward way as a change of connectivity in the corresponding ascending and descending Morse complexes. We consider a discrete counterpart of the Morse-Smale complex, called a quasi-Morse complex, and we present a compact graph-based representation of such complex and of its associated discrete Morse complexes, showing also how such representation is affected by a cancellation. JA - Discrete Geometry for Computer Imagery M3 - 10.1007/978-3-540-79126-3_12 ER - TY - CONF T1 - Canny edge detection on NVIDIA CUDA T2 - Computer Vision and Pattern Recognition Workshops, 2008. CVPRW '08. IEEE Computer Society Conference on Y1 - 2008 A1 - Luo,Yuancheng A1 - Duraiswami, Ramani KW - algorithms;connected-component KW - analysis KW - application KW - Canny KW - CUDA;computer KW - detection;feature KW - detection;NVIDIA KW - detector;filtering;graphical KW - detector;non KW - edge KW - extraction;smoothing KW - feature KW - filter KW - layers;multistep KW - methods; KW - responses;nonmaxima KW - stage;edge KW - suppression;smoothing;computer KW - VISION KW - vision;edge AB - The Canny edge detector is a very popular and effective edge feature detector that is used as a pre-processing step in many computer vision algorithms. It is a multi-step detector which performs smoothing and filtering, non-maxima suppression, followed by a connected-component analysis stage to detect ldquotruerdquo edges, while suppressing ldquofalserdquo non edge filter responses. While there have been previous (partial) implementations of the Canny and other edge detectors on GPUs, they have been focussed on the old style GPGPU computing with programming using graphical application layers. Using the more programmer friendly CUDA framework, we are able to implement the entire Canny algorithm. Details are presented along with a comparison with CPU implementations. We also integrate our detector in to MATLAB, a popular interactive simulation package often used by researchers. The source code will be made available as open source. JA - Computer Vision and Pattern Recognition Workshops, 2008. CVPRW '08. IEEE Computer Society Conference on M3 - 10.1109/CVPRW.2008.4563088 ER - TY - BOOK T1 - On classification, ranking, and probability estimation Y1 - 2008 A1 - Flach,P. A1 - Matsubara,E.T. A1 - De Raedt,L. A1 - Dietterich,T. A1 - Getoor, Lise A1 - Kersting,K. A1 - Muggleton,S. H AB - Given a binary classification task, a ranker is an algorithm that can sort a set of instances from highest to lowest expectation that the instance is positive. In contrast to a classifier, a ranker does not output class predictions – although it can be turned into a classifier with help of an additional procedure to split the ranked list into two. A straightforward way to compute rankings is to train a scoring classifier to assign numerical scores to instances, for example the predicted odds that an instance is positive. However, rankings can be computed without scores, as we demonstrate in this paper. We propose a lexicographic ranker, LexRank , whose rankings are derived not from scores, but from a simple ranking of attribute values obtained from the training data. Although various metrics can be used, we show that by using the odds ratio to rank the attribute values we obtain a ranker that is conceptually close to the naive Bayes classifier, in the sense that for every instance of LexRank there exists an instance of naive Bayes that achieves the same ranking. However, the reverse is not true, which means that LexRank is more biased than naive Bayes. We systematically develop the relationships and differences between classification, ranking, and probability estimation, which leads to a novel connection between the Brier score and ROC curves. Combining LexRank with isotonic regression, which derives probability estimates from the ROC convex hull, results in the lexicographic probability estimator LexProb. T3 - Dagstuhl Seminar Proceedings PB - Internationales Begegnungs- und Forschungszentrum fur Informatik (IBFI), Schloss Dagstuhl, Germany ER - TY - CHAP T1 - Combining Classifiers with Informational Confidence T2 - Studies in Computational Intelligence: Machine Learning in Document Analysis and RecognitionStudies in Computational Intelligence: Machine Learning in Document Analysis and Recognition Y1 - 2008 A1 - Jaeger,Stefan A1 - Ma,Huanfeng A1 - David Doermann ED - Simone Marinai,Hiromichi Fujisawa AB - We propose a new statistical method for learning normalized confidence values in multiple classifier systems. Our main idea is to adjust confidence values so that their nominal values equal the information actually conveyed. In order to do so, we assume that information depends on the actual performance of each confidence value on an evaluation set. As information measure, we use Shannon's well-known logarithmic notion of information. With the confidence values matching their informational content, the classifier combination scheme reduces to the simple sum-rule, theoretically justifying this elementary combination scheme. In experimental evaluations for script identification, and both handwritten and printed character recognition, we achieve a consistent improvement on the best single recognition rate. We cherish the hope that our information-theoretical framework helps fill the theoretical gap we still experience in classifier combination, and puts the excellent practical performance of multiple classifier systems on a more solid basis. JA - Studies in Computational Intelligence: Machine Learning in Document Analysis and RecognitionStudies in Computational Intelligence: Machine Learning in Document Analysis and Recognition PB - Springer ER - TY - CONF T1 - Combining open-source with research to re-engineer a hands-on introductory NLP course T2 - Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics Y1 - 2008 A1 - Madnani,Nitin A1 - Dorr, Bonnie J AB - We describe our first attempts to re-engineer the curriculum of our introductory NLP course by using two important building blocks: (1) Access to an easy-to-learn programming language and framework to build hands-on programming assignments with real-world data and corpora and, (2) Incorporation of interesting ideas from recent NLP research publications into assignment and examination problems. We believe that these are extremely important components of a curriculum aimed at a diverse audience consisting primarily of first-year graduate students from both linguistics and computer science. Based on overwhelmingly positive student feedback, we find that our attempts were hugely successful. JA - Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics T3 - TeachCL '08 PB - Association for Computational Linguistics CY - Stroudsburg, PA, USA SN - 978-1-932432-14-5 UR - http://dl.acm.org/citation.cfm?id=1627306.1627318 ER - TY - JOUR T1 - Compressive sensing for background subtraction JF - Computer Vision–ECCV 2008 Y1 - 2008 A1 - Cevher, V. A1 - Sankaranarayanan, A. A1 - Duarte, M. A1 - Reddy, D. A1 - Baraniuk, R. A1 - Chellapa, Rama AB - Compressive sensing (CS) is an emerging field that provides a framework for image recovery using sub-Nyquist sampling rates. The CS theory shows that a signal can be reconstructed from a small set of random projections, provided that the signal is sparse in some basis, e.g., wavelets. In this paper, we describe a method to directly recover background subtracted images using CS and discuss its applications in some communication constrained multi-camera computer vision problems. We show how to apply the CS theory to recover object silhouettes (binary background subtracted images) when the objects of interest occupy a small portion of the camera view, i.e., when they are sparse in the spatial domain. We cast the background subtraction as a sparse approximation problem and provide different solutions based on convex optimization and total variation. In our method, as opposed to learning the background, we learn and adapt a low dimensional compressed representation of it, which is sufficient to determine spatial innovations; object silhouettes are then estimated directly using the compressive samples without any auxiliary image reconstruction. We also discuss simultaneous appearance recovery of the objects using compressive measurements. In this case, we show that it may be necessary to reconstruct one auxiliary image. To demonstrate the performance of the proposed algorithm, we provide results on data captured using a compressive single-pixel camera. We also illustrate that our approach is suitable for image coding in communication constrained problems by using data captured by multiple conventional cameras to provide 2D tracking and 3D shape reconstruction results with compressive measurements. ER - TY - RPRT T1 - Computer Vision and Image Processing Techniques for Mobile Application Y1 - 2008 A1 - Liu,Xu A1 - David Doermann AB - Camera phones have penetrated every corner of society and have become a focal point for communications. In our research we extend the traditional use of such devices to help bridge the gap between physical and digital worlds. Their combined image acquisition, processing, storage, and communication capabilities in a compact, portable device make them an ideal platform for embedding computer vision and image processing capabilities in the pursuit of new mobile applications. This technical report is presented as a series of computer vision and image processing techniques together with their applications on the mobile device. We have developed a set of techniques for ego-motion estimation, enhancement, feature extraction, perspective correction, object detection, and document retrieval that serve as a basis for such applications. Our applications include a dynamic video barcode that can transfer significant amounts of information visually, a document retrieval system that can retrieve documents from low resolution snapshots, and a series of applications for the users with visual disabilities such as a currency reader. Solutions for mobile devices require a fundamentally different approach than traditional vision techniques that run on traditional computers, so we consider user-device interaction and the fact that these algorithms must execute in a resource constrained environment. For each problem we perform both theoretical and empirical analysis in an attempt to optimize performance and usability. The thesis makes contributions related to efficient implementation of image processing and computer vision techniques, analysis of information theory, feature extraction and analysis of low quality images, and device usability. PB - Center for Automation Research, University of Maryland VL - LAMP-TR-151 ER - TY - CONF T1 - Computing word-pair antonymy T2 - Proceedings of the Conference on Empirical Methods in Natural Language Processing Y1 - 2008 A1 - Mohammad,Saif A1 - Dorr, Bonnie J A1 - Hirst,Graeme AB - Knowing the degree of antonymy between words has widespread applications in natural language processing. Manually-created lexicons have limited coverage and do not include most semantically contrasting word pairs. We present a new automatic and empirical measure of antonymy that combines corpus statistics with the structure of a published thesaurus. The approach is evaluated on a set of closest-opposite questions, obtaining a precision of over 80%. Along the way, we discuss what humans consider antonymous and how antonymy manifests itself in utterances. JA - Proceedings of the Conference on Empirical Methods in Natural Language Processing T3 - EMNLP '08 PB - Association for Computational Linguistics CY - Stroudsburg, PA, USA UR - http://dl.acm.org/citation.cfm?id=1613715.1613843 ER - TY - JOUR T1 - Confluent Volumetric Visualization of Gyrokinetic Turbulence JF - Plasma Science, IEEE Transactions on Y1 - 2008 A1 - Stantchev,G. A1 - Juba,D. A1 - Dorland,W. A1 - Varshney, Amitabh KW - flow;plasma KW - geometry;plasma KW - gyrokinetic KW - simulation;nontrivial KW - simulation;plasma KW - turbulence; KW - turbulence;nonlinear KW - turbulence;volumetric KW - visualisation;plasma KW - visualization;flow AB - Data from gyrokinetic turbulence codes are often difficult to visualize due their high dimensionality, the nontrivial geometry of the underlying grids, and the vast range of spatial scales. We present an interactive visualization framework that attempts to address these issues. Images from a nonlinear gyrokinetic simulation are presented. VL - 36 SN - 0093-3813 CP - 4 M3 - 10.1109/TPS.2008.924509 ER - TY - JOUR T1 - Constraint Integration for Efficient Multiview Pose Estimation with Self-Occlusions JF - Pattern Analysis and Machine Intelligence, IEEE Transactions on Y1 - 2008 A1 - Gupta,A. A1 - Mittal,A. A1 - Davis, Larry S. KW - Automated;Posture;Reproducibility of Results;Sensitivity and Specificity;Video Recording;Whole Body Imaging; KW - automatic initialization;constraint integration;graphical structure;human pose tracking;kinematic constraint;likelihood measure;location probability distribution;multiview pose estimation;nonparametric belief propagation;optimization;pose configuration;se KW - Computer-Assisted;Imaging KW - Three-Dimensional;Pattern Recognition AB - Automatic initialization and tracking of human pose is an important task in visual surveillance. We present a part-based approach that incorporates a variety of constraints in a unified framework. These constraints include the kinematic constraints between parts that are physically connected to each other, the occlusion of one part by another, and the high correlation between the appearance of certain parts, such as the arms. The location probability distribution of each part is determined by evaluating appropriate likelihood measures. The graphical (nontree) structure representing the interdependencies between parts is utilized to "connect" such part distributions via nonparametric belief propagation. Methods are also developed to perform this optimization efficiently in the large space of pose configurations. VL - 30 SN - 0162-8828 CP - 3 M3 - 10.1109/TPAMI.2007.1173 ER - TY - JOUR T1 - Content-based assembly search: A step towards assembly reuse JF - Computer-Aided Design Y1 - 2008 A1 - Deshmukh,Abhijit S. A1 - Banerjee,Ashis Gopal A1 - Gupta, Satyandra K. A1 - Sriram,Ram D. KW - Assembly characteristics KW - Assembly mating conditions KW - Content-based assembly search KW - Graph compatibility AB - The increased use of CAD systems by product development organizations has resulted in the creation of large databases of assemblies. This explosion of assembly data is likely to continue in the future. In many situations, a text-based search alone may not be sufficient to search for assemblies and it may be desirable to search for assemblies based on the content of the assembly models. The ability to perform content-based searches on these databases is expected to help the designers in the following two ways. First, it can facilitate the reuse of existing assembly designs, thereby reducing the design time. Second, a lot of useful designs for manufacturing, and assembly knowledge are implicitly embedded in existing assemblies. Therefore a capability to locate existing assemblies and examine them can be used as a learning tool by designers to learn from the existing assembly designs. This paper describes a system for performing content-based searches on assembly databases. We identify templates for comprehensive search definitions and describe algorithms to perform content-based searches for mechanical assemblies. We also illustrate the capabilities of our system through several examples. VL - 40 SN - 0010-4485 UR - http://www.sciencedirect.com/science/article/pii/S0010448507002424 CP - 2 M3 - 10.1016/j.cad.2007.10.012 ER - TY - CONF T1 - Context and observation driven latent variable model for human pose estimation T2 - Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on Y1 - 2008 A1 - Gupta,A. A1 - Chen,T. A1 - Chen,F. A1 - Kimber,D. A1 - Davis, Larry S. KW - estimation; KW - estimation;image KW - Gaussian KW - gestures;pose KW - latent KW - learning;parameterized KW - model;human KW - observations;integrated KW - pose KW - process KW - processes;gesture KW - processing;pose KW - recognition;image KW - tracking;Gaussian KW - variable AB - Current approaches to pose estimation and tracking can be classified into two categories: generative and discriminative. While generative approaches can accurately determine human pose from image observations, they are computationally expensive due to search in the high dimensional human pose space. On the other hand, discriminative approaches do not generalize well, but are computationally efficient. We present a hybrid model that combines the strengths of the two in an integrated learning and inference framework. We extend the Gaussian process latent variable model (GPLVM) to include an embedding from observation space (the space of image features) to the latent space. GPLVM is a generative model, but the inclusion of this mapping provides a discriminative component, making the model observation driven. Observation Driven GPLVM (OD-GPLVM) not only provides a faster inference approach, but also more accurate estimates (compared to GPLVM) in cases where dynamics are not sufficient for the initialization of search in the latent space. We also extend OD-GPLVM to learn and estimate poses from parameterized actions/gestures. Parameterized gestures are actions which exhibit large systematic variation in joint angle space for different instances due to difference in contextual variables. For example, the joint angles in a forehand tennis shot are function of the height of the ball (Figure 2). We learn these systematic variations as a function of the contextual variables. We then present an approach to use information from scene/objects to provide context for human pose estimation for such parameterized actions. JA - Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on M3 - 10.1109/CVPR.2008.4587511 ER - TY - JOUR T1 - Contribution to a Taxonomy of Non-Manifold Models Based on Topological Properties JF - ASME Conference Proceedings Y1 - 2008 A1 - Leon,Jean-Claude A1 - De Floriani, Leila AB - Non-manifold models have classically been used in Finite Element (FE) simulations to describe the shape of a mechanical part. Sketching the shape of an object is also important in a design stage, where non-manifold models could be of help for the designer. The purpose of the taxonomy proposed here is to be able either to accept or reject them based on the design context and on the type of product. This will contribute to model non-manifold objects and to their binary classification. The proposed taxonomy is also based on the topological properties of non-manifold objects, and especially on topological invariants. We will classify several topological properties as local or global ones, and we will illustrate how such properties can help either modelling these objects or searching for them in industrial data bases when large numbers of FE models have been stored. VL - 2008 UR - http://link.aip.org/link/abstract/ASMECP/v2008/i43277/p187/s1 CP - 43277 M3 - 10.1115/DETC2008-49529 ER - TY - RPRT T1 - CrossTalk: The Journal of Defense Software Engineering. Volume 21, Number 10, October 2008 Y1 - 2008 A1 - Basili, Victor R. A1 - Dangle,K. A1 - Esker,L. A1 - Marotta,F. A1 - Rus,I. A1 - Brosgol,B. M A1 - Jamin,S. A1 - Arthur,J. D A1 - Ravichandar,R. A1 - Wisnosky,D. E PB - DTIC Document ER - TY - CONF T1 - Cross-task knowledge-constrained self training T2 - Proceedings of the Conference on Empirical Methods in Natural Language Processing Y1 - 2008 A1 - Daumé, Hal JA - Proceedings of the Conference on Empirical Methods in Natural Language Processing ER - TY - CONF T1 - Decentralized discovery of camera network topology T2 - Distributed Smart Cameras, 2008. ICDSC 2008. Second ACM/IEEE International Conference on Y1 - 2008 A1 - Farrell,R. A1 - Davis, Larry S. KW - adjacency;camera KW - Bayesian KW - camera KW - discovery;modified KW - distribution;sequential KW - estimation;Bayes KW - methods;distributed KW - multinomial KW - network KW - sensors; KW - sensors;image KW - topology;decentralized AB - One of the primary uses of camera networks is the observation and tracking of objects within some domain. Substantial research has gone into tracking objects within single and multiple views. However, few such approaches scale to large numbers of sensors, and those that do require an understanding of the network topology. Camera network topology models camera adjacency in the context of tracking: when an object/entity leaves one camera, which cameras could it appear at next? This paper presents a decentralized approach for estimating a camera networkpsilas topology based on sequential Bayesian estimation using a modified multinomial distribution. Central to this method is an information-theoretic appearance model for observation weighting. The distributed nature of the approach utilizes all of the sensors as processing agents in collectively recovering the network topology. Experimental results are presented using camera networks varying in size from 10-100 nodes. JA - Distributed Smart Cameras, 2008. ICDSC 2008. Second ACM/IEEE International Conference on M3 - 10.1109/ICDSC.2008.4635696 ER - TY - CHAP T1 - Delegating Capabilities in Predicate Encryption Systems T2 - Automata, Languages and Programming Y1 - 2008 A1 - Elaine Shi A1 - Waters,Brent ED - Aceto,Luca ED - Damgård,Ivan ED - Goldberg,Leslie ED - Halldórsson,Magnús ED - Ingólfsdóttir,Anna ED - Walukiewicz,Igor KW - Computer science AB - In predicate encryption systems, given a capability, one can evaluate one or more predicates on the plaintext encrypted, while all other information about the plaintext remains hidden. We consider the role of delegation in such predicate encryption systems. Suppose Alice has a capability, and she wishes to delegate to Bob a more restrictive capability allowing the decryption of a subset of the information Alice can learn about the plaintext encrypted. We formally define delegation in predicate encryption systems, propose a new security definition for delegation, and give an efficient construction supporting conjunctive queries. The security of our construction can be reduced to the general 3-party Bilinear Diffie-Hellman assumption, and the Bilinear Decisional Diffie-Hellman assumption in composite order bilinear groups. JA - Automata, Languages and Programming T3 - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 5126 SN - 978-3-540-70582-6 UR - http://www.springerlink.com/content/w320422h15050004/abstract/ ER - TY - JOUR T1 - Detecting botnet membership with dnsbl counterintelligence JF - Botnet Detection Y1 - 2008 A1 - Ramachandran,A. A1 - Feamster, Nick A1 - Dagon,D. M3 - DOI: 10.1007/978-0-387-68768-1_7 ER - TY - JOUR T1 - Discrete Distortion in Triangulated 3‐Manifolds JF - Computer Graphics Forum Y1 - 2008 A1 - Mesmoudi,Mohammed Mostefa A1 - De Floriani, Leila A1 - Port,Umberto KW - I.3.3 [Computer Graphics] KW - I.3.5 [Computational Geometry and Object Modeling] AB - We introduce a novel notion, that we call discrete distortion, for a triangulated 3-manifold. Discrete distortion naturally generalizes the notion of concentrated curvature defined for triangulated surfaces and provides a powerful tool to understand the local geometry and topology of 3-manifolds. Discrete distortion can be viewed as a discrete approach to Ricci curvature for singular flat manifolds. We distinguish between two kinds of distortion, namely, vertex distortion, which is associated with the vertices of the tetrahedral mesh decomposing the 3-manifold, and bond distortion, which is associated with the edges of the tetrahedral mesh. We investigate properties of vertex and bond distortions. As an example, we visualize vertex distortion on manifold hypersurfaces in R4 defined by a scalar field on a 3D mesh. distance fields. VL - 27 SN - 1467-8659 UR - http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8659.2008.01272.x/abstract?userIsAuthenticated=false&deniedAccessCustomisedMessage= CP - 5 M3 - 10.1111/j.1467-8659.2008.01272.x ER - TY - CONF T1 - Distinguishing persistent failures from transient losses T2 - Proceedings of the 2008 ACM CoNEXT Conference Y1 - 2008 A1 - Cunha,Ítalo A1 - Teixeira,Renata A1 - Feamster, Nick A1 - Diot,Christophe AB - Network tomography is a promising technique to identify the location of of IP faults. The goal of tomography is to infer the status of network internal characteristics based on end-to-end observations. In particular, binary tomography identifies the set of failed links from end-to-end path meausrments. Upon detecting the failure of one or more of the monitored paths, a monitor sends its measurements to a central coordinator. The coordinator then runs the binary tomography algorithm, which takes as input the topology of the network and the status (i.e., up or down) of all monitored paths and finds the minimum set of links that explain the observations. JA - Proceedings of the 2008 ACM CoNEXT Conference T3 - CoNEXT '08 PB - ACM CY - New York, NY, USA SN - 978-1-60558-210-8 UR - http://doi.acm.org/10.1145/1544012.1544062 M3 - 10.1145/1544012.1544062 ER - TY - CONF T1 - Document Zone Classification Using Partial Least Squares and Hybrid Classifiers T2 - ICPR 2008. 19th International Conference on Pattern Recognition, 2008. Y1 - 2008 A1 - Abd-Almageed, Wael A1 - Agrawal,Mudit A1 - Seo,W. A1 - David Doermann AB - This paper introduces a novel document-zone classification algorithm. Low level image features are first extracted from document zones and partial least squares is used on pairs of classes to compute discriminating pairwise features. Rather than using the popular one-against-all and one-against-one voting schemes, we introduce a novel hybrid method which combines the benefits of the two schemes. The algorithm is applied on the University of Washington dataset and 97.3% classification accuracy is obtained. JA - ICPR 2008. 19th International Conference on Pattern Recognition, 2008. ER - TY - JOUR T1 - The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus) JF - Nature Y1 - 2008 A1 - Ming,R. A1 - Hou,S. A1 - Feng,Y. A1 - Yu,Q. A1 - Dionne-Laporte,A. A1 - Saw,J.H. A1 - Senin,P. A1 - Wang,W. A1 - Ly,B.V. A1 - Lewis,K.L.T. A1 - others AB - Papaya, a fruit crop cultivated in tropical and subtropical regions, is known for its nutritional benefits and medicinal applications. Here we report a 3 draft genome sequence of 'SunUp' papaya, the first commercial virus-resistant transgenic fruit tree to be sequenced. The papaya genome is three times the size of the Arabidopsis genome, but contains fewer genes, including significantly fewer disease-resistance gene analogues. Comparison of the five sequenced genomes suggests a minimal angiosperm gene set of 13,311. A lack of recent genome duplication, atypical of other angiosperm genomes sequenced so far, may account for the smaller papaya gene number in most functional groups. Nonetheless, striking amplifications in gene number within particular functional groups suggest roles in the evolution of tree-like habit, deposition and remobilization of starch reserves, attraction of seed dispersal agents, and adaptation to tropical daylengths. Transgenesis at three locations is closely associated with chloroplast insertions into the nuclear genome, and with topoisomerase I recognition sites. Papaya offers numerous advantages as a system for fruit-tree functional genomics, and this draft genome sequence provides the foundation for revealing the basis of Carica's distinguishing morpho-physiological, medicinal and nutritional properties. VL - 452 SN - 0028-0836 UR - http://www.nature.com/nature/journal/v452/n7190/full/nature06856.html CP - 7190 M3 - 10.1038/nature06856 ER - TY - CONF T1 - Dynamic visual category learning T2 - Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on Y1 - 2008 A1 - Tom Yeh A1 - Darrell,Trevor AB - Dynamic visual category learning calls for efficient adaptation as new training images become available or new categories are defined, existing training images or categories become modified or obsolete, or when categories are divided into subcategories or merged together. We develop novel methods for efficient incremental learning of SVM-based visual category classifiers to handle such dynamic tasks. Our method exploits previous classifier estimates to more efficiently learn the optimal parameters for the current set of training images and categories. We show empirically that for dynamic visual category tasks, our incremental learning methods are significantly faster than batch retraining. JA - Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on PB - IEEE SN - 978-1-4244-2242-5 UR - http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4587616 M3 - 10.1109/CVPR.2008.4587616 ER - TY - JOUR T1 - THE EFFECT OF NETWORK STRUCTURE ON DYNAMIC TEAM FORMATION IN MULTI‐AGENT SYSTEMS JF - Computational Intelligence Y1 - 2008 A1 - Gaston,Matthew E. A1 - desJardins, Marie KW - dynamics of networked systems KW - multi‐agent systems KW - network structure KW - team formation AB - Previous studies of team formation in multi-agent systems have typically assumed that the agent social network underlying the agent organization is either not explicitly described or the social network is assumed to take on some regular structure such as a fully connected network or a hierarchy. However, recent studies have shown that real-world networks have a rich and purposeful structure, with common properties being observed in many different types of networks. As multi-agent systems continue to grow in size and complexity, the network structure of such systems will become increasing important for designing efficient, effective agent communities.We present a simple agent-based computational model of team formation, and analyze the theoretical performance of team formation in two simple classes of networks (ring and star topologies). We then give empirical results for team formation in more complex networks under a variety of conditions. From these experiments, we conclude that a key factor in effective team formation is the underlying agent interaction topology that determines the direct interconnections among agents. Specifically, we identify the property of diversity support as a key factor in the effectiveness of network structures for team formation. Scale-free networks, which were developed as a way to model real-world networks, exhibit short average path lengths and hub-like structures. We show that these properties, in turn, result in higher diversity support; as a result, scale-free networks yield higher organizational efficiency than the other classes of networks we have studied. VL - 24 SN - 1467-8640 UR - http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8640.2008.00325.x/abstract CP - 2 M3 - 10.1111/j.1467-8640.2008.00325.x ER - TY - CONF T1 - Efficient Kriging via Fast Matrix-Vector Products T2 - Aerospace Conference, 2008 IEEE Y1 - 2008 A1 - Memarsadeghi,N. A1 - Raykar,V.C. A1 - Duraiswami, Ramani A1 - Mount, Dave KW - cokriging KW - data KW - data;scattered KW - efficiency;geophysical KW - estimator;remotely KW - fusion; KW - fusion;interpolation;iterative KW - matrix-vector KW - methods;image KW - methods;nearest KW - methods;remote KW - multipole KW - neighbor KW - points;time KW - products;fast KW - scattered KW - searching;optimal KW - sensed KW - sensing;sensor KW - technique;fast KW - techniques;iterative AB - Interpolating scattered data points is a problem of wide ranging interest. Ordinary kriging is an optimal scattered data estimator, widely used in geosciences and remote sensing. A generalized version of this technique, called cokriging, can be used for image fusion of remotely sensed data. However, it is computationally very expensive for large data sets. We demonstrate the time efficiency and accuracy of approximating ordinary kriging through the use of fast matrix-vector products combined with iterative methods. We used methods based on the fast Multipole methods and nearest neighbor searching techniques for implementations of the fast matrix-vector products. JA - Aerospace Conference, 2008 IEEE M3 - 10.1109/AERO.2008.4526433 ER - TY - CONF T1 - Efficient Kriging via Fast Matrix-Vector Products T2 - Aerospace Conference, 2008 IEEE Y1 - 2008 A1 - Memarsadeghi,N. A1 - Raykar,V.C. A1 - Duraiswami, Ramani A1 - Mount, Dave KW - cokriging technique KW - fast matrix-vector products KW - fast multipole methods KW - geophysical techniques KW - image fusion KW - Interpolation KW - iterative methods KW - nearest neighbor searching KW - optimal scattered data estimator KW - Remote sensing KW - remotely sensed data KW - scattered data points KW - sensor fusion KW - time efficiency AB - Interpolating scattered data points is a problem of wide ranging interest. Ordinary kriging is an optimal scattered data estimator, widely used in geosciences and remote sensing. A generalized version of this technique, called cokriging, can be used for image fusion of remotely sensed data. However, it is computationally very expensive for large data sets. We demonstrate the time efficiency and accuracy of approximating ordinary kriging through the use of fast matrix-vector products combined with iterative methods. We used methods based on the fast Multipole methods and nearest neighbor searching techniques for implementations of the fast matrix-vector products. JA - Aerospace Conference, 2008 IEEE M3 - 10.1109/AERO.2008.4526433 ER - TY - CHAP T1 - Energy Efficient Monitoring in Sensor Networks T2 - LATIN 2008: Theoretical InformaticsLATIN 2008: Theoretical Informatics Y1 - 2008 A1 - Deshpande, Amol A1 - Khuller, Samir A1 - Malekian,Azarakhsh A1 - Toossi,Mohammed ED - Laber,Eduardo ED - Bornstein,Claudson ED - Nogueira,Loana ED - Faria,Luerbio AB - In this paper we study a set of problems related to efficient energy management for monitoring applications in wireless sensor networks. We study several generalizations of a basic problem called Set k -Cover, which can be described as follows: we are given a set of sensors, and a set of regions to be monitored. Each region can be monitored by a subset of the sensors. To increase the lifetime of the sensor network, we would like to partition the sensors into k sets (or time-slots) and activate each partition in a different time-slot. The goal is to find the partitioning that maximizes the coverage of the regions. This problem is known to be NP -hard. We first develop improved approximation algorithms for this problem based on its similarities to the max k -cut problem. We then consider a variation, called Set ( k , α )-cover, where each sensor is allowed to be active in α different time-slots. We develop a randomized routing algorithm for this problem. We then consider extensions where each sensor can monitor only a bounded number of regions in any time-slot. We develop the first approximation algorithms for this problem. An experimental evaluation of the algorithms we propose can be found in the full version of the paper. JA - LATIN 2008: Theoretical InformaticsLATIN 2008: Theoretical Informatics T3 - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 4957 SN - 978-3-540-78772-3 UR - http://dx.doi.org/10.1007/978-3-540-78773-0_38 ER - TY - JOUR T1 - Event modeling and recognition using markov logic networks JF - Computer Vision–ECCV 2008 Y1 - 2008 A1 - Tran,S. A1 - Davis, Larry S. AB - We address the problem of visual event recognition in surveillance where noise and missing observations are serious problems. Common sense domain knowledge is exploited to overcome them. The knowledge is represented as first-order logic production rules with associated weights to indicate their confidence. These rules are used in combination with a relaxed deduction algorithm to construct a network of grounded atoms, the Markov Logic Network. The network is used to perform probabilistic inference for input queries about events of interest. The system’s performance is demonstrated on a number of videos from a parking lot domain that contains complex interactions of people and vehicles. ER - TY - JOUR T1 - Exploiting Prior Knowledge in Intelligent Assistants: Combining Relational Models with Hierarchies JF - Probabilistic, Logical and Relational Learning - A Further Synthesis Y1 - 2008 A1 - Natarajan,S. A1 - Tadepalli,P. A1 - Fern,A. A1 - De Raedt,L. A1 - Dietterich,T. A1 - Getoor, Lise A1 - Kersting,K. A1 - Muggleton,S. H AB - Statitsical relational models have been successfully used to model static probabilistic relationships between the entities of the domain. In this talk, we illustrate their use in a dynamic decison-theoretic setting where the task is to assist a user by inferring his intentional structure and taking appropriate assistive actions. We show that the statistical relational models can be used to succintly express the system's prior knowledge about the user's goal-subgoal structure and tune it with experience. As the system is better able to predict the user's goals, it improves the effectiveness of its assistance. We show through experiments that both the hierarchical structure of the goals and the parameter sharing facilitated by relational models significantly improve the learning speed. ER - TY - JOUR T1 - Exploiting shared correlations in probabilistic databases JF - Proceedings of the VLDB Endowment Y1 - 2008 A1 - Sen,Prithviraj A1 - Deshpande, Amol A1 - Getoor, Lise AB - There has been a recent surge in work in probabilistic databases, propelled in large part by the huge increase in noisy data sources --- from sensor data, experimental data, data from uncurated sources, and many others. There is a growing need for database management systems that can efficiently represent and query such data. In this work, we show how data characteristics can be leveraged to make the query evaluation process more efficient. In particular, we exploit what we refer to as shared correlations where the same uncertainties and correlations occur repeatedly in the data. Shared correlations occur mainly due to two reasons: (1) Uncertainty and correlations usually come from general statistics and rarely vary on a tuple-to-tuple basis; (2) The query evaluation procedure itself tends to re-introduce the same correlations. Prior work has shown that the query evaluation problem on probabilistic databases is equivalent to a probabilistic inference problem on an appropriately constructed probabilistic graphical model (PGM). We leverage this by introducing a new data structure, called the random variable elimination graph (rv-elim graph) that can be built from the PGM obtained from query evaluation. We develop techniques based on bisimulation that can be used to compress the rv-elim graph exploiting the presence of shared correlations in the PGM, the compressed rv-elim graph can then be used to run inference. We validate our methods by evaluating them empirically and show that even with a few shared correlations significant speed-ups are possible. VL - 1 SN - 2150-8097 UR - http://dx.doi.org/10.1145/1453856.1453944 CP - 1 M3 - 10.1145/1453856.1453944 ER - TY - JOUR T1 - A Fast Algorithm for Learning a Ranking Function from Large-Scale Data Sets JF - Pattern Analysis and Machine Intelligence, IEEE Transactions on Y1 - 2008 A1 - Raykar,V.C. A1 - Duraiswami, Ramani A1 - Krishnapuram,B. KW - Automated; KW - Factual;Information Storage and Retrieval;Likelihood Functions;Models KW - Statistical;Pattern Recognition KW - Wilcoxon-Mann-Whitney statistics;collaborative filtering;error function;gradient algorithm;large-scale data sets;learning ranking functions;ranking function;training data;computational complexity;error analysis;learning (artificial intelligence);regressio AB - We consider the problem of learning a ranking function that maximizes a generalization of the Wilcoxon-Mann-Whitney statistic on the training data. Relying on an e-accurate approximation for the error function, we reduce the computational complexity of each iteration of a conjugate gradient algorithm for learning ranking functions from O(m2) to O(m), where m is the number of training samples. Experiments on public benchmarks for ordinal regression and collaborative filtering indicate that the proposed algorithm is as accurate as the best available methods in terms of ranking accuracy, when the algorithms are trained on the same data. However, since it is several orders of magnitude faster than the current state-of-the-art approaches, it is able to leverage much larger training data sets. VL - 30 SN - 0162-8828 CP - 7 M3 - 10.1109/TPAMI.2007.70776 ER - TY - JOUR T1 - Fast concurrent object classification and localization JF - CSAIL Technical Reports (July 1, 2003 - present) Y1 - 2008 A1 - Tom Yeh A1 - Lee,John J A1 - Darrell,Trevor AB - Object localization and classification are important problems incomputer vision. However, in many applications, exhaustive searchover all class labels and image locations is computationallyprohibitive. While several methods have been proposed to makeeither classification or localization more efficient, few havedealt with both tasks simultaneously. This paper proposes anefficient method for concurrent object localization andclassification based on a data-dependent multi-classbranch-and-bound formalism. Existing bag-of-featuresclassification schemes, which can be expressed as weightedcombinations of feature counts can be readily adapted to ourmethod. We present experimental results that demonstrate the meritof our algorithm in terms of classification accuracy, localizationaccuracy, and speed, compared to baseline approaches includingexhaustive search, the ISM method, and single-class branch andbound. UR - http://dspace.mit.edu/handle/1721.1/41862 ER - TY - CONF T1 - Fast, easy, and cheap: Construction of statistical machine translation models with MapReduce T2 - Proceedings of the Third Workshop on Statistical Machine Translation Y1 - 2008 A1 - Dyer,C. A1 - Cordova,A. A1 - Mont,A. A1 - Jimmy Lin AB - In recent years, the quantity of parallel train-ing data available for statistical machine trans- lation has increased far more rapidly than the performance of individual computers, re- sulting in a potentially serious impediment to progress. Parallelization of the model- building algorithms that process this data on computer clusters is fraught with challenges such as synchronization, data exchange, and fault tolerance. However, the MapReduce programming paradigm has recently emerged as one solution to these issues: a powerful functional abstraction hides system-level de- tails from the researcher, allowing programs to be transparently distributed across potentially very large clusters of commodity hardware. We describe MapReduce implementations of two algorithms used to estimate the parame- ters for two word alignment models and one phrase-based translation model, all of which rely on maximum likelihood probability esti- mates. On a 20-machine cluster, experimental results show that our solutions exhibit good scaling characteristics compared to a hypo- thetical, optimally-parallelized version of cur- rent state-of-the-art single-core tools. JA - Proceedings of the Third Workshop on Statistical Machine Translation PB - Association for Computational Linguistics ER - TY - JOUR T1 - Fast multipole accelerated boundary element method (FMBEM) for solution of 3D scattering problems JF - The Journal of the Acoustical Society of America Y1 - 2008 A1 - Gumerov, Nail A. A1 - Duraiswami, Ramani AB - Wideband FMBEM codes are challenging to implement since there are problems at both very low and high frequencies. Substantially different schemes for function representation and translation are efficient for low and high frequency ranges. We present a method which is suitable for solution both high and low frequency problems since it implements a switch between different representations and uses fast translation methods appropriate to each representation. For a high frequency problem the switch between representations may occur at some intermediate levels of hierarchical space subdivision of the FMM. We also present an FMM‐based preconditioner used in the flexible GMRES iterative solver for scattering problems and discuss example problems computed in range 0.001