TY - JOUR T1 - Segmenting time-lapse phase contrast images of adjacent NIH 3T3 cells. JF - Journal of microscopy Y1 - 2013 A1 - Chalfoun, J A1 - Kociolek, M A1 - Dima, A A1 - Halter, M A1 - Cardone, Antonio A1 - Peskin, A A1 - Bajcsy, P A1 - Brady, M. KW - Animals KW - Cell Adhesion KW - Cell Count KW - Cell Division KW - Cell Shape KW - Computational Biology KW - Fibroblasts KW - Image Processing, Computer-Assisted KW - Mice KW - Microscopy, Phase-Contrast KW - NIH 3T3 Cells KW - Reproducibility of results KW - Sensitivity and Specificity KW - Time-Lapse Imaging AB - We present a new method for segmenting phase contrast images of NIH 3T3 fibroblast cells that is accurate even when cells are physically in contact with each other. The problem of segmentation, when cells are in contact, poses a challenge to the accurate automation of cell counting, tracking and lineage modelling in cell biology. The segmentation method presented in this paper consists of (1) background reconstruction to obtain noise-free foreground pixels and (2) incorporation of biological insight about dividing and nondividing cells into the segmentation process to achieve reliable separation of foreground pixels defined as pixels associated with individual cells. The segmentation results for a time-lapse image stack were compared against 238 manually segmented images (8219 cells) provided by experts, which we consider as reference data. We chose two metrics to measure the accuracy of segmentation: the 'Adjusted Rand Index' which compares similarities at a pixel level between masks resulting from manual and automated segmentation, and the 'Number of Cells per Field' (NCF) which compares the number of cells identified in the field by manual versus automated analysis. Our results show that the automated segmentation compared to manual segmentation has an average adjusted rand index of 0.96 (1 being a perfect match), with a standard deviation of 0.03, and an average difference of the two numbers of cells per field equal to 5.39% with a standard deviation of 4.6%. VL - 249 CP - 1 U1 - http://www.ncbi.nlm.nih.gov/pubmed/23126432?dopt=Abstract M3 - 10.1111/j.1365-2818.2012.03678.x ER - TY - JOUR T1 - Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage. JF - ISME J Y1 - 2012 A1 - Dupont, Chris L A1 - Rusch, Douglas B A1 - Yooseph, Shibu A1 - Lombardo, Mary-Jane A1 - Richter, R Alexander A1 - Valas, Ruben A1 - Novotny, Mark A1 - Yee-Greenbaum, Joyclyn A1 - Jeremy D Selengut A1 - Haft, Dan H A1 - Halpern, Aaron L A1 - Lasken, Roger S A1 - Nealson, Kenneth A1 - Friedman, Robert A1 - Venter, J Craig KW - Computational Biology KW - Gammaproteobacteria KW - Genome, Bacterial KW - Genomic Library KW - metagenomics KW - Oceans and Seas KW - Phylogeny KW - plankton KW - Rhodopsin KW - RNA, Ribosomal, 16S KW - Seawater AB -

Bacteria in the 16S rRNA clade SAR86 are among the most abundant uncultivated constituents of microbial assemblages in the surface ocean for which little genomic information is currently available. Bioinformatic techniques were used to assemble two nearly complete genomes from marine metagenomes and single-cell sequencing provided two more partial genomes. Recruitment of metagenomic data shows that these SAR86 genomes substantially increase our knowledge of non-photosynthetic bacteria in the surface ocean. Phylogenomic analyses establish SAR86 as a basal and divergent lineage of γ-proteobacteria, and the individual genomes display a temperature-dependent distribution. Modestly sized at 1.25-1.7 Mbp, the SAR86 genomes lack several pathways for amino-acid and vitamin synthesis as well as sulfate reduction, trends commonly observed in other abundant marine microbes. SAR86 appears to be an aerobic chemoheterotroph with the potential for proteorhodopsin-based ATP generation, though the apparent lack of a retinal biosynthesis pathway may require it to scavenge exogenously-derived pigments to utilize proteorhodopsin. The genomes contain an expanded capacity for the degradation of lipids and carbohydrates acquired using a wealth of tonB-dependent outer membrane receptors. Like the abundant planktonic marine bacterial clade SAR11, SAR86 exhibits metabolic streamlining, but also a distinct carbon compound specialization, possibly avoiding competition.

VL - 6 CP - 6 M3 - 10.1038/ismej.2011.189 ER - TY - JOUR T1 - A computational statistics approach for estimating the spatial range of morphogen gradients JF - Development Y1 - 2011 A1 - Kanodia, Jitendra S. A1 - Kim, Yoosik A1 - Tomer, Raju A1 - Zia Khan A1 - Chung, Kwanghun A1 - Storey, John D. A1 - Lu, Hang A1 - Keller, Philipp J. A1 - Shvartsman, Stanislav Y. KW - Computational Biology KW - Confidence interval KW - Dorsal gradient KW - Drosophila embryo KW - Morphogen gradient KW - Statistics AB - A crucial issue in studies of morphogen gradients relates to their range: the distance over which they can act as direct regulators of cell signaling, gene expression and cell differentiation. To address this, we present a straightforward statistical framework that can be used in multiple developmental systems. We illustrate the developed approach by providing a point estimate and confidence interval for the spatial range of the graded distribution of nuclear Dorsal, a transcription factor that controls the dorsoventral pattern of the Drosophila embryo. VL - 138 SN - 0950-1991, 1477-9129 UR - http://dev.biologists.org/content/138/22/4867 CP - 22 J1 - Development ER - TY - JOUR T1 - A cost-aggregating integer linear program for motif finding JF - Journal of Discrete Algorithms Y1 - 2011 A1 - Kingsford, Carl A1 - Zaslavsky,Elena A1 - Singh,Mona KW - Computational Biology KW - Integer linear programming KW - Sequence motif finding AB - In the motif finding problem one seeks a set of mutually similar substrings within a collection of biological sequences. This is an important and widely-studied problem, as such shared motifs in DNA often correspond to regulatory elements. We study a combinatorial framework where the goal is to find substrings of a given length such that the sum of their pairwise distances is minimized. We describe a novel integer linear program for the problem, which uses the fact that distances between substrings come from a limited set of possibilities allowing for aggregate consideration of sequence position pairs with the same distances. We show how to tighten its linear programming relaxation by adding an exponential set of constraints and give an efficient separation algorithm that can find violated constraints, thereby showing that the tightened linear program can still be solved in polynomial time. We apply our approach to find optimal solutions for the motif finding problem and show that it is effective in practice in uncovering known transcription factor binding sites. VL - 9 SN - 1570-8667 UR - http://www.sciencedirect.com/science/article/pii/S157086671100044X CP - 4 M3 - 10.1016/j.jda.2011.04.001 ER - TY - CONF T1 - Inexact Local Alignment Search over Suffix Arrays T2 - IEEE International Conference on Bioinformatics and Biomedicine, 2009. BIBM '09 Y1 - 2009 A1 - Ghodsi,M. A1 - Pop, Mihai KW - bacteria KW - Bioinformatics KW - biology computing KW - Computational Biology KW - Costs KW - DNA KW - DNA homology searches KW - DNA sequences KW - Educational institutions KW - generalized heuristic KW - genes KW - Genetics KW - genome alignment KW - Genomics KW - human KW - inexact local alignment search KW - inexact seeds KW - local alignment KW - local alignment tools KW - memory efficient suffix array KW - microorganisms KW - molecular biophysics KW - mouse KW - Organisms KW - Sensitivity and Specificity KW - sequences KW - suffix array KW - USA Councils AB - We describe an algorithm for finding approximate seeds for DNA homology searches. In contrast to previous algorithms that use exact or spaced seeds, our approximate seeds may contain insertions and deletions. We present a generalized heuristic for finding such seeds efficiently and prove that the heuristic does not affect sensitivity. We show how to adapt this algorithm to work over the memory efficient suffix array with provably minimal overhead in running time. We demonstrate the effectiveness of our algorithm on two tasks: whole genome alignment of bacteria and alignment of the DNA sequences of 177 genes that are orthologous in human and mouse. We show our algorithm achieves better sensitivity and uses less memory than other commonly used local alignment tools. JA - IEEE International Conference on Bioinformatics and Biomedicine, 2009. BIBM '09 PB - IEEE SN - 978-0-7695-3885-3 M3 - 10.1109/BIBM.2009.25 ER - TY - JOUR T1 - Microbial oceanography in a sea of opportunity JF - Nature Y1 - 2009 A1 - Bowler,Chris A1 - Karl,David M. A1 - Rita R Colwell KW - Astronomy KW - astrophysics KW - Biochemistry KW - Bioinformatics KW - Biology KW - biotechnology KW - cancer KW - cell cycle KW - cell signalling KW - climate change KW - Computational Biology KW - development KW - developmental biology KW - DNA KW - drug discovery KW - earth science KW - ecology KW - environmental science KW - Evolution KW - evolutionary biology KW - functional genomics KW - Genetics KW - Genomics KW - geophysics KW - immunology KW - interdisciplinary science KW - life KW - marine biology KW - materials science KW - medical research KW - medicine KW - metabolomics KW - molecular biology KW - molecular interactions KW - nanotechnology KW - Nature KW - neurobiology KW - neuroscience KW - palaeobiology KW - pharmacology KW - Physics KW - proteomics KW - quantum physics KW - RNA KW - Science KW - science news KW - science policy KW - signal transduction KW - structural biology KW - systems biology KW - transcriptomics AB - Plankton use solar energy to drive the nutrient cycles that make the planet habitable for larger organisms. We can now explore the diversity and functions of plankton using genomics, revealing the gene repertoires associated with survival in the oceans. Such studies will help us to appreciate the sensitivity of ocean systems and of the ocean's response to climate change, improving the predictive power of climate models. VL - 459 SN - 0028-0836 UR - http://www.nature.com/nature/journal/v459/n7244/abs/nature08056.html CP - 7244 M3 - 10.1038/nature08056 ER - TY - JOUR T1 - Temporal Summaries: Supporting Temporal Categorical Searching, Aggregation and Comparison JF - IEEE Transactions on Visualization and Computer Graphics Y1 - 2009 A1 - Wang,T. D A1 - Plaisant, Catherine A1 - Shneiderman, Ben A1 - Spring, Neil A1 - Roseman,D. A1 - Marchand,G. A1 - Mukherjee,V. A1 - Smith,M. KW - Aggregates KW - Collaborative work KW - Computational Biology KW - Computer Graphics KW - Data analysis KW - data visualisation KW - Data visualization KW - Databases, Factual KW - Displays KW - Event detection KW - Filters KW - Heparin KW - History KW - Human computer interaction KW - Human-computer interaction KW - HUMANS KW - Information Visualization KW - Interaction design KW - interactive visualization technique KW - Medical Records Systems, Computerized KW - Pattern Recognition, Automated KW - Performance analysis KW - Springs KW - temporal categorical data visualization KW - temporal categorical searching KW - temporal ordering KW - temporal summaries KW - Thrombocytopenia KW - Time factors AB - When analyzing thousands of event histories, analysts often want to see the events as an aggregate to detect insights and generate new hypotheses about the data. An analysis tool must emphasize both the prevalence and the temporal ordering of these events. Additionally, the analysis tool must also support flexible comparisons to allow analysts to gather visual evidence. In a previous work, we introduced align, rank, and filter (ARF) to accentuate temporal ordering. In this paper, we present temporal summaries, an interactive visualization technique that highlights the prevalence of event occurrences. Temporal summaries dynamically aggregate events in multiple granularities (year, month, week, day, hour, etc.) for the purpose of spotting trends over time and comparing several groups of records. They provide affordances for analysts to perform temporal range filters. We demonstrate the applicability of this approach in two extensive case studies with analysts who applied temporal summaries to search, filter, and look for patterns in electronic health records and academic records. VL - 15 SN - 1077-2626 CP - 6 M3 - 10.1109/TVCG.2009.187 ER - TY - JOUR T1 - Guest Editors' Introduction to the Special Section on Algorithms in Bioinformatics JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics Y1 - 2008 A1 - Giancarlo,Raffaele A1 - Hannenhalli, Sridhar KW - Abstracts KW - Algorithm design and analysis KW - Bioinformatics KW - Biological system modeling KW - biology computing KW - Computational Biology KW - Computational modeling KW - Computer science KW - Genomics KW - sequences VL - 5 SN - 1545-5963 CP - 4 M3 - 10.1109/TCBB.2008.116 ER - TY - JOUR T1 - Structural Biology: Analysis of 'downhill' protein folding; Analysis of protein-folding cooperativity (Reply) JF - Nature Y1 - 2007 A1 - Sadqi,Mourad A1 - Fushman, David A1 - Muñoz,Victor KW - Astronomy KW - astrophysics KW - Biochemistry KW - Bioinformatics KW - Biology KW - biotechnology KW - cancer KW - cell cycle KW - cell signalling. KW - climate change KW - Computational Biology KW - development KW - developmental biology KW - DNA KW - drug discovery KW - earth science KW - ecology KW - environmental science KW - Evolution KW - evolutionary biology KW - functional genomics KW - Genetics KW - Genomics KW - geophysics KW - immunology KW - interdisciplinary science KW - life KW - marine biology KW - materials science KW - medical research KW - medicine KW - metabolomics KW - molecular biology KW - molecular interactions KW - nanotechnology KW - Nature KW - neurobiology KW - neuroscience KW - palaeobiology KW - pharmacology KW - Physics KW - proteomics KW - quantum physics KW - RNA KW - Science KW - science news KW - science policy KW - signal transduction KW - structural biology KW - systems biology KW - transcriptomics AB - Ferguson et al. and Zhou and Bai criticize the quality of our nuclear magnetic resonance (NMR) data and atom-by-atom analysis of global 'downhill' folding, also claiming that the data are compatible with two-state folding. VL - 445 SN - 0028-0836 UR - http://www.nature.com/nature/journal/v445/n7129/full/nature05645.html?lang=en CP - 7129 M3 - 10.1038/nature05645 ER - TY - BOOK T1 - Advances in Computers: Computational Biology and Bioinformatics Y1 - 2006 A1 - Zelkowitz, Marvin V A1 - Tseng,Chau-Wen KW - Bioinformatics KW - Computational Biology KW - Computers / Bioinformatics KW - Computers / Computer Science KW - Computers / Data Processing KW - Computers / Information Theory KW - Computers / Interactive & Multimedia KW - Computers / Programming / General KW - Computers / Reference KW - Computers / Software Development & Engineering / General KW - Mathematics / Applied KW - Science / Life Sciences / Biology AB - The field of bioinformatics and computational biology arose due to the need to apply techniques from computer science, statistics, informatics, and applied mathematics to solve biological problems. Scientists have been trying to study biology at a molecular level using techniques derived from biochemistry, biophysics, and genetics. Progress has greatly accelerated with the discovery of fast and inexpensive automated DNA sequencing techniques. As the genomes of more and more organisms are sequenced and assembled, scientists are discovering many useful facts by tracing the evolution of organisms by measuring changes in their DNA, rather than through physical characteristics alone. This has led to rapid growth in the related fields of phylogenetics, the study of evolutionary relatedness among various groups of organisms, and comparative genomics, the study of the correspondence between genes and other genomic features in different organisms. Comparing the genomes of organisms has allowed researchers to better understand the features and functions of DNA in individual organisms, as well as provide insights into how organisms evolve over time. The first four chapters of this book focus on algorithms for comparing the genomes of different organisms. Possible concrete applications include identifying the basis for genetic diseases and tracking the development and spread of different forms of Avian flu. As researchers begin to better understand the function of DNA, attention has begun shifting towards the actual proteins produced by DNA. The final two chapters explore proteomic techniques for analyzing proteins directly to identify their presence and understand their physical structure.- Written by active PhD researchers in computational biology and bioinformatics PB - Academic Press SN - 9780120121687 ER - TY - JOUR T1 - Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals JF - J. ACM Y1 - 1999 A1 - Hannenhalli, Sridhar A1 - Pevzner,Pavel A. KW - Computational Biology KW - Genetics AB - Genomes frequently evolve by reversals &rgr;(i,j) that transform a gene order &pgr;1 … &pgr;i&pgr;i+1 … &pgr;j-1&pgr;j … &pgr;n into &pgr;1 … &pgr;i&pgr;j-1 … &pgr;i+1&pgr;j … &pgr;n. Reversal distance between permutations &pgr; and &sgr;is the minimum number of reversals to transform &pgr; into &Agr;. Analysis of genome rearrangements in molecular biology started in the late 1930's, when Dobzhansky and Sturtevant published a milestone paper presenting a rearrangement scenario with 17 inversions between the species of Drosophilia. Analysis of genomes evolving by inversions leads to a combinatorial problem of sorting by reversals studied in detail recently. We study sorting of signed permutations by reversals, a problem that adequately models rearrangements in a small genomes like chloroplast or mitochondrial DNA. The previously suggested approximation algorithms for sorting signed permutations by reversals compute the reversal distance between permutations with an astonishing accuracy for both simulated and biological data. We prove a duality theorem explaining this intriguing performance and show that there exists a “hidden” parameter that allows one to compute the reversal distance between signed permutations in polynomial time. VL - 46 SN - 0004-5411 UR - http://doi.acm.org/10.1145/300515.300516 CP - 1 M3 - 10.1145/300515.300516 ER -