BioVis 2011 Paper
Visual Analysis of Next-Generation Sequencing Data to Detect Overlapping Genes in Bacterial Genomes
Next generation sequencing (NGS) technologies are about to revolutionize biological research. Being able to sequence large amounts of DNA or, indirectly, RNA sequences in a short time period opens numerous new possibilities. However, analyzing the large amounts of data generated in NGS is a serious challenge, which requires novel data analysis and visualization methods to allow the biological experimenter to understand the results. In this paper, we describe a novel system to deal with the flood of data generated by transcriptome sequencing (RNA-seq) using NGS. Our system allows the analyzer to get a quick overview of the data and interactively explore interesting regions based on the three important parameters coverage, transcription, and fit. In particular, our system supports the NGS analysis in the following respects: (1) Representation of the coverage sequence in a way that no artifacts are introduced. (2) Easy determination of a fit of an open reading frame (ORF) to a transcript by mapping the coverage sequence directly into the ORF representation. (3) Providing automatic support for finding interesting regions to address the problems that the overwhelming volume of data comes with. (4) Providing an overview representation that allows parameter tuning and enables quick access to interesting areas of the genome. We show the usefulness of our system by a case study in the area of overlapping gene detection in a bacterial genome.
BioVis 2011 Papers and Abstracts