Refinery Platform: A Foundation for Integrative Data Visualization Tools

Nils Gehlenborg, Richard Park, Ilya Sytchev, Psalm Haseley, Stefan Luger, Anton Xue, Marc Streit, Shannan Ho Sui, Winston Hide, Peter Park

Datasets with dozens or hundreds of samples are now common in biology. Manually keeping track of software and data used in analyses is tedious and error prone. Furthermore, using visual exploration tools to study the results of such analyses is currently not well supported. To address these challenges, we are developing the web-based Refinery Platform ( This flexible analysis platform is designed to accommodate diverse data and workflows; our current implementation focuses on epigenomics and cancer genomics. One goal for this system is to serve as a platform for the development of novel visual exploration tools, that can directly access large and complex datasets and analysis results, and trigger new analyses on these data. The Refinery Platform enables reproducible analyses by combining two powerful, community-supported tools: (1) a data repository with rich metadata capabilities based on ISA-Tab ( and (2) a workflow engine based on the popular Galaxy framework ( The ISA-Tab-based data model provides extensive provenance information in an "experiment graph", which links all files to the inputs that they were derived from. Workflows are executed by in Galaxy. The Galaxy workflow editor is used to create a "workflow template" that is imported by the Refinery Platform, automatically instantiated based on the inputs selected by the user, and exported back into Galaxy through its API. Workflow results are downloaded into Refinery from Galaxy, added to the experiment graph, and made available for visualization and as input for further analyses.

BioVis 2014 Information

Interactive Exploration of Spatial Distribution in Mass Spectrometry ImagingNeXO Web: An integrated ontology visualization application for modern web platformsTreemap Visualization of Personal Genomic ReportsHitWalker2: An interactive and queryable web-based framework for variant prioritization in precision medicineGWAS Viewer - a fast and interactive visualization for GWAS resultsConTour: Data-Driven Exploration of Multi-Relational Datasets for Drug DiscoveryAn interactive visualisation tool for the hierarchical clustering of large data setsMCAweb: an interactive graphical tool for Multiresolution Correlation Analysis in single-cell dataGingr: Interactive visualization of large-scale phylogenies and multi-alignments.NeuroLines: A Subway Map Metaphor for Visualizing Nanoscale Neuronal ConnectivityLONGEVITY: A novel visualization platform for interpreting multidimensional gene expressionExploration of metagenome assemblies with an interactive visualization tool.