The biological domain for the 2012 contest is a continuation of the expression Quantitative Trait Locus (eQTL) analysis problem from BioVis 2011. eQTL experiments catalog massive collections of correlated genotype and phenotype data, with the goal of detecting and identifying important genome-sequence variations that affect phenotypic (specifically RNA expression level) outcomes through non-obvious mechanisms. Typically these mechanisms involve networks of interacting polymorphisms that non-linearly affect specific gene expression levels and are conditional on the presence of other polymorphisms, and the underlying tissue type. Two simultaneous challenges are being proposed.

  • Metadata Challenge - This introductory challenge (Phase I, to be continued in 2013) asks for effective visualizations to identify medically relevant and salient single nucleotide polymorphisms (SNPs) of given eQTL data by leveraging metadata in existing databases.
  • expression Quantitative Trait Locus (eQTL) Challenge - This advanced challenge (Phase II, continuation of Biovis 2011 Contest) solicits visualization solutions o to identify the association patterns of genome-sequence variations, and expression-levels, that predict the occurrence of Tristram's Syndrome (a hypothetical disease around which the data was synthesized).

The base data set for both challenges this year is a data set describing a set of individuals and their genotypes, and their gene expression levels as assayed for several genes. The two visualization challenges examine this data from different perspectives. For contestants who may wish to focus purely on a visual representation, rather than develop an interactive approach that requires analysis or a combination of visualization and analysis, we provide partial analyses of the base data set. These include a partial catalog of metadata for the Metadata Challenge, and single and two-locus PLINK results for the eQTL Challenge.

Metadata Challenge

Using the eQTL data provided, provide visualizations that help researchers take advantage of these databases to identify the SNPs in the eQTL data that are most likely to have a medically relevant, or at least causal, effect, and present this information usefully - in effect, "tell the story" of the how, and why, the patterns of SNP polymorphisms affect phenotypic gene variation. It is possible that several SNPs will have related effects, which reinforces the likelihood that they are important. We will periodically release suggestions and hints regarding important features that you may want to consider in your visualization, via the Contest Forum.

Expression Quantitative Trait Locus (eQTL) Challenge

Using the data provided, identify the pattern of genome-sequence variations, and expression-levels, that predict the occurrence of Tristram's Syndrome (a hypothetical disease around which the data was synthesized). To as great an extent as possible, elucidate and explain these factors, and the pattern of interaction amongst the factors, influencing the incidence of Tristram's Syndrome. The factors that have been built into the data include both direct SNP effects on the genes that contain them (cis effects), SNP effects on distant genes (trans effects), networks of cis and trans-acting SNPs, and gene-gene interactions.

Mini Challenges

Contestants may also choose to tackle smaller subsets of the data for their entry. The three classes of gene (cell-cell adhesion, serotonin and dopamine) each provide independent information about the disease (though this ignore the interactions across gene classes). Contestants may choose to focus on one of these gene families.

Both main and mini challenge entries will be judged on how well they address the following:

  1. Identification of the most important genes for predicting Tristram's Syndrome from the brain gene expression values.
  2. Identification of the SNPs within those genes that influence gene expression.
  3. Examination of the blood gene expression values to see what could be detected in a simple blood sample, and determine the predictive power of a blood sample for predicting disease. This last criterion is the most important, however, to judge the correctness of the process that lead to the prediction from blood, the first two criteria must be fully documented.

As an optional bonus challenge, contestants may build a predictive statistical model and provide the percent accuracy their model has in predicting disease.

Submissions and Judging

Submissions consist of an up to 4 page extended abstract summarizing the contest entry. Supplementary material is also welcome in form of slide decks (PDF format), virtual machine images and/or binary distributions, technical reports, and supporting manuscripts. A subset of selected contestants will have the opportunity to present their work in a separate session during the symposium.

Contest entries will be judged for their contribution to the state-of-the-art in visualization, and in their ability to provide biological insights. The review committee for the contest entries will draw upon the existing BioVis reviewers while recruiting others from various pertinent research communities in biology and bioinformatics.


March 7th Contest Announcement and data release
April 30th Paper submissions due - Contest entries can be submitted as papers
June 27th Poster submissions due - Contest entries can be submitted as posters
July 11th Contest entry submissions due
July 27th Notification of acceptance of contest entries and poster submissions
August 25th Camera-ready submission of accepted Biovis papers
October 15th Contest winners announced at Biovis 2012!


Contest General Address
Contest Chairs Christopher Bartlett, Raghu Machiraju, William Ray
Meta Data Challenge Data Providers
and Domain Experts
Larry Hunter, Benjamin Keller
eQTL Challenge Data Provider
and Domain Expert
Christopher Bartlett
Contest Forum Forum, Moderator - William Ray