Next generation sequencing presents new challenges in large scale data processing. In collaboration with the University of Liverpool’s Animal Sciences & Physiology Research group, in particular Dr Harry Noyes, we combined Taverna scientific workflows with computing power from the Amazon cloud to create a powerful next generation sequencing application for whole genome Single Nucleotide Polymorphism (SNP) analysis.
Through a Web Portal, the application allows scientists to upload their input data, fire off a number of parallel cloud instances for the analysis, monitor progress and collect results.
Preliminary work on the genetic variation of African cattle, provided by Dr Harry Noyes, showed we can run a whole genome of ~22 million SNPs in a matter of hours. The application was demonstrated at the European Conference of Computational Biology (ECCB) 2010 in Ghent, Belgium – see the video from the demonstration.