Howdah - A Flexible Pipeline Framework for Analyzing Genomic Data
- Resource Type
- Conference
- Authors
- Lewis, Steven; Reynolds, Sheila; Rovera, Hector; O'Leary, Mike; Killcoyne, Sarah; Shmulevich, Ilya; Boyle, John
- Source
- 2010 IEEE Second International Conference on Cloud Computing Technology and Science Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on. :776-779 Nov, 2010
- Subject
- Computing and Processing
Communication, Networking and Broadcast Technologies
Bioinformatics
Genomics
Pipelines
Registers
Testing
Cancer
Data mining
Hadoop
MapReduce
genomics
bioinformatics
cloud computing
parallelization
- Language
The advent of new high-throughput sequencing technologies has led to a flood of genomic data which overwhelms the capabilities of single processor machines. We present a MapReduce pipeline called Howdah that supports the analysis of genomic sequence data allowing multiple tests to be plugged in to a single MapReduce job. The pipeline is used to detect chromosomal abnormalities such as insertions, deletions and translocations as well as single nucleotide polymorphisms (SNPs).