A fast sequence assembly method based on compressed data structures
- Resource Type
- Conference
- Authors
- Liang, Peifeng; Zhang, Yancong; Lin, Kui; Hu, Jinglu
- Source
- 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE. :326-329 Aug, 2014
- Subject
- Bioengineering
Assembly
Bioinformatics
Genomics
Sequential analysis
Memory management
Data structures
Indexes
- Language
- ISSN
- 1094-687X
1558-4615
Assembling a large genome using next generation sequencing reads requires large computer memory and a long execution time. To reduce these requirements, a memory and time efficient assembler is presented from applying FM-index in JR-Assembler, called FMJ-Assembler, where FM stand for FM R -index derived from the FM-index and BWT and J for jumping extension. The FMJ-Assembler uses expanded FM-index and BWT to compress data of reads to save memory and jumping extension method make it faster in CPU time. An extensive comparison of the FMJ-Assembler with current assemblers shows that the FMJ-Assembler achieves a better or comparable overall assembly quality and requires lower memory use and less CPU time. All these advantages of the FMJ-Assembler indicate that the FMJ-Assembler will be an efficient assembly method in next generation sequencing technology.