FPGASW: Accelerating Large-Scale Smith–Waterman Sequence Alignment Application with Backtracking on FPGA Linear Systolic Array

Interdisciplinary Sciences: Computational Life Sciences - Tập 10 Số 1 - Trang 176-188 - 2018
Fei Xia1, Dan Zou2, Lu Liu3, Xin Man1, Chunlei Zhang1
1Electronic Engineering College, Naval University of Engineering, Wuhan, People’s Republic of China
2Academy of Ocean Science and Engineering, National University of Defense Technology, Changsha, People’s Republic of China
3College of Mechatronic Engineering and Automation, National University of Defense Technology, Changsha, People’s Republic of China

Tóm tắt

Từ khóa


Tài liệu tham khảo

Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197

Altschul SF, Gish W, Miller M, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410

Thompson J, Higgins D, Gibson T (1994) Clustalw: improving the sensitivity of progressive multiple sequence alignment through sequence weighting position specific gap penalties and weight matrix choice. Nucleic Acids Res 22(22):4673–4680

Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760

Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25

Hoang DT (1992) A systolic array for the sequence alignment problem, Technical Report, Brown University Providence, RI, USA

Hoang DT (1993) Searching genetic databases on splash 2. Proc IEEE Workshop FPGAs Custom Comput Mach 1993:185–191

Fagin B, Watt JG, Robert G (1993) A special-purpose processor for gene sequence analysis. Comput Appl Biosci 9:221–226

Yamaguchi Y, Maruyama T et al (2002) High speed homology search with FPGAs. Proc Pac Symp Biocomput 2002:271–282

Dydel S, Bala P (2004) Large scale protein sequence alignment using FPGA reprogrammable logic devices. Proc IEEE Int Conf Field Program Logic Appl 2004:23–32

Sahoo B, Padhy S (2009) A reconfigurable accelerator for parallel longest common protein subsequence algorithm. Proc IEEE Int Adv Comput Conf 2009:260–265

VanCourt T, Herbordt MC (2007) Families of FPGA-based algorithms for approximate string matching. J Microprocess Microsyst 31:135–145

Oliver T, Schmidt B, Maskell D (2005) Hyper customized processors for bio-sequence database scanning on FPGAs. In: Proceedings of ACM/SIGDA 13th international symposium on field programmable gate arrays, pp 229–237

Benkrid K, Liu Y, Benkrid A (2009) A highly parameterized and efficient FPGA-based skeleton for pairwise biological sequence alignment. IEEE Trans Very Large Scale Integr 17(4):561–570

Gok M, Yilmaz C (2006) Efficient cell designs for systolic Smith-Waterman implementation. In: Proceedings of 16th international conference on field programmable logic and applications, pp 1–4

Moritz GL, Jory C, Lopes HS, Lima CRE (2006) Implementation of a parallel algorithm for protein pairwise alignment using reconfigurable computing. In: Proceedings of IEEE international conference on reconfigurable computing and FPGAs (ReConfig’06), pp 1–7

Marongiu A, Palazzari P, Rosato V (2003) Designing hardware for protein sequence analysis. Bioinformatics 19(14):1739–1740

Zhang P, Tan G, Gao GR (2007) Implementation of the Smith-Waterman algorithm on a reconfigurable supercomputing platform. In: Proceedings of 1st international workshop on high-performance reconfigurable computing technology and applications, pp 39–48

Storaasli O, Yu W, Strenski D, Maltby J (2007) Performance evaluation of FPGA-based biological applications. In: Proceedings of cray users group, Seattle

Lloyd S, Snell QO (2008) Sequence alignment with traceback on reconfigurable hardware. In: Proceedings of the 2008 international conference on reconfigurable computing and FPGAs (ReConFig’08), pp 259–264

Paracel Corporation (2004) Applied high-performance computing, revision 6.1. http://www.paracel.com . Accessed 20 Mar 2016

Compugen Ltd. (1994) The bioccelerator for GCG users. Mol Biotechnol 2(2):205. doi: 10.1007/BF02824817

Pfeiffer G, Baumgart S, Schroder J, Schimmler M (2009) A massively parallel architecture for bioinformatics. In: Proceedings of the 9th international conference on computer science, pp 994–1003

TimeLogic Ltd. (2010) TimeLogic Biocomputing Engine Introduction. http://www.timelogic.com/decypher_intro.html

Convey (2010) Convey website. http://www.convey.com

Vermij E (2011) Genetic sequence alignment on a supercomputing platform, MS Thesis, TU Delft, Netherlands

Aldinucci M, Meneghin M, Torquati M (2010) Efficient Smith-Waterman on multi-core with fastflow. J Parallel Distrib Netw Based Proc PDP 1(3):195–199

Manavski SA, Valle G (2008) CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment. BMC Bioinf 9(2):10–19

Liu Y, Schmidt B, Maskell DL (2009) Parallel reconstruction of neighbor-joining trees for large multiple sequence alignments using CUDA. In: Proceedings of IEEE international symposium on parallel and distributed processing, pp 1–8

Ligowski L, Rudnicki W (2009) An efficient implementation of Smith-Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases. In: Proceedings of the 23rd IEEE international parallel and distributed processing symposium (IPDPS ’09), pp 1–8

Alachiotis N, Berger SA, Stamatakis A (2011) Accelerating phylogeny-aware short DNA read alignment with FPGAs. In: IEEE international symposium on field-programmable custom computing machines, pp 226–233

Olson CB, Kim M, Clauson C, Kogon B (2012) Hardware acceleration of short read mapping. IEEE Int Symp Field Program Custom Comput Mach 282(1):161–168

Preuber TB, Knodel O, Spallek RG (2012) Short-read mapping by a systolic custom FPGA computation. IEEE Int Symp Field Program Custom Comput Mach 282(1):169–176

Tang W, Wang W, Duan B et al (2012) Accelerating millions of short reads mapping on a heterogeneous architecture with FPGA accelerator. IEEE Int Symp Field Program Custom Comput Mach 282(1):184–187

Chen P, Wang C, Li X, Zhou X (2014) Accelerating the next generation long read mapping with the FPGA-based system. IEEE/ACM Trans Comput Biol Bioinf 11(5):840–852

Sebastiao N, Roma N, Flores P (2012) Integrated hardware architecture for efficient computation of the n-Best bio-sequence local alignments in embedded platforms. IEEE Trans Very Large Scale Integr VLSI Syst 20(7):1262–1275

Liu Y, Schmidt B, Maskell DL (2009) MSA-CUDA: multiple sequence alignment on graphics processing units with CUDA. In: Proceedings of the 20th IEEE international conference on application specific systems, architectures and processors, pp 121–128

Liu Y, Maskell D, Schmidt B (2009) CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units. BMC Res Notes 2(1):73–82

Liu Y, Schmidt B, Maskell DL (2010) CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA enabled GPUs based on SIMT and virtualized SIMD abstractions. BMC Res Notes 3(1):93. doi: 10.1186/1756-0500-3-93

Liu Y, Wirawan A, Schmidt B (2013) CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions. BMC Bioinf 14(1):117

Liu Y, Hong Y, Lin CY, Hung CL (2015) Accelerating Smith-Waterman alignment for protein database search using frequency distance filtration scheme based on CPU-GPU collaborative system. Int J Genom Hindawi Publ Corp 2015:12. doi: 10.1155/2015/761063

Hasan L, Kentie M, Al-Ars Z (2011) GPU-accelerated protein sequence alignment. In: Proceedings of the 33rd annual international conference of the IEEE engineering in medicine and biology society (EMBS ’11), pp 2442–2446

Hains D, Cashero Z, Ottenberg M, Bohm W, Rajopadhye S (2011) Improving CUDASW++, a parallelization of smithwaterman for CUDA enabled devices. In: Proceedings of the 25th IEEE international parallel and distributed processing symposium, workshops and Phd forum (IPDPSW’11), pp 490–501

NCBI (2016) GenBank growth statistics. http://ww.ncbi.nlm.nih.gov/genbank/statistics/ . Accessed 21 Aug 2016

ClustalW 2.0 (2016) Center of bioinformatics. http://www.cbi.pku.edu.cn/chinese/documents/bioinfor/overview/web4/1.html . Accessed 21 Aug 2016

Farrar M Striped Smith–Waterman speeds database searches six times over other SIMD implementations

Rognes T (2011) Faster Smith-Waterman database searches with inter-sequence SIMD parallelization. BMC Bioinf 12:221

Jaleel A, Mattina M, Jacob B (2006) Last level cache (LLC) performance of data mining workloads on a CMP-a case study of parallel bioinformatics workloads. In: Proceedings of the twelfth international symposium on high-performance computer architecture (23):88–98

Wang D, Tang ZM (2004) The implementation and analysis of Smith-Waterman algorithm on systolic array. Chin J Comput 27(1):12–20