Upstream Sequence Finder-Tool to Find Out Upstream Element in Various Database or Genome.
- Resource Type
- Conference
- Authors
- Jha, Vineet; Mazumder, Mohit; Roy, Susanta
- Source
- 2009 IEEE International Advance Computing Conference Advance Computing Conference, 2009. IACC 2009. IEEE International. :1335-1340 Mar, 2009
- Subject
- Computing and Processing
Databases
Genomics
Bioinformatics
Sequences
Proteins
DNA
RNA
Polymers
Biochemistry
Filters
- Language
Upstream elements are very significant in disclosing the property of the sequence not only they set a signal for the various protein to bind there but also help in locating hidden sequences and their property like TATA box. The whole idea about developing this algorithm is that to find out upstream sequences which carry hidden property like road signs which can alert drivers. In this case protein help user to predict and analyse the upstream sequences. We downloaded the DATABASE file (nucleotide file), query file and did the nBLAST. Then we parse the blast output to filter out full length sequences (sequences which are not truncated either from 5' or 3' end for more than 11 bases). The time complexity of algorithm was improved from exponential time complex to linear time complex by using the divide and conquer approach, where the large database file is divided into smaller files. This algorithm gives good hits and filters out the upstream element. One can even fix the option of having a gap or un-gapped alignment in the database.