A Markov Chain Monte Carlo Sampling Relevance Vector Machine Model for Recognizing Transcription Start Sites
- Resource Type
- Conference
- Authors
- Juncai, Huang; Fengbi, Wang; Huanzhang, Mao; Mingtian, Zhou
- Source
- 2010 International Conference on Artificial Intelligence and Computational Intelligence Artificial Intelligence and Computational Intelligence (AICI), 2010 International Conference on. 3:185-188 Oct, 2010
- Subject
- Computing and Processing
Communication, Networking and Broadcast Technologies
Training
Markov processes
Data models
DNA
Genomics
Bioinformatics
Biological system modeling
recognizing TSSs
Relevance vector machines
style
Markov-chain Monte Carlo sampler
candidate feature
- Language
The task of finding transcription start sites (TSSs) can be modeled as a classification problem. Relevance vector machines (RVM) is a family of machine learning methods that represent a Bayesian approach to the training of general linear models (GLM). Based on the Markov-chain Monte Carlo(MCMC) sampler, propose a model for using the RVM to explore very large numbers of candidate features. The model applyes the power of the RVM to classifying and detecting interesting points and regions in biological sequence data. The model has been used successfully for testing predicting transcription start sites and other features in genome sequences. Our experimental results on real nucleotide sequences data show that our method improve the prediction accuracy greatly and our method performs significantly better than Promoter Inspector and CpG islands.