eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

PCD2Vec: A Poisson Correction Distance Based Approach for Viral Host Classification

Resource Type: Conference
Authors: Ali, Sarwan; Murad, Taslim; Patterson, Murray
Source: 2023 International Joint Conference on Neural Networks (IJCNN) Neural Networks (IJCNN), 2023 International Joint Conference on. :1-8 Jun, 2023
Subject: Components, Circuits, Devices and Systems
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Proteins
Pandemics
RNA
Neural networks
Genomics
Organizations
Coronaviruses
Host Classification
Spike Sequence
Coron-aviruses
Sequence Analysis
Classification
Language
ISSN: 2161-4407

Online Access

Full Text (IEEE)

초록

Coronaviruses are membrane-enveloped, non-segmented positive-strand RNA viruses belonging to the Coronaviridae family. They are primarily divided into two subfamilies, Letovirinae and Coronavirinae, with the majority of these viruses belonging to the latter subfamily. Various animal species, mainly mammalian and avian, are severely infected by various coronaviruses, causing serious concerns like the recent pandemic (COVID-19) – one example of the impact of these viruses on human health as well as the global economy. Therefore, building a deeper understanding of these viruses is essential to devise prevention and mitigation mechanisms. Coronaviruses have an invariant genome organization of $\approx 30\text{KB}$, divided into regions that code for non-structural and structural proteins. Among these, an essential structural region is the spike region and its resulting protein which is responsible for attaching the virus to the host cell membrane. Therefore, the usage of only the spike protein, instead of the full genome, provides most of the essential information for performing analyses such as host classification. In this paper, we propose a novel method for predicting the host specificity of coronaviruses by analyzing spike protein sequences from different viral subgenera and species. Our method involves using the Poisson correction distance to generate a distance matrix, followed by using a radial basis function (RBF) kernel and kernel principal component analysis (PCA) to generate a low-dimensional embedding. Finally, we apply classification algorithms to the low-dimensional embedding to generate the resulting predictions of the host specificity of coronaviruses. We provide theoretical proofs for the non-negativity, symmetry, and triangle inequality properties of the Poisson correction distance metric, which are important properties in a machine-learning setting. By encoding the spike protein structure and sequences using this comprehensive approach, we aim to uncover hidden patterns in the biological sequences to make accurate predictions about host specificity. Finally, our classification results illustrate that our method can achieve higher predictive accuracy and improve performance over existing baselines.

공지

DAU Library

eArticles

요약정보

PCD2Vec: A Poisson Correction Distance Based Approach for Viral Host Classification

Online Access

초록