eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Mikrodizi gen ifade verilerinde farklı öznitelik seçim yöntemleri ile sınıflama yöntemlerinin performanslarının değerlendirilmesi

Resource Type
Authors: Arik, Özlem
Source
Subject: Biyoistatistik
Genetics
Genetik
Biostatistics
Language: Turkish

Online Access

Open Access (OpenAIRE)

초록

İstatistik, biyoloji, bilgisayar, matematik ve genetik bilimlerini bir arada kullanan disiplinler arası bir bilim dalı olan biyoinformatik sayesinde, hangi hastalığa hangi anormalliklerin neden olduğu gösterilebilmektedir. Çağımızın hastalığı olan kanser de ne kadar erken fark edilirse iyileşme ihtimali o kadar yüksektir. Kanser hastalığında mikrodizi gen ifade verileri ile yapılan teşhis, sınıflama işlemleri, kanserin yapısında etkili olan genlerin belirlenmesi erken teşhiste önemlidir. Tez çalışmasında da akciğer, lenfoma, rahim ağzı, prostat, meme ve lösemi kanser türlerine ait mikrodizi gen ifade verileri üzerinde çalışılmıştır. Verilerin öznitelik sayısı fazla olduğu için daha az sayıda öznitelik ile çalışmak amacıyla varFilter, nsFilter, rf, lasso, rfe ve limma öznitelik seçim yöntemleri ele alınmıştır. Filtrelenmiş veri setlerinde Naive Bayes, Destek Vektör Makineleri, k-En Yakın Komşu ve Yapay Sinir Ağları sınıflama yöntemleri ile son yıllarda popülerlik kazanan Derin Öğrenme yöntemi ile sınıflama modelleri oluşturulmuştur. Veri setlerinde, ele alınan öznitelik seçim yöntemlerinin hangi sınıflama yöntemlerinde daha iyi olduğunu göstermek ve oluşturulan sınıflama modellerinin performanslarını karşılaştırmak için doğruluk, duyarlılık, seçicilik ve ROC eğrisi altında kalan alan değerleri elde edilmiştir. Genellikle lasso ve limma öznitelik seçim yöntemlerinde oluşturulan sınıflama modelleri diğer öznitelik seçim yöntemlerinde oluşturulan modellere göre daha başarılıdır. Derin Öğrenme yöntemi de klasik veri madenciliği sınıflama yöntemlerine göre çoğunlukla daha iyi performans göstermiştir. Yapay Sinir Ağları yöntemi ise diğer sınıflama yöntemlerine göre düşük performansa sahiptir. Veri setleri üzerinde öznitelik seçim yöntemi uygulamadan Derin Öğrenme sınıflama modelleri de elde edilmiştir. Öznitelik seçim yöntemlerini uygulayarak ve uygulamadan elde edilen Derin Öğrenme modellerinin performansları da karşılaştırıldı. Ayrıca benzetim çalışması yapılmıştır ve gerçek veri setlerine benzer sonuçlar elde edilmiştir. Bioinformatics is an interdisciplinary branch of science that combines statistics, biology, computing, mathematics, and genetics, and thanks to the analysis in bioinformatics, it can be shown which disease causes which abnormalities. In the treatment of cancer, which is the disease of our age, early diagnosis increases the probability of treating the disease. In cancer disease, diagnosis with microarray gene expression data, classification procedures and identification of genes that are effective in the structure of cancer are of great importance for early diagnosis of the disease. In the thesis, microarray gene expression data of lung, kidney, lymphoma, cervical, prostate, breast and leukemia cancer types were studied. Since the number of attributes of the data is high, varFilter, nsFilter, rf, lasso, rfe and limma feature selection methods have been discussed. In filtered data sets, classification models were constructed with Naive Bayes, Support Vector Machines, k-Nearest Neighbor, Artificial Neural Networks and Deep Learning method, which has gained popularity in recent years. Accuracy, sensitivity, specificity and AUC were obtained to demonstrate which classification methods are better in the subject feature selection methods and to compare the performance and success of the generated classification models. Generally, classification models obtained in lasso and limma feature selection methods are more successful than models obtained in other feature selection methods. Deep Learning method is also generally more successful than classical data mining classification methods. Artificial Neural Networks method has lower performance than other classification methods. Deep learning classification models were also obtained without applying the feature selection method on the datasets. It was compared whether there is a difference between the performances of deep learning models obtained by applying and without applying attribute selection methods. In addition, implementation steps were carried out in four different simulation data. Similar results were obtained on real and simulation datasets. 142

공지

DAU Library

eArticles

요약정보

Mikrodizi gen ifade verilerinde farklı öznitelik seçim yöntemleri ile sınıflama yöntemlerinin performanslarının değerlendirilmesi

Online Access

초록