eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

A Comparative Study of Traditional and Transformer-based Deep Learning Models for Multi-Class Eye Movement Recognition Using Collected Dataset

Resource Type: Conference
Authors: Masaoodi, A.A.; Hassan Abbas, H.; Shahadi, H.I.
Source: 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation (ICAMIMIA) Advanced Mechatronics, Intelligent Manufacture and Industrial Automation (ICAMIMIA), 2023 International Conference on. :624-630 Nov, 2023
Subject: Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Power, Energy and Industry Applications
Robotics and Control Systems
Deep learning
Adaptation models
Tracking
Computational modeling
Memory management
Interference
Transformers
Physiology
Iris recognition
Context modeling
Eye Movement Recognition
Remote Eye Tracking
Eye Gesture Recognition
Real-Time Eye Movement Recognition
Eye Datasets
Human-Computer Interaction (HCI)
Eye Tracking Algorithms
Pupil Localization
Eye State Classification
Multi-Class Classification
Deep Learning Benchmarking
Vision Transformer (ViT)
Machine Learning
Image Preprocessing
Cross-Validation
Language
ISSN: 2832-8353

Online Access

Full Text (IEEE)

초록

The study of eye movement recognition has emerged as a pivotal focus, particularly in fields such as human-computer interaction, healthcare diagnostics, and adaptive technologies, due to its potential to enhance lives, especially for those with physical impairments. However, employing deep learning models that utilize non-intrusive cameras for recognizing and classifying eye movements has been impeded by issues that stem from environmental, physiological, and technical factors. These encompass unpredictable lighting, noise, head movements, and inherent human differences. In response to these challenges, this study presents an in-depth comparison between the performance of the ViT vit-base-patch16-224-in21k model and traditional deep learning models including ResNet18, and AlexNet, all of which were adapted and optimized for our collected dataset that consists of diverse eye movements from eight participants, captured under varied environmental and physiological conditions. The evaluation criteria included accuracy, interference time, and memory footprint. The findings indicate that the ViT model delivers a balanced performance, effectively addressing the intricacies of the multi-class eye movement dataset while maintaining interference time efficiency. This study underscores the importance of considering both performance and computational demands in choosing appropriate models for eye movement recognition and offers insights to guide future research. ViT and ResNet18 were about equally accurate but ViT was faster, while ResNet18 used less memory; AlexNet was less accurate and its speed and memory use were in between the two. We find that ViT showed remarkable efficiency with an average of 0.0588 seconds per image which makes it promising for applications that rely on the interference time.

공지

DAU Library

eArticles

요약정보

A Comparative Study of Traditional and Transformer-based Deep Learning Models for Multi-Class Eye Movement Recognition Using Collected Dataset

Online Access

초록