A Principal Component Analysis, Sampling and Classifier strategies for dealing with concerns of class imbalance in datasets with a ratio greater than five
- Resource Type
- Conference
- Authors
- Joshi, Aarchit; Kanwar, Kushal; Vaidya, Pankaj; Sharma, Sachin
- Source
- 2022 Second International Conference on Computer Science, Engineering and Applications (ICCSEA) Computer Science, Engineering and Applications (ICCSEA), 2022 Second International Conference on. :1-6 Sep, 2022
- Subject
- Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineering Profession
General Topics for Engineers
Nuclear Engineering
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Performance evaluation
Computer science
Principal component analysis
pca
sampling
classifier
majority class
minority class
class imbalance
- Language
This research looks at how the imbalance ratio and classifier affect how well various resampling techniques work when dealing with imbalanced data sets. The research examines the effects on learning when previously imbalanced data is changed into fictitiously balanced class distributions using a variety of resampling strategies. When data sets are significantly uneven, analyses utilising attribute selection, four different classifiers, and six performance evaluation criteria demonstrate that over-sampling (95 percent accuracy) the minority class routinely outperforms under-sampling (77 percent accuracy) the majority class.