BLM-17m: A Large-Scale Dataset for Black Lives Matter Topic Detection on Twitter
- Resource Type
- Conference
- Authors
- Kemik, Hasan; Ozates, Nusret; Asgari-Chenaghlou, Meysam; Li, Yang; Cambria, Erik
- Source
- 2023 IEEE International Conference on Data Mining Workshops (ICDMW) ICDMW Data Mining Workshops (ICDMW), 2023 IEEE International Conference on. :736-743 Dec, 2023
- Subject
- Computing and Processing
Ethics
Social networking (online)
Navigation
Blogs
Machine learning
Media
Monitoring
BlackLivesMatter
BLM
Sentiment Analysis
Natural Language Processing
AI
Social Media
- Language
- ISSN
- 2375-9259
Protection of human rights is one of the most important problems of the modern world. In this paper, we construct a Twitter dataset that covers one of the most significant human rights contradiction in recent years which affected the whole world: the George Floyd incident. We propose a labeled dataset for topic detection that contains about 17 million tweets. These Tweets are collected from 25 May 2020 to 21 August 2020, covering about 90 days from the start of the incident. We labeled the dataset by monitoring most trending news topics from global and local newspapers and used TF-IDF and LDA as baselines. We evaluated the results of these two methods with three different k values for precision, recall and F1-score.