학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Toward a Personalized Clustered Federated Learning: A Speech Recognition Case Study

Resource Type: Periodical
Authors: Farahani, B.; Tabibian, S.; Ebrahimi, H.
Source: IEEE Internet of Things Journal IEEE Internet Things J. Internet of Things Journal, IEEE. 10(21):18553-18562 Nov, 2023
Subject: Computing and Processing
Communication, Networking and Broadcast Technologies
Speech recognition
Training
Hidden Markov models
Data privacy
Federated learning
Data models
Servers
Artificial intelligence (AI)
federated learning (FL)
Internet of Things (IoT)
personalization
privacy-preserving machine learning (PPML)
speech recognition
Language
ISSN: 2327-4662
2372-2541

Online Access

초록

Most speech recognition systems utilize cloud computing for model training and updates. Speech data, being personally identifiable information (PII), encompasses personal, privacy-sensitive, and regulated content. Relying on centralized servers or third parties can threaten confidential data, resulting in privacy breaches. Therefore, privacy issues and strict regulations (e.g., EU’s general data protection regulation, California’s CCPA, and the Privacy Act in Australia) limit the availability of large data sets. The scarcity of data sets is particularly pronounced in less-represented languages, like Persian, adversely impacting innovation and data-driven product development. To overcome the challenges posed by the scarcity of data sets and privacy concerns, for the first time, we propose a novel federated learning (FL) solution for Persian Spoken Isolated Digit Recognition. This proposed technique bridges the gap between privacy and utility by enabling the training of an algorithm using decentralized data sets stored on edge devices or servers, without the need for data exchange. Nonindependent and identically distributed data (non-IID), such as unique speaker accents, poses a challenge in speech recognition, especially in an FL setup. Regrettably, this challenge has largely been overlooked in existing techniques and methodologies. To address this, we present an innovative personalized clustered FL (PCFL) approach that successfully exploits similarities among the private data distributions and captures distinctive characteristics inherent in each client’s data in order to train models. The experimental results show that while the proposed solution significantly addresses privacy concerns, it has a negligible performance loss compared to centralized model training techniques.

공지

DAU Library

학술논문

요약정보

Toward a Personalized Clustered Federated Learning: A Speech Recognition Case Study

Online Access

초록