eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Ensemble Policy Distillation with Reduced Data Distribution Mismatch

Resource Type: Conference
Authors: Sun, Yuxiang; Zhang, Qi
Source: 2022 International Joint Conference on Neural Networks (IJCNN) Neural Networks (IJCNN), 2022 International Joint Conference on. :1-8 Jul, 2022
Subject: Bioengineering
Computing and Processing
General Topics for Engineers
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Degradation
Training
Q-learning
Power demand
Supervised learning
Neural networks
Games
reinforcement learning
ensemble learning
distribution mismatch
policy distillation
Language
ISSN: 2161-4407

Online Access

Full Text (IEEE)

초록

Policy distillation is a method for model compression for deep reinforcement learning, which is typically applied onto mobile devices to reduce power consumption and inference time. However, achieving full and stable distilled policies is challenging, which impedes higher compression ratios. In this work, we develop two policy distillation algorithms to address this problem. Our first algorithm, Ensemble Policy Distillation (EPD), incorporates the idea from supervised learning distillation that uses an ensemble of teacher networks to provide diverse supervision for a compact student policy network. In the Deep Q-Network (DQN) framework, our experiments verify that highly compressed student networks distilled using EPD even outperform teachers for numerous Atari games. Additionally, we analyze how the issue of data distribution mismatch caused by the teacher ensemble in EPD negatively impacts teachers' learning, and introduce the second algorithm, Double Policy Distillation (DPD), as a novel method to mitigate the distribution mismatch. Empirical results show that DPD improves both the teachers' learning and the student's distillation in Atari games and continuous control tasks.

공지

DAU Library

eArticles

요약정보

Ensemble Policy Distillation with Reduced Data Distribution Mismatch

Online Access

초록