Policy distillation is a method for model compression for deep reinforcement learning, which is typically applied onto mobile devices to reduce power consumption and inference time. However, achieving full and stable distilled policies is challenging, which impedes higher compression ratios. In this work, we develop two policy distillation algorithms to address this problem. Our first algorithm, Ensemble Policy Distillation (EPD), incorporates the idea from supervised learning distillation that uses an ensemble of teacher networks to provide diverse supervision for a compact student policy network. In the Deep Q-Network (DQN) framework, our experiments verify that highly compressed student networks distilled using EPD even outperform teachers for numerous Atari games. Additionally, we analyze how the issue of data distribution mismatch caused by the teacher ensemble in EPD negatively impacts teachers' learning, and introduce the second algorithm, Double Policy Distillation (DPD), as a novel method to mitigate the distribution mismatch. Empirical results show that DPD improves both the teachers' learning and the student's distillation in Atari games and continuous control tasks.