Sensing data collection from the Internet of Things (IoT) devices lays the foundation to support massive IoT applications, such as patient monitoring in smart health and intelligent control in smart manufacturing. Unfortunately, the heterogeneity of IoT devices and dynamic environments result in not only the life-cycle latency but also data collection failures, affecting the quality of experience (QoE) for all the users. In this paper, we propose a recovery mechanism with a dynamic data contamination method to handle the failure. To further enhance the long-term overall QoE, we allocate the spectrum resources and make contamination decisions for each device using a deep reinforcement learning method. Particularly, a lightweight decentralized State-sharing Deep-Recurrent Q-Network (SDRQN) is proposed to find the optimal collection policies. Our simulation results indicate that the recurrent unit in SDRQN gives rise to 10% lower waiting time and 60% lower task drop rate than the fully-connected design. Compared to a centralized DQN scheme, SDRQN achieves a similar ultra-low drop rate of 0.29% but requires only 1% GPU memory, demonstrating the effectiveness of SDRQN in the large-scale heterogeneous IoT network.