With many devices in the big data system, the challenge arises on how to configure the devices based on the system constraints like the energy budget. The typical scenarios include many wearable health sensors or inter-of-things. In this study, targeting this challenge, we propose a novel deep reinforcement learning framework for configuring the system for optimal operations, which automatically learns through actions and feedbacks. Our novel contributions are two-fold, which greatly boost the performance. First, we propose a system characterization approach that can extract the patterns of the system states of many devices to determine whether the system state has significant changes. Second, we propose a knowledge adaptation approach to determine when to update the learning buffer and how to select a learning batch. Further, we have investigated hyperparameters including learning rate, batch normalization and regularization methods, on deep reinforcement learning outcomes. Evaluated on a multi-device system setup, the proposed framework has demonstrated significant performance boosting with the novel designs. This study will greatly advance intelligent configuration for systems with many wearables or IoT devices, towards big data practices.