Programmable switches allow data plane to program how packets are processed, which enables flexibility for network management tasks, e.g., packet scheduling and flow measurement. Existing studies focus on program deployment at a single switch, while deployment across the whole data plane is still a challenging issue. In this paper, we present RED, a Resource-Efficient and Distributed program deployment solution for programmable switches. First of all, we compile the data plane programs to estimate the resource utilization and divide them into two categories for further processing. Then, the proposed merging and splitting algorithms are selectively applied to merge or split the pending programs. Finally, we consolidate the scarce resources of the whole data plane to deploy the programs. Extensive experiment results show that 1) RED improves the speedup by two orders of magnitude compared to P4Visor and merges 58.64% more nodes than SPEED; 2) RED makes the overwhelmed programs run normally at a single switch and reduces 3% latency of inter-device scheduling; 3) RED achieves network-wide resource balancing in a distributed way.