Recently, reinforcement learning has been applied to various fields and shows better performance than humans. In particular, it is attracting attention in the fields of smart factories and robotics that require automatic control without human intervention. In this paper, we try to allow multiple reinforcement learning agents to learn optimal control policy on their own IoT devices of the same type. There is no guarantee that the reinforcement learning agent that has learned the optimal control policy using one IoT device will perform optimal control of other IoT devices. Therefore, since reinforcement learning must be performed individually for each IoT device, it takes a lot of time and cost. To solve this problem, we propose a new method of federated reinforcement learning. In the proposed federated reinforcement learning, multiple agents have independent IoT devices, perform learning at the same time, and federate with each other to improve learning performance. Therefore, we apply a new gradient sharing method and transfer learning to reinforcement learning. In addition, Actor-Critic PPO, which shows good performance in reinforcement learning algorithms, is used. And, for smooth learning in the IoT environment where numerous devices exist, we propose an architecture based on Software-Defined Networking. Using multiple rotary inverted pendulum devices interconnected via a SDN, we demonstrate that the proposed federated reinforcement learning scheme can effectively facilitate the learning process for multiple IoT devices.