eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Punisher: A Deep Reinforcement Learning Model Trained by Correcting Bad Actions

Resource Type: Conference
Authors: Yang, Jianyi
Source: 2023 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML) Image Processing, Computer Vision and Machine Learning (ICICML), 2023 International Conference on. :302-306 Nov, 2023
Subject: Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Training
Deep learning
State feedback
Reinforcement learning
Stability analysis
Mathematical models
Software development management
deep learning
punisher
reinforcement learning
Language

Online Access

Full Text (IEEE)

초록

Deep Reinforcement Learning (DRL) is a deep learning (DL) network model that uses environmental feedback to train and make decisions. Expected value, as a powerful mathematical tool, is widely used in DRL network training. However, there are often deviations between the expected values and the actual values obtained from the environment. Additionally, accumulative deviations are also present in DRL networks. The accumulation of these deviations can result in a slower training speed and negatively impact the network's stability. To address these issues, this paper proposes a new DRL training method called Punisher. The principle behind Punisher is to identify the bad actions made by the DRL agent and correct only those actions during the training process. By focusing on correcting the bad actions, Punisher aims to improve the overall performance and stability of the DRL network. The experimental results demonstrate that the Punisher method exhibits excellent performance, faster training speeds and greater network stability, making it a promising approach for efficiently training DRL agents in various applications. Experiments presented in the main content of this paper have been posted at GitHub at https://github.com/Jimmyoungyi/Punisher.

공지

DAU Library

eArticles

요약정보

Punisher: A Deep Reinforcement Learning Model Trained by Correcting Bad Actions

Online Access

초록