eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Efficient Reinforcement Learning with Generalized-Reactivity Specifications

Resource Type: Conference
Authors: Zhu, Chenyang; Cai, Yujie; Hu, Can; Bi, Jia
Source: 2022 29th Asia-Pacific Software Engineering Conference (APSEC) APSEC Software Engineering Conference (APSEC), 2022 29th Asia-Pacific. :31-40 Dec, 2022
Subject: Computing and Processing
Knowledge engineering
Q-learning
Decision making
Markov processes
Manipulators
Intelligent agents
Task analysis
linear temporal logic
reinforcement learning
reward shaping
Language
ISSN: 2640-0715

Online Access

Full Text (IEEE)

초록

Reinforcement learning has been used to solve sequential decision-making problems in intelligent systems. However, current RL approaches suffer from slow convergence and reward sparsity, and its reward mechanism is challenging to deal with complex task specifications. As one of the software engineering practices, temporal logic can describe nonMarkovian task specifications, the synthesized strategy of which could be used as a priori knowledge to train the agents to interact with the environment efficiently. This paper considers the intelligent agent reacts to the environment with a high-level reactive temporal logic specification called Generalized Reactivity of rank 1 (GR(1)). We first use the synthesized strategy of GR(1) to construct the Markov Decision Process with a potentialbased reward machine, which integrates the environment with high-level reactive temporal specifications. Then we developed a topological-sort-based reward shaping approach to calculate the potential functions of the reward machine, based on which we used Q-learning to train the agents. Experiments on multitask learning show that the proposed approach outperforms the state-of-art algorithms in learning rate and optimal rewards. Also, compared with the value-iteration-based reward shaping approaches, our topological-sort-based reward shaping approach could handle the cases where the synthesized strategies are in the form of directed cyclic graphs.

공지

DAU Library

eArticles

요약정보

Efficient Reinforcement Learning with Generalized-Reactivity Specifications

Online Access

초록