학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Q-learning based linear quadratic regulator with balanced exploration and exploitation for unknown systems

Resource Type: Conference
Authors: Ma, Xuehui; Zhang, Shiliang; Qian, Fucai; Wang, Jinbao; Yan, Lin
Source: 2022 China Automation Congress (CAC) Automation Congress (CAC), 2022 China. :446-451 Nov, 2022
Subject: Aerospace
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Estimation error
Q-learning
Uncertainty
Regulators
Adaptive systems
Control systems
Numerical simulation
unknown system
linear quadratic regulation
exploration and exploitation
Language
ISSN: 2688-0938

Online Access

Full Text (IEEE)

초록

Exploration and exploitation are pivot components of Q-learning, and a balance between the two components is crucial toward efficient Q-learning procedures. This paper considers Q-learning for the task of linear quadratic regulation (LQR) for unknown systems. To avoid an aggressive or conservative exploration or exploitation of Q-learning that leads to failed LQR, e.g., large system overshoot and turn-off effect, we propose a novel approach where the two components are balanced in an adaptive way. Particularly, we first take into account the estimation error of Q function optimization for Q-learning to restrain the exploration, which otherwise can be aggressive under the optimization principle of certainty-equivalence as is in previous studies. Then, to balance the exploration and exploitation, we quantify the two components by formulating two objective functions representing the interests of exploration and exploitation. We combine the two functions together as a two-objective optimization problem, which we solved via the bi-criterial method and the solution can serve the regulating signal with balanced exploration and exploitation for the LQR task. Numerical experiments are conducted, and the results demonstrate that the proposed approach can bring a robust and stable LQR in systems with significant uncertainty.

공지

DAU Library

학술논문

요약정보

Q-learning based linear quadratic regulator with balanced exploration and exploitation for unknown systems

Online Access

초록