학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Finite-Sample Bounds for Adaptive Inverse Reinforcement Learning Using Passive Langevin Dynamics

Resource Type: Conference
Authors: Snow, Luke; Krishnamurthy, Vikram
Source: 2023 62nd IEEE Conference on Decision and Control (CDC) Decision and Control (CDC), 2023 62nd IEEE Conference on. :3618-3625 Dec, 2023
Subject: Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Heuristic algorithms
Reinforcement learning
Markov processes
Cost function
Approximation algorithms
Real-time systems
Probability distribution
Language
ISSN: 2576-2370

Online Access

Full Text (IEEE)

초록

Stochastic gradient Langevin dynamics (SGLD) are a useful methodology for sampling from probability distributions. This paper provides a finite sample analysis of a passive stochastic gradient Langevin dynamics algorithm (PSGLD) designed to achieve inverse reinforcement learning. By “passive”, we mean that the noisy gradients available to the PSGLD algorithm (inverse learning process) are evaluated at randomly chosen points by an external stochastic gradient algorithm (forward learner). The PSGLD algorithm acts as a randomized sampler which recovers the cost function being optimized by this external process. Previous work has analyzed the asymptotic performance of this passive algorithm using stochastic approximation techniques; in this work we analyze the non-asymptotic performance. Specifically, we provide finite-time bounds on the 2-Wasserstein distance between the passive algorithm and its stationary measure, from which the reconstructed cost function is obtained.

공지

DAU Library

학술논문

요약정보

Finite-Sample Bounds for Adaptive Inverse Reinforcement Learning Using Passive Langevin Dynamics

Online Access

초록