학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

The exploration/exploitation trade-off in Reinforcement Learning for dialogue management

Resource Type: Conference
Authors: Varges, Sebastian; Riccardi, Giuseppe; Quarteroni, Silvia; Ivanov, Alexei V.
Source: 2009 IEEE Workshop on Automatic Speech Recognition & Understanding Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on. :479-484 Dec, 2009
Subject: Computing and Processing
Signal Processing and Analysis
Machine learning
Uncertainty
Humans
Engineering management
Computer science
Delta modulation
Noise robustness
Speech recognition
Noise level
Supervised learning
Language

Online Access

Full Text (IEEE)

초록

Conversational systems use deterministic rules that trigger actions such as requests for confirmation or clarification. More recently, Reinforcement Learning and (Partially Observable) Markov Decision Processes have been proposed for this task. In this paper, we investigate action selection strategies for dialogue management, in particular the exploration/exploitation trade-off and its impact on final reward (i.e. the session reward after optimization has ended) and lifetime reward (i.e. the overall reward accumulated over the learner's lifetime). We propose to use interleaved exploitation sessions as a learning methodology to assess the reward obtained from the current policy. The experiments show a statistically significant difference in final reward of exploitation-only sessions between a system that optimizes lifetime reward and one that maximizes the reward of the final policy.

공지

DAU Library

학술논문

요약정보

The exploration/exploitation trade-off in Reinforcement Learning for dialogue management

Online Access

초록