학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

A Distributed Algorithm for Multi-Armed Bandit with Homogeneous Rewards over Directed Graphs

Resource Type: Conference
Authors: Zhu, Jingxuan; Liu, Ji
Source: 2021 American Control Conference (ACC) American Control Conference (ACC), 2021. :3038-3043 May, 2021
Subject: Aerospace
Bioengineering
Components, Circuits, Devices and Systems
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Knowledge engineering
Sufficient conditions
Decision making
Directed graphs
Random variables
Distributed algorithms
Language
ISSN: 2378-5861

Online Access

Full Text (IEEE)

초록

This paper studies a distributed multi-armed bandit problem over a network of $N$ agents whose neighbor relationships are described by a directed graph G. Each agent makes a sequence of decisions on selecting an arm from $M$ candidates, yet it only receives a sample of the reward for each decision, which is an unknown random variable. A distributed upper confidence bound (UCB) algorithm is proposed for the agents to cooperatively learn the best decision. It is shown that when all the agents share a homogeneous distribution of each arm reward and G is strongly connected, the algorithm achieves a guaranteed logarithmic regret for all $N$ agents. A sufficient condition under which the proposed distributed algorithm learns faster than the classical single-agent UCB1 is also provided.

공지

DAU Library

학술논문

요약정보

A Distributed Algorithm for Multi-Armed Bandit with Homogeneous Rewards over Directed Graphs

Online Access

초록