사행산업 > BETTING & GAMING > Article] Accelerated Bayesian learning for decentralized two-armed bandit based decision making with applications to the Goore Game

BUSINESS: Institute of Lottery Policy Graduate School Cooperative Lottery Institute MEMBERSHIPS FAQ CONTACT US INFO.& BIDDING NEWS & ANNOUNCE DONATION & SHARE

BIG DATA: CASES & STUDIES TECHNOLOGY & INFO. ECONOMIC STATISTICS CUTURAL STATISTICS LOTTERY STATISTICS EVENTS & CONFERENCE KOREA LOTTERIES

LOTTERY&GAMBLE: LOTTERY CASINO HORSE RACING BETTING & GAMING DOG RACING INTERNET GAMBLING MOBILE GAMBLING SPORTS BETTING Interdisciplinary Studies

RESEARCH: ADDICTION & DISORDER GAME DESIGN PATHOLOGICAL GAMBLING CULTURE & ECONOMY SECURITY & REGULATION TECHNOLOGY & ALGORITHM CONSUMER & PLAYER RISK MANAGEMENT PREFERENCE & DECISION-MAKING MARKETING & PROMOTION BEST PRACTICE & CASES

Our_Knowledge: BOOK & PAPER PRINTED LOTTERY AUDIT PLAN & ORGANIZATION PRINTING PLAN PRE-PRESS ACTUAL PRINTING INSPECTION STORAGE & DISTRIBUTION RETAILER & SALES PRIZE PAYOUT TICKET CLOSURES Conference Paper Journal of Lottery Policy

GLOBAL_NETWORK: Institute & University Associations & Magazine Asia Lottery Australia Lottery Canada Lottery Europe Lottery USA Lottery Latin America Lottery Africa Lottery Major Vender

MY PAGE: MY PAGE JOIN GREETINGS COMMUNITY MY ARTICLE MY COLLETION REAL LECTURE WORK DIARY ILP DATA

베팅·게임 | Cases & Studies in Betting & Gaming | 赌博 & 投机

date : 2015-05-20 01:10|hit : 2536

Article] Accelerated Bayesian learning for decentralized two-armed bandit based decision making with applications to the Goore Game

DocNo of ILP: 643

Doc. Type: Article

Title: Accelerated Bayesian learning for decentralized two-armed bandit based decision making with applications to the Goore Game

Authors: Granmo, OC; Glimsdal, S

Full Name of Authors: Granmo, Ole-Christoffer; Glimsdal, Sondre

Keywords by Author: Bandit problems; Goore Game; Bayesian learning; Decentralized decision making; Quality of service control; Wireless sensor networks

Keywords Plus:

Abstract: The two-armed bandit problem is a classical optimization problem where a decision maker sequentially pulls one of two arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Bandit problems are particularly fascinating because a large class of real world problems, including routing, Quality of Service (QoS) control, game playing, and resource allocation, can be solved in a decentralized manner when modeled as a system of interacting gambling machines. Although computationally intractable in many cases, Bayesian methods provide a standard for optimal decision making. This paper proposes a novel scheme for decentralized decision making based on the Goore Game in which each decision maker is inherently Bayesian in nature, yet avoids computational intractability by relying simply on updating the hyper parameters of sibling conjugate priors, and on random sampling from these posteriors. We further report theoretical results on the variance of the random rewards experienced by each individual decision maker. Based on these theoretical results, each decision maker is able to accelerate its own learning by taking advantage of the increasingly more reliable feedback that is obtained as exploration gradually turns into exploitation in bandit problem based learning. Extensive experiments, involving QoS control in simulated wireless sensor networks, demonstrate that the accelerated learning allows us to combine the benefits of conservative learning, which is high accuracy, with the benefits of hurried learning, which is fast convergence. In this manner, our scheme outperforms recently proposed Goore Game solution schemes, where one has to trade off accuracy with speed. As an additional benefit, performance also becomes more stable. We thus believe that our methodology opens avenues for improved performance in a number of applications of bandit based decentralized decision making.

Cate of OECD: Computer and information sciences

Year of Publication: 2013

Business Area: game

Detail Business: game

Country: Netherlands

Study Area:

Name of Journal: APPLIED INTELLIGENCE

Language: English

Country of Authors: [Granmo, Ole-Christoffer; Glimsdal, Sondre] Univ Agder, Dept Informat & Commun Technol, N-4604 Kristiansand, Norway

Press Adress: Granmo, OC (reprint author), Univ Agder, Dept Informat & Commun Technol, POB 422, N-4604 Kristiansand, Norway.

Email Address: ole.granmo@uia.no; sondre.glimsdal@uia.no

Citaion:

Funding:

Lists of Citation: Bouhmala N, 2010, ENG APPL ARTIF INTEL, V23, P715, DOI 10.1016/j.engappai.2010.01.009; Cao YU, 1997, AUTON ROBOT, V4, P7, DOI 10.1023/A:1008855018923; Chen D, 2004, 2004 INT C WIR NETW; Dimitrakakis C, 2006, LECT NOTES COMPUT SC, V4131, P850; Gelly S, 2006, P NIPS 2006 NIPS; Google, GOOGL WEBS OPT; Graepel T, 2010, P 27 INT C MACH LEAR, P1320; Granmo OC, 2010, IEEE T COMPUT, V59, P545, DOI 10.1109/TC.2009.189; Granmo OC, 2010, LECT NOTES ARTIF INT, V6098, P199, DOI 10.1007/978-3-642-13033-5_21; Granmo Ole-Christoffer, 2010, International Journal of Intelligent Computing & Cybernetics, V3, DOI 10.1108/17563781011049179; Granmo O-C, 2007, INT J COMPUTER SCI A, V4, P15; Granmo OC, 2007, IEEE T SYST MAN CY B, V37, P166, DOI 10.1109/TSMCB.2006.879012; Gupta N, 2011, P 10 INT C MACH LEAR; Gupta N, 2011, P 31 SGAI INT C ART; Iyer R., 2003, IEEE INT C COMM, P517, DOI 10.1109/ICC.2003.1204230; May BC, 2011, TECHNICAL REPORT; Oommen BJ, 2007, IEEE T COMPUT, V56, P959, DOI 10.1109/TC.2007.1045; Narendra K.S., 1989, LEARNING AUTOMATA IN; Oommen BJ, 2008, GAME THEORY STRATEGI; Scott SL, 2010, APPL STOCH MODEL BUS, V26, P639, DOI 10.1002/asmb.874; Thompson WR, 1933, BIOMETRIKA, V25, P285, DOI 10.2307/2332286; Tsetlin M. L., 1973, AUTOMATION THEORY MO; Tung B., 1996, IEEE T PARALL DISTR, V7, P47; Wang T., 2005, P 22 INT C MACH LEAR, P956, DOI DOI 10.1145/1102351.1102472

Number of Citaion: 24

Publication: SPRINGER

City of Publication: DORDRECHT

Address of Publication: VAN GODEWIJCKSTRAAT 30, 3311 GZ DORDRECHT, NETHERLANDS

ISSN: 0924-669X

29-Character Source Abbreviation: APPL INTELL

ISO Source Abbreviation: Appl. Intell.

Volume: 38

Version: 4

Start of File: 479

End of File: 488

DOI: 10.1007/s10489-012-0346-z

Number of Pages: 10

Web of Science Category: Computer Science, Artificial Intelligence

Subject Category: Computer Science

Document Delivery Number: 140IE

Unique Article Identifier: WOS:000318646900001

[이 게시물은 HyeJung Mo…님에 의해 2015-05-20 14:37:45 GAMBLING에서 이동 됨]

http://www.wokinfo.com (1110)

reply : 0

list: prev

next

Terms Of Use | Privacy Policy | Guidelines
110-052, 31-1 Jeokseon-dong Jongno-gu Seoul S.Korea