69一区二三区好的精华液,中文字幕无码av波多野吉衣,亚洲精品久久久久久无码色欲四季,日本不卡高字幕在线2019

Our paper got accepted by NIPS'16
來源: 吳華森/
加州大學戴維斯分校
1764
2
0
2016-08-13

Our paper "Double Thompson Sampling for Dueling Bandits" got accepted by NIPS'16, one of the top conferences in machine learning. 

In this paper, we propose a Double Thompson Sampling (D-TS) algorithm for dueling bandit problems. As indicated by its name, D-TS selects both the first and the second candidates according to Thompson Sampling. Specifically, D-TS maintains a posterior distribution for the preference matrix, and chooses the pair of arms for comparison by sampling twice from the posterior distribution. This simple algorithm applies to general Copeland dueling bandits, including Condorcet dueling bandits as its special case. For general Copeland dueling bandits, we show that D-TS achieves O(K^2 log T) regret. For Condorcet dueling bandits, we further simplify the D-TS algorithm and show that the simplified D-TS algorithm achieves O(Klog T + K^2 log log T) regret. Simulation results based on both synthetic and real-world data demonstrate the efficiency of the proposed D-TS algorithm.


A preliminary version can be found at https://arxiv.org/abs/1604.07101.


登錄用戶可以查看和發表評論, 請前往  登錄 或  注冊
SCHOLAT.com 學者網
免責聲明 | 關于我們 | 聯系我們
聯系我們:
主站蜘蛛池模板: 绍兴市| 宜城市| 东兴市| 天祝| 尼勒克县| 临清市| 西藏| 临西县| 揭阳市| 江川县| 蓝田县| 泾川县| 葵青区| 岐山县| 保靖县| 伊通| 会同县| 台南市| 黄平县| 虹口区| 天台县| 怀仁县| 九龙城区| 红原县| 平顶山市| 兖州市| 海阳市| 齐齐哈尔市| 江达县| 武冈市| 昆山市| 商南县| 确山县| 阳新县| 正镶白旗| 定边县| 桐柏县| 外汇| 丁青县| 宁海县| 札达县|