This work shows that the optimal logarithmic regret is also achievable uniformly over time, with simple and efficient policies, and for all reward distributions with bounded support.Expand

This chapter discusses prediction with expert advice, efficient forecasters for large classes of experts, and randomized prediction for specific losses.Expand

The focus is on two extreme cases in which the analysis of regret is particularly simple and elegant: independent and identically distributed payoffs and adversarial payoffs.Expand

This work analyzes algorithms that predict a binary value by combining the predictions of several prediction strategies, called `experts', and shows how this leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context.Expand

This paper shows that essentially the same optimized bounds can be obtained when the algorithms adaptively tune their learning rates as the examples in the sequence are progressively revealed, as they depend on the whole sequence of examples.Expand

This paper introduces a general technique for turning linear-threshold classification algorithms from the general additive family into randomized selective sampling algorithms, and shows that these semi-supervised algorithms can achieve, on average, the same accuracy as that of their fully supervised counterparts, but using fewer labels.Expand

This work analyzes how the structure of the feedback graph controls the inherent difficulty of the induced $T$-round learning problem and shows how the regret is affected if the graphs are allowed to vary with time.Expand

New and sharper regret bounds are derived for the well-known exponentially weighted average forecaster and for a second forecaster with a different multiplicative update rule, expressed in terms of sums of squared payoffs, replacing larger first-order quantities appearing in previous bounds.Expand

A global recommendation strategy which allocates a bandit algorithm to each network node (user) and allows it to "share" signals (contexts and payoffs) with the neghboring nodes, and derives two more scalable variants of this strategy based on different ways of clustering the graph nodes.Expand