Workshop 2: Speakers and Abstracts

Submitted by serrurie on Sat, 07/18/2015 - 20:36

All the videos of the lectures are available here!

You can download the PDF booklet that will be distributed at the workshop.

Convex bandit, part I

Abstract: In this two-parts lecture Ronen and I will discuss our solution to the decade-old problem of characterizing the minimax regret in bandit convex optimization. Part I will focus on the information theoretic aspects of the proof, while part II will be about a new geometric question on convex functions that we had to answer.

Sébastien Bubeck

Sebastien Bubeck is a researcher in the Theory Group at Microsoft Research. Prior to MSR he was an assistant professor in the ORFE department at Princeton University. His research has won a few awards, including the 2010 Jacques Neveu prize and a 2015 Sloan Fellowship.

Nonparametric learning, scale invariance, and sketching techniques for online algorithms

Abstract: We describe recent results and work in progress on algorithmic techniques for online learning. First, we discuss a simple and modular approach to nonparametric learning in data streams. This method covers the input space using base classifiers that are locally trained. A good balance between model complexity and predictive accuracy is achieved by dynamically adapting the cover to the local complexity of the classification problem. Next, we consider the problem of designing online algorithms that are invariant to linear transformation of the data. As invariance relies upon maintaining second-order information, we show how the associated computational cost can be tamed via the use of sketching techniques without giving up regret guarantees.

Nicolò Cesa-Bianchi

Nicolò Cesa-Bianchi is professor of Computer Science at the University of Milano, Italy, where he is currently head of the Computer Science programs. He was President of the Association for Computational Learning and member of the steering committee of the EC-funded Network of Excellence PASCAL2. He held visiting positions with UC Santa Cruz, Graz Technical University, Ecole Normale Superieure in Paris, Google, and Microsoft Research. His main research interest is the design and analysis of machine learning algorithms with special emphasis on sequential learning problems. He is recipient of a Google Research Award and of a Xerox Foundation UAC Award.

Slides of the presentation

Combinatorial bandits revisited

Abstract: We investigate stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting, we first derive problem-specific regret lower bounds, and analyze how these bounds scale with the dimension of the decision space. We then propose COMBUCB, algorithms that efficiently exploit the combinatorial structure of the problem, and derive finite-time upper bound on their regrets. These bounds improve over regret upper bounds of existing algorithms, and we show numerically that COMBUCB significantly outperforms any other algorithm. In the adversarial setting, we propose two simple algorithms, namely COMBEXP-1 and COMBEXP-2 for semi-bandit and bandit feedback, respectively. Their regrets have similar scaling as state-of-the-art algorithms, in spite of the simplicity of their implementation. More information at http://arxiv.org/abs/1502.03475.

Richard Combes

Richard Combes is an assistant professor in Supelec. He received the Engineering Degree from Telecom Paristech (2008), the Master Degree in Mathematics from university of Paris VII (2009) and the Ph.D. degree in Mathematics from university of Paris VI (2013). He was a visiting scientist at INRIA (2012) and a post-doc in KTH (2013). He received the best paper award at CNSM 2011. His current research interests are machine learning, networks and probability.

Slides of the presentation

Placing Dynamic Content in Local Caches

Abstract: In this talk we will propose a means of addressing a fundamental limitation for the adoption of caching for wireless access networks due to small population sizes. This shortcoming is due to two main challenges: (i) making timely estimates of varying content popularity and (ii) inferring popular content from small samples. We propose a framework that alleviates such limitations. To timely estimate varying popularity in a context of a single cache we propose an Age-Based Threshold (ABT) policy which caches all contents requested more times than a certain threshold related to the content age. We show that ABT is asymptotically hit rate optimal in the many contents regime, which allows us to obtain the first characterization of the optimal performance of a caching system in a dynamic context. We then address small sample sizes focusing on L local caches and one global cache. On one hand we show that the global cache learns L times faster by aggregating all requests from local caches, which improves hit rates. On the other hand, assuming correlations, aggregation washes out local characteristics which leads to a hit rate penalty. This motivates coordination mechanisms that combine global learning of popularity scores in clusters and LRU with prefetching.

Moez Draief

Moez Draief is head of the Computational Learning, Algorithms and Networks team in the Mathematical and Algorithmic Sciences Lab of Huawei Technologies France Research Centre. He is also Associate Professor in Intelligent Systems and Networks in the EEE department at Imperial College London. Prior to that, he was a Marie Curie Research Fellow and Lecturer in the Statistical Laboratory, Cambridge University. He held visiting research positions at Microsoft Research Cambridge, Technicolor Research Paris and the Computer and Information Sciences Department at UPenn. He is a fellow of the Royal Statistical Society and member of its Applied Probability Society Steering Committee. He is also the founding co-director of Imperial Probability Centre.

Slides of the presentation

Convex bandit, part II

Abstract: In this two-parts lecture Sébastien and I will discuss our solution to the decade-old problem of characterizing the minimax regret in bandit convex optimization. Part I will focus on the information theoretic aspects of the proof, while part II will be about a new geometric question on convex functions that we had to answer.

Ronen Eldan

Ronen Eldan received his Ph.D. from the Tel Aviv university. After spending two years as a postdoc in MSR and the University of Washington, he is now at the Weizmann Institute of Science. While his background fields of research are probability theory and high-dimensional convex geometry, he has recently also become interested in machine learning and optimization. His favorite conjecture is the Gaussian Correlation Conjecture and his favorite color is purple.

Optimizing ad-auction reserve prices when the outcome is measured on an aggregate basis

Abstract: Selling an ad-space through automated ad-auctions with an optimal reserve price can be a very challenging task when the outcome of a particular reserve price configuration is observed on an aggregate basis at a low frequency.
This problem is similar to a problem of optimization with costly function evaluations. A bayesian update of the unknown function to optimize is run. An exploration/exploitation question is posed before each new trial. An Upper-Confidence-Bound approach is used. This approach is exposed and compared to alternative approaches.

Nicolas Grislain

Nicolas Grislain is co-founder and Chief Scientist of AlephD (leader in programatic media pricing solutions for publishers). He graduated from the École Normale Supérieure de Lyon (major in Mathematics) and from the Ecole Nationale de Statistique et de l'Administration Économique (major in Financial Engineering) in 2006. From 2006 to 2010 he worked as an expert at the French Treasury on economic policy and debt issues. Then he was a senior quantitative risk manager in charge of a risk-modeling team in a leading French bank (Société Générale) for 2 years.

Slides of the presentation

On Bayesian index policies for sequential resource allocation

Abstract: In the context of a parametric bandit model, I will investigate the interplay between frequentist and Bayesian regret and frequentist and Bayesian strategies. The focus will be on index policies inspired by a Bayesian view on the multi-armed bandit problem but that perform well with respect to the frequentist regret. The notion of index policy dates back to the strategy introduced by Gittins (1979) to solve the Bayesian, discounted, multi-armed bandit problem. We will see that the Gittins indices can also be useful in the non-discounted case. Moreover, some existing approximations of these (hard to compute) indices suggest alternative exploration rates that could be used in UCB-like algorithms. I will then talk about Bayes-UCB, a simple algorithm based on quantiles of posterior distributions on the means of the arms, that is (asymptotically) optimal for some classes of rewards distributions.

Emilie Kaufmann

Emilie Kaufmann is a CNRS junior researcher at CRIStAL (Lille), and an associate member of the Inria team SequeL. Her main research interests lie in statistics and machine learning. In 2014, she has received the Jacques Neveu thesis prize for her PhD works on bandit models.

Slides of the presentation

The moment-LP and moment-SOS approach in polynomial optimization

Abstract: We review basic properties of the moment-LP and moment-SDP hierarchies for polynomial optimization and compare them. We also illustrate how to use such a methodology in two applications outside optimization. More details in this abstract.

Jean-Bernard Lasserre

Jean-Bernard Lasserre has been Directeur de Recherche at LAAS-CNRS in Toulouse since 1980. He is also a member of the Institute of Mathematics of Toulouse. He is a SIAM Fellow and the 2009 recipient of the Lagrange prize in Continuous optimization. In 2015 has been awarded an ERC Advanced Grant for the TAMING project. His research areas include control theory, production planning and scheduling, optimization, and probability theory.

Slides of the presentation

The Hidden World of Bandits

Abstract: Although the standard multi-armed bandit framework is by now very well understood and the available algorithms are optimal, in practice the learning performance is not always satisfactory. In order to alleviate this problem, a common approach is to assume the existence of a "structure" characterizing the shape of the bandit problem. Whenever such structure is known, the learning process can be significantly improved. On the other hand, when the structure is "hidden", one of the main challenges is how to learn it at the same time as minimizing the regret. While many forms of structures have been proposed (e.g., regularity of the reward function across arms), in this talk we show how modern tensor decomposition methods for latent variable models can be paired with standard UCB-like strategies to unveil the hidden structure at the same time as learning the best arm. In particular, we will review an application of tensor decomposition in the standard multi-armed bandit and in the more challenging POMDP problem.

Alessandro Lazaric

Alessandro Lazaric is junior researcher at Inria Lille - Nord Europe in the SequeL team. His research is mainly focused in developing techniques for sequential decision-making under uncertainty, ranging from multi-armed bandit to reinforcement learning and transfer learning problems.

Slides of the presentation

Sampling convex bodies with Projected Monte Carlo

Abstract: Given a convex body K in high dimension n, we consider the Markov chain whose transition consists in adding a small Gaussian and projecting back on K. We shall show that this chain approches the uniform measure on K in at most n^7 steps. The proof extends to the case where a convex potential is added, allowing us to sample from a smooth log-concave density restricted to K. The talk is based on a joint work with Sébastien Bubeck and Ronen Eldan.

Joseph Lehec

Joseph Lehec is maître de conférence at Université Paris-Dauphine since 2009. He is mostly working on probability theory and its application to functional inequalities and high dimensional geometry.

Subsampling, Concentration and Multi-armed bandits

Abstract: In the first part of this talk, we will provide novel concentration results for the setting of sampling without replacement from a finite population (sub-sampling), improving on the fundamental result of Serfling (1974), and extending it to obtain Bernstein and empirical Berstein concentration bounds. We will then move to the stochastic multi-armed bandit problem, a popular model of the exploration/exploitation trade-off in sequential decision making. We introduce a novel algorithm that is based on sub-sampling. Despite its simplicity, we show that the algorithm demonstrates excellent empirical performances against state-of-the-art algorithms, including Thompson sampling and KL-UCB. The algorithm is very flexible, and does need to know a set of reward distributions in advance nor the range of the rewards. We provide a detailed experimental study comparing the algorithm to the state of the art, the main intuition that explains the striking results, and conclude with a finite-time regret analysis for this algorithm in the simplified two-arm bandit setting.

Odalric-Ambrym Maillard

Odalric-Ambrym Maillard is Full-time researcher at INRIA Saclay, in the TAO team. He was previously a Postdoctorant at the Technion, then a Senior Researcher at the Technion working with Shie Mannor, on the European SUPREL project. Before that, he was doing a first Postdoctorat at University of Leoben, with Peter Auer, within the 4-year European project COMPLACS. He did his PhD in Computer Science and Mathematics at Université of Lille 1, under the supervision of Rémi Munos (SequeL, INRIA, Lille) and Philippe Berthet (LSP, IMT, Toulouse).

Slides of the presentation

The Baire Metric and Ultrametric for Linear Computational Time Hierarchical Clustering: Applications and Implementations

Abstract: The context of this work is search and discovery through clustering, i.e. unsupervised classification. Case studies of the clustering of large data sets are reviewed in astronomy and, in high dimensions, in chemistry. To have linear computational time hierarchical clustering, the Baire, or longest common prefix, metric, and simultaneously ultrametric, is used. For high dimensions, random projection is used. Since hierarchically clustered data can be scaled in one dimension, seriation or unidimensional scaling is an important objective. This talk will include comparison with other uses of random projection; other low dimensional mapping and clustering approaches that are related; and an overview of the (mainly) R software used.

Fionn Murtagh

Fionn Murtagh is Professor of Data Science at the University of Derby and also at Goldsmiths University of London. He was President (2008-2009) of the Classification Society of North America, and of the British Classification Society (2007-2010). Fionn Murtagh's research is in digital content analytics, computational science, analysis of narrative, multiresolution morphological modelling, and new data science from the geometry and topology of data and information. Authority on clustering, data analysis and data mining.

Slides of the presentation

Two problems in stochastic gradient descent: choosing the right learning rate and dealing with temporal dependencies (non-iid)

Abstract: We will present two ideas for dealing with two problems in online statistical learning: choosing the right step size for stochastic gradient descents and similar algorithms, and dealing with strong temporal dependencies. The performance (cumulated loss) of algorithms based on stochastic gradients is highly dependent on the chosen step size. Fortunately it is possible to approximate the derivative of cumulated loss with respect to the step size in an online fashion; the resulting gradient descent on the step size results in a reasonable practical performance even if the initial step size varies by several orders of magnitude, although no theoretical results is proved yet. Next, we will consider online learning of the parameters of a dynamical system, such as a hidden Markov model or a recurrent neural network. The usual online algorithms, such as the Kalman filter, usually scale like the third power of dimension. All the more fast algorithms are non-online and require to go back in time at some point (Bellman equations, backpropagation through time, forward-backward algorithm). We will present an online algorithm that maintains a random but *unbiased* estimate of the gradient of the state of the system with respect to its parameters, at a low algorithmic cost.

Yann Ollivier

Yann Ollivier started his career with theoretical work on discrete curvature and convergence and concentration inequalities for Markov chains. He now works in the computer science department at Paris Saclay university, focussing on the applications of information theory to statistical learning and neural networks.

Highly-Smooth Zero-th Order Online Optimization

Abstract: In this joint work with Francis Bach, we consider online convex optimization with noisy zero-th order information, that is noisy function evaluations at any desired point. We focus on problems with high degrees of smoothness, such as online logistic regression. We show that as opposed to gradient-based algorithms, high-order smoothness may be used to improve estimation rates, with a precise dependence on the degree of smoothness and the dimension. In particular, we show that for infinitely differentiable functions, we recover the same dependence on sample size as gradient-based algorithms, with an extra dimension-dependent factor. This is done for convex and strongly-convex functions, with finite horizon and anytime algorithms.

Vianney Perchet

Vianney Perchet is assistant professor of statistics at University Paris-Diderot (Paris 7) in the "Laboratoire de Probabilités et de Modèles Aléatoires". He is mostly working on game theory, online learning and stochastic optimization.

Slides of the presentation

Contextual Distributed Models for Sequences

Abstract: Generative probabilistic models have many practical applications for analyzing sequences, in e.g. natural language processing and web log mining. For such models, taking into account the context (e.g. the person who is writing) is important. While a purely probabilistic approach would introduce further random variables to capture the context, in distributed generative models like recurrent or convolutional neural networks, an exciting possibility is to use a distributed representation of the context, that modifies the generative process. This is the topic of this talk.

Benjamin Piwowarski

Benjamin Piwowarski is a CNRS researcher in the Laboratory of Informatics of Paris 6 University (LIP6). His main research topics are about Information Access, and more precisely around the problems of: representing text (neural networks and quantum probability models, structured document), building user models (for activities like search), and evaluating retrieval systems.

Slides of the presentation

Constraint-based Reputation Modelling on microblogs

Abstract: In this talk we review modeling corporate Entities' Online e-Reputation based on PLS-pm approaches. Dimensions to analyze e-Reputation are set up by analyst as latent variables. Machine Learning (ML) Natural Language Processing (NLP) approaches are used to label large sets of text passages. For each Dimension, several estimations of the relationship with each text passage are computed as well as Opinion and Priority. Automatic path modeling algorithm explains Opinion or Priority scores based on selected Dimensions. Even though PLS-pm is based on standard Factor Analysis, it includes an iterative algorithm that makes it incompatible with objective function optimization. However there is an implicit bayesian latent probabilistic model that explains Model Robustness.

Eric San Juan

Eric San Juan is Associate Professor in Business Intelligence at the University of Avignon. Coming from the world of discrete mathematics and algebraic logic, his topics of interest are now on original applications of discrete probabilistic models to Knowledge Representation, Natural Language Processing and related areas in Information Science like Text Mining and Information Retrieval. The goal is to find efficient algorithms to handle discrete aspects of texts like syntax, linguistic rules and underlying conceptual structure. Recently he has been working on automatic summarization, literature based discovery, discourse segmentation and classification.

Slides of the presentation

Robust sequential learning with applications to the forecasting of electricity consumption and of exchange rates

Abstract: Sometimes, you feel you're spoilt for choice: there are so many good predictors that you could use! Why select and focus on just one? I will review the framework of robust online aggregation (also known as prediction of individual sequences or online aggregation of expert advice). This setting explains how to combine base forecasts provided by ensemble methods. No stochastic modeling is needed and the performance achieved is comparable to the one of the best (constant convex combination of) base forecast(s). I will illustrate the technology on various data sets, including electricity consumption and exchange rates. More importantly, I will point out open issues, both on the theoretical and on the practical sides.

$Gilles Stoltz$

Gilles Stoltz

Gilles Stoltz is a CNRS researcher and an affiliate professor at HEC Paris, France. He holds a PhD and an habilitation from Université Paris-Sud, Orsay, France. His research themes are at the intersection of machine learning, game theory and sequential statistics.

Slides of the presentation

Navigation :

You are here

User login

Workshop 2: Speakers and Abstracts

Convex bandit, part I

Nonparametric learning, scale invariance, and sketching techniques for online algorithms

Combinatorial bandits revisited

Placing Dynamic Content in Local Caches

Convex bandit, part II

Optimizing ad-auction reserve prices when the outcome is measured on an aggregate basis

On Bayesian index policies for sequential resource allocation

The moment-LP and moment-SOS approach in polynomial optimization

The Hidden World of Bandits

Sampling convex bodies with Projected Monte Carlo

Subsampling, Concentration and Multi-armed bandits

The Baire Metric and Ultrametric for Linear Computational Time Hierarchical Clustering: Applications and Implementations

Two problems in stochastic gradient descent: choosing the right learning rate and dealing with temporal dependencies (non-iid)

Highly-Smooth Zero-th Order Online Optimization

Contextual Distributed Models for Sequences

Constraint-based Reputation Modelling on microblogs

Robust sequential learning with applications to the forecasting of electricity consumption and of exchange rates