Want to contribute?
Differentiable Ranking and Sorting using Optimal Transport by Marco Cuturi • Google Brain
Differentiable Ranking and Sorting using Optimal Transport: We consider in this work proxy operators for ranking and sorting that are differentiable. To do so, we leverage the +fact that sorting can be seen as a particular instance of the optimal transport (OT) problem between two univariate uniform probability measures, the first measure being supported on the family of input values of interest, the second supported on a family of increasing target values (e.g. 1,2,…,n if the input array has n elements). Building upon this link, we propose generalized rank and sort operators by considering arbitrary target measures supported on m values, where m can be smaller than n. We recover differentiable algorithms by adding to this generic OT problem an entropic regularization, and approximate its outputs using Sinkhorn iterations. We call these operators soft-ranks and soft-sorts. We show that they are competitive with the recently proposed neuralsort (grover, 2019). To illustrate their versatility, we use the soft-rank operator to propose a new classification training loss that is a differentiable proxy of the 0/1 loss. Using the soft-sort operator, we propose a new differentiable loss for trimmed regression.
A framework for online meta-learning by Massimiliano Pontil • IIT
We study the problem in which a series of learning tasks are observed sequentially and the goal is to incrementally adapt a learning algorithm in order to improve its performance on future tasks. We consider both stochastic and adversarial settings.
The algorithm may be parametrized by either a representation matrix applied to the raw inputs or by a bias vector.
We develop a computational efficient meta-algorithm to incrementally adapt the learning algorithm after a task dataset is observed.
The meta-algorithm performs online convex optimization on a proxy objective of the risk of the learning algorithm. We derive bounds on the performance of the meta-algorithm, measured by either the average risk of the learning algorithm on random tasks from the environment or by an average regret bound. Our analysis leverages ideas from multitask learning and learning-to-learn with tools from online learning and stochastic optimization. In the last part of the talk, we discuss extensions of the framework to nonlinear models such a deep neural nets and draw links between meta-learning, bilevel optimization and gradient-based hyperparameter optimization.
Adaptive inference and its relations to sequential decision making by Alexandra Carpentier • OVGU
Adaptive inference – namely adaptive estimation and adaptive confidence statements – is particularly important in high of infinite dimensional models in statistics. Indeed whenever the dimension becomes high or infinite, it is important to adapt to the underlying structure of the problem. While adaptive estimation is often possible, it is often the case that adaptive and honest confidence sets do not exist. This is known as the adaptive inference paradox. And this has consequences in sequential decision making. In this talk, I will present some classical results of adaptive inference and discuss how they impact sequential decision making. This is based on joint works with Andrea Locatelli, Matthias Loeffler, Olga Klopp and Richard Nickl.
Guest Speaker – Diarmuid Gill • Criteo
Lunch break on the Rooftop
Optimal Transport for Machine Learning by Quentin Berthet • Cambridge University
Fundamental Limits on Robustness to Adversarial Examples by Elvis Dohmatob • Criteo AI Lab
Have you heard of the flying pig ?
You can add a small perturbation to the image of a pig and fool a deep-net into classifying it as an aircraft, with high confidence! The existence of these so-called adversarial examples for machine-learning algorithms is at least one reason why AI systems will never be deployable in a closed-loop fashion. In this presentation, I will present some theoretical results on fundamental limits on the robustness of algorithms to adversarial examples.
Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise by Julien Mairal • Inria
Nicolas Vayatis • ENS Cachan
Cocktail and Posters on the Rooftop
32 rue Blanche
75009 Paris France