It’s been a year since we launched our AI Lab to drive and shape innovation within the AI field. Among all, we pay attention to give back to the community by sharing our works at top machine learning conferences.
That is why we were particularly proud to have 5 papers presented at the ICML conf in Long Beach, as well as a workshop that we co-chaired. As a flagship of the research work carried out at Criteo, this selection highlights both the quality and diversity of the academic works that have been done. This article sums up the key learnings from each paper, enjoy and feel free to share your comments.
Learning to bid in revenue maximizing auctions
Thomas Nedelec, Noureddine El Karoui, and Vianney Perchet
One of the key components of computational advertising is to bid to win ad opportunities since most of online placements are sold through an auction mechanism. The paper of Thomas Nédélec et al. shows that one can leverage machine learning techniques and a variational approach for learning optimal bidding strategies against any smart data-driven selling mechanism. One important aspect of this work is that it mixes theoretical results and experimental analyses. Indeed, they first derive the objective that buyers should optimize for when sellers are also optimizing their strategies. Their experiments give them the chance to play with deep learning framework and autograd python packages to learn their mechanism. The strategies yield to very large uplifts for the buyers. What I really like about this paper is that not only he got accepted to one of the best ML conferences, but it will create business value for the company.
Exploiting structure of uncertainty for efficient matroid semi-bandits
Pierre Perrault, Vianney Perchet and Michal Valko
Once again, computational advertising makes heavy uses of bandit theory as it allows to model interactions between personalized recommendation and rewards (clicks). In his work, Vianney Perchet and his co-authors address the issue of making combinatorial bandits efficient. By exploiting submodular optimization and underlying tools like matroids and specific knowledge on the structure of the rewards, they propose an algorithm that is still optimal in term of asymptotic regret while being efficient compared to the current state of the art. Algorithm efficiency is key regarding the scale and complexity of Criteo’s data and many research and engineering efforts are dedicated to making technology fast and responsive.
Screening rules for Lasso with non-convex regularizers
Alain Rakotomamonjy, Gilles Gasso and Joseph Salmon
In the same context of developing fast and scalable machine learning algorithms, in collaboration with external colleagues, I have proposed an approach for making non-convex Lasso solvers more efficient. The problem we address stems from the compromise between better statistical properties induced by non-convex regularizers in the Lasso and the hardness of solving a non-convex learning problem. We have introduced a test that is able to identify useless variables in the model. We can apply this test anytime during the learning process and then safely remove those variables without altering the final solution.
Generalized no free lunch theorem for adversarial robustness
While the above works can be related to Criteo’s bidding and recommendation businesses, Criteo researchers also contribute to advance the state of knowledge through projects and research that create value on the longer term. As such, theoretical work of Elvis Dohmatob proves that under mild conditions on the data distribution almost every classifier can be fooled with high-probability by examples that are near (in flat or manifold space sense) training samples. This result is generic enough to subsume several results available in the literature and exhibits a lack of robustness of some class of classifiers. This is an important result in safe machine learning as we are moving towards a future in which machine learning is going to play a dominant role in society. Yet, finding how this robustness question should be addressed is still open, and at Criteo we are thrilled to participate in this effort.
Fairness-aware learning for continuous attributes and treatments
Jéremie Mary, Clément Calauzènes, Nourredine El Karoui
Safety in machine learning is also concomitant to the question of fairness in algorithm-based decision making. As such computational advertising may perpetuate biases if they are not handled properly. The work of Mary et al. introduces a novel measure of the (conditional) independence between two random variables (an algorithm outcome and a characteristic that takes continuous value we want the algorithm outcome to be independent of). The co-authors made the connection between the measure they consider and some theoretical results showing that the measure can be computed after an SVD decomposition. However, in the sake of efficiency, they also propose a fully differentiable approximation of their measure making it computable and optimizable end-to-end in a deep learning framework.
As a computational advertising company, one may think that research at Criteo is mostly focused on recommender systems. While there are efforts going into that direction, there is an important diversity of research topics that can be challenged At Criteo. You can learn and advance the state-of-the-art on theoretical problems related to bandits, sequential decision-making, game and auction theory. As machine learning and AI is not only about beating state of the art results on some tasks, we are also concerned about their safety, their robustness and their fairness as these features are essential for machine-learning deployed in real-world. Papers presented at ICML this year only show the tip of the Criteo’s research iceberg. We are excited to see how our Nips submissions are welcomed and look forward to the big challenges posed by the various research opportunities at Criteo!
Want to be part of the adventure? Join the team!Careers | Criteo
At Criteo, we’re passionate about connecting more shoppers to the things they need and love. And the only way we can do…www.criteo.com
Oh, one more thing: if you are interested in the statistics of papers accepted at ICML, there is a nice Reddit thread below. tl;dr: Criteo is 11th in the top industrial contributing institutions (behind Google, Facebook and Microsoft for instance but in front of Amazon!)
Oops, one more thing again : you can catch up with all the ICML talks at https://slideslive.com/icml