Retrospective on WSDM

By: Criteo AI Lab Team / 30 Mar 2017

Our Research Scientists had the privilege to attend the 10th International Conference on Web Search and Data Mining (WSDM) in Cambridge, UK.

The conference lasted 5 days with 3 days dedicated to papers presentations spanning a broad range of subjects like information retrieval, recommender systems, crowdsourcing, natural language processing. From their website: “This year, WSDM was able to accept 80 out of 505 papers, which amounts to an acceptance rate of around 16%”.

Criteo AI Lab Booth at WSDM

We face a variety of very practical, applied research problems at Criteo, and a few papers related to these problems got our attention.

A/B Testing

A/B Testing is an important topic at Criteo, as we continuously evaluate improvements of the different parts of our system, be it the recommendation or the bidding engine, or the new predictive search engine.

In Trustworthy Analysis of Online A/B Tests: Pitfalls, challenges and solutions, Alex Deng and his co-authors from Microsoft present a general variance formula for more accurate analysis of A/B tests when the randomization mechanism is unknown. This is a problem that they have with Skype when their randomization depends on the user-session, but also on a minimum periodical refresh. They underscore the importance of having clean A/B tests set-ups, with i.i.d randomization units, and to be aware of the difficulty of variance estimation when the i.i.d hypothesis is not met.

Our colleague Eugene Kharitonov presented his paper Learning Sensitive Combinations of A/B Test Metrics. In this paper, Eugene and his co-authors study the following problem: given a seed set of historical A/B tests with known outcomes, how can one learn a linear combination of A/B testing metrics, such that this combination is (a) is a sensitive metric itself, and (b) agrees with the preferences in the seed set? Kharitonov et al. experimented with a dataset of over one hundred historical experiments containing millions of user interactions and demonstrated that in some cases considerable improvements in sensitivity can be achieved.

Recommendation / Ranking

This year, there was a Networks and Recommendation session at WSDM. Recommending to users the right product at the right time is a critical component of Criteo’s success.

One interesting paper was Neural Survival Recommender from How Jing (LinkedIn) and Alex Smola (Amazon). In this paper, the authors want to predict both the items to recommend to a user, and when the user will return to the website. They use survival analysis (more common in clinical studies) to frame the problem: here the time of survival of a user is the time before the user returns to the website. Using a RNN with LSTM cells, their model is trained on users’ historical sequences, and learned jointly to predict return time and a set of recommended items.

Graph mining

There were quite a few presentations involving graph mining, mostly with applications to social networks or knowledge graphs. There were some interesting presentations on algorithms to maximize your influence on social networks. Criteo has a different use for graphs, though. One graph we build is called the cross-device graph, which connects the different cookies of a same user. We are interested in extracting statistics to understand its structure. In Counting Graphlets: Spacevs Time, Bressan and his co-authors propose an algorithm to count graphlets up to size 6 for “large” graphs by extending the state-of-the-art algorithm based on color coding. Still, “large” here means about 40 millions nodes, and we have more than a billion, so it’s not obvious that we could use this on the full graph. Another interesting poster was about Motifs in Temporal Networks, in which Paranjape and his co-authors extend network motif counting to temporal graph: the motif must appear during a given time-window δ. They show that even with only motifs of size 3 some interesting insights can be drawn. For instance they compare communication networks between StackOverflow and Email, and show with motifs indicating switch of target that there are more engagement in discussions on StackOverflow than in the email data they use.


There were a few papers specifically related to display advertising and retargeting. As we are already testing reinforcement learning for bidding, a poster of special interest for us was Real-Time Bidding by Reinforcement Learning in Display Advertising by Cai and his co-authors. They formulate RTB bidding as a Markov Decision Process and use dynamic programming to learn the optimal value function, in an environment including budget constraints. They show mostly improvements over a simple logistic regression estimate of the optimal bid: offline, over RTB logs, but more interestingly for an online algorithm, they report positive A/B test results. The setup we have at Criteo is a bit different but the idea is very similar: we need to be able to adapt the bidding strategy to the environment.

Overall, the conference brought us a lot of interesting ideas. Research in applied machine learning is thriving, with many different application areas that are of direct interest to Criteo. We look forward to participating in the next edition!