The Criteo AI Lab is pioneering innovations in computational advertising. As the center of scientific excellence in the company, we deliver both fundamental and applied scientific leadership through published research, product innovations and new technologies powering the company’s products.
The Criteo AI Lab operates within the spectrum of two main areas: ML Engineering and ML Research
ML Engineering
When an idea or a prototype has reached a sufficient maturity level and proven potential value for our customers, it is time to clear the road to go into production. And at Criteo this often means being able to go full throttle: either answer up to hundreds of million requests per second in a low latency setup or process up to petabytes of data in a high throughput setup. In any case our engineers are here to translate these emerging algorithms into full scale solutions and build reliable platforms to make them operate flawlessly. By closing the gap with researchers and clients we aim at making our partnerships achieve better velocity and flexibility.
Here are some examples of problems we are solving:
- Boosting recommendation with representation learning
- Building a universal catalog across all our partners
- Blacklisting inappropriate products from our banners
- Model composition and advanced feature engineering
ML Research
ML research spans from applied research to pure and upstream academic research:
Applied research. Our scientists fully-leverage the advantage of working in a machine-learning driven organization by partnering closely with our product and engineering counterparts to deliver cutting-edge solutions to the challenges in online advertising.
Academic research. The research scientists are encouraged and fully-supported to publish their works at international conferences, collaborate with academic partners, file for patents, release datasets and help establish the state-of-art in computational advertising.
A sampling of the research topics we work on in the above roles:
- Click prediction: How do you accurately predict if the user will click on an ad in less than a millisecond? Thankfully, you have billions of data points to help you.
- Recommender systems: standard SVD works well for recommendation, what happens when you have to choose the top products amongst hundreds of thousands for every user, 2 billion times per day, in less than 50ms?
- Auction theory: in a second-price auction, the theoretical optimal is to bid the expected value. But what happens when you run 15 billion auctions per day against the same competitors?
- Reinforcement learning: how to find the optimal bidding strategy across multiple auctions? Can this be cast as a reinforcement learning problem in very high dimensions with unpredictable rewards.
- Offline testing/Metrics: You can always compute the classification error on model predicting the probability of a click. But is this really related to the online performance of a new model? What is the right offline metric that predicts online performance?