You are viewing content from a past/completed QCon -


Predictability In ML Applications

In the context of building predictive models, predictability is usually considered a blessing. After all – that is the goal: build the model that has the highest predictive performance. The rise of ‘big data’ has in fact vastly improved our ability to predict human behavior thanks to the introduction of much more informative features. However, in practice things are more differentiated than that. For many applications, the relevant outcome is observed for possibly very different reasons. In such mixed scenarios, the model will automatically gravitate to the one, that is easiest to predict at the expense of the others. This even holds if the predictable scenario is by far less common or relevant. We present a number of such scenarios: clicks on ads being performed ‘intentionally’ vs. ‘accidentally’, online forms being filled out by people or fraudulent bots, and finally consumers visiting store locations vs. their phones pretending to be there. The combination of different and highly informative features can have significantly negative overall impact on the usefulness of predictive modeling.


Claudia Perlich

Chief Scientist at Dstillery

Claudia Perlich currently acts as chief scientist at Dstillery (previously m6d) and in this role designs, develops, analyzes, and optimizes the machine learning that drives digital advertising. She has published more than 50 scientific article and holds multiple patents in machine learning. She...

Read more

From the same track

SESSION + Live Q&A Ad Serving

Achieving High Load in Advertising Technology

High Load consists of three factors: Latency - The Speed of an individual request, business transaction or event Throughput - The Scale required to process a number business transactions per time period a.k.a TPS Availability - The system 99.XXX% availability 24/7/365 AdTech is the...

Peter Milne

Technology Architect @Adform

SESSION + Live Q&A Machine Learning

Policing The Stock Market with Machine Learning

Neurensic has built a solution, SCORE, for doing Trade Surveillance using H2O (an open-source pure Java Big Data ML tool), Machine Learning, and a whole lot of domain expertise and data munging. SCORE pulls in private and public market data and in a few minutes will search it for all sorts of bad...

Cliff Click

CTO @Neurensic

SESSION + Live Q&A Ad Serving

Extreme Programming Meets Realtime Data

At Unruly, we're fast and furious both in terms of the volumes we're handling and in our development process. Scaling incrementally has raised some interesting challenges in how to deal with growing data volumes, and by trying to deliver value in small steps, we've ended up with some...

Tom Johnson

Software Engineer @UnrulyCo

Gel Goldsby

Reporting and Data Team Lead @Unruly

SESSION + Live Q&A Ad Serving

The Move to AI: From HFT to Laplace Demon

The race for low latency data continues. 10 years ago, Flashboys were helping HFT make money with low-latency infrastructures. Today, hedge funds build AI brains pumping hundreds of sources of data in real-time, seeking ubiquity to build Laplace Demons.

Eric Horesnyi


Albert Bifet

Associate Professor @Telecom ParisTech

SESSION + Live Q&A Ad Serving

Ad Serving & Finance Open Space

View full Schedule