Track: Data Science & Machine Learning Methods


Day of week:

Data Science & Machine Learning Methods: How to start using machine learning and data science in your environment today. Latest and greatest best practices.

Modern businesses live or die by how savvy they are in putting their data to work, building using new data processing technologies but, more centrally, applying data science methods via those technologies to shape that raw material reliably into predictions, alerts, and business insights which gain the business an edge.

But effective data science programmes are complicated: creating powerful, sophisticated, and high impact solutions needs a mixture of software engineering, business understanding, and statistical skills, and is much more than just a one-time task.

In this track we will talk about data science and machine learning as they are implemented in real, modern businesses. We’ll look at the case studies, best practices, and pitfalls you need to know about as a developer, and we will see real stories of successful implementation of methods from the most commonly used (e.g. anomaly detection) to the most exotic (e.g. neural networks).

Track Host:
Fran Bennett
CEO and cofounder @MastodonC
Francine spent a number of years working for search engines, helping them to turn data into money. She likes coffee, running, sleeping a lot, and large data sets.
10:35am - 11:25am

by Michael Manapat
Head of ML @Stripe

Stripe processes billions of dollars a year for businesses around the world. To protect its users from fraud, Stripe employs machine learning to detect potentially fraudulent transactions. In this talk, I'll describe how we bootstrapped this system and some of the most important aspects of industrial machine learning. We'll talk about how to choose, train, and evaluate models, how to bridge the gap between training and production systems, and how to address common pitfalls using the problem...

11:50am - 12:40pm

by Mathieu Bastian
Builds data products. @LinkedIn data science alumnus and @Gephi co-founder

Applied machine learning data pipelines are being developed at a very fast pace and often exceed traditional web/business applications codebase in terms of scale and complexity. The algorithms and processes these data workflows implement fulfill business-critical applications which require robust and scalable architectures. But how to make these data pipelines robust? When the number of developers and data jobs grow while at the same time the underlying data change how do we test that...

1:40pm - 2:30pm

by Richard Kasperowski
Author of The Core Protocols: A Guide to Greatness

Open Space
2:55pm - 3:45pm

by Michelle Lee
Data Ambassador, DataKind UK

People who repeatedly cycle through the criminal justice system, often for low-level crimes, cost the country millions of dollars in time and effort, but little is known about them. Who are these repeat offenders, and what drives their recidivism? Can we predict who they are to intervene before they enter the vicious cycle of recidivism?

The steps to build a predictive model may vary by project, but the fundamental workflow and principles remain the same. This talk will walk through...

4:10pm - 5:00pm

by Cathy O'Neil
Author of the blog

Algorithms are increasingly being used to automate what used to be human processes. They are potentially more fair and objective, but they are not automatically so. In fact they can easily codify unfair historical practices. I will discuss some examples of this problem and then I'll pose the question, how would we transparently and comprehensibly audit such algorithms for fairness?

5:25pm - 6:15pm

by Emma Deraze
Lead data scientist @eRevalue

Natural Language Processing (NLP) is an essential part of a growing number of products, covering a vast scope of issues from the seemingly mundane (entity recognition) to the seriously involved (automatic summarization). In all cases, the goal is to reproduce or approximate a process - language - that we as humans rarely think twice about: this makes NLP a field fraught with misunderstandings and incorrect assumptions.

We will talk about the common pitfalls of treating language like '...


Covering innovative topics

Monday, 7 March

Tuesday, 8 March

Wednesday, 9 March