Presentation: Data Inferno: 9 Circles of Data Tests With Apache Airflow

Track: Solutions Track IV

Location: Westminster, 4th flr.

Duration: 10:35am - 11:25am

Day of week: Wednesday

Share this on:

Abstract

Continuous delivery is a given nowadays. This goes hand in hand with a lot of automated testing. For 'normal' applications, such testing is well known and documented in the form of unit tests, integration tests, regression tests etc. For big data applications, however, another dimension of complexity is added: that of the data itself. The truth is: real data sucks, it always surprises you by how it differs from what you expect. Unreliable data, in turn, can result in unreliable applications, which makes for unhappy users. In this talk, we'll take you on a journey through our Nine Circles of Data Tests which ensure the data is correct and makes sense. We use Airflow to do this, testing our data and logic at several steps, in order to avoid having to debug such issues over the weekend.

Topics include:

  • CI tests for your data deployments
  • Integrating data tests into your DAG
  • DTAP-ing your data deployments
  • Integrating data science models into this engineering world
  • How we went nuclear with GIT
  • How Chuck Norris keeps us honest
  • Local Airflow in Docker

Speaker: John Müller

Data Engineer WB Advanced Analytics @ING_news (ING Bank)

John works as a Data Engineer at WB Advanced Analytics of ING Bank. Working with loads of data from all kinds of different source systems gets you intimitaly familiar with some good practices in Data Engineering, as you're going to need them all when working with all of it.

Tracks

  • Career Hacking

    Strategies for advancing the skills that advance your career. Look for mentoring, speaking, empathy, and career paths.

  • Advances in FinTech

    Finance is king in London. What's happening and what should you be paying attention to with modern #FinTech

  • Security Transformation

    How do you actually start with a security mindset? Learn techniques for making security a first-class concern.

  • Tech Ethics: The Intersection of Human Welfare & STEM

    What does it mean to be ethical in software? Hear how the discussion is evolving and what is being said in ethics today.

  • Bare Knuckle Performance

    Killing latency and getting the most out of your hardware.

  • Evolving Java & the JVM

    6 month cadence, cloud-native deployments, scale, Graal, Kotlin, and beyond. Learn how the role of Java and the JVM is evolving.

The all-new QCon app!

Available on iOS and Android

The new QCon app helps you make the most of your conference experience. Easily browse and follow the conference schedule, star the talks you want to attend, and keep tabs on your personal itinerary. Download the app now for free on iOS and Android.