You are viewing content from a past/completed QCon

Presentation: Understanding Deep Learning

Track: AI/Machine Learning without a PhD

Location: Churchill, G flr.

Duration: 4:10pm - 5:00pm

Day of week: Monday

Share this on:

This presentation is now available to view on

Watch video with transcript

What You’ll Learn

  1. Hear about machine learning, deep learning and neural networks.
  2. Find out how to build a model, how to debug it, how to tune it.
  3. Learn what are some of the tricks that make building a correct model easier.


No matter what your role is, it is really important to have some understanding of the models you’re working with. In last year's keynote, Rob Harrop talked about the importance of intuition in machine learning. This is a step towards that.

You might already be using neural networks. How can you go beyond just using deep learning and move towards understanding it so you can make your models better?


Deep learning is notoriously opaque, but there are principles behind how neural networks are constructed that can shed a lot of light on how they behave.


The goal of this talk is to help you understand foundational concepts about neural networks that are not often taught in online tutorials (and that even data scientists may not know), so you can better design and deploy neural networks.


We will go from

  1. Dissecting a single layer of a neural network to

  2. How to train (multi-layer) neural networks to

  3. Problems with training very deep networks and how you can tackle them.


At every stage, I will highlight key things to pay attention to, such as learning rates and how to initialise your network. These will all be related to how the networks are constructed and trained, so you can understand why these parameters are so important.


I will end the talk with practical takeaways used by state-of-the-art models to help you kickstart building powerful neural networks.


Tell me a bit about your experience with deep learning.


I frequently use deep learning for a range of different things, very often with time series. Previously I worked in finance. We tried to predict different kinds of stock prices, bond prices and economic indicators. I also worked with self-driving cars. The inspiration for this talk was really very much based in what I was doing because I found that when I first started working deep learning as opposed to other kind of machine learning models, I had this spreadsheet and then I would try to find the best model, for example, for predicting stock prices. And then I would try this one model architecture and then I would try hundreds of configurations of different numbers of layers, numbers of units in each layer, different kind of parameters, different kinds of optimizers. Then I felt that at the time I was just doing trial and error and I spent a lot of time running a lot of experiments without a clear sense of the direction in which I was going. I didn't know how the different parameters of the different components affected the output, how good the model was. When I did understand it many months later I thought, if I'd only known this then I could have saved so much time running those hundreds of experiments, each of which took hours. So I thought it would be a really good thing to talk about.


Tell me what's the plan for your talk.


We're going to start from the really fundamental stuff, but we're going to talk about it in a way that I hope people who have been using neural networks for months can still find it helpful. Neural networks have a lot of layers, so we're going to start off by talking about what happens in a single layer - a linear layer and a non-linearity - and we're going to talk about what kinds of things one or a few layers can model, why you need a non-linearity, and what different non-linearities do. And then we're going to move onto models with more and more layers because the consensus seems to be that the deeper the models are the better. But often it's very difficult to train very deep models. I'm going to go through the usual problems, when those problems might arise, and how people have been tackling those problems in state of the art models. Then I'll finish with practical tips to train really good models.


Can you give me an example of some of the things that you might talk about for recommendations on training models?


One of the things I'm going to talk about is about one parameter that's really important that's called the learning rate. If you don't set the learning rate right, if you set it to be too high then your model is going to jump from one solution to another. It will be very unstable and you really don't want that. Or maybe it's too low, then your model is not going to learn anything. And some other practical things. In the ML community people have found there's a trick called batchnorm that improves the performance of the model a lot. On the day I'll talk about more quick tips that the research community has figured out through years of experimentation.


Who is the main persona this talk addresses?


I wrote my talk to be informative enough for the person who has been working for deep learning for a while but hasn't really understood it, but I really want this talk to be understandable to someone who has not done any deep learning at all. So it will be simple enough for people who have not had any exposure to DL to understand. But also I hope it'll be insightful enough such that even if you've done machine learning for a while and don't have a huge understanding of deep learning that it will still be useful to you.


What do you want someone to leave the talk with?


I want them to leave the talk with an understanding of deep learning, but then the question is what does understanding of deep learning mean, right? In terms of the takeaways, my hope is that firstly they can understand the building blocks of neural networks, what they're made of, how they're trained. And secondly that they can understand why some hacks improve performance.

The overarching theme of it is the hope that understanding these two things can give them the intuition on how to build a model, how to improve it, how to design and to debug these models, because as with a lot of software engineering the most annoying thing and the thing that takes a lot of time is debugging these models, and it makes a big difference if you can use your intuition to figure out what might be going wrong. So I think the main takeaways would be having the confidence to do that, and being able to build on this foundation, being able to pick up new architectures quickly. Those would be the main takeaways.

Speaker: Jessica Yung

Machine Learning blogger and entrepreneur, Self-Driving Car Engineer Scholar @nvidia


Jessica is a research masters student in machine learning at University College London supervised by Prof. John Shawe-Taylor and André Barreto (Google DeepMind). She was previously at the University of Cambridge and was an NVIDIA Self-Driving Car Engineer Scholar. She applied machine learning to finance at Jump Trading and consults on machine learning.


Jessica is keen on sharing knowledge and writes about machine learning and how to learn effectively on her blog at

Find Jessica Yung at


  • Architectures You've Always Wondered About

    Hard-earned lessons from the names you know on scalability, reliability, security, and performance.

  • Machine Learning: The Latest Innovations

    AI and machine learning is more approachable than ever. Discover how ML, deep learning, and other modern approaches are being used in practice.

  • Kubernetes and Cloud Architectures

    Practical approaches and lessons learned for deploying systems into Kubernetes, cloud, and FaaS platforms.

  • Evolving Java

    JVM futures, JIT directions and improvements to the runtimes stack is the theme of this year’s JVM track.

  • Next Generation Microservices: Building Distributed Systems the Right Way

    Microservice-based applications are everywhere, but well-built distributed systems are not so common. Early adopters of microservices share their insights on how to design systems the right way.

  • Chaos and Resilience: Architecting for Success

    Making systems resilient involves people and tech. Learn about strategies being used, from cognitive systems engineering to chaos engineering.

  • The Future of the API: REST, gRPC, GraphQL and More

    The humble web-based API is evolving. This track provides the what, how, and why of future APIs.

  • Streaming Data Architectures

    Today's systems move huge volumes of data. Hear how the innovators in this space are designing systems and leveraging modern data stream processing platforms.

  • Modern Compilation Targets

    Learn about the innovation happening in the compilation target space. WebAssembly is only the tip of the iceberg.

  • Leaving the Ivory Tower: Modern CS Research in the Real World

    Thoughts pushing software forward, including consensus, CRDT's, formal methods & probabilistic programming.

  • Bare Knuckle Performance

    Crushing latency and getting the most out of your hardware.

  • Leading Distributed Teams

    Remote and distributed working are increasing in popularity, but many organisations underestimate the leadership challenges. Learn from those who are doing this effectively.

  • Full Cycle Developers: Lead the People, Manage the Process & Systems

    "Full cycle developers" is not just another catch phrase; it's about engineers taking ownership and delivering value, and doing so with the support of their entire organisation. Learn more from the pioneers.

  • JavaScript: Pushing the Client Beyond the Browser

    JavaScript is not just the language of the web. Join this track to learn how the innovators are pushing the boundaries of this classic language and ecosystem.

  • When Things Go Wrong: GDPR, Ethics, & Politics

    Privacy, confidentiality, safety and security: learning from the frontlines, from both good and bad experiences

  • Growing Unicorns in the EU: Building, Leading and Scaling Financial Tech Start Ups

    Learn how EU FinTech innovators have designed, built, and led both their technologies and organisations.

  • Building High Performing Teams

    To have a high-performing team, everybody on it has to feel and act like an owner. Learn about cultivating culture, creating psychological safety, sharing the vision effectively, and more

  • Scaling Security, from Device to Cloud

    Implementing effective security is vitally important, regardless of where you are deploying software applications.