Presentation: Understanding Deep Learning
Share this on:
What You’ll Learn
- Hear about machine learning, deep learning and neural networks.
- Find out how to build a model, how to debug it, how to tune it.
- Learn what are some of the tricks that make building a correct model easier.
Abstract
No matter what your role is, it is really important to have some understanding of the models you’re working with. In last year's keynote, Rob Harrop talked about the importance of intuition in machine learning. This is a step towards that.
You might already be using neural networks. How can you go beyond just using deep learning and move towards understanding it so you can make your models better?
Deep learning is notoriously opaque, but there are principles behind how neural networks are constructed that can shed a lot of light on how they behave.
The goal of this talk is to help you understand foundational concepts about neural networks that are not often taught in online tutorials (and that even data scientists may not know), so you can better design and deploy neural networks.
We will go from
- Dissecting a single layer of a neural network to
- How to train (multi-layer) neural networks to
- Problems with training very deep networks and how you can tackle them.
At every stage, I will highlight key things to pay attention to, such as learning rates and how to initialise your network. These will all be related to how the networks are constructed and trained, so you can understand why these parameters are so important.
I will end the talk with practical takeaways used by state-of-the-art models to help you kickstart building powerful neural networks.
Tell me a bit about your experience with deep learning.
I frequently use deep learning for a range of different things, very often with time series. Previously I worked in finance. We tried to predict different kinds of stock prices, bond prices and economic indicators. I also worked with self-driving cars. The inspiration for this talk was really very much based in what I was doing because I found that when I first started working deep learning as opposed to other kind of machine learning models, I had this spreadsheet and then I would try to find the best model, for example, for predicting stock prices. And then I would try this one model architecture and then I would try hundreds of configurations of different numbers of layers, numbers of units in each layer, different kind of parameters, different kinds of optimizers. Then I felt that at the time I was just doing trial and error and I spent a lot of time running a lot of experiments without a clear sense of the direction in which I was going. I didn't know how the different parameters of the different components affected the output, how good the model was. When I did understand it many months later I thought, if I'd only known this then I could have saved so much time running those hundreds of experiments, each of which took hours. So I thought it would be a really good thing to talk about.
Tell me what's the plan for your talk.
We're going to start from the really fundamental stuff, but we're going to talk about it in a way that I hope people who have been using neural networks for months can still find it helpful. Neural networks have a lot of layers, so we're going to start off by talking about what happens in a single layer - a linear layer and a non-linearity - and we're going to talk about what kinds of things one or a few layers can model, why you need a non-linearity, and what different non-linearities do. And then we're going to move onto models with more and more layers because the consensus seems to be that the deeper the models are the better. But often it's very difficult to train very deep models. I'm going to go through the usual problems, when those problems might arise, and how people have been tackling those problems in state of the art models. Then I'll finish with practical tips to train really good models.
Can you give me an example of some of the things that you might talk about for recommendations on training models?
One of the things I'm going to talk about is about one parameter that's really important that's called the learning rate. If you don't set the learning rate right, if you set it to be too high then your model is going to jump from one solution to another. It will be very unstable and you really don't want that. Or maybe it's too low, then your model is not going to learn anything. And some other practical things. In the ML community people have found there's a trick called batchnorm that improves the performance of the model a lot. On the day I'll talk about more quick tips that the research community has figured out through years of experimentation.
Who is the main persona this talk addresses?
I wrote my talk to be informative enough for the person who has been working for deep learning for a while but hasn't really understood it, but I really want this talk to be understandable to someone who has not done any deep learning at all. So it will be simple enough for people who have not had any exposure to DL to understand. But also I hope it'll be insightful enough such that even if you've done machine learning for a while and don't have a huge understanding of deep learning that it will still be useful to you.
What do you want someone to leave the talk with?
I want them to leave the talk with an understanding of deep learning, but then the question is what does understanding of deep learning mean, right? In terms of the takeaways, my hope is that firstly they can understand the building blocks of neural networks, what they're made of, how they're trained. And secondly that they can understand why some hacks improve performance.
The overarching theme of it is the hope that understanding these two things can give them the intuition on how to build a model, how to improve it, how to design and to debug these models, because as with a lot of software engineering the most annoying thing and the thing that takes a lot of time is debugging these models, and it makes a big difference if you can use your intuition to figure out what might be going wrong. So I think the main takeaways would be having the confidence to do that, and being able to build on this foundation, being able to pick up new architectures quickly. Those would be the main takeaways.
Similar Talks



Tracks
-
Architectures You've Always Wondered About
Ever wondered how they do it? Next-gen architectures from the most admired companies in software, such as Netflix, Google, BBC, Twitter, & more.
-
Modern CS in the Real World
Rediscover CS in this applied track on how research is affecting software today.
-
Architecting for Failure: Chaos, Complexity, and Resilience
Making systems resilient involves people and tech. Learn about strategies being used from chaos testing to distributed system clustering.
-
Architecting for the Cloud / Streaming Architectures
Cloud native architectures is a reality. Hear the war stories. learn the benefits, and dodge some of the pitfalls of running on the cloud.
-
JavaScript: Powering the Modern Web
Explore the frameworks that make JavaScript so popular, and learn how JavaScript-based languages are revolutionizing frontend (and backend) development.
-
Operationalizing Microservices: Design, Deliver, Operate
What's the last mile for deploying your service? Learn techniques from the world's most innovative shops on operating microservices.
-
“Don’t Mess Up The Culture!”—Scaling with Sanity
Culture is simply a shared way of doing something with passion. How do you maintain the culture as you scale?
-
DevOps & DevEx: Remove Friction, Ship Code, Add Value
Remove developer friction: CI/CD, fluent API, service meshes... anything that removes the friction in deploying & operating a system.
-
AI/Machine Learning without a PhD
AI/ML is more approachable than ever. Discover how deep learning and ML is being used in practice. Topics include: TensorFlow, TPUs, Keras, PyTorch, & more. No PhD required.
-
Surviving Uncertainty: GDPR, Brexit, or Politics? Beyond DR
With so much uncertainty, how do you bulkhead your organization and technology choices? Learn strategies for dealing with uncertainty today.
-
Career Hacking
Strategies for advancing the skills that advance your career. Look for mentoring, speaking, empathy, and career paths.
-
Advances in FinTech
Finance is king in London. What's happening and what should you be paying attention to with modern #FinTech
-
Security Transformation
How do you actually start with a security mindset? Learn techniques for making security a first-class concern.
-
Tech Ethics: The Intersection of Human Welfare & STEM
What does it mean to be ethical in software? Hear how the discussion is evolving and what is being said in ethics today.
-
Bare Knuckle Performance
Killing latency and getting the most out of your hardware.
-
Evolving Java & the JVM
6 month cadence, cloud-native deployments, scale, Graal, Kotlin, and beyond. Learn how the role of Java and the JVM is evolving.
-
The Right Language for the Job
We're polyglot developers. Learn languages that excel at very specific tasks and remove the undifferentiated heavy lifting in their specific domain.
-
Modern Operating Systems
Decompose the modern operating system, LinuxKit, Containers, Unikernals, eBPF, and more.