You are viewing content from a past/completed QCon -


DSSTNE: Deep Learning at Scale

DSSTNE (Deep Sparse Scalable Tensor Network Engine) is a deep learning framework for working with large sparse data sets. It arose out of research into the use of deep learning for product recommendations after we realized existing frameworks were limited to a single GPU or data-parallel scaling and that they handled sparse datasets incredibly inefficiently. DSSTNE provides nearly free sparse input layers for neural networks and stores such data in a CSR-like format that allowed us to train on data sets that would otherwise have consumed Terabytes of memory and/or bandwidth. Further, DSSTNE implements a new approach to model parallel training that automatically minimizes communication costs such that for a GM204 GPU, one can attain nearly 100% efficient scaling given sufficiently large layer width (~1000 units per GM204 in use). In mid-2016 Amazon open-sourced DSSTNE in exactly the same form as it is used in production in the hopes of advancing the use of deep learning for large sparse data sets wherever the may be.


Scott Le Grand

Deep Learning Engineer @Teza (ex-Amazon, ex-NVidia)

Scott is a senior scientist at Teza Technologies. He spent four years at Amazon where he was the lead author of DSSTNE, the Deep Scalable Sparse Tensor Network. Before that he spent ten years at NVidia, doing work that resulted in 14 GPU-related patents.

Read more


Mountbatten, 6th flr.


Modern Learning Systems


Deep LearningScalabilityMachine Learning


From the same track

SESSION + Live Q&A Machine Learning

Building Robust Machine Learning Systems

Machine learning is powering huge advances in products that we know and love. As a result, ever growing parts of the systems we build are changing from the deterministic to the probabilistic. The accuracy of machine learning applications can quickly deteriorate in the wild without strategies for...

Stephen Whitworth

Co-founder and Machine Learning Engineer @Ravelin

SESSION + Live Q&A Deep Learning

Deep Learning @Google Scale: Smart Reply in Inbox

Anjuli will describe the algorithmic, scaling and deployment considerations involved in an extremely prominent application of cutting-edge deep learning in a user-facing product: the Smart Reply feature of Google Inbox.

Anjuli Kannan

Software Engineer @GoogleBrain

SESSION + Live Q&A Deep Learning

Products And Prototypes With Keras

In this talk Micha will show how to build a working product with Keras, a high level deep learning framework. He'll start by explaining deep learning at a conceptual level, before describing the product requirements. He'll then show code and discuss design decisions that demonstrate how to train...

Micha Gorelick

Research Engineer @FastForwardLabs, Keras Contributor

SESSION + Live Q&A Machine Learning

Julia: A Modern Language For Modern ML

Julia is a modern high-performance, dynamic language for technical computing, with many features which make it ideal for machine learning, including just-in-time (JIT) compilation, multiple dispatch, metaprogramming and easy to use parallelism. This talk will demonstrate these features, and...

Dr. Viral Shah

Co-Founder and CEO of Julia Computing and a Co-Creator of the Julia language

Dr. Simon Byrne

Quantitative Software Developer @JuliaComputing

SESSION + Live Q&A Machine Learning

Mini Workshop: Hands-on Deep Learning

In this interactive workshop, Micha Gorelick will lead you through modification an existing deep learning product implemented in Keras. If you plan to run the code, please come with a well-charged laptop battery! And if you get the chance, please also download the python packages and data we'll...

Micha Gorelick

Research Engineer @FastForwardLabs, Keras Contributor

Mike Lee Williams

Director of Research @FastForwardLabs

View full Schedule