Presentation: DSSTNE: Deep Learning at Scale

Location:

Duration

Duration: 
4:10pm - 5:00pm

Day of week:

Level:

Persona:

Abstract

DSSTNE (Deep Sparse Scalable Tensor Network Engine) is a deep learning framework for working with large sparse data sets. It arose out of research into the use of deep learning for product recommendations after we realized existing frameworks were limited to a single GPU or data-parallel scaling and that they handled sparse datasets incredibly inefficiently. DSSTNE provides nearly free sparse input layers for neural networks and stores such data in a CSR-like format that allowed us to train on data sets that would otherwise have consumed Terabytes of memory and/or bandwidth. Further, DSSTNE implements a new approach to model parallel training that automatically minimizes communication costs such that for a GM204 GPU, one can attain nearly 100% efficient scaling given sufficiently large layer width (~1000 units per GM204 in use). In mid-2016 Amazon open-sourced DSSTNE in exactly the same form as it is used in production in the hopes of advancing the use of deep learning for large sparse data sets wherever the may be.

Speaker: Scott Le Grand

Deep Learning Engineer @Teza (ex-Amazon, ex-NVidia)

Scott is a senior scientist at Teza Technologies. He spent four years at Amazon where he was the lead author of DSSTNE, the Deep Scalable Sparse Tensor Network. Before that he spent ten years at NVidia, doing work that resulted in 14 GPU-related patents.

Find Scott Le Grand at

Similar Talks

Office of the CTO @MuleSoft
Developer Advocate, JFrog
Associate Professor @Telecom ParisTech
Software Engineer @Instagram
Head of Software Engineering @LMAX Exchange
Tech Lead Notifications, Staff Software Engineer @ Twitter

Tracks

Conference for Professional Software Developers