Presentation: Real-Time Decisions Using ML on the Google Cloud Platform
Share this on:
Abstract
Ocado Technology is providing a full solution to put the world’s retailers online using the cloud, robotics, AI and IoT. Processing tens of thousands of orders every day, we generate millions of events every minute, leading to huge amount of data to be managed. We will present how this Big Data is handled in Google Cloud Platform to build a end-to-end machine learning pipeline: how data is stored and processed in BigQuery, post-processed and copied with Dataflow, then used to train Deep Neural Network models with TensorFlow, how all this is orchestrated using our in-house scheduling software called Query Manager, and how predictions are finally run in real-time using Cloud ML Engine and Datastore.
What is the focus of your work today?
Carlos: The team is currently building the new ML-powered fraud application, to be used by fraud agents at Ocado and also other retailers using our Ocado Smart Platform. The team is currently focused on building the pipelines that allow us to integrate the real-time production systems with the “big data” stored in Google Cloud and on adding more features to the ML models that will improve their accuracy.
Przemek: My team is building a machine learning platform on top of Google Cloud to help our data scientists be more productive. We would like them to focus on the things they do best - data exploration - instead of doing purely engineering work. We hope that by giving them a set of proper tools they will become self-sufficient to productionise whatever machine learning models they create.
At the same time, we aim to lower the entry barrier to machine learning for all non-data scientists. We firmly believe that this is the way to be successful in the AI space - to have all engineers incorporate simple ML models into their products instead of just having a handful of state-of-the-art solutions created for and by data scientists.
What’s the motivation for this talk?
Carlos: Most talks about ML are usually given by data scientists who have a research background. This is usually intimidating for Software Engineers, but also, we believe they overlook important aspects such as how to deploy those models to production, build automated pipelines, monitor their accuracy, etc.
Przemek: We’ve heard people say that all processes and tricks known for years in the software industry no longer apply in the machine learning domain and one needs to fundamentally change the way of thinking to be successful in AI. While there may be some truth to it, we still think that the plain, old software engineering methods established over the last decades can enable the success of machine learning projects.
How you you describe the persona and level of the target audience?
Carlos: I expect to see mainly software engineers who have had some exposure or interest in machine learning. On the other hand, I also think this is relevant for data scientists who want to learn more about how others have “productionised” ML-based systems.
Przemek: To add what Carlos has said - we will cover lots of services available in the Google Cloud Platform, so anyone interested in knowing more about GCP will surely benefit from our talk.
What do you “that” persona to walk away from your talk knowing that they might not have known 50 minutes before?
Carlos: We expect they will have a better understanding of components in Google Cloud Platform and what use cases they work best for. Also, we want to give the attendees some first-hand advice based on the lessons we’ve learnt building production ML systems.
Przemek:
- Make full use of the services available in the cloud, so that your unicorn data scientists can focus on data exploration and building models rather than worrying about underlying infrastructure.
- Machine learning solutions might serve different business purposes (such as recommendations and fraud detection), but they share a lot in terms of architecture
- You can do everything using a single Google Cloud Platform stack: data exploration, feature engineering, modelling, training and serving. There’s no need to go anywhere else.
- The Google Cloud Machine Learning Engine makes neural networks dead easy to use, so go ahead and try it.
What trend in the next 12 months would you recommend an early adopter/early majority SWE to pay particular attention to?
Carlos: When it comes to Google services, I think this year we’ll hear more about Spanner, since it was made publicly available last year.
Amazon SageMaker, released at the end of 2017, is Amazon’s bet to become a first-class player in the ML ecosystem and will probably attract data scientists and software engineers, as AWS seems to become more mature in the ML space.
Przemek: Currently, a natural step for a companies that would like to start using machine learning is to try out high-level APIs available in the cloud, like Amazon Rekognition or Google Vision API. Those are great tools as long as you don’t need customizations. If you do, then the only possibility is to hire machine learning experts and create neural nets tailored to your problem. This is of course very expensive and not every company is prepared for such investment.
Recently there’s a new kind of services emerging, that are positioned somewhere in the middle between high-level APIs and low-level neural net frameworks. One example is Google Cloud AutoML, which will train a custom vision model on the data you’ve provided. I believe in the next 12 months we will see more of these specialized “model trainers” in different domains and it’ll be a huge game-changer for smaller companies which couldn’t afford to do ML before.
Similar Talks



Tracks
Monday, 5 March
- 
                                      Leading Edge Backend Languages                  
                  Code the future! How cutting-edge programming languages and their more-established forerunners can help solve today and tomorrow’s server-side technical problems. 
- 
                                      Security: Red XOR Blue Team                  
                  Security from the defender's AND the attacker's point of view 
- 
                                      Microservices/ Serverless: Patterns and Practices                  
                  Stories of success and failure building modern service and function-based applications, including event sourcing, reactive, decomposition, & more. 
- 
                                      Stream Processing in the Modern Age                  
                  Compelling applications of stream processing & recent advances in the field 
- 
                                      DevEx: The Next Evolution of DevOps                  
                  Removing friction from the developer experience. 
- 
                                      Modern CS in the Real World                  
                  Applied trends in Computer Science that are likely to affect Software Engineers today. 
- 
                                      Speaker AMAs (Ask Me Anything)                   
                  
Tuesday, 6 March
- 
                                      Next Gen Banking: It’s not all Blockchains and ICOs                  
                  Great technologies like Blockchain, smartphones and biometrics must not be limited to just faster banking, but better banking. 
- 
                                      Observability: Logging, Alerting and Tracing                  
                  Observability in modern large distributed computer systems 
- 
                                      Building Great Engineering Cultures & Organizations                  
                  Stories of cultural change in organizations 
- 
                                      Architectures You've Always Wondered About                  
                  Topics like next-gen architecture mixed with applied use cases found in today's large-scale systems, self-driving cars, network routing, scale, robotics, cloud deployments, and more. 
- 
                                      The Practice & Frontiers of AI                  
                  Learn about machine learning in practice and on the horizon 
- 
                                      JavaScript and Beyond: The Future of the Frontend                  
                  Exploring the great frontend frameworks that make JavaScript so popular and theg JavaScript-based languages revolutionising frontend development. 
- 
                                      Speaker AMAs (Ask Me Anything)                   
                  
Wednesday, 7 March
- 
                                      Distributed Stateful Systems                   
                  Architecting and leveraging NoSQL revisitied 
- 
                                      Operating Systems: LinuxKit, Unikernels, & Beyond                  
                  Applied, practical, & real-world deep-dive into industry adoption of OS, containers and virtualisation, including Linux on Windows, LinuxKit, and Unikernels 
- 
                                      Architecting for Failure                  
                  If you're not architecting for failure you're heading for failure 
- 
                                      Evolving Java and the JVM: Mobile, Micro and Modular                  
                  Although the Java language is holding strong as a developer favourite, new languages and paradigms are being embraced on JVM. 
- 
                                      Tech Ethics in Action                  
                  Learning from the experiences of real-world companies driving technology decisions from ethics as much as technology. 
- 
                                      Bare Knuckle Performance                  
                  Killing latency and getting the most out of your hardware 
- 
                                      Speaker AMAs (Ask Me Anything)                   
                  
 
                              






