The operation and maintenance of large scale production machine learning systems has uncovered new challenges which require fundamentally different approaches to that of traditional software. The field of security in data & machine learning infrastructure has seen a growing rise in attention due to the critical risks being identified as it expands into more demanding real-world use-cases. In this talk we will introduce the motivations and the importance of security in data & machine learning infrastructure through a set of practical examples showcasing "Flawed Machine Learning Security".
These "Flawed ML security" examples are analogous to the annual "OWASP Top 10" report that highlights the top vulnerabilities in the web space, and will highlight common high risk touchpoints. We'll cover a practical example covering how we can mitigate these critical security vulnerabilities. We will cover concepts such as RBAC for ML system artifacts and resources, encryption and access restrictions of data in transit and at rest, best practices for supply chain vulnerability mitigation, tools for vulnerability scans, and templates that practitioners can introduce to ensure best practices.
Interview:
What's the focus of your work these days?
My current focus is on building platforms that enable Data Scientists and ML Engineers to iterate throughout the Model Development Life Cycle. This takes a lot of learnings from DevOps, like the “shift left” paradigm, to abstract as many details as possible from the underlying systems while letting them “own” their own models, training code, etc. at the abstraction level that is useful for them.
What's the motivation for your talk at QCon London 2024?
In the last few years there has been a huge increase in literature around MLOps, and how to do ML systems in the right way. However, there has been very little focus around security in the traditional “cybersecurity” / SecOps sense. As a fellow of the Institute for Ethical AI, one of our key concerns is to spread awareness over this gap and kickstart a conversation about what’s the right way to secure our ML systems.
How would you describe your main persona and target audience for this session?
The talk is quite broad in focus, so it should interest to AI practitioners (i.e. Data Scientists / ML Engineers), to DevOps / Architects and everyone in between. It doesn’t need a deep technical level, however some familiarity with the Model Development Life Cycle may be useful.
Is there anything specific that you'd like people to walk away with after watching your session?
I’ll be 100% honest: you won’t learn any silver bullets from this talk. When it comes to security, every solution will be a combination of processes, tools and humans. Instead, the goal is make the attendee aware of this current gap in ML Systems Design and to introduce them to the field of MLSecOps.
Speaker
Adrian Gonzalez-Martin
Senior MLOps Engineer, Previously Leader of the MLServer Project @Seldon
Adrian is Senior MLOps Engineer, with an extensive experience maintaining open source and enterprise MLOps products to solve large scale problems at leading organisations in the Automotive, Pharmaceutical, Financial and Technology sectors. When he is not doing that, Adrian loves experimenting with new technologies and catching up with the MLOps open source community, where he is a member of the LFAI’s MLSecOps Working Group. Before MLOps, Adrian has worked as a Software Engineer across different startups, where he contributed and led the development of large production codebases. Adrian holds an MSc in Machine Learning from University College London, where he specialised in probabilistic methods applied to healthcare, as well as a MEng in Computer Science from the University of Alicante.