Flawed ML Security: Mitigating Security Vulnerabilities in Data & Machine Learning Infrastructure with MLSecOps

The operation and maintenance of large scale production machine learning systems has uncovered new challenges which require fundamentally different approaches to that of traditional software. The field of security in data & machine learning infrastructure has seen a growing rise in attention due to the critical risks being identified as it expands into more demanding real-world use-cases. In this talk we will introduce the motivations and the importance of security in data & machine learning infrastructure through a set of practical examples showcasing "Flawed Machine Learning Security". 

These "Flawed ML security" examples are analogous to the annual "OWASP Top 10" report that highlights the top vulnerabilities in the web space, and will highlight common high risk touchpoints. We'll cover a practical example covering how we can mitigate these critical security vulnerabilities. We will cover concepts such as RBAC for ML system artifacts and resources, encryption and access restrictions of data in transit and at rest, best practices for supply chain vulnerability mitigation, tools for vulnerability scans, and templates that practitioners can introduce to ensure best practices.

Interview:

What's the focus of your work these days?

My current focus is on building platforms that enable Data Scientists and ML Engineers to iterate throughout the Model Development Life Cycle. This takes a lot of learnings from DevOps, like the “shift left” paradigm, to abstract as many details as possible from the underlying systems while letting them “own” their own models, training code, etc. at the abstraction level that is useful for them.

What's the motivation for your talk at QCon London 2024?

In the last few years there has been a huge increase in literature around MLOps, and how to do ML systems in the right way. However,  there has been very little focus around security in the traditional “cybersecurity” / SecOps sense. As a fellow of the Institute for Ethical AI, one of our key concerns is to spread awareness over this gap and kickstart a conversation about what’s the right way to secure our ML systems.

How would you describe your main persona and target audience for this session?

The talk is quite broad in focus, so it should interest to AI practitioners (i.e. Data Scientists / ML Engineers), to DevOps / Architects and everyone in between. It doesn’t need a deep technical level, however some familiarity with the Model Development Life Cycle may be useful.

Is there anything specific that you'd like people to walk away with after watching your session?

I’ll be 100% honest: you won’t learn any silver bullets from this talk. When it comes to security, every solution will be a combination of processes, tools and humans. Instead, the goal is make the attendee aware of this current gap in ML Systems Design and to introduce them to the field of MLSecOps.


Speaker

Adrian Gonzalez-Martin

Senior MLOps Engineer, Previously Leader of the MLServer Project @Seldon

Adrian is Senior MLOps Engineer, with an extensive experience maintaining open source and enterprise MLOps products to solve large scale problems at leading organisations in the Automotive, Pharmaceutical, Financial and Technology sectors. When he is not doing that, Adrian loves experimenting with new technologies and catching up with the MLOps open source community, where he is a member of the LFAI’s MLSecOps Working Group. Before MLOps, Adrian has worked as a Software Engineer across different startups, where he contributed and led the development of large production codebases. Adrian holds an MSc in Machine Learning from University College London, where he specialised in probabilistic methods applied to healthcare, as well as a MEng in Computer Science from the University of Alicante.

Read more
Find Adrian Gonzalez-Martin at:

Date

Tuesday Apr 9 / 02:45PM BST ( 50 minutes )

Location

Windsor (5th Fl.)

Share

From the same track

Session AI/ML

Lessons Learned From Building LinkedIn’s AI Data Platform

Tuesday Apr 9 / 05:05PM BST

Taking AI from lab to business is notoriously difficult. It is not just about picking which model flavor of the day to use. More important is making every step of the process reliable and productive.

Speaker image - Felix GV

Felix GV

Principal Staff Engineer @LinkedIn

Session AI/ML

Mind Your Language Models: An Approach to Architecting Intelligent Systems

Tuesday Apr 9 / 11:45AM BST

As large language models (LLMs) emerge from the realm of proof-of-concept (POC) and into mainstream production, the demand for effective architectural strategies intensifies.

Speaker image - Nischal HP

Nischal HP

Vice President of Data Science @Scoutbee, Decade of Experience Building Enterprise AI

Session

Large Language Models for Code: Exploring the Landscape, Opportunities, and Challenges

Tuesday Apr 9 / 03:55PM BST

In the rapidly evolving landscape of software development, Large Language Models (LLMs) for code have emerged as a groundbreaking tool for code completion, synthesis and analysis.

Speaker image - Loubna Ben Allal

Loubna Ben Allal

Machine Learning Engineer @Hugging Face

Session

When AIOps Meets MLOps: What Does It Take To Deploy ML Models at Scale

Tuesday Apr 9 / 10:35AM BST

In this talk, we introduce the concept of AIOps referring to using AI and data-driven tooling to provision, manage and scale distributed IT infra. We particularly focus on how AIOps can be leveraged to help train and deploy machine learning models and pipelines at scale.

Speaker image - Ghida Ibrahim

Ghida Ibrahim

Chief Architect, Head of Data @Sector Alarm Group, Ex-Facebook/Meta

Session

Connecting the Dots: Applying Generative AI (Limited Space - Registration Required)

Tuesday Apr 9 / 01:35PM BST

Details coming soon.