In this talk, Omar will talk about trends in the ML ecosystem for Open Science and Open Source. Omar will talk about the power of creating interactive demos using Open Source libraries, BigScience, a one-year long research workshop involving over 700 researchers, and other community-led efforts that are making the field more accessible than ever.

1 Hear about open source tools that are available for machine learning projects.

2 Learn about the benefits of using such tools and sharing back to the community.

Omar, what is the focus of your work these days?

I am a machine learning engineer at the Hugging Face, a fully open source company that is democratizing good machine learning. My concrete work is focusing on integrating different open source libraries with the Hub, an open, free platform that allows anyone to share and access machine learning models.

What's the motivation for your talk?

Right now in Hugging Face we have over 30000 models and we also have thousands of datasets and demos built by the community. But there are many people that can leverage all of this existing work. The main goal is to help people know what open source tools are out there.

Instead of reinventing the wheel again and again, we should leverage what exists. It's similar to what people do in software development. If you are a software engineer, you will probably go and access open source libraries that are already out there, probably in GitHub. You are trying to do something similar, but with machine learning, right? You can go to the Hugging Face Hub, find a machine learning model you can modify for your specific use case or project, and you can share that with the community. 

The talk is also about sharing this collaborative mindset. There is a very big project in which we are participating, which is called Big Science. It's a collaboration between over 700 researchers from many different organizations at universities, people from universities, from Google, from everywhere. The idea here is to train a large machine learning model in a scientifically rigorous, ethical way. So instead of asking the ethical questions about data afterward, the questions are being asked at the beginning of this process, and it's a collaboration with many people. So it's not just a small group of people deciding on others. This is a very big project we're working on, sharing, collaborating, working with the community.

How would you describe the persona and level of the target audience?

Many people from different backgrounds can benefit from this talk, from software engineers working in ML-related projects to technical product managers.

What do you want these people to walk away with from your presentation?

There are two main things that people can walk away from this presentation. The first one is just to learn about the huge open source landscape that is out there. And it's not just libraries to train models, but it's also many models, datasets and demos that people can use. And the second part is learning about this mindset of sharing and collaborating instead of competing. Sharing a lot of that machine learning work that you're doing as a company will also benefit you. Those are the two main goals.


Omar Sanseviero

Machine Learning Engineer @huggingface

Omar Sanseviero is a Machine Learning engineer with 7 years of experience. Currently, he works at Hugging Face in the Open Source team democratizing the usage of Machine Learning. Previously, Omar worked as a Software Engineer at Google in the teams of Assistant and TensorFlow Graphics. Omar is...

