Self-hosted Language Models are going to power the next generation of applications in critical industries like financial services, healthcare, and defence. Self-hosting LLMs, as opposed to using API-based models, comes with its own host of challenges - as well as needing to solve business problems, engineers need to wrestle with the intricacies of model inference, deployment and infrastructure. In this talk we are going to discuss the best practices in model optimisation, serving and monitoring - with practical tips and real case-studies.
Interview:
What's the focus of your work these days?
At TitanML our focus is on making Generative AI applications easier to develop, deploy and serve. A large focus of our work recently is making it easier to build applications that involve both RAG and JSON constrained outputs.
What's the motivation for your talk at QCon London 2024?
Almost every business is trying to build and deploy LLM applications at the moment, however very few of them have successfully got these applications into production. Our teams are experts in deploying and serving LLM apps so we have a lot of tips and tricks to help other developers avoid common pitfalls.
How would you describe your main persona and target audience for this session?
This session is interesting for those working with or thinking of building with Generative AI, especially self-hosted open source AI. It is not a 'code-along' session, however there may be some technical concepts.
Is there anything specific that you'd like people to walk away with after watching your session?
I want this persona to realize that deploying LLMs within your own environment is a viable option and is not as scary as it might appear!
Speaker
Meryem Arik
Co-Founder @TitanML
Meryem co-founded TitanML with the vision of creating a seamless and secure infrastructure for enterprise LLM deployments. Meryem's training was in Theoretical Physics and Philosophy at the University of Oxford. Beyond her contributions to TitanML, Meryem is dedicated to sharing her insights on the practical and ethical adoption of AI in enterprise.
Speaker
Meryem Arik
Co-Founder @TitanML