The rise of LLMs that coherently use language has led to an appetite to ground the generation of these models in facts and private collections of data. This is motivated by the desire to reduce the hallucinations of these models, as well as supply them with up-to-date, often private information that is not a part of their training data. Retrieval-augmented generation is the method that uses a search step to ground models in relevant data sources. In this talk, we'll cover the common schematics of RAG systems and tips on how to improve them.
Interview:
What's the focus of your work these days?
I explore and advise enterprises and the developer community on applications of Large Language Models (LLMs).
What's the motivation for your talk at QCon London 2024?
I aim to give builders the intuition of problem-solving with LLMs and going beyond thinking of them as text-in / text-out monoliths.
How would you describe your main persona and target audience for this session?
This talk is accessible to a wide audience. All that's needed is curiosity around large language models.
Is there anything specific that you'd like people to walk away with after watching your session?
Insight into different possible systems to build using LLMs as individual components in a pipeline.
Is there anything interesting that you learned from a previous QCon?
How beautiful a melodica sound with algorithmically generated music in the background is. This is from the session Functional Composition by Chris Ford.
Speaker
Jay Alammar
Director & Engineering Fellow @Cohere & Co-Author of "Hands-On Large Language Models"
Jay is the co-author of Hands-On Large Language Models. Through his blog and YouTube channel, Jay has helped millions of people wrap their heads around complex machine learning topics. Jay harnesses a visual, highly-intuitive presentation style to communicate concepts ranging from the most basic intros to data analysis, interactive intros to neural networks, to dissections of state-of-the-art models in Natural Language Processing.
Jay is Director and Engineering Fellow at Cohere, a leading provider of large language models for text generation, search, and retrieval-augmented generation for the enterprise.