AI for Food Image Generation in Production: How & Why

In this talk, we will conduct a technical overview of a client-facing Food Image Generation solution developed at Delivery Hero. We will explore step-by-step the stages of the product development cycle starting from an initial business hypothesis, following up with the fast-to-market product validation MVP and the full-scale productization phase that enabled generating 100,000 images per day.

The first focus of our discussion will be on the modern approaches in Image Generation that enable the generation of high-quality food-related visual content. We will explore the challenges of experimenting with the image generation models, the evaluation techniques and the ways to fine-tune these models. We'll also cover a set of advanced Computer Vision methods that help maintain high standards of visual content quality by automatic validation of the images across dimensions like positioning, colour balance, appropriateness and content relevance.

We will also consider the practical aspects of serving and scaling the visual models in production depending on the maturity level of the product and the infrastructure. Following up with the technical stack, we will outline the most appropriate approaches for each of the stages focusing on cost-efficiency, as well as the architectural decisions made at Delivery Hero to host and scale a zoo of visual models.

Interview:

What is the focus of your work?

As a Senior Data Scientist at Delivery Hero, I'm leading the AI-related projects of the AI Menu Content team. Our main products cover image generation and image enhancement problems. The development process includes several steps: data curation and labeling, modeling, experimentation, and integration into production. Our model zoo includes a set of Computer Vision and Image Generation methods based on the latest advancements in the area, which makes it highly fascinating to work with.

What’s the motivation for your talk?

The primary goal behind my speech is to motivate more teams in the industry to work on the image generation problem. That's why I'm excited to demonstrate the business case and the value such work could bring to a company.

Who is your talk for?

The talk targets AI & ML leaders and practitioners: we'll go through the business foundations and impact, following up with the methods used and the details on how the models are served in production. The focus will be on the applied part of the story and the real-world use cases.

What do you want someone to walk away with from your presentation?

With such rapid development in the area of deep learning and AI in general, many companies still wait for a broader adoption of these technologies in the industry before investing in them. The most important point I'd like to make clear is that with the modern stack of available frameworks, cloud services, and open-source models it's possible to achieve low time-to-market, making such investments justified.

What do you think is the next big disruption in software?

The AI agents which are capable of effectively utilizing the desktop setup and interacting with web services have the potential to disrupt the industry once their generalization capabilities and cost-efficiency reach a certain level. We can already see the first steps in this direction with the models from Anthropic and Open AI. It's obvious that the communication interfaces will evolve with time, potentially creating a new industry of AI-tailored operating systems and services.


Speaker

Iaroslav Amerkhanov

Senior Data Scientist @Delivery Hero

Iaroslav pioneered projects in Food Science at Delivery Hero and is now focused on generative AI solutions. He previously founded an EdTech startup and co-founded a sentiment analysis platform.

Read more

Date

Tuesday Apr 8 / 01:35PM BST ( 50 minutes )

Location

Whittle (3rd Fl.)

Topics

AI/ML Image Generation computer vision stable diffusion

Slides

Slides are not available

Share

From the same track

Session AI/ML

Deploy MultiModal RAG Systems with vLLM

Tuesday Apr 8 / 10:35AM BST

While text-based RAG systems have been everywhere in the last year and a half, there is so much more than text data. Images, audio, and documents often need to be processed together to provide meaningful insights, yet most RAG implementations focus solely on text.

Speaker image - Stephen Batifol

Stephen Batifol

Developer Advocate @Zilliz, Founding Member of the MLOps Community Berlin, Previously Machine Learning Engineer @Wolt, and Data Scientist @Brevo

Session AI/ML

How to Unlock Insights and Enable Discovery Within Petabytes of Autonomous Driving Data

Tuesday Apr 8 / 11:45AM BST

For autonomous vehicle companies, finding valuable insights within millions of hours of video data is essential yet challenging.

Speaker image - Kyra Mozley

Kyra Mozley

Machine Learning Engineer @Wayve

Session

Foundation Models for Recommenders: Challenges, Successes, and Lessons Learned

Tuesday Apr 8 / 02:45PM BST

Recommender systems are an integral part of most products nowadays and are often a key driver of discovery for users of the product.

Speaker image - Moumita Bhattacharya

Moumita Bhattacharya

Senior Research Scientist @Netflix, Previously @Etsy

Session

Building Embedding Models for Large-Scale Real-World Applications

Tuesday Apr 8 / 03:55PM BST

Embedding models are at the core of search, recommendation, and retrieval-augmented generation (RAG) systems, transforming data into meaningful representations.

Speaker image - Sahil Dua

Sahil Dua

Senior Software Engineer, Machine Learning @Google, Stanford AI, Co-Author of “The Kubernetes Workshop”, Open-Source Enthusiast

Session

Lessons Learned From Building LinkedIn’s First Agent: Hiring Assistant

Tuesday Apr 8 / 05:05PM BST

In October 2024, we announced LinkedIn’s first agent, Hiring Assistant to a select group of LinkedIn customers.

Speaker image - Karthik Ramgopal

Karthik Ramgopal

Distinguished Engineer & Tech Lead of the Product Engineering Team @LinkedIn, 15+ Years of Experience in Full-Stack Software Development

Speaker image - Daniel Hewlett

Daniel Hewlett

Principal AI Engineer & Technical Lead for AI @LinkedIn, 12+ Years of Expierence in ML and AI Engineering, Previously @Google