In this talk, we will conduct a technical overview of a client-facing Food Image Generation solution developed at Delivery Hero. We will explore step-by-step the stages of the product development cycle starting from an initial business hypothesis, following up with the fast-to-market product validation MVP and the full-scale productization phase that enabled generating 100,000 images per day.
The first focus of our discussion will be on the modern approaches in Image Generation that enable the generation of high-quality food-related visual content. We will explore the challenges of experimenting with the image generation models, the evaluation techniques and the ways to fine-tune these models. We'll also cover a set of advanced Computer Vision methods that help maintain high standards of visual content quality by automatic validation of the images across dimensions like positioning, colour balance, appropriateness and content relevance.
We will also consider the practical aspects of serving and scaling the visual models in production depending on the maturity level of the product and the infrastructure. Following up with the technical stack, we will outline the most appropriate approaches for each of the stages focusing on cost-efficiency, as well as the architectural decisions made at Delivery Hero to host and scale a zoo of visual models.
Interview:
What is the focus of your work?
As a Senior Data Scientist at Delivery Hero, I'm leading the AI-related projects of the AI Menu Content team. Our main products cover image generation and image enhancement problems. The development process includes several steps: data curation and labeling, modeling, experimentation, and integration into production. Our model zoo includes a set of Computer Vision and Image Generation methods based on the latest advancements in the area, which makes it highly fascinating to work with.
What’s the motivation for your talk?
The primary goal behind my speech is to motivate more teams in the industry to work on the image generation problem. That's why I'm excited to demonstrate the business case and the value such work could bring to a company.
Who is your talk for?
The talk targets AI & ML leaders and practitioners: we'll go through the business foundations and impact, following up with the methods used and the details on how the models are served in production. The focus will be on the applied part of the story and the real-world use cases.
What do you want someone to walk away with from your presentation?
With such rapid development in the area of deep learning and AI in general, many companies still wait for a broader adoption of these technologies in the industry before investing in them. The most important point I'd like to make clear is that with the modern stack of available frameworks, cloud services, and open-source models it's possible to achieve low time-to-market, making such investments justified.
What do you think is the next big disruption in software?
The AI agents which are capable of effectively utilizing the desktop setup and interacting with web services have the potential to disrupt the industry once their generalization capabilities and cost-efficiency reach a certain level. We can already see the first steps in this direction with the models from Anthropic and Open AI. It's obvious that the communication interfaces will evolve with time, potentially creating a new industry of AI-tailored operating systems and services.
Speaker
Iaroslav Amerkhanov
Senior Data Scientist @Delivery Hero
Iaroslav pioneered projects in Food Science at Delivery Hero and is now focused on generative AI solutions. He previously founded an EdTech startup and co-founded a sentiment analysis platform.