Retrieval-augmented generation, or RAG, has become a foundational approach to building production AI systems. However, deploying RAG in practice can be complex and costly. Developers typically have to manage vector databases, chunking strategies, embedding models, and indexing infrastructure. Designing effective RAG systems is also a moving target, as techniques and best practices evolve in step with rapidly advancing language models.
Google DeepMind recently released the File Search Tool, a fully managed RAG system built directly into the Gemini API. File Search abstracts away the retrieval pipeline, allowing developers to upload documents, code, and other text data, automatically generate embeddings, and query their knowledge base. We wanted to understand how the DeepMind team designed a general-purpose RAG system that maintains high retrieval quality.
Animesh Chatterji is a Software Engineer at Google DeepMind and Ivan Solovyev is a Product Manager at DeepMind, and they worked on File Search Tool. They joined the podcast with Sean Falconer to discuss the evolution of RAG, why simplicity and pricing transparency matter, how embedding models have improved retrieval quality, the tradeoffs between configurability and ease of use, and what’s next for multimodal retrieval across text, images, and beyond.
Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from AI to quantum computing. Currently, Sean is an AI Entrepreneur in Residence at Confluent where he works on AI strategy and thought leadership. You can connect with Sean on LinkedIn.
Please click here to see the transcript of this episode.
Sponsors
Why is there always a meeting bot in your Zoom call?
Blame Recall.ai.
Recall.ai powers the meeting bots and desktop recording apps behind products like Cluely, HubSpot, and ClickUp. They handle the hard infrastructure work—capturing clean recordings, transcripts, and metadata across Zoom, Google Meet, Microsoft Teams, in-person meetings, and more—so developers don’t have to build it themselves.
If you’re building a meeting notetaker or anything involving conversation data, Recall.ai is the API for meeting recording.
Get started today with $100 in free credits at recall.ai/software
In mobile application security, ‘good enough’ is a risk.
Guardsquare uses advanced, multi-layered code hardening techniques and automated runtime application self-protection and mobile application security testing, combined with real-time threat monitoring, to deliver the highest level of mobile app security.
Discover how Guardsquare brings all these together to provide mobile app security for your Android and iOS apps without compromise at www dot Guardsquare dot com.
You know Fidelity as a financial services leader. But did you know that inside Fidelity is a community of technologists working together to shape the future of finance and tech?
Fidelity is always investing in tomorrow: from emerging tech to cutting-edge tools that will transform what comes next. Their technologists are encouraged to keep learning so they can expand their skillsets, explore new ground, and stay ahead of this rapidly-evolving industry.
And right now Fidelity is hiring technologists to join their team.
Fidelity technologists get the best of both worlds: startup energy that’s grounded in the stability of a financial institution. That means support, resources, and amazing benefits.
Bring your skills to a culture where you’re empowered to dream big and build the tech that drives an organization and makes a real impact on people’s lives.
Find out more at Tech.FidelityCareers.com. That’s Tech.FidelityCareers.com.
Fidelity is an equal opportunity employer.

