Google Research team has introduced a new agentic RAG framework. It is built into the Gemini Enterprise Agent Platform. It powers a feature called Cross-Corpus Retrieval, now in public preview.
The target is a known failure mode in enterprise search. Standard single-step RAG was not built for multi-source, multi-hop queries. Ask “What are the specs of the server used in Project X?” The system may find a document naming a server ID. It will not know to take that ID and search a second database for specs. The answer comes back partial, or as “not found.”
What is Google’s New Agentic RAG
Agentic RAG plans, reasons, and iteratively interacts with data sources. It handles complex queries to increase dependability and accuracy. Google’s version is the Gemini Enterprise Agent Platform-hosted Cross-Corpus Retrieval powered by Agentic RAG. Like other multi-agent RAG frameworks, it uses agents that work together. Unlike them, it adds a sufficient context check before generating a response. Compared to standard RAG, it increases accuracy on factuality datasets by up to 34%. Google’s research team also tested it on proprietary internal datasets. It reports better grounding and improved reasoning accuracy on domain-specific tasks.
How the multi-agent architecture works
Think of it as an organized research department, not one search engine. A “Vanilla” RAG system just matches your question to documents. An LLM then generates a response from those matches. The multi-agent framework splits the job into specialized roles.
The Orchestrator decides the request is not a one-step job and delegates. The Planner Agent maps the information pathways across data sources. The Query Rewriter turns a vague request into several precise search queries. The Search Fanout Agent sends those queries to various retrieval sources. Finally, an LLM aggregates the collected context into a response.
What makes this framework different
The key difference is persistence. The framework knows when it is missing information and keeps searching. This stops the model from guessing when the first search is empty. It also avoids a premature “I don’t have enough information.”
That persistence comes from the Sufficient Context Agent, a new component in Google’s framework. Consider a doctor asking for a patient’s discharge medications, dietary restrictions, and allergic reactions.
In Phase 1, Orchestration, the Root Agent parses the request and delegates. The Planner Agent targets Pharmacy, Nutrition, and Clinical Notes. The Query Rewriter breaks the long request into simple, searchable questions.
In Phase 2, Search, the RAG Agent runs all query fanouts at once. It finds medications and diet, but no allergy mention. A Vanilla RAG system might stop here with an incomplete answer.
In Phase 3, the Sufficient Context Agent inspects the result. It reads the retrieved snippets pulled from the database. It reviews an intermediate draft against the prompt and snippets. Then it runs a missing pieces analysis. It does not just flag “insufficient context.” It writes a specific Reason and Feedback log naming the gap.
In Phase 4, Iteration, the Query Rewriter creates a new search for the missing term. The RAG Agent digs into files it skipped and finds the data.
In Phase 5, Synthesis, the agent confirms context is complete. The Synthesis Agent then writes a clean, accurate summary.
The benchmark case
Google team evaluated the system on FramesQA, which is based on the FRAMES research paper. FramesQA has 824 queries and a corpus of 2,676 PDF documents. The “Vanilla” baseline used Google’s RAG Engine. That engine includes an advanced retrieval engine, LLM parser, and re-ranker.
Agentic RAG ran in two settings. Single-corpus retrieves from the FramesQA documents only. Cross-corpus adds three distracting datasets, so the Planner Agent must choose where to retrieve. This mimics companies whose databases are managed by separate teams. Accuracy used an LLM-as-a-judge against ground truth answers.
In cross-corpus, the system nearly matched its single-corpus accuracy. It answered 90.1% of questions correctly while selecting the right corpus from four. Latency stayed within 3% on average between the two settings.
| Capability | Vanilla RAG (RAG Engine) | Standard agentic RAG | Google Cross-Corpus Agentic RAG |
|---|---|---|---|
| Retrieval style | Single-step match | Multi-agent, single pass | Multi-agent, iterative |
| Multiple agents | No | Yes | Yes |
| Sufficient Context Agent | No | No | Yes |
| Iterative re-search | No | No | Yes |
| Cross-corpus routing | No | No | Yes (Planner picks from 4) |
| Reported accuracy | Baseline | Not reported here | 90.1% cross-corpus; up to 34% factuality gain vs standard RAG |
| Latency | Not reported here | Not reported here | Within 3% single vs cross |
Use cases
The framework fits multi-hop, multi-source enterprise work. Healthcare teams can compile medications, diet, and allergy data from separate records. Engineering teams can trace a server ID to specs in another database. Finance and project teams can join budget data with timeline logs. The cross-corpus design suits organizations with databases owned by different teams.
Key Takeaways
- Google’s agentic RAG adds a Sufficient Context Agent that re-searches until context is complete.
- It ships as Cross-Corpus Retrieval in Gemini Enterprise Agent Platform, in public preview.
- Reported gain is up to 34% higher factuality accuracy versus standard RAG.
- Cross-corpus routing answered 90.1% of FramesQA questions while picking from four corpora.
- Latency stayed within 3% between single-corpus and cross-corpus runs.
Check out the Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.

