How to Use Conversational Analytics: Complete Step by Step BigQuery Guide
By Braincuber Team
Published on May 7, 2026
Conversational Analytics shifts data interaction from manual SQL queries to natural language conversations. Instead of writing SELECT statements, you chat with a data agent that understands your business context and returns answers grounded in your actual tables. This beginner guide walks you through setting up and using Conversational Analytics in BigQuery with Gemini-powered AI.
What You'll Learn:
- What Conversational Analytics is and how Gemini powers data agents
- How to set up Conversational Analytics in BigQuery Studio
- Build, configure, and refine your own data agents
- Write effective agent instructions and define glossary terms
- Use verified queries to improve accuracy
- Advanced capabilities: ML, unstructured data, and API sharing
What Are Conversational Analytics?
Conversational analytics shifts data interaction from manual SQL queries to natural language conversations. Instead of writing SELECT statements, you chat with a data agent that understands your business context and returns answers grounded in your actual tables.
This isn't just a basic text-to-SQL parser; it is a profound step toward true data democratization. It allows non-technical users to access real-time insights independently, and gives data professionals a rapid way to explore datasets and automate reporting.
How Gemini Powers Data Agents
At the heart of BigQuery's conversational analytics is a reasoning engine powered by the Gemini family of models. Data agents use a structured, multi-stage pipeline to ensure insights are based on your specific data context:
Intent Interpretation
The agent evaluates the prompt against custom instructions, metadata, and business glossaries. This ensures understanding of terms such as "Q3 performance," which relate to specific fiscal calendars and KPIs.
Schema-Grounded SQL Generation
The AI translates natural language into optimized SQL. The code is specifically mapped to BigQuery schemas and verified against internal logic to ensure accuracy.
Secure Execution
The agent runs the generated query directly within your BigQuery environment. This maintains existing security and IAM protocols. The agent only sees what you are authorized to see.
Insight Synthesis
The agent distills raw rows into a human-readable summary. It provides the underlying data and dynamic visualizations (like charts), and maintains the "thread" of the conversation for follow-up questions.
BigQuery vs Looker: Choosing the Right Entry Point
Google Cloud offers conversational analytics across different layers of your data stack. Choosing the right entry point depends on your users and where your business logic lives:
| Feature | BigQuery Conversational Analytics | Looker Conversational Analytics | Data Studio (via BigQuery Agents) |
|---|---|---|---|
| Best For | Data teams, analysts, and developers building custom applications | Business users who need governed, dashboard-ready insights | Business users who prefer lightweight BI reporting |
| Grounding Method | Direct warehouse schemas, table metadata, and verified queries | LookML (semantic layer) | Connected directly to pre-built BigQuery data agents |
| Data Access | Can analyze structured, predictive (ML), and unstructured data | Strictly structured, modeled data | Structured data |
| Release Status | Preview (as of May 2026) | Generally available | Preview |
Choose BigQuery if you need to build custom AI applications or analyze unstructured data directly. Choose Looker if your organization requires consistent metrics through LookML. Choose Data Studio if you want to provide non-technical users with a simple way to query existing BigQuery Data Agents. This tutorial focuses on BigQuery as the fastest way for data teams to prototype and productionize agents directly where the data lives.
How Data Agents Work in BigQuery
It is important to understand the architecture of a data agent before setting it up. In the Google Cloud environment, a data agent is the central abstraction layer. It combines BigQuery assets with the reasoning capabilities of the Gemini family of models.
Instead of exposing raw tables directly, a data agent configures everything the model needs to interpret questions, generate secure SQL, and return trustworthy answers. This combination of data sources, instructions, and verified logic makes BigQuery's conversational analytics more reliable than standard text-to-SQL tools.
Knowledge Sources
Tables, Views, and User Defined Functions (UDFs) can connect as knowledge sources. Multiple sources can connect to a single agent for comprehensive analysis.
Agent Instructions
Custom instructions, synonyms, and business glossaries ground the agent in a specific domain. Teach the agent that "Top Customers" refers to users with LTV over $1,000.
Verified Queries
Pre-defined question-and-answer pairs serve as source of truth. Map specific questions to expert-vetted SQL for critical KPIs and advanced ML requests.
Secure Execution
Agents operate only within authorized data. IAM roles control access, and all queries run within your existing BigQuery security protocols.
Setting Up Conversational Analytics in BigQuery
To follow along with this beginner guide, please make sure you have the following prerequisites:
Google Cloud Project
A Google Cloud project with BigQuery enabled and billing active.
SQL Familiarity
Basic SQL familiarity (you won't write much, but you'll review generated queries).
IAM Role
The Gemini Data Analytics Data Agent Owner IAM role (or equivalent Creator/Editor role).
Before building your first agent, you must configure your Google Cloud project and ensure your user account has the necessary permissions. Data Agents operate as a layer on top of your existing data, so correct IAM (Identity and Access Management) configuration is critical for both security and functionality.
Step 1: Grant IAM Role
Open the Google Cloud console and grant yourself the Gemini Data Analytics Data Agent Owner role:
Navigate to IAM
Go to IAM & Admin → IAM.
Grant Access
Click Grant Access. Add your email and assign the role Gemini Data Analytics Data Agent Owner.
This role grants you permission to create, edit, share, and delete all data agents in the project.
Step 2: Enable Data Analytics API
Use the sidebar navigation menu or search menu at the top of the page to navigate to BigQuery. In the left navigation, click Agents. If the feature is not yet enabled, you'll see a prominent banner or button that says Enable the Data Analytics API with Gemini. Click it, and then enable both the Gemini in BigQuery API and the Gemini for Google Cloud API.
Once enabled, the Agents page becomes fully functional. You should now see the new agent page where you can create and manage data agents.
Navigating the Agent Catalog
The Agent Catalog is used to create, manage, and version data agents within BigQuery Studio. Here's what you'll find in the Agent Catalog:
Predefined Sample Agent
A ready-to-use example agent created automatically for every project. It's view-only and great for exploring how a finished agent looks before you build your own.
My Draft Agents
Agents you've started but not yet published (perfect for experimentation).
My Agents
Agents you have created and published.
Shared by Others
Agents published by teammates in your organization (if they've shared them with you).
The agent lifecycle follows this structure: Draft → Created → Published.
Draft
You can edit instructions, add knowledge sources, test queries in the live preview pane, and iterate without affecting anyone else.
Created
Saved version that's still private.
Published
Live and usable by anyone with the correct IAM roles. Published agents can be used directly in BigQuery Studio, Data Studio Pro, or via the Conversational Analytics API.
Creating and Configuring a Data Agent in BigQuery
Now that the foundation is set, let's build a Data Agent from scratch. We will use the bigquery-public-data.austin_bikeshare dataset to transform raw trip data into a conversational interface. We'll use two tables:
bikeshare_trips
Detailed trip-level data including trip_id, start_station_name, end_station_name, subscriber_type, start_time, and duration_minutes.
bikeshare_stations
Station metadata including station_id and station names for mapping locations.
Starting the Agent Creation
Navigate to Agent Catalog
Make sure you're on the Agent Catalog tab.
Create Agent
Click the Create agent button. The New agent page opens with an Editor panel on the left and a live Preview pane on the right.
Fill Basics
In the Editor section, fill in the basics first (agent name and description). These two fields help you quickly identify the agent later.
Selecting Knowledge Sources
Knowledge sources define exactly what data the agent can access. The fewer and more focused the sources, the better the accuracy and the lower the cost. In the Knowledge sources section of the editor, click Add source. Search for austin_bikeshare and select bikeshare_trips and bikeshare_stations as the sources.
For each table you add, click Customize. Gemini will automatically generate a description and suggest column metadata. Review everything, accept the accurate suggestions, make any tweaks, and click Update.
Pro Tip
A common mistake is adding 50 tables at once. Start with 2-3 core tables. This makes it easier to debug the agent's logic. You can always expand the knowledge later once the core queries are accurate.
Writing Effective Agent Instructions
Next, you need to ground your agent with instructions. Rather than just writing a generic text prompt (e.g., "Answer questions about sales"), BigQuery's data agent interface allows you to provide highly structured context to guide the AI's query generation. Think of this as onboarding a new analyst with your company's exact data dictionary.
Use the Instructions field to provide structured business context:
Synonyms
Define alternative terms for your columns so the agent understands natural language variations. Example: "Journey", "Ride", and "Commute" all refer to a record in the bikeshare_trips table.
Key Fields
Highlight the most important fields for analysis. Example: Prioritize trip_id, start_station_name, end_station_name, subscriber_type, start_time, and duration_minutes for general reporting.
Excluded Fields
Specify columns that the data agent should strictly avoid. Example: Do not use the bike_id column in the bikeshare_trips table for most analysis, as it is rarely needed for business questions.
Filtering and Grouping
Instruct the agent on standard ways to slice the data. Example: Unless specified otherwise, always filter out trips where duration_minutes < 1 (these are false starts or test rides). Default to grouping data by start_station_name when the user asks for "by station" or "top stations".
Join Relationships
Since our agent pulls from multiple tables, explicitly define how they connect. Example: Join the bikeshare_trips table to the bikeshare_stations table by matching bikeshare_trips.start_station_id to bikeshare_stations.station_id (and similarly for end_station_id).
You can combine all of the above into one clean block in the Instructions field. Begin your instructions with a clear persona statement like "You are a senior transportation analyst..." This sets the tone, depth of analysis, and output style.
Defining Glossary Terms
In addition to instructions, you can (and should) define glossary terms directly in the agent. These help the agent interpret business jargon, abbreviations, and derived concepts consistently.
Click Add term in the Glossary section (usually near Instructions) and create terms with a term, definition, and synonyms (comma-separated).
| Term | Definition | Synonyms |
|---|---|---|
| duration_minutes | Trip duration in minutes. Always use this for user-facing answers and calculations | ride time, trip length, duration, ride duration |
| ridership | The total number (count) of bike trips started | trips, rides, journeys, bike usage, commute count |
| peak_hours | Morning peak (7-9) or evening peak (16-19) hours based on the hour extracted from start_time | rush hour, busy hours, high demand period |
| subscriber_type | Type of rider — Subscriber (monthly or annual pass holder) or Customer (one-time ride) | user type, membership type, pass holder, member, casual rider |
| false_start | A very short trip (usually under 1 minute) that is likely a test ride or accidental unlock | test ride, invalid trip, short trip |
By using glossaries, if leadership decides to change the official definition of a "Long Ride" to 45 minutes next quarter, your data governance team only has to update it once in Dataplex. Every Data Agent connected to that glossary will immediately adopt the new logic, keeping it consistent across the organization.
Testing Your Agent with Natural Language Queries
Once you've configured the knowledge sources, instructions, and glossary terms, it's time to test your agent before publishing. Scroll to the right side of the screen to the Preview pane. This live chat interface lets you interact with your agent in real time as you build it.
Try these four queries of increasing complexity (adjusted to match the dataset's data range up to 2024):
Simple Lookup
"How many trips started in June 2024?"
Filtered Aggregation
"What were the top 5 stations by ridership in the last quarter of 2024?"
Multi-Step Analysis
"Compare average trip duration on weekdays versus weekends in 2024."
Follow-Up Question
"Now show the same comparison, but only for the Zilker Park station." (Tests conversational memory)
What you'll see in the agent's response: Summary (natural-language explanation), Query result (clean table), Insights (bullet-point takeaways), Generated SQL (click Open in Editor to view), and Visualization (auto-generated chart).
Improving Accuracy with Verified Queries
Even with good instructions and glossary terms, your data agent may still occasionally misinterpret business rules or generate inconsistent answers. Verified queries solve this by letting you explicitly teach the agent the correct way to handle important or frequently asked questions.
Each verified query consists of a natural language question paired with the exact SQL that should be used. They serve as high-quality examples that anchor the agent's reasoning and are one of the most effective ways to move from a "good enough" agent to a production-ready one.
Writing Your First Verified Query
In the agent editor, scroll to the Verified Queries section. You have two easy ways to add verified queries:
Option 1: Create Manually
Click Add query. Enter a natural language Question, write or paste the correct SQL in the editor, click Run to test it, then click Add to save it.
Option 2: Use Gemini-Generated Suggestions
Click View Gemini-generated suggestions. Review the suggested questions, check the associated tables, select the ones you like, edit if needed, and click Add.
A good verified query for the Austin Bikeshare dataset could be:
Question: What were the top 5 stations by ridership in Q2 2024?
SQL:
WITH QuarterlyRidership AS (
SELECT start_station_id AS station_id, COUNT(trip_id) AS ridership_count
FROM bigquery-public-data.austin_bikeshare.bikeshare_trips
WHERE TIMESTAMP_TRUNC(start_time, QUARTER) = TIMESTAMP '2024-04-01 00:00:00'
GROUP BY start_station_id
UNION ALL
SELECT CAST(end_station_id AS INT64) AS station_id, COUNT(trip_id) AS ridership_count
FROM bigquery-public-data.austin_bikeshare.bikeshare_trips
WHERE TIMESTAMP_TRUNC(start_time, QUARTER) = TIMESTAMP '2024-04-01 00:00:00'
AND end_station_id IS NOT NULL
GROUP BY CAST(end_station_id AS INT64)
)
SELECT stations.name AS station_name, SUM(qr.ridership_count) AS total_ridership
FROM QuarterlyRidership AS qr
INNER JOIN bigquery-public-data.austin_bikeshare.bikeshare_stations AS stations
ON qr.station_id = stations.station_id
GROUP BY stations.name
ORDER BY SUM(qr.ridership_count) DESC
LIMIT 5;
Iterating on Data Agent Accuracy
Even when the agent gives a reasonable answer on the first try, you can make it significantly more accurate and consistent by reviewing the generated SQL and adding verified queries.
Ask a Question
Ask a question in the Preview pane.
Review SQL
Review the generated SQL (click the SQL box and then Open in Editor to see the full query).
Identify Issues
Identify where the agent went wrong or where the logic can be improved.
Fix with Verified Query
Fix it by adding a verified query (or refining instructions).
Re-Test
Re-test the same question and observe the improvement.
Advanced Capabilities of BigQuery Conversational Analytics
BigQuery's conversational analytics stands out from simple text-to-SQL tools by natively supporting BigQuery ML functions, unstructured data, and easy sharing across the Google Cloud ecosystem.
Machine Learning
One of the biggest differentiators is the agent's ability to call BigQuery ML functions directly from natural language, to move beyond retrospective reporting into forward-looking insights.
Predictive Analytics
Ask a data agent to predict the daily number of trips for the next 30 days based on 2024 trends. It will trigger AI.FORECAST, and generate a forecast with a beautiful chart.
Anomaly Detection
Ask an agent to detect anomalies in daily ridership during June 2024. It will invoke AI.DETECT_ANOMALIES, comparing June 2024 against prior months, and return a time-series table plus a line chart.
Querying Unstructured Data
Most conversational BI tools fail the moment data isn't neatly organized into rows and columns. BigQuery, however, supports Object Tables, which allow you to analyze unstructured data (like PDFs, images, and raw text logs) stored in Google Cloud Storage. Because the Data Agent is powered by Gemini's multimodal capabilities, it can reason across both your structured metrics and your unstructured files simultaneously. If you have rider survey PDFs or station inspection images in an object table, simply ask, "Summarize the main complaints from the Q2 2024 rider survey PDFs."
Sharing Agents Across Data Studio Pro and API
Your data team builds and tests Data Agents in BigQuery Studio, but your end-users likely live in entirely different applications. Google makes it easy to decouple the agent from the GCP Console so you can meet business users where they already work.
Data Studio Pro
Surface your published BigQuery Data Agents directly inside Data Studio Pro for non-technical stakeholders.
Conversational Analytics API & ADK
Embed your agent into a custom web application, internal Slack bot, or customer-facing portal using the Conversational Analytics API and Agent Development Kit (ADK).
Frequently Asked Questions
What is Conversational Analytics in BigQuery?
Conversational Analytics is a Gemini-powered feature in BigQuery that allows users to chat with data agents using natural language. Instead of writing SQL, you ask questions and receive data, charts, and generated SQL instantly.
Do I need to know SQL to use Conversational Analytics?
Basic SQL familiarity helps to review generated queries, but non-technical users can access insights independently. Data professionals can use it as a rapid exploration tool.
How does BigQuery Conversational Analytics differ from Looker?
BigQuery Conversational Analytics is for data teams building custom applications with direct warehouse schemas. Looker uses LookML semantic layer for governed, dashboard-ready insights for business users.
Can I use Machine Learning models with Conversational Analytics?
Yes, agents can call BigQuery ML functions directly from natural language for predictive analytics and anomaly detection. You can generate churn predictions, sales forecasts, and time-series forecasts.
How do I improve my data agent's accuracy?
Write clear agent instructions with synonyms, key fields, and join relationships. Define glossary terms and add verified queries for frequently asked questions to anchor the agent's reasoning.
Need Help with Google Cloud & BigQuery?
Our experts can help you set up Conversational Analytics, build custom data agents, and implement Gemini-powered workflows for your organization.
