Skip to main content
Go to documentation:
⌘U
Weaviate Database

Develop AI applications using Weaviate's APIs and tools

Deploy

Deploy, configure, and maintain Weaviate Database

Weaviate Agents

Build and deploy intelligent agents with Weaviate

Weaviate Cloud

Manage and scale Weaviate in the cloud

Additional resources

Academy
Integrations
Contributor guide
Events & Workshops

Need help?

Weaviate LogoAsk AI Assistant⌘K
Community Forum

Weaviate Query Agent: Usage

The Weaviate Query Agent is a pre-built agentic service designed to answer natural language queries based on the data stored in Weaviate Cloud.

The user simply provides a prompt/question in natural language, and the Query Agent takes care of all intervening steps to provide an answer.

Weaviate Query Agent from a user perspective Weaviate Query Agent from a user perspective

This page describes how to use the Query Agent to answer natural language queries, using your data stored in Weaviate Cloud.

Prerequisites

Weaviate instance

This Agent is available exclusively for use with a Weaviate Cloud instance. Refer to the Weaviate Cloud documentation for more information on how to set up a Weaviate Cloud instance.

You can try this Weaviate Agent with a free Sandbox instance on Weaviate Cloud.

Client library

Supported languages

At this time, this Agent is available only for Python and TypeScript/JavaScript. Support for other languages will be added in the future.

For Python, you can install the Weaviate client library with the optional agents extras to use Weaviate Agents. This will install the weaviate-agents package along with the weaviate-client package. For TypeScript/JavaScript, you can install the weaviate-agents package alongside the weaviate-client package.

Install the client library using the following command:

pip install -U weaviate-client[agents]

Troubleshooting: Force pip to install the latest version

For existing installations, even pip install -U "weaviate-client[agents]" may not upgrade weaviate-agents to the latest version. If this occurs, additionally try to explicitly upgrade the weaviate-agents package:

pip install -U weaviate-agents

Or install a specific version:

pip install -U weaviate-agents==1.0.0

Instantiate the Query Agent

Basic instantiation

Provide:

import os
import weaviate
from weaviate.classes.init import Auth
from weaviate.agents.query import QueryAgent


headers = {
# Provide your required API key(s), e.g. Cohere, OpenAI, etc. for the configured vectorizer(s)
"X-INFERENCE-PROVIDER-API-KEY": os.environ.get("YOUR_INFERENCE_PROVIDER_KEY", ""),
}

client = weaviate.connect_to_weaviate_cloud(
cluster_url=os.environ.get("WEAVIATE_URL"),
auth_credentials=Auth.api_key(os.environ.get("WEAVIATE_API_KEY")),
headers=headers,
)

# Instantiate a new agent object
qa = QueryAgent(
client=client, collections=["ECommerce", "FinancialContracts", "Weather"]
)

Configure collections

The list of collections to be queried are further configurable with:

  • Tenant names (required for a multi-tenant collection)
  • Target vector(s) of the collection to query (optional)
  • List of property names for the agent to use (optional)
  • Additional filters to always apply on top of agent-generated ones (optional)
from weaviate.agents.query import QueryAgent
from weaviate.agents.classes import QueryAgentCollectionConfig

qa = QueryAgent(
client=client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce", # The name of the collection to query
target_vector=[
"name_description_brand_vector"
], # Target vector name(s) for collections with named vectors
view_properties=[
"name",
"description",
"price",
], # Optional list of property names the agent can view
# Optional tenant name for collections with multi-tenancy enabled
# tenant="tenantA"
),
QueryAgentCollectionConfig(name="FinancialContracts"),
QueryAgentCollectionConfig(name="Weather"),
],
)
What does the Query Agent have access to?

The Query Agent derives its access credentials from the Weaviate client object passed to it. This can be further restricted by the collection names provided to the Query Agent.

For example, if the associated Weaviate credentials' user has access to only a subset of collections, the Query Agent will only be able to access those collections.

Additional options

The Query Agent can be instantiated with additional options, such as:

  • system_prompt: A custom system prompt to replace the default system prompt provided by the Weaviate team (systemPrompt for JavaScript).
  • timeout: The maximum time the Query Agent will spend on a single query, in seconds (server-side default: 60).

Custom system prompt

You can provide a custom system prompt to guide the Query Agent's behavior:

from weaviate.agents.query import QueryAgent

# Define a custom system prompt to guide the agent's behavior
system_prompt = """You are a helpful assistant that can answer questions about the products and users in the database.
When you write your response use standard markdown formatting for lists, tables, and other structures.
Emphasize key insights and provide actionable recommendations when relevant."""

qa = QueryAgent(
client=client,
collections=["ECommerce", "FinancialContracts", "Weather"],
system_prompt=system_prompt,
)

response = qa.ask("What are the most expensive items in the store?")
response.display()

User-defined filters

You can apply persistent filters that will always be combined with any agent-generated filters using logical AND operations.

from weaviate.agents.query import QueryAgent
from weaviate.agents.classes import QueryAgentCollectionConfig
from weaviate.classes.query import Filter

# Apply persistent filters that will always be combined with agent-generated filters
qa = QueryAgent(
client=client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce",
# This filter ensures only items above $50 are considered
additional_filters=Filter.by_property("price").greater_than(50),
target_vector=[
"name_description_brand_vector"
], # Required target vector name(s) for collections with named vectors
),
],
)

# The agent will automatically combine these filters with any it generates
response = qa.ask("Find me some affordable clothing items")
response.display()

# You can also apply filters dynamically at runtime
runtime_config = QueryAgentCollectionConfig(
name="ECommerce",
additional_filters=Filter.by_property("category").equal("Footwear"),
)

response = qa.ask("What products are available?", collections=[runtime_config])
response.display()

Async Python client

For usage example with the async Python client, see the Async Python client section.

Querying

The Query Agent supports two query types:

Search Weaviate with the Query Agent using natural langauge. The Query Agent will process the question, perform the necessary searches in Weaviate, and return the relevant objects.

# Perform a search using Search Mode (retrieval only, no answer generation)
search_response = qa.search("Find me some vintage shoes under $70", limit=10)

# Access the search results
for obj in search_response.search_results.objects:
print(f"Product: {obj.properties['name']} - ${obj.properties['price']}")

Search response structure

# SearchModeResponse structure for Python
search_response = qa.search("winter boots for under $100", limit=5)

# Access different parts of the response
print(f"Original query: {search_response.original_query}")
print(f"Total time: {search_response.total_time}")

# Access usage statistics
print(f"Usage statistics: {search_response.usage}")

# Access the searches performed (if any)
if search_response.searches:
for search in search_response.searches:
print(f"Search performed: {search}")

# Access the search results (QueryReturn object)
for obj in search_response.search_results.objects:
print(f"Properties: {obj.properties}")
print(f"Metadata: {obj.metadata}")
Example output
Original query: winter boots for under $100
Total time: 4.695224046707153
Usage statistics: requests=2 request_tokens=143 response_tokens=9 total_tokens=152 details=None
Search performed: queries=['winter boots'] filters=[[IntegerPropertyFilter(property_name='price', operator=<ComparisonOperator.LESS_THAN: '<'>, value=100.0)]] filter_operators='AND' collection='ECommerce'
Properties: {'name': 'Bramble Berry Loafers', 'description': 'Embrace your love for the countryside with our soft, hand-stitched loafers, perfect for quiet walks through the garden. Crafted with eco-friendly dyed soft pink leather and adorned with a subtle leaf embossing, these shoes are a testament to the beauty of understated simplicity.', 'price': 75.0}
Metadata: {'creation_time': None, 'last_update_time': None, 'distance': None, 'certainty': None, 'score': 0.4921875, 'explain_score': None, 'is_consistent': None, 'rerank_score': None}
Properties: {'name': 'Glitter Bootcut Fantasy', 'description': "Step back into the early 2000s with these dazzling silver bootcut jeans. Embracing the era's optimism, these bottoms offer a comfortable fit with a touch of stretch, perfect for dancing the night away.", 'price': 69.0}
Metadata: {'creation_time': None, 'last_update_time': None, 'distance': None, 'certainty': None, 'score': 0.47265625, 'explain_score': None, 'is_consistent': None, 'rerank_score': None}
Properties: {'name': 'Celestial Step Platform Sneakers', 'description': 'Stride into the past with these baby blue platforms, boasting a dreamy sky hue and cushy soles for day-to-night comfort. Perfect for adding a touch of whimsy to any outfit.', 'price': 90.0}
Metadata: {'creation_time': None, 'last_update_time': None, 'distance': None, 'certainty': None, 'score': 0.48828125, 'explain_score': None, 'is_consistent': None, 'rerank_score': None}
Properties: {'name': 'Garden Bliss Heels', 'description': 'Embrace the simplicity of countryside elegance with our soft lavender heels, intricately designed with delicate floral embroidery. Perfect for occasions that call for a touch of whimsy and comfort.', 'price': 90.0}
Metadata: {'creation_time': None, 'last_update_time': None, 'distance': None, 'certainty': None, 'score': 0.45703125, 'explain_score': None, 'is_consistent': None, 'rerank_score': None}
Properties: {'name': 'Garden Stroll Loafers', 'description': 'Embrace the essence of leisurely countryside walks with our soft, leather loafers. Designed for the natural wanderer, these shoes feature delicate, hand-stitched floral motifs set against a soft, cream background, making every step a blend of comfort and timeless elegance.', 'price': 90.0}
Metadata: {'creation_time': None, 'last_update_time': None, 'distance': None, 'certainty': None, 'score': 0.451171875, 'explain_score': None, 'is_consistent': None, 'rerank_score': None}

Search with pagination

Search supports pagination to handle large result sets efficiently:

# Search with pagination
response_page_1 = qa.search(
"Find summer shoes and accessories between $50 and $100 that have the tag 'sale'",
limit=3,
)

# Get the next page of results
response_page_2 = response_page_1.next(limit=3, offset=3)

# Continue paginating
response_page_3 = response_page_2.next(limit=3, offset=3)

# Access results from each page
for page_num, page_response in enumerate(
[response_page_1, response_page_2, response_page_3], 1
):
print(f"Page {page_num}:")
for obj in page_response.search_results.objects:
# Safely access properties in case they don't exist
name = obj.properties.get("name", "Unknown Product")
price = obj.properties.get("price", "Unknown Price")
print(f" {name} - ${price}")
print()
Example output
Page 1:
Glide Platforms - $90.0
Garden Haven Tote - $58.0
Sky Shimmer Sneaks - $69.0

Page 2:
Garden Haven Tote - $58.0
Celestial Step Platform Sneakers - $90.0
Eloquent Satchel - $59.0

Page 3:
Garden Haven Tote - $58.0
Celestial Step Platform Sneakers - $90.0
Eloquent Satchel - $59.0

Ask

Ask the Query Agent a question using natural language. The Query Agent will process the question, perform the necessary searches in Weaviate, and return the answer.

Consider your query carefully

The Query Agent will formulate its strategy based on your query. So, aim to be unambiguous, complete, yet concise in your query as much as possible.

# Perform a query using Ask Mode (with answer generation)
response = qa.ask(
"I like vintage clothes and nice shoes. Recommend some of each below $60."
)

# Print the response
response.display()

Configure collections at runtime

The list of collections to be queried can be overridden at query time, as a list of names, or with further configurations:

Specify collection names only

This example overrides the configured Query Agent collections for this query only.

response = qa.ask(
"What kinds of contracts are listed? What's the most common type of contract?",
collections=["FinancialContracts"],
)

response.display()

Configure collections in detail

This example overrides the configured Query Agent collections for this query only, specifying additional options where relevant, such as:

  • Target vector
  • Properties to view
  • Target tenant
  • Additional filters
from weaviate.agents.classes import QueryAgentCollectionConfig

response = qa.ask(
"I like vintage clothes and nice shoes. Recommend some of each below $60.",
collections=[
# Use QueryAgentCollectionConfig class to provide further collection configuration
QueryAgentCollectionConfig(
name="ECommerce", # The name of the collection to query
target_vector=[
"name_description_brand_vector"
], # Required target vector name(s) for collections with named vectors
view_properties=[
"name",
"description",
"category",
"brand",
], # Optional list of property names the agent can view
),
QueryAgentCollectionConfig(
name="FinancialContracts", # The name of the collection to query
# Optional tenant name for collections with multi-tenancy enabled
# tenant="tenantA"
),
],
)

response.display()

Conversational queries

The Query Agent supports multi-turn conversations by passing a list of ChatMessage objects. This works with both Search and Ask query types.

When building conversations with ChatMessage there are two available roles for messages:

  • user - Represents messages from the human user asking questions or providing context
  • assistant - Represents the Query Agent's responses or any AI assistant responses in the conversation history

The conversation history helps the Query Agent understand context from previous exchanges, enabling more coherent multi-turn dialogues. Always alternate between user and assistant roles to maintain a proper conversation flow.

from weaviate.agents.classes import ChatMessage

# Create a conversation with multiple turns
conversation = [
ChatMessage(role="user", content="Hi!"),
ChatMessage(role="assistant", content="Hello! How can I assist you today?"),
ChatMessage(
role="user",
content="I have some questions about the weather data. You can assume the temperature is in Fahrenheit and the wind speed is in mph.",
),
ChatMessage(
role="assistant",
content="I can help with that. What specific information are you looking for?",
),
]

# Add the user's query
conversation.append(
ChatMessage(
role="user",
content="What's the average wind speed, the max wind speed, and the min wind speed",
)
)

# Get the response
response = qa.ask(conversation)
print(response.final_answer)

# Continue the conversation
conversation.append(ChatMessage(role="assistant", content=response.final_answer))
conversation.append(ChatMessage(role="user", content="and for the temperature?"))

response = qa.ask(conversation)
print(response.final_answer)

Stream responses

The Query Agent can also stream responses, allowing you to receive the answer as it is being generated.

A streaming response can be requested with the following optional parameters:

  • include_progress: If set to True, the Query Agent will stream a progress update as it processes the query.
  • include_final_state: If set to True, the Query Agent will stream the final answer as it is generated, rather than waiting for the entire answer to be generated before returning it.

If both include_progress and include_final_state are set to False, the Query Agent will only include the answer tokens as they are generated, without any progress updates or final state.

from weaviate.agents.classes import ProgressMessage, StreamedTokens

for output in qa.ask_stream(
query,
# Setting this to false will skip ProgressMessages, and only stream
# the StreamedTokens / the final QueryAgentResponse
include_progress=True, # Default is True
include_final_state=True, # Default is True
):
if isinstance(output, ProgressMessage):
# The message is a human-readable string, structured info available in output.details
print(output.message)
elif isinstance(output, StreamedTokens):
# The delta is a string containing the next chunk of the final answer
print(output.delta, end="", flush=True)
else:
# This is the final response, as returned by QueryAgent.ask()
output.display()

Inspect responses

The response from the Query Agent will contain the final answer, as well as additional supporting information.

The supporting information may include searches or aggregations carried out, what information may have been missing, and how many LLM tokens were used by the Agent.

Helper function

Try the provided helper functions (e.g. .display() method) to display the response in a readable format.

# Perform a query using Ask Mode (with answer generation)
response = qa.ask(
"I like vintage clothes and nice shoes. Recommend some of each below $60."
)

# Print the response
response.display()

This will print the response and a summary of the supporting information found by the Query Agent.

Example output
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────── 🔍 Original Query ────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ I like vintage clothes and nice shoes. Recommend some of each below $60. │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────── 📝 Final Answer ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ For vintage clothing under $60, you might like the Vintage Philosopher Midi Dress by Echo & Stitch. It features deep green velvet fabric with antique gold button details, tailored fit, and pleated skirt, perfect for a classic vintage look. │
│ │
│ For nice shoes under $60, consider the Glide Platforms by Vivid Verse. These are high-shine pink platform sneakers with cushioned soles, inspired by early 2000s playful glamour, offering both style and comfort. │
│ │
│ Both options combine vintage or retro aesthetics with an affordable price point under $60. │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────── 🔭 Searches Executed 1/2 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ QueryResultWithCollection(queries=['vintage clothing'], filters=[[]], filter_operators='AND', collection='ECommerce') │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────── 🔭 Searches Executed 2/2 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ QueryResultWithCollection(queries=['nice shoes'], filters=[[]], filter_operators='AND', collection='ECommerce') │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ 📊 No Aggregations Run │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 📚 Sources ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ - object_id='a7aa8f8a-f02f-4c72-93a3-38bcbd8d5581' collection='ECommerce' │
│ - object_id='ff5ecd6e-8cb9-47a0-bc1c-2793d0172984' collection='ECommerce' │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯


📊 Usage Statistics
┌────────────────┬─────┐
│ LLM Requests: │ 5 │
│ Input Tokens: │ 288 │
│ Output Tokens: │ 17 │
│ Total Tokens: │ 305 │
└────────────────┴─────┘

Total Time Taken: 7.58s

Inspection example

This example outputs:

  • The original user query
  • The answer provided by the Query Agent
  • Searches & aggregations (if any) conducted by the Query Agent
  • Any missing information
print("\n=== Query Agent Response ===")
print(f"Original Query: {response.original_query}\n")

print("🔍 Final Answer Found:")
print(f"{response.final_answer}\n")

print("🔍 Searches Executed:")
for collection_searches in response.searches:
for result in collection_searches:
print(f"- {result}\n")

if len(response.aggregations) > 0:
print("📊 Aggregation Results:")
for collection_aggs in response.aggregations:
for agg in collection_aggs:
print(f"- {agg}\n")

if response.missing_information:
if response.is_partial_answer:
print("⚠️ Answer is Partial - Missing Information:")
else:
print("⚠️ Missing Information:")
for missing in response.missing_information:
print(f"- {missing}")

Usage - Async Python client

If you are using the async Python Weaviate client, the instantiation pattern remains similar. The difference is use of the AsyncQueryAgent class instead of the QueryAgent class.

The resulting async pattern works as shown below:

import asyncio
import os
import weaviate
from weaviate.agents.query import AsyncQueryAgent


async_client = weaviate.use_async_with_weaviate_cloud(
cluster_url=os.environ.get("WEAVIATE_URL"),
auth_credentials=os.environ.get("WEAVIATE_API_KEY"),
headers=headers,
)


async def query_vintage_clothes(async_query_agent: AsyncQueryAgent):
response = await async_query_agent.ask(
"I like vintage clothes and nice shoes. Recommend some of each below $60."
)
return ("Vintage Clothes", response)


async def query_financial_data(async_query_agent: AsyncQueryAgent):
response = await async_query_agent.ask(
"What kinds of contracts are listed? What's the most common type of contract?",
)
return ("Financial Contracts", response)


async def run_concurrent_queries():
try:
await async_client.connect()

async_qa = AsyncQueryAgent(
async_client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce", # The name of the collection to query
target_vector=[
"name_description_brand_vector"
], # Optional target vector name(s) for collections with named vectors
view_properties=[
"name",
"description",
"category",
"brand",
], # Optional list of property names the agent can view
),
QueryAgentCollectionConfig(
name="FinancialContracts", # The name of the collection to query
# Optional tenant name for collections with multi-tenancy enabled
# tenant="tenantA"
),
],
)

# Wait for both to complete
vintage_response, financial_response = await asyncio.gather(
query_vintage_clothes(async_qa), query_financial_data(async_qa)
)

# Display results
print(f"=== {vintage_response[0]} ===")
vintage_response[1].display()

print(f"=== {financial_response[0]} ===")
financial_response[1].display()

finally:
await async_client.close()


asyncio.run(run_concurrent_queries())

Streaming

The async Query Agent can also stream responses, allowing you to receive the answer as it is being generated.

async def stream_query(async_query_agent: AsyncQueryAgent):
async for output in async_query_agent.ask_stream(
"What are the top 5 products sold in the last 30 days?",
# Setting this to false will skip ProgressMessages, and only stream
# the StreamedTokens / the final QueryAgentResponse
include_progress=True, # Default is True
):
if isinstance(output, ProgressMessage):
# The message is a human-readable string, structured info available in output.details
print(output.message)
elif isinstance(output, StreamedTokens):
# The delta is a string containing the next chunk of the final answer
print(output.delta, end="", flush=True)
else:
# This is the final response, as returned by QueryAgent.ask()
output.display()


async def run_streaming_query():
try:
await async_client.connect()
async_qa = AsyncQueryAgent(
async_client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce", # The name of the collection to query
target_vector=[
"name_description_brand_vector"
], # Optional target vector name(s) for collections with named vectors
view_properties=[
"name",
"description",
"category",
"brand",
], # Optional list of property names the agent can view
),
QueryAgentCollectionConfig(
name="FinancialContracts", # The name of the collection to query
# Optional tenant name for collections with multi-tenancy enabled
# tenant="tenantA"
),
],
)
await stream_query(async_qa)

finally:
await async_client.close()


asyncio.run(run_streaming_query())

Limitations & Troubleshooting

Usage limits

Each Weaviate Cloud organization can make up to 1,000 Query Agent requests per month at no cost.

Requests are consumed based on query type:

  • Ask: 4 requests per query
  • Search: 1 request per query

This limit may change in the future. For questions about usage limits, contact product@weaviate.io.

Custom collection descriptions

The Query Agent makes use of each collection's description metadata as well as individual property descriptions in deciding what collection to query.

Both collection descriptions and property descriptions can be updated after the collection has been created. For detailed instructions on updating collection and property descriptions, see the update collection definition documentation.

We are investigating an ability to specify a custom collection description at runtime.

Execution times

The Query Agent performs multiple operations to translate a natural language query into Weaviate queries, and to process the response.

This typically requires multiple calls to generative models (e.g. LLMs) and multiple queries to Weaviate.

As a result, each Query Agent run may take some time to complete. Depending on the query complexity, it may not be uncommon to see execution times of ~10 seconds.

Questions and feedback

If you have any questions or feedback, let us know in the user forum.