Weaviate Query Agent: Usage
The Weaviate Query Agent is a pre-built agentic service designed to answer natural language queries based on the data stored in Weaviate Cloud.
The user simply provides a prompt/question in natural language, and the Query Agent takes care of all intervening steps to provide an answer.

This page describes how to use the Query Agent to answer natural language queries, using your data stored in Weaviate Cloud.
Prerequisites
Weaviate instance
This Agent is available exclusively for use with a Weaviate Cloud instance. Refer to the Weaviate Cloud documentation for more information on how to set up a Weaviate Cloud instance.
You can try this Weaviate Agent with a free Sandbox instance on Weaviate Cloud.
Client library
At this time, this Agent is available only for Python and TypeScript/JavaScript. Support for other languages will be added in the future.
For Python, you can install the Weaviate client library with the optional agents extras to use Weaviate Agents. This will install the weaviate-agents package along with the weaviate-client package. For TypeScript/JavaScript, you can install the weaviate-agents package alongside the weaviate-client package.
Install the client library using the following command:
pip install -U weaviate-client[agents]
Troubleshooting: Force pip to install the latest version
For existing installations, even pip install -U "weaviate-client[agents]" may not upgrade weaviate-agents to the latest version. If this occurs, additionally try to explicitly upgrade the weaviate-agents package:
pip install -U weaviate-agents
Or install a specific version:
pip install -U weaviate-agents==1.0.1
Instantiate the Query Agent
Basic instantiation
Provide:
- Target Weaviate Cloud instance details (e.g. the
WeaviateClientobject). - A default list of the collections to be queried
import os
import weaviate
from weaviate.classes.init import Auth
from weaviate.agents.query import QueryAgent
headers = {
# Provide your required API key(s), e.g. Cohere, OpenAI, etc. for the configured vectorizer(s)
"X-INFERENCE-PROVIDER-API-KEY": os.environ.get("YOUR_INFERENCE_PROVIDER_KEY", ""),
}
client = weaviate.connect_to_weaviate_cloud(
cluster_url=os.environ.get("WEAVIATE_URL"),
auth_credentials=Auth.api_key(os.environ.get("WEAVIATE_API_KEY")),
headers=headers,
)
# Instantiate a new agent object
qa = QueryAgent(
client=client, collections=["ECommerce", "FinancialContracts", "Weather"]
)
Configure collections
The list of collections to be queried are further configurable with:
- Tenant names (required for a multi-tenant collection)
- Target vector(s) of the collection to query (optional)
- List of property names for the agent to use (optional)
- Additional filters to always apply on top of agent-generated ones (optional)
from weaviate.agents.query import QueryAgent
from weaviate.agents.classes import QueryAgentCollectionConfig
qa = QueryAgent(
client=client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce", # The name of the collection to query
target_vector=[
"name_description_brand_vector"
], # Target vector name(s) for collections with named vectors
view_properties=[
"name",
"description",
"price",
], # Optional list of property names the agent can view
# Optional tenant name for collections with multi-tenancy enabled
# tenant="tenantA"
),
QueryAgentCollectionConfig(name="FinancialContracts"),
QueryAgentCollectionConfig(name="Weather"),
],
)
The Query Agent derives its access credentials from the Weaviate client object passed to it. This can be further restricted by the collection names provided to the Query Agent.
For example, if the associated Weaviate credentials' user has access to only a subset of collections, the Query Agent will only be able to access those collections.
Additional options
The Query Agent can be instantiated with additional options, such as:
system_prompt: A custom system prompt to replace the default system prompt provided by the Weaviate team (systemPromptfor JavaScript).timeout: The maximum time the Query Agent will spend on a single query, in seconds (server-side default: 60).
Custom system prompt
You can provide a custom system prompt to guide the Query Agent's behavior:
from weaviate.agents.query import QueryAgent
from weaviate.agents.classes import QueryAgentCollectionConfig
# Define a custom system prompt to guide the agent's behavior
system_prompt = """You are a helpful assistant that can answer questions about the products and users in the database.
When you write your response use standard markdown formatting for lists, tables, and other structures.
Emphasize key insights and provide actionable recommendations when relevant."""
qa = QueryAgent(
client=client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce", # The name of the collection to query
target_vector=[
"name_description_brand_vector"
], # Target vector name(s) for collections with multiple vectors
),
"FinancialContracts",
"Weather",
],
system_prompt=system_prompt,
)
response = qa.ask("What are the most expensive items in the store?")
response.display()
User-defined filters
You can apply persistent filters that will always be combined with any agent-generated filters using logical AND operations.
from weaviate.agents.query import QueryAgent
from weaviate.agents.classes import QueryAgentCollectionConfig
from weaviate.classes.query import Filter
# Apply persistent filters that will always be combined with agent-generated filters
qa = QueryAgent(
client=client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce",
# This filter ensures only items above $50 are considered
additional_filters=Filter.by_property("price").greater_than(50),
target_vector=[
"name_description_brand_vector"
], # Required target vector name(s) for collections with named vectors
),
],
)
# The agent will automatically combine these filters with any it generates
response = qa.ask("Find me some affordable clothing items")
response.display()
# You can also apply filters dynamically at runtime
runtime_config = QueryAgentCollectionConfig(
name="ECommerce",
additional_filters=Filter.by_property("category").equal("Footwear"),
target_vector=[
"name_description_brand_vector"
], # Required target vector name(s) for collections with named vectors
)
response = qa.ask("What products are available?", collections=[runtime_config])
response.display()
Async Python client
For usage example with the async Python client, see the Async Python client section.
Querying
The Query Agent supports two query types:
Search
Search Weaviate with the Query Agent using natural langauge. The Query Agent will process the question, perform the necessary searches in Weaviate, and return the relevant objects.
# Perform a search using Search Mode (retrieval only, no answer generation)
search_response = qa.search("Find me some vintage shoes under $70", limit=10)
# Access the search results
for obj in search_response.search_results.objects:
print(f"Product: {obj.properties['name']} - ${obj.properties['price']}")
Search response structure
# SearchModeResponse structure for Python
search_response = qa.search("winter boots for under $100", limit=5)
# Access different parts of the response
print(f"Original query: {search_response.searches[0].query}")
print(f"Total time: {search_response.total_time}")
# Access usage statistics
print(f"Usage statistics: {search_response.usage}")
# Access the searches performed (if any)
if search_response.searches:
for search in search_response.searches:
print(f"Search performed: {search}")
# Access the search results (QueryReturn object)
for obj in search_response.search_results.objects:
print(f"Properties: {obj.properties}")
print(f"Metadata: {obj.metadata}")
Example output
Original query: winter boots for under $100
Total time: 4.695224046707153
Usage statistics: requests=2 request_tokens=143 response_tokens=9 total_tokens=152 details=None
Search performed: queries=['winter boots'] filters=[[IntegerPropertyFilter(property_name='price', operator=<ComparisonOperator.LESS_THAN: '<'>, value=100.0)]] filter_operators='AND' collection='ECommerce'
Properties: {'name': 'Bramble Berry Loafers', 'description': 'Embrace your love for the countryside with our soft, hand-stitched loafers, perfect for quiet walks through the garden. Crafted with eco-friendly dyed soft pink leather and adorned with a subtle leaf embossing, these shoes are a testament to the beauty of understated simplicity.', 'price': 75.0}
Metadata: {'creation_time': None, 'last_update_time': None, 'distance': None, 'certainty': None, 'score': 0.4921875, 'explain_score': None, 'is_consistent': None, 'rerank_score': None}
Properties: {'name': 'Glitter Bootcut Fantasy', 'description': "Step back into the early 2000s with these dazzling silver bootcut jeans. Embracing the era's optimism, these bottoms offer a comfortable fit with a touch of stretch, perfect for dancing the night away.", 'price': 69.0}
Metadata: {'creation_time': None, 'last_update_time': None, 'distance': None, 'certainty': None, 'score': 0.47265625, 'explain_score': None, 'is_consistent': None, 'rerank_score': None}
Properties: {'name': 'Celestial Step Platform Sneakers', 'description': 'Stride into the past with these baby blue platforms, boasting a dreamy sky hue and cushy soles for day-to-night comfort. Perfect for adding a touch of whimsy to any outfit.', 'price': 90.0}
Metadata: {'creation_time': None, 'last_update_time': None, 'distance': None, 'certainty': None, 'score': 0.48828125, 'explain_score': None, 'is_consistent': None, 'rerank_score': None}
Properties: {'name': 'Garden Bliss Heels', 'description': 'Embrace the simplicity of countryside elegance with our soft lavender heels, intricately designed with delicate floral embroidery. Perfect for occasions that call for a touch of whimsy and comfort.', 'price': 90.0}
Metadata: {'creation_time': None, 'last_update_time': None, 'distance': None, 'certainty': None, 'score': 0.45703125, 'explain_score': None, 'is_consistent': None, 'rerank_score': None}
Properties: {'name': 'Garden Stroll Loafers', 'description': 'Embrace the essence of leisurely countryside walks with our soft, leather loafers. Designed for the natural wanderer, these shoes feature delicate, hand-stitched floral motifs set against a soft, cream background, making every step a blend of comfort and timeless elegance.', 'price': 90.0}
Metadata: {'creation_time': None, 'last_update_time': None, 'distance': None, 'certainty': None, 'score': 0.451171875, 'explain_score': None, 'is_consistent': None, 'rerank_score': None}
Search with pagination
Search supports pagination to handle large result sets efficiently:
# Search with pagination
response_page_1 = qa.search(
"Find summer shoes and accessories between $50 and $100 that have the tag 'sale'",
limit=3,
)
# Get the next page of results
response_page_2 = response_page_1.next(limit=3, offset=3)
# Continue paginating
response_page_3 = response_page_2.next(limit=3, offset=3)
# Access results from each page
for page_num, page_response in enumerate(
[response_page_1, response_page_2, response_page_3], 1
):
print(f"Page {page_num}:")
for obj in page_response.search_results.objects:
# Safely access properties in case they don't exist
name = obj.properties.get("name", "Unknown Product")
price = obj.properties.get("price", "Unknown Price")
print(f" {name} - ${price}")
print()
Example output
Page 1:
Glide Platforms - $90.0
Garden Haven Tote - $58.0
Sky Shimmer Sneaks - $69.0
Page 2:
Garden Haven Tote - $58.0
Celestial Step Platform Sneakers - $90.0
Eloquent Satchel - $59.0
Page 3:
Garden Haven Tote - $58.0
Celestial Step Platform Sneakers - $90.0
Eloquent Satchel - $59.0
Ask
Ask the Query Agent a question using natural language. The Query Agent will process the question, perform the necessary searches in Weaviate, and return the answer.
The Query Agent will formulate its strategy based on your query. So, aim to be unambiguous, complete, yet concise in your query as much as possible.
# Perform a query using Ask Mode (with answer generation)
response = qa.ask(
"I like vintage clothes and nice shoes. Recommend some of each below $60."
)
# Print the response
response.display()
Configure collections at runtime
The list of collections to be queried can be overridden at query time, as a list of names, or with further configurations:
Specify collection names only
This example overrides the configured Query Agent collections for this query only.
response = qa.ask(
"What kinds of contracts are listed? What's the most common type of contract?",
collections=["FinancialContracts"],
)
response.display()
Configure collections in detail
This example overrides the configured Query Agent collections for this query only, specifying additional options where relevant, such as:
- Target vector
- Properties to view
- Target tenant
- Additional filters
from weaviate.agents.classes import QueryAgentCollectionConfig
response = qa.ask(
"I like vintage clothes and nice shoes. Recommend some of each below $60.",
collections=[
# Use QueryAgentCollectionConfig class to provide further collection configuration
QueryAgentCollectionConfig(
name="ECommerce", # The name of the collection to query
target_vector=[
"name_description_brand_vector"
], # Required target vector name(s) for collections with named vectors
view_properties=[
"name",
"description",
"category",
"brand",
], # Optional list of property names the agent can view
),
QueryAgentCollectionConfig(
name="FinancialContracts", # The name of the collection to query
# Optional tenant name for collections with multi-tenancy enabled
# tenant="tenantA"
),
],
)
response.display()
Conversational queries
The Query Agent supports multi-turn conversations by passing a list of ChatMessage objects. This works with both Search and Ask query types.
When building conversations with ChatMessage there are two available roles for messages:
user- Represents messages from the human user asking questions or providing contextassistant- Represents the Query Agent's responses or any AI assistant responses in the conversation history
The conversation history helps the Query Agent understand context from previous exchanges, enabling more coherent multi-turn dialogues. Always alternate between user and assistant roles to maintain a proper conversation flow.
from weaviate.agents.classes import ChatMessage
# Create a conversation with multiple turns
conversation = [
ChatMessage(role="user", content="Hi!"),
ChatMessage(role="assistant", content="Hello! How can I assist you today?"),
ChatMessage(
role="user",
content="I have some questions about the weather data. You can assume the temperature is in Fahrenheit and the wind speed is in mph.",
),
ChatMessage(
role="assistant",
content="I can help with that. What specific information are you looking for?",
),
]
# Add the user's query
conversation.append(
ChatMessage(
role="user",
content="What's the average wind speed, the max wind speed, and the min wind speed",
)
)
# Get the response
response = qa.ask(conversation)
print(response.final_answer)
# Continue the conversation
conversation.append(ChatMessage(role="assistant", content=response.final_answer))
conversation.append(ChatMessage(role="user", content="and for the temperature?"))
response = qa.ask(conversation)
print(response.final_answer)
Stream responses
The Query Agent can also stream responses, allowing you to receive the answer as it is being generated.
A streaming response can be requested with the following optional parameters:
include_progress: If set toTrue, the Query Agent will stream a progress update as it processes the query.include_final_state: If set toTrue, the Query Agent will stream the final answer as it is generated, rather than waiting for the entire answer to be generated before returning it.
If both include_progress and include_final_state are set to False, the Query Agent will only include the answer tokens as they are generated, without any progress updates or final state.
from weaviate.agents.classes import ProgressMessage, StreamedTokens
for output in qa.ask_stream(
query,
# Setting this to false will skip ProgressMessages, and only stream
# the StreamedTokens / the final QueryAgentResponse
include_progress=True, # Default is True
include_final_state=True, # Default is True
):
if isinstance(output, ProgressMessage):
# The message is a human-readable string, structured info available in output.details
print(output.message)
elif isinstance(output, StreamedTokens):
# The delta is a string containing the next chunk of the final answer
print(output.delta, end="", flush=True)
else:
# This is the final response, as returned by QueryAgent.ask()
output.display()
Inspect responses
The response from the Query Agent will contain the final answer, as well as additional supporting information.
The supporting information may include searches or aggregations carried out, what information may have been missing, and how many LLM tokens were used by the Agent.
Helper function
Try the provided helper functions (e.g. .display() method) to display the response in a readable format.
# Perform a query using Ask Mode (with answer generation)
response = qa.ask(
"I like vintage clothes and nice shoes. Recommend some of each below $60."
)
# Print the response
response.display()
This will print the response and a summary of the supporting information found by the Query Agent.
Example output
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────── 🔍 Original Query ────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ I like vintage clothes and nice shoes. Recommend some of each below $60. │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────── 📝 Final Answer ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ For vintage clothing under $60, you might like the Vintage Philosopher Midi Dress by Echo & Stitch. It features deep green velvet fabric with antique gold button details, tailored fit, and pleated skirt, perfect for a classic vintage look. │
│ │
│ For nice shoes under $60, consider the Glide Platforms by Vivid Verse. These are high-shine pink platform sneakers with cushioned soles, inspired by early 2000s playful glamour, offering both style and comfort. │
│ │
│ Both options combine vintage or retro aesthetics with an affordable price point under $60. │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────── 🔭 Searches Executed 1/2 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ QueryResultWithCollection(queries=['vintage clothing'], filters=[[]], filter_operators='AND', collection='ECommerce') │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────── 🔭 Searches Executed 2/2 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ QueryResultWithCollection(queries=['nice shoes'], filters=[[]], filter_operators='AND', collection='ECommerce') │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ 📊 No Aggregations Run │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 📚 Sources ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ - object_id='a7aa8f8a-f02f-4c72-93a3-38bcbd8d5581' collection='ECommerce' │
│ - object_id='ff5ecd6e-8cb9-47a0-bc1c-2793d0172984' collection='ECommerce' │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
📊 Usage Statistics
┌────────────────┬─────┐
│ LLM Requests: │ 5 │
│ Input Tokens: │ 288 │
│ Output Tokens: │ 17 │
│ Total Tokens: │ 305 │
└────────────────┴─────┘
Total Time Taken: 7.58s
Inspection example
This example outputs:
- The original user query
- The answer provided by the Query Agent
- Searches & aggregations (if any) conducted by the Query Agent
- Any missing information
print("\n=== Query Agent Response ===")
print(f"Original Query: {response.searches[0].query}\n")
print("🔍 Final Answer Found:")
print(f"{response.final_answer}\n")
print("🔍 Searches Executed:")
for collection_searches in response.searches:
for result in collection_searches:
print(f"- {result}\n")
if len(response.aggregations) > 0:
print("📊 Aggregation Results:")
for collection_aggs in response.aggregations:
for agg in collection_aggs:
print(f"- {agg}\n")
if response.missing_information:
if response.is_partial_answer:
print("⚠️ Answer is Partial - Missing Information:")
else:
print("⚠️ Missing Information:")
for missing in response.missing_information:
print(f"- {missing}")
Example output
=== Query Agent Response ===
Original Query: vintage style clothing
🔍 Final Answer Found:
For vintage-style clothing under $60, I recommend the Vintage Scholar Turtleneck priced at $55. It features a soft, stretchable fabric with timeless pleated details, perfect for a Dark Academia-inspired intellectual and moody look, whether layered or worn solo.
However, based on the available information, no shoes under $60 were found. If you want, I can help search further for nice shoes within your budget. Let me know!
🔍 Searches Executed:
- ('query', 'vintage style clothing')
- ('filters', IntegerPropertyFilter(property_name='price', operator=<ComparisonOperator.LESS_THAN: '<'>, value=60.0))
- ('collection', 'ECommerce')
- ('query', 'nice shoes')
- ('filters', IntegerPropertyFilter(property_name='price', operator=<ComparisonOperator.LESS_THAN: '<'>, value=60.0))
- ('collection', 'ECommerce')
⚠️ Answer is Partial - Missing Information:
- No recommendations were provided for nice shoes under $60, though the user specifically requested shoes as well as vintage clothes.
Usage - Async Python client
If you are using the async Python Weaviate client, the instantiation pattern remains similar. The difference is use of the AsyncQueryAgent class instead of the QueryAgent class.
The resulting async pattern works as shown below:
import asyncio
import os
import weaviate
from weaviate.agents.query import AsyncQueryAgent
async_client = weaviate.use_async_with_weaviate_cloud(
cluster_url=os.environ.get("WEAVIATE_URL"),
auth_credentials=os.environ.get("WEAVIATE_API_KEY"),
headers=headers,
)
async def query_vintage_clothes(async_query_agent: AsyncQueryAgent):
response = await async_query_agent.ask(
"I like vintage clothes and nice shoes. Recommend some of each below $60."
)
return ("Vintage Clothes", response)
async def query_financial_data(async_query_agent: AsyncQueryAgent):
response = await async_query_agent.ask(
"What kinds of contracts are listed? What's the most common type of contract?",
)
return ("Financial Contracts", response)
async def run_concurrent_queries():
try:
await async_client.connect()
async_qa = AsyncQueryAgent(
async_client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce", # The name of the collection to query
target_vector=[
"name_description_brand_vector"
], # Optional target vector name(s) for collections with named vectors
view_properties=[
"name",
"description",
"category",
"brand",
], # Optional list of property names the agent can view
),
QueryAgentCollectionConfig(
name="FinancialContracts", # The name of the collection to query
# Optional tenant name for collections with multi-tenancy enabled
# tenant="tenantA"
),
],
)
# Wait for both to complete
vintage_response, financial_response = await asyncio.gather(
query_vintage_clothes(async_qa), query_financial_data(async_qa)
)
# Display results
print(f"=== {vintage_response[0]} ===")
vintage_response[1].display()
print(f"=== {financial_response[0]} ===")
financial_response[1].display()
finally:
await async_client.close()
asyncio.run(run_concurrent_queries())
Streaming
The async Query Agent can also stream responses, allowing you to receive the answer as it is being generated.
async def stream_query(async_query_agent: AsyncQueryAgent):
async for output in async_query_agent.ask_stream(
"What are the top 5 products sold in the last 30 days?",
# Setting this to false will skip ProgressMessages, and only stream
# the StreamedTokens / the final QueryAgentResponse
include_progress=True, # Default is True
):
if isinstance(output, ProgressMessage):
# The message is a human-readable string, structured info available in output.details
print(output.message)
elif isinstance(output, StreamedTokens):
# The delta is a string containing the next chunk of the final answer
print(output.delta, end="", flush=True)
else:
# This is the final response, as returned by QueryAgent.ask()
output.display()
async def run_streaming_query():
try:
await async_client.connect()
async_qa = AsyncQueryAgent(
async_client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce", # The name of the collection to query
target_vector=[
"name_description_brand_vector"
], # Optional target vector name(s) for collections with named vectors
view_properties=[
"name",
"description",
"category",
"brand",
], # Optional list of property names the agent can view
),
QueryAgentCollectionConfig(
name="FinancialContracts", # The name of the collection to query
# Optional tenant name for collections with multi-tenancy enabled
# tenant="tenantA"
),
],
)
await stream_query(async_qa)
finally:
await async_client.close()
asyncio.run(run_streaming_query())
Limitations & Troubleshooting
Usage limits
Each Weaviate Cloud organization can make up to 1,000 Query Agent requests per month at no cost.
Requests are consumed based on query type:
Ask: 4 requests per querySearch: 1 request per query
This limit may change in the future. For questions about usage limits, contact product@weaviate.io.
Custom collection descriptions
The Query Agent makes use of each collection's description metadata as well as individual property descriptions in deciding what collection to query.
Both collection descriptions and property descriptions can be updated after the collection has been created. For detailed instructions on updating collection and property descriptions, see the update collection definition documentation.
We are investigating an ability to specify a custom collection description at runtime.
Execution times
The Query Agent performs multiple operations to translate a natural language query into Weaviate queries, and to process the response.
This typically requires multiple calls to generative models (e.g. LLMs) and multiple queries to Weaviate.
As a result, each Query Agent run may take some time to complete. Depending on the query complexity, it may not be uncommon to see execution times of ~10 seconds.
Questions and feedback
If you have any questions or feedback, let us know in the user forum.
