Skip to main content
Go to documentation:
⌘U
Weaviate Database

Develop AI applications using Weaviate's APIs and tools

Deploy

Deploy, configure, and maintain Weaviate Database

Query Agent

Run agentic search over your Weaviate Cloud collections

Weaviate Cloud

Manage and scale Weaviate in the cloud

Engram

Persistent memory for LLM agents and applications

Additional resources

Integrations
Contributor guide
Events & Workshops
Weaviate Academy

Need help?

Weaviate LogoAsk AI Assistant⌘K
Community Forum

Search Mode

Weaviate Cloud only

Search Mode transforms your query into actionable searches and returns the matching Weaviate objects directly — without generating an LLM-authored answer.

For example, you could ask:

"Find me some vintage shoes under $70"

And the agent will perform semantic search for vintage shoes, apply a filter for price < 70, and return the matching objects from your collections, ready for you to render or post-process.

For more details, see the page for the Python client or the Typescript Client.

Usage

Like all features of the Query Agent, it requires instantiation of the QueryAgent class, which is connected to your Weaviate client. See the class instantiation page for more detail.

Note, locally running Weaviate instances do not support the Query Agent.

import os
import weaviate
from weaviate.agents.query import QueryAgent
from weaviate.classes.init import Auth

client = weaviate.connect_to_weaviate_cloud(
cluster_url=os.environ.get("WEAVIATE_URL"),
auth_credentials=Auth.api_key(os.environ.get("WEAVIATE_API_KEY")),
)

qa = QueryAgent(
client=client,
collections=["ECommerce"],
)
search_response = qa.search(
query="Find me some vintage shoes under $70",
limit=10,
)

# Access the matching Weaviate objects
for obj in search_response.search_results.objects:
print(f"Product: {obj.properties['name']} - ${obj.properties['price']}")

Make sure to include your API keys in your environment, and specify whichever collection you want to search over.

Async

In Python, the Query Agent supports both synchronous and asynchronous usage. The Python examples on this page use the synchronous client, but can be easily replaced with the async equivalents — see the async section for details. In JavaScript/TypeScript, all calls are asynchronous by default and use await.

Parameters

The .search() method accepts several arguments:

ParameterTypeDescription
querystr | list[ChatMessage]The user query you want the agent to search with. This can be a simple string ("Find me some vintage shoes under $70") or a list of chat messages (for conversational context). See the page on multi-turn conversations for more detail.
collectionslist[str | QueryAgentCollectionConfig] | NoneThe name(s) of the collections to search. You can pass one or many collection names as a list of strings (e.g., ["ECommerce", "BookSales"]), or provide collection configuration objects for more control. If specified in the ask method, it will overwrite those defined in the instantiation of QueryAgent. See the page on collection configuration for more detail.
limitintThe maximum number of results returned in this page of results. Defaults to 20. Use .next() to fetch additional pages.
filteringLiteral["recall", "precision"]Either "recall" or "precision" to control filter generation. "recall" favors more results across filter interpretations; "precision" favors strict intent match. See Customized filtering below.
diversity_weightfloat | NoneA value between 0.0 and 1.0 that biases the result ranking towards diversity using Maximal Marginal Relevance (MMR). See Diversity ranking below.

For more advanced searches, you can also specify additional filters within the collection configuration. See the page on additional filters for more detail.

Customized filtering

Search Mode uses query rewriting to transform your original query into one or multiple Weaviate queries, each with either a search query, metadata filters, or both. The filtering parameter controls how many Weaviate queries are generated.

  • "recall" (default): Generates multiple Weaviate queries spanning different filters and interpretations of the user query. You should use these when you prefer to get results, even if they don't match every criteria in your query.

  • "precision": Generates a single Weaviate query targeting the most likely interpretation of the user query. You should use this when you want the results to follow your query intent closely, even if that means potentially receiving no results.

search_response = qa.search(
"Find me some vintage shoes under $70",
filtering="precision",
limit=10,
)

for obj in search_response.search_results.objects:
print(f"Product: {obj.properties['name']} - ${obj.properties['price']}")

Diversity ranking

Search supports adding diversity weighting to result rankings using Maximal Marginal Relevance (MMR). This is enabled by passing a diversity_weight parameter in the range of 0.0 to 1.0 — higher values favor more varied results over the most relevant ones.

To use diversity ranking with target vectors, set the single target vector you want to use in the Query Agent's constructor. Diversity ranking is not yet supported with collections using multi-vector embeddings, and will only work across multiple collections if they share the same embedding model.

qa = QueryAgent(
client=client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce",
target_vector=["name_description_brand_vector"],
)
]
)

search_response = qa.search(
"summer shoes",
limit=10,
diversity_weight=0.5,
)

for obj in search_response.search_results.objects:
print(f"Product: {obj.properties['name']} - ${obj.properties['price']}")

Response

The Search Mode response has the following properties:

FieldTypeDescription
searcheslist[QueryResultWithCollectionNormalized]A list of searches the agent carried out. Each contains the search query, filters, and the collection the search was run against.
usageModelUnitUsageA ModelUnitUsage instance providing detail on the model units used during the run. The model_units are effectively token usage measurements normalized by cost.
total_timefloatTotal time taken (seconds).
search_resultsQueryReturnA QueryReturn object whose .objects field is the list of matching Weaviate objects, each with properties and metadata (including the relevance score).
next(limit, offset)SearchModeResponseA method that returns the next page of results, reusing the same underlying searches for consistency. See Pagination below.

See the client documentation for more detail.

Result scores

The search_results / searchResults field reuses Weaviate's native QueryReturn / WeaviateReturnWithCollection type, so results have the same shape as a standard Weaviate query. However, the score in each object's metadata is replaced with Search Mode's own ranking score rather than the original Weaviate search score.

Pagination

Search returns results one page at a time. To fetch additional pages, call .next() on the previous response — the underlying searches are reused so results stay consistent across pages.

# Search with pagination
response_page_1 = qa.search(
"Find summer shoes and accessories between $50 and $100 that have the tag 'sale'",
limit=3,
)

# Get the next page of results
response_page_2 = response_page_1.next(limit=3, offset=3)

# Continue paginating
response_page_3 = response_page_2.next(limit=3, offset=6)

# Access results from each page
for page_num, page_response in enumerate(
[response_page_1, response_page_2, response_page_3], 1
):
print(f"Page {page_num}:")
for obj in page_response.search_results.objects:
# Safely access properties in case they don't exist
name = obj.properties.get("name", "Unknown Product")
price = obj.properties.get("price", "Unknown Price")
print(f" {name} - ${price}")
print()

Async

In Python, the above examples use the synchronous client, but Search Mode can also be called asynchronously. This requires the AsyncQueryAgent class (instantiated the same way as its sync counterpart) together with an async Weaviate client.

from weaviate.agents.query import AsyncQueryAgent

async_client = weaviate.use_async_with_weaviate_cloud(
cluster_url=os.environ.get("WEAVIATE_URL"),
auth_credentials=Auth.api_key(os.environ.get("WEAVIATE_API_KEY")),
)
await async_client.connect()

async_qa = AsyncQueryAgent(
client=async_client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce",
target_vector=["name_description_brand_vector"],
)
]
)

The .search() method must be awaited:

await async_qa.search(
query="Find me some vintage shoes under $70",
limit=10,
)

Questions and feedback

If you have any questions or feedback, let us know in the user forum.