Search Mode
Search Mode transforms your query into actionable searches and returns the matching Weaviate objects directly — without generating an LLM-authored answer.
For example, you could ask:
"Find me some vintage shoes under $70"
And the agent will perform semantic search for vintage shoes, apply a filter for price < 70, and return the matching objects from your collections, ready for you to render or post-process.
For more details, see the page for the Python client or the Typescript Client.
Usage
Like all features of the Query Agent, it requires instantiation of the QueryAgent class, which is connected to your Weaviate client. See the class instantiation page for more detail.
Note, locally running Weaviate instances do not support the Query Agent.
import os
import weaviate
from weaviate.agents.query import QueryAgent
from weaviate.classes.init import Auth
client = weaviate.connect_to_weaviate_cloud(
cluster_url=os.environ.get("WEAVIATE_URL"),
auth_credentials=Auth.api_key(os.environ.get("WEAVIATE_API_KEY")),
)
qa = QueryAgent(
client=client,
collections=["ECommerce"],
)
search_response = qa.search(
query="Find me some vintage shoes under $70",
limit=10,
)
# Access the matching Weaviate objects
for obj in search_response.search_results.objects:
print(f"Product: {obj.properties['name']} - ${obj.properties['price']}")
Make sure to include your API keys in your environment, and specify whichever collection you want to search over.
In Python, the Query Agent supports both synchronous and asynchronous usage. The Python examples on this page use the synchronous client, but can be easily replaced with the async equivalents — see the async section for details. In JavaScript/TypeScript, all calls are asynchronous by default and use await.
Parameters
The .search() method accepts several arguments:
| Parameter | Type | Description |
|---|---|---|
query | str | list[ChatMessage] | The user query you want the agent to search with. This can be a simple string ("Find me some vintage shoes under $70") or a list of chat messages (for conversational context). See the page on multi-turn conversations for more detail. |
collections | list[str | QueryAgentCollectionConfig] | None | The name(s) of the collections to search. You can pass one or many collection names as a list of strings (e.g., ["ECommerce", "BookSales"]), or provide collection configuration objects for more control. If specified in the ask method, it will overwrite those defined in the instantiation of QueryAgent. See the page on collection configuration for more detail. |
limit | int | The maximum number of results returned in this page of results. Defaults to 20. Use .next() to fetch additional pages. |
filtering | Literal["recall", "precision"] | Either "recall" or "precision" to control filter generation. "recall" favors more results across filter interpretations; "precision" favors strict intent match. See Customized filtering below. |
diversity_weight | float | None | A value between 0.0 and 1.0 that biases the result ranking towards diversity using Maximal Marginal Relevance (MMR). See Diversity ranking below. |
For more advanced searches, you can also specify additional filters within the collection configuration. See the page on additional filters for more detail.
Customized filtering
Search Mode uses query rewriting to transform your original query into one or multiple Weaviate queries, each with either a search query, metadata filters, or both. The filtering parameter controls how many Weaviate queries are generated.
-
"recall"(default): Generates multiple Weaviate queries spanning different filters and interpretations of the user query. You should use these when you prefer to get results, even if they don't match every criteria in your query. -
"precision": Generates a single Weaviate query targeting the most likely interpretation of the user query. You should use this when you want the results to follow your query intent closely, even if that means potentially receiving no results.
search_response = qa.search(
"Find me some vintage shoes under $70",
filtering="precision",
limit=10,
)
for obj in search_response.search_results.objects:
print(f"Product: {obj.properties['name']} - ${obj.properties['price']}")
Diversity ranking
Search supports adding diversity weighting to result rankings using Maximal Marginal Relevance (MMR). This is enabled by passing a diversity_weight parameter in the range of 0.0 to 1.0 — higher values favor more varied results over the most relevant ones.
To use diversity ranking with target vectors, set the single target vector you want to use in the Query Agent's constructor. Diversity ranking is not yet supported with collections using multi-vector embeddings, and will only work across multiple collections if they share the same embedding model.
qa = QueryAgent(
client=client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce",
target_vector=["name_description_brand_vector"],
)
]
)
search_response = qa.search(
"summer shoes",
limit=10,
diversity_weight=0.5,
)
for obj in search_response.search_results.objects:
print(f"Product: {obj.properties['name']} - ${obj.properties['price']}")
Response
The Search Mode response has the following properties:
| Field | Type | Description |
|---|---|---|
searches | list[QueryResultWithCollectionNormalized] | A list of searches the agent carried out. Each contains the search query, filters, and the collection the search was run against. |
usage | ModelUnitUsage | A ModelUnitUsage instance providing detail on the model units used during the run. The model_units are effectively token usage measurements normalized by cost. |
total_time | float | Total time taken (seconds). |
search_results | QueryReturn | A QueryReturn object whose .objects field is the list of matching Weaviate objects, each with properties and metadata (including the relevance score). |
next(limit, offset) | SearchModeResponse | A method that returns the next page of results, reusing the same underlying searches for consistency. See Pagination below. |
The search_results / searchResults field reuses Weaviate's native QueryReturn / WeaviateReturnWithCollection type, so results have the same shape as a standard Weaviate query. However, the score in each object's metadata is replaced with Search Mode's own ranking score rather than the original Weaviate search score.
Pagination
Search returns results one page at a time. To fetch additional pages, call .next() on the previous response — the underlying searches are reused so results stay consistent across pages.
# Search with pagination
response_page_1 = qa.search(
"Find summer shoes and accessories between $50 and $100 that have the tag 'sale'",
limit=3,
)
# Get the next page of results
response_page_2 = response_page_1.next(limit=3, offset=3)
# Continue paginating
response_page_3 = response_page_2.next(limit=3, offset=6)
# Access results from each page
for page_num, page_response in enumerate(
[response_page_1, response_page_2, response_page_3], 1
):
print(f"Page {page_num}:")
for obj in page_response.search_results.objects:
# Safely access properties in case they don't exist
name = obj.properties.get("name", "Unknown Product")
price = obj.properties.get("price", "Unknown Price")
print(f" {name} - ${price}")
print()
Async
In Python, the above examples use the synchronous client, but Search Mode can also be called asynchronously. This requires the AsyncQueryAgent class (instantiated the same way as its sync counterpart) together with an async Weaviate client.
from weaviate.agents.query import AsyncQueryAgent
async_client = weaviate.use_async_with_weaviate_cloud(
cluster_url=os.environ.get("WEAVIATE_URL"),
auth_credentials=Auth.api_key(os.environ.get("WEAVIATE_API_KEY")),
)
await async_client.connect()
async_qa = AsyncQueryAgent(
client=async_client,
collections=[
QueryAgentCollectionConfig(
name="ECommerce",
target_vector=["name_description_brand_vector"],
)
]
)
The .search() method must be awaited:
await async_qa.search(
query="Find me some vintage shoes under $70",
limit=10,
)
Questions and feedback
If you have any questions or feedback, let us know in the user forum.
