Skip to main content
Go to documentation:
⌘U
Weaviate Database

Develop AI applications using Weaviate's APIs and tools

Deploy

Deploy, configure, and maintain Weaviate Database

Weaviate Agents

Build and deploy intelligent agents with Weaviate

Weaviate Cloud

Manage and scale Weaviate in the cloud

Additional resources

Integrations
Contributor guide
Events & Workshops
Weaviate Academy

Need help?

Weaviate LogoAsk AI Assistant⌘K
Community Forum

Vector similarity search

Vector search returns the objects with most similar vectors to that of the query.

Prefer natural language queries?

The Query Agent translates plain English questions into optimized Weaviate queries automatically - no manual query construction needed.

Cloud only

Search with text

Use the Near Text operator to find objects with the nearest vector to an input text.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import MetadataQuery

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.near_text(
query="animals in movies",
limit=2,
return_metadata=MetadataQuery(distance=True)
)

for o in response.objects:
print(o.properties)
print(o.metadata.distance)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "meerkats",
"question": "Group of mammals seen <a href=\"http://www.j-archive.com/media/1998-06-01_J_28.jpg\" target=\"_blank\">here</a>: [like Timon in <i>The Lion King</i>]",
"_additional": { "distance": 0.17602634 }
},
{
"answer": "dogs",
"question": "Scooby-Doo, Goofy & Pluto are cartoon versions",
"_additional": { "distance": 0.17842108 }
}
]
}
}
}

Search with image

Use the Near Image operator to find objects with the nearest vector to an image.
This example uses a base64 representation of an image.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
base64_string="SOME_BASE_64_REPRESENTATION"

# Get the collection containing images
dogs = client.collections.use("Dog")

# Perform query
response = dogs.query.near_image(
near_image=base64_string,
return_properties=["breed"],
limit=1,
# targetVector: "vector_name" # required when using multiple named vectors
)

print(response.objects[0])

client.close()

See Image search for more information.

Search with an existing object

If you have an object ID, use the Near Object operator to find similar objects to that object.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import MetadataQuery

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.near_object(
near_object=uuid, # A UUID of an object (e.g. "56b9449e-65db-5df4-887b-0a4773f52aa7")
limit=2,
return_metadata=MetadataQuery(distance=True)
)

for o in response.objects:
print(o.properties)
print(o.metadata.distance)

Additional information

To get the object ID, see Retrieve the object ID.

Search with a vector

If you have an input vector, use the Near Vector operator to find objects with similar vectors

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import MetadataQuery

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.near_vector(
near_vector=query_vector, # your query vector goes here
limit=2,
return_metadata=MetadataQuery(distance=True)
)

for o in response.objects:
print(o.properties)
print(o.metadata.distance)

Named vectors

To search a collection that has named vectors, use the target vector field to specify which named vector to search.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import MetadataQuery

reviews = client.collections.use("WineReviewNV")
response = reviews.query.near_text(
query="a sweet German white wine",
limit=2,
target_vector="title_country", # Specify the target vector for named vector collections
return_metadata=MetadataQuery(distance=True)
)

for o in response.objects:
print(o.properties)
print(o.metadata.distance)
Example response

The output is like this:

{
"WineReviewNV": [
{
"country": "Austria",
"review_body": "With notions of cherry and cinnamon on the nose and just slight fizz, this is a refreshing, fruit-driven sparkling ros\u00e9 that's full of strawberry and cherry notes\u2014it might just be the very definition of easy summer wine. It ends dry, yet refreshing.",
"title": "Gebeshuber 2013 Frizzante Ros\u00e9 Pinot Noir (\u00d6sterreichischer Perlwein)"
},
{
"country": "Austria",
"review_body": "Beautifully perfumed, with acidity, white fruits and a mineral context. The wine is layered with citrus and lime, hints of fresh pineapple acidity. Screw cap.",
"title": "Stadt Krems 2009 Steinterrassen Riesling (Kremstal)"
}
]
}

Set a similarity threshold

To set a similarity threshold between the search and target vectors, define a maximum distance (or certainty).

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import MetadataQuery

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.near_text(
query="animals in movies",
distance=0.25, # max accepted distance
return_metadata=MetadataQuery(distance=True)
)

for o in response.objects:
print(o.properties)
print(o.metadata.distance)
Additional information
  • The distance value depends on many factors, including the vectorization model you use. Experiment with your data to find a value that works for you.
  • certainty is only available with cosine distance.
  • To find the least similar objects, use the negative cosine distance with nearVector search.

limit & offset

Use limit to set a fixed maximum number of objects to return.

Optionally, use offset to paginate the results.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import MetadataQuery

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.near_text(
query="animals in movies",
limit=2, # return 2 objects
offset=1, # With an offset of 1
return_metadata=MetadataQuery(distance=True)
)

for o in response.objects:
print(o.properties)
print(o.metadata.distance)

Limit result groups

To limit results to groups of similar distances to the query, use the autocut filter to set the number of groups to return.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import MetadataQuery

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.near_text(
query="animals in movies",
auto_limit=1, # number of close groups
return_metadata=MetadataQuery(distance=True)
)

for o in response.objects:
print(o.properties)
print(o.metadata.distance)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "meerkats",
"question": "Group of mammals seen <a href=\"http://www.j-archive.com/media/1998-06-01_J_28.jpg\" target=\"_blank\">here</a>: [like Timon in <i>The Lion King</i>]",
"_additional": { "distance": 0.17602634 }
},
{
"answer": "dogs",
"question": "Scooby-Doo, Goofy & Pluto are cartoon versions",
"_additional": { "distance": 0.17842108 }
}
]
}
}
}

Group results

Use a property or a cross-reference to group results. To group returned objects, the query must include a Near search operator, such as Near Text or Near Object.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import MetadataQuery, GroupBy

jeopardy = client.collections.use("JeopardyQuestion")

group_by = GroupBy(
prop="round", # group by this property
objects_per_group=2, # maximum objects per group
number_of_groups=2, # maximum number of groups
)

response = jeopardy.query.near_text(
query="animals in movies", # find object based on this query
limit=10, # maximum total objects
return_metadata=MetadataQuery(distance=True),
group_by=group_by
)


for o in response.objects:
print(o.uuid)
print(o.belongs_to_group)
print(o.metadata.distance)

for grp, grp_items in response.groups.items():
print("=" * 10 + grp_items.name + "=" * 10)
print(grp_items.number_of_objects)
for o in grp_items.objects:
print(o.properties)
print(o.metadata)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"_additional": {
"group": {
"count": 2,
"groupedBy": {
"path": [
"round"
],
"value": "Jeopardy!"
},
"hits": [
{
"answer": "meerkats",
"question": "Group of mammals seen <a href=\"http://www.j-archive.com/media/1998-06-01_J_28.jpg\" target=\"_blank\">here</a>: [like Timon in <i>The Lion King</i>]"
},
{
"answer": "dogs",
"question": "Scooby-Doo, Goofy & Pluto are cartoon versions"
}
],
"id": 0,
"maxDistance": 0.17842054,
"minDistance": 0.17602539
}
}
},
{
"_additional": {
"group": {
"count": 1,
"groupedBy": {
"path": [
"round"
],
"value": "Double Jeopardy!"
},
"hits": [
{
"answer": "fox",
"question": "In titles, animal associated with both Volpone and Reynard"
}
],
"id": 1,
"maxDistance": 0.18770188,
"minDistance": 0.18770188
}
}
}
]
}
}
}

Filter results

For more specific results, use a filter to narrow your search.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import MetadataQuery, Filter

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.near_text(
query="animals in movies",
filters=Filter.by_property("round").equal("Double Jeopardy!"),
limit=2,
return_metadata=MetadataQuery(distance=True),
)

for o in response.objects:
print(o.properties)
print(o.metadata.distance)
Example response

The output is like this:

{
"data": {
"Get": {
"JeopardyQuestion": [
{
"_additional": {
"distance": 0.18759078
},
"answer": "fox",
"question": "In titles, animal associated with both Volpone and Reynard",
"round": "Double Jeopardy!"
},
{
"_additional": {
"distance": 0.19532347
},
"answer": "Swan",
"question": "In a Tchaikovsky ballet, Prince Siegfried goes hunting for these animals & falls in love with 1 of them",
"round": "Double Jeopardy!"
}
]
}
}
}

Diversity selection (MMR)

Preview — added in v1.37

This is a preview feature. The API may change in future releases.

  • Python client: Support is not yet in a released weaviate-client. Coming in the next release (tracked in PR #1997).
  • Multi-node clusters: MMR reranking may produce suboptimal results for collections whose shards are distributed across multiple nodes, since each shard returns its own candidate set before the coordinator reranks them. We are actively working on improving this.

Standard vector search returns the closest matches to the query, which often means a cluster of near-duplicate results. Maximum Marginal Relevance (MMR) reranks results to balance relevance with diversity — each selected result must add something new to the result set.

Add the selection parameter to any vector search query:

from weaviate.classes.query import Diversity

collection = client.collections.get("MMRDemo")

# Retrieve 20 candidates, then rerank to select 5 diverse results
response = collection.query.near_vector(
near_vector=base_vec,
limit=20,
selection=Diversity.MMR(
limit=5,
balance=0.5,
),
)

for o in response.objects:
print(o.properties["question"])

How it works

  1. Weaviate runs a regular vector search to retrieve a candidate set (controlled by the query's limit)
  2. The most relevant candidate is selected first
  3. For each remaining candidate, MMR computes a score that balances query similarity against maximum similarity to already-selected results, weighted by balance
  4. The candidate with the highest MMR score is selected next
  5. Steps 3–4 repeat until the Diversity.MMR(limit) is reached

Parameters

ParameterTypeDescription
limitintNumber of results to return after MMR reranking. Must be less than or equal to the query's top-level limit (the candidate set size).
balancefloatControls the relevance-diversity trade-off (0.0–1.0). 0.0 = pure diversity, 0.5 = balanced, 1.0 = pure relevance (equivalent to standard search).
from weaviate.classes.query import Diversity

collection = client.collections.get("MMRDemo")

# Pure diversity — maximize difference between results
response_diverse = collection.query.near_vector(
near_vector=base_vec,
limit=20,
selection=Diversity.MMR(limit=5, balance=0.0),
)

# Balanced — equal weight on relevance and diversity
response_balanced = collection.query.near_vector(
near_vector=base_vec,
limit=20,
selection=Diversity.MMR(limit=5, balance=0.5),
)

# Pure relevance — equivalent to standard vector search
response_relevant = collection.query.near_vector(
near_vector=base_vec,
limit=20,
selection=Diversity.MMR(limit=5, balance=1.0),
)

Important notes:

  • Result ordering: Results are ordered by MMR score, not query similarity. The first result is the most relevant, but subsequent results may have lower query similarity because they were chosen for diversity.
  • No reindexing needed: MMR is applied at query time. You can use it on any existing collection without schema changes.
  • Supported queries: near_text, near_vector, near_object, near_image, and near_media.
  • Not supported: hybrid search and multi-vector collections.
tip

A larger candidate set (higher top-level limit) gives MMR more results to choose from, improving diversity at the cost of slightly more computation. A good starting point is setting the candidate limit to 2–4x the MMR limit.

Questions and feedback

If you have any questions or feedback, let us know in the user forum.