Skip to main content
Go to documentation:
⌘U
Weaviate Database

Develop AI applications using Weaviate's APIs and tools

Deploy

Deploy, configure, and maintain Weaviate Database

Weaviate Agents

Build and deploy intelligent agents with Weaviate

Weaviate Cloud

Manage and scale Weaviate in the cloud

Additional resources

Integrations
Contributor guide
Events & Workshops
Weaviate Academy

Need help?

Weaviate LogoAsk AI Assistant⌘K
Community Forum

Vector Search

Vector search is a similarity-based search using vector embeddings, or embeddings. Vector search is also referred to as "semantic search" due to its ability to find semantically similar objects. It should be noted, however, that vector search is not limited to text data. Vector search can be used with other types of data, such as images, videos, and audio.

A vector embedding captures semantic meaning of an object in a vector space. It consists of a set of numbers that represent the object's features. Vector embeddings are generated by a vectorizer model, which is a machine learning model that is trained for this purpose.

A vector search compares vectors of the stored objects against the query vector(s) to find the closest matches, before returning the top n results.

An introduction to vector search

New to vector search? Check out our blog, "Vector Search Explained" for an introduction to vector search concepts and use cases.

Object vectors

For vector search, each object must have representative vector embeddings.

The model used to generate vectors is called a vectorizer model, or an embedding model.

A user can populate Weaviate with objects and their vectors in one of two ways:

Model provider integration

Weaviate provides first-party integrations with popular vectorizer model providers such as Cohere, Ollama, OpenAI, and more.

In this workflow, the user can configure a vectorizer for a collection and Weaviate will automatically generate vectors as needed, such as when inserting objects or performing searches.

This integration abstracts the vector generation process from the user, allowing the user to focus on building applications and performing searches without worrying about the vector generation process.

Vectorizer configuration is immutable

Once it is set, the vectorizer cannot be changed for a collection. This ensures that the vectors are generated consistently and stay compatible. If you need to change the vectorizer, you must create a new collection with the desired vectorizer, and migrate the data to the new collection.

Manual vectors when vectorizer is configured

Even when a vectorizer model is configured for a collection, a user can still provide vectors directly when inserting objects or performing a query. In this case, Weaviate will use the provided vector instead of generating a new one.

This is useful when the user already has vectors generated by the same model, such as when importing objects from another system. Re-using the same vectors will save time and resources, as Weaviate will not need to generate new vectors.

Bring your own vector

A user can directly upload vectors to Weaviate when inserting objects. This is useful when the user already has vectors generated by a model, or if the user wants to use a specific vectorizer model that does not have an integration with Weaviate.

In this workflow, the user has the flexibility to use any vectorizer model and process independently of Weaviate.

If using your own model, we recommend explicitly setting the vectorizer as none in the vectorizer configuration, such that you do not accidentally generate incompatible vectors with Weaviate.

Named vectors

A collections can be configured to allow each object to be represented by more than one vector embedding.

Each such vector works as its distinct vector space that is independent of each other, referred to as a "named vector".

A named vector can be configured with a vectorizer model integration, and may be provided using the "bring your own vector" integration.

Query vectors

In Weaviate, you can specify the query vector using:

  • A query vector (called nearVector),
  • A query object (called nearObject),
  • A query text (called nearText), or
  • A query media (called nearImage or nearVideo).

In each of these cases, the search will return the most similar objects to the query, based on the vector embeddings of the query and the stored objects. However, they differ in how the query vector is specified to Weaviate.

nearVector

In a nearVector query, the user provides a vector directly to Weaviate. This vector is compared to the vectors of the stored objects to find the most similar objects.

nearObject

In a nearObject query, the user provides an object ID to Weaviate. Weaviate retrieves the vector of the object and compares it to the vectors of the stored objects to find the most similar objects.

nearText (and nearImage, nearVideo)

In a nearText query, the user provides an input text to Weaviate. Weaviate uses the specified vectorizer model to generate a vector for the text, and compares it to the vectors of the stored objects to find the most similar objects.

As a result, a nearText query is only available for collections where a vectorizer model is configured.

A nearImage or nearVideo query works similarly to a nearText query, but with an image or video input instead of text.

In a multi-target vector search, Weaviate performs multiple, concurrent, single-target vector searches.

These searches will produce multiple sets of results, each with a vector distance score.

Weaviat combines these result sets, using a "join strategy" to produce final scores for each result.

If an object is within the search limit or the distance threshold of any of the target vectors, it will be included in the search results.

If an object does not contain vectors for any selected target vector, Weaviate ignores that object and does not include it in the search results.

Available join strategies.

  • minimum (default) Use the minimum of all vector distances.
  • sum Use the sum of the vector distances.
  • average Use the average of the vector distances.
  • manual weights Use the sum of weighted distances, where the weight is provided for each target vector.
  • relative score Use the sum of weighted normalized distances, where the weight is provided for each target vector.

Weaviate uses vector indexes to facilitate efficient vector searches. Like other types of indexes, a vector index organizes vector embeddings in a way that allows for fast retrieval while optimizing for other needs such as search quality (e.g. recall), search throughput, and resource use (e.g. memory).

In Weaviate, multiple types of vector indexes are available such as hnsw, flat and dynamic indexes.

Each collection or tenant in Weaviate will have its own vector index. Additionally, each collection or tenant can have multiple vector indexes, each with different configurations.

Distance metrics

There are many ways to measure vector distances, such as cosine distance, dot product, and Euclidean distance. Weaviate supports a variety of these distance metrics, as listed on the distance metrics page. Each vectorizer model is trained with a specific distance metric, so it is important to use the same distance metric for search as was used for training the model.

Weaviate uses cosine distance as the default distance metric for vector searches, as this is the typical distance metric for vectorizer models.

Distance vs Similarity

In a "distance", the lower the value, the closer the vectors are to each other. In a "similarity", or "certainty" score, the higher the value, the closer the vectors are to each other. Some metrics, such as cosine distance, can also be expressed as a similarity score. Others, such as Euclidean distance, are only expressable as a distance.

Diversity selection (MMR)

Preview — added in v1.37

This is a preview feature. The API may change in future releases.

  • Python client: Support is not yet in a released weaviate-client. Coming in the next release (tracked in PR #1997).
  • Multi-node clusters: MMR reranking may produce suboptimal results for collections whose shards are distributed across multiple nodes, since each shard returns its own candidate set before the coordinator reranks them. We are actively working on improving this.

Standard vector search returns the closest matches to a query, which often means a cluster of near-duplicate results. For example, searching for "Italian food" in a product catalog might return five images of pizza instead of a diverse spread of Italian dishes.

Maximum Marginal Relevance (MMR) solves this by reranking results to balance two objectives:

  • Relevance: how well does the item match the query?
  • Diversity: how different is the item from the results already selected?

The algorithm works iteratively. It selects the most relevant item first, then for each subsequent pick it scores candidates by weighing their query similarity against their maximum similarity to any already-selected result. The balance parameter (λ) controls the trade-off:

  • λ = 0.0: Pure diversity — maximizes difference from already-selected items
  • λ = 0.5: Balanced — each result must be both relevant and distinct
  • λ = 1.0: Pure relevance — equivalent to standard vector search

MMR is applied at query time as a reranking step on top of standard search. No reindexing is needed. The typical pattern is to retrieve a larger candidate set via regular vector search, then rerank a subset using MMR.

Result ordering

Results are ordered by MMR score, not query similarity. The first result is always the most relevant, but subsequent results may have lower query similarity because they were chosen for the diversity they add.

See the how-to guide for configuration details and code examples.

Notes and best practices

All compatible vectors are similar to some degree search.

This has two effects:

  1. There will always be some "top" search results regardless of relevance.
  2. The entire dataset is always returned.

If you search a vector database containing vectors for colors "Red", "Crimson" and "LightCoral" with a query vector for "SkyBlue", the search will still return a result (e.g. "Red"), even if it is not semantically similar to the query. The search is simply returning the closest match, even if it is not a good match in the absolute sense.

As a result, Weaviate provides multiple ways to limit the search results:

  • Limit: Specify the maximum number of results to return.
  • AutoCut: Limit results based on discontinuities in result metrics such as vector distance or search score.
  • Threshold: Specify a minimum similarity score (e.g. maximum cosine distance) for the results.
  • Apply filters: Use filters to exclude results based on other criteria, such as metadata or properties.

Use a combination of these methods to ensure that the search results are meaningful and relevant to the user.

Generally, start with a limit to a maximum number of results to provide to the user, and adjust the threshold such that irrelevant results are unlikely to be returned.

This will cause the search to return up to the specified (limit) number of results, but only if they are above the specified (threshold) similarity score.

Further resources

Questions and feedback

If you have any questions or feedback, let us know in the user forum.