Skip to main content
Go to documentation:
⌘U
Weaviate Database

Develop AI applications using Weaviate's APIs and tools

Deploy

Deploy, configure, and maintain Weaviate Database

Weaviate Agents

Build and deploy intelligent agents with Weaviate

Weaviate Cloud

Manage and scale Weaviate in the cloud

Additional resources

Integrations
Contributor guide
Events & Workshops
Weaviate Academy

Need help?

Weaviate LogoAsk AI Assistant⌘K
Community Forum

Vectorizer and vector index config

Python and JS/TS client - Vectorizer Configuration API Changes

Starting with Weaviate Python client v4.16.0, the vectorizer configuration API has been updated.
Starting with Weaviate JS/TS client v3.8.0, the vectorizer configuration API has been updated.

Action required: Update to the latest client version and migrate your code to use the new vectorizer configuration API.

Specify a vectorizer

Specify a vectorizer for a collection.

Additional information

Collection level settings override default values and general configuration parameters such as environment variables.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure, Property, DataType

client.collections.create(
"Article",
vector_config=Configure.Vectors.text2vec_openai(),
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="body", data_type=DataType.TEXT),
],
)

Specify vectorizer settings

.Vectors.text2vec_xxx with AutoSchema

Defining a collection with Configure.Vectors.text2vec_xxx() with Python client library 4.16.0-4.16.3 will throw an error if no properties are defined and vectorize_collection_name is not set to True.

This is addressed in 4.16.4 of the Weaviate Python client. See this FAQ entry for more details: Invalid properties error in Python client versions 4.16.0 to 4.16.3.

To configure how a vectorizer works (i.e. what model to use) with a specific collection, set the vectorizer parameters.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure

client.collections.create(
"Article",
vector_config=Configure.Vectors.text2vec_cohere(
model="embed-multilingual-v2.0", vectorize_collection_name=True
),
)

Define named vectors

Added in v1.24

You can define multiple named vectors per collection. This allows each object to be represented by multiple vector embeddings, each with its own vector index.

As such, each named vector configuration can include its own vectorizer and vector index settings.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure, Property, DataType

client.collections.create(
"ArticleNV",
vector_config=[
# Set a named vector with the "text2vec-cohere" vectorizer
Configure.Vectors.text2vec_cohere(
name="title",
source_properties=["title"], # (Optional) Set the source property(ies)
vector_index_config=Configure.VectorIndex.hnsw(), # (Optional) Set vector index options
),
# Set another named vector with the "text2vec-openai" vectorizer
Configure.Vectors.text2vec_openai(
name="title_country",
source_properties=[
"title",
"country",
], # (Optional) Set the source property(ies)
vector_index_config=Configure.VectorIndex.hnsw(), # (Optional) Set vector index options
),
# Set a named vector for your own uploaded vectors
Configure.Vectors.self_provided(
name="custom_vector",
vector_index_config=Configure.VectorIndex.hnsw(), # (Optional) Set vector index options
),
],
properties=[ # Define properties
Property(name="title", data_type=DataType.TEXT),
Property(name="country", data_type=DataType.TEXT),
],
)

Add new named vectors

Added in v1.31

Named vectors can be added to existing collection definitions with named vectors. (This is not possible for collections without named vectors.)

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure

articles = client.collections.use("Article")

articles.config.add_vector(
vector_config=Configure.Vectors.text2vec_cohere(
name="body_vector",
source_properties=["body"],
)
)
Objects aren't automatically revectorized

Adding a new vector to the collection definition won't trigger vectorization for existing objects. Only objects created after the vector addition will receive these new vector embeddings.

Define multi-vector embeddings (e.g. ColBERT, ColPali)

Added in v1.29, v1.30

Multi-vector embeddings, also known as multi-vectors, represent a single object with multiple vectors, i.e. a 2-dimensional matrix. Multi-vectors are currently only available for HNSW indexes for named vectors. To use multi-vectors, enable it for the appropriate named vector.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure, Property, DataType

client.collections.create(
"DemoCollection",
vector_config=[
# Example 1 - Use a model integration
# The factory function will automatically enable multi-vector support for the HNSW index
Configure.MultiVectors.text2vec_jinaai(
name="jina_colbert",
source_properties=["text"],
),
# Example 2 - User-provided multi-vector representations
# Must explicitly enable multi-vector support for the HNSW index
Configure.MultiVectors.self_provided(
name="custom_multi_vector",
),
],
properties=[Property(name="text", data_type=DataType.TEXT)],
# Additional parameters not shown
)
Use quantization and encoding to compress your vectors

Multi-vector embeddings use up more memory than single vector embeddings. You can use vector quantization and encoding to compress them and reduce memory usage.

Set vector index type

The vector index type can be set for each collection at creation time, between hnsw, flat and dynamic index types.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure, Property, DataType

client.collections.create(
"Article",
vector_config=Configure.Vectors.text2vec_openai(
name="default",
vector_index_config=Configure.VectorIndex.hnsw(), # Use the HNSW index
# vector_index_config=Configure.VectorIndex.flat(), # Use the FLAT index
# vector_index_config=Configure.VectorIndex.dynamic(), # Use the DYNAMIC index
),
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="body", data_type=DataType.TEXT),
],
)
Additional information

Set vector index parameters

Set vector index parameters such as compression and filter strategy through collection configuration. Some parameters can be updated later after collection creation.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import (
Configure,
Property,
DataType,
VectorDistances,
VectorFilterStrategy,
)

client.collections.create(
"Article",
vector_config=Configure.Vectors.text2vec_openai(
name="default",
vector_index_config=Configure.VectorIndex.hnsw(
ef_construction=300,
distance_metric=VectorDistances.COSINE,
filter_strategy=VectorFilterStrategy.ACORN,
),
),
)
Additional information

Property-level settings

Configure individual properties in a collection. Each property can have it's own configuration. Here are some common settings:

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure, Property, DataType, Tokenization

client.collections.create(
"Article",
vector_config=Configure.Vectors.text2vec_cohere(),
properties=[
Property(
name="title",
data_type=DataType.TEXT,
vectorize_property_name=True, # Use "title" as part of the value to vectorize
tokenization=Tokenization.LOWERCASE, # Use "lowercase" tokenization
description="The title of the article.", # Optional description
),
Property(
name="body",
data_type=DataType.TEXT,
skip_vectorization=True, # Don't vectorize this property
tokenization=Tokenization.WHITESPACE, # Use "whitespace" tokenization
),
],
)

Specify a distance metric

If you choose to bring your own vectors, you should specify the distance metric.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure, VectorDistances

client.collections.create(
"Article",
vector_config=Configure.Vectors.text2vec_openai(
vector_index_config=Configure.VectorIndex.hnsw(
distance_metric=VectorDistances.COSINE
),
),
)

Additional information

For details on the configuration parameters, see the following:

Further resources

Questions and feedback

If you have any questions or feedback, let us know in the user forum.