Select & configure vector indexes

Each named vector in a Weaviate collection can have its own index configuration. This setting is then used to configure the index for that specific vector. In a multi-tenant collection, each tenant inherits the index configuration of the collection.

This tutorial provides a hands-on guide to configuring and tuning these vector indexes. While the defaults are a great starting point, understanding how to adjust index parameters can help you optimize for search speed, accuracy, and memory usage.

We will cover how to configure the HNSW, Flat, and Dynamic index types.

Prerequisites

This tutorial assumes you are familiar with the basics of Weaviate. If not, start with the Quickstart.

We also assume you are familiar with the high-level concepts of vector indexing.

The HNSW Index: For Speed and Scale

The Hierarchical Navigable Small World (HNSW) index is the default in Weaviate. It's designed for large-scale datasets where you need fast and accurate similarity searches. HNSW builds a multi-layered graph of your vectors, which allows it to find approximate nearest neighbors very efficiently.

Default HNSW Configuration

If you create a collection without specifying a vectorIndexConfig, Weaviate will use HNSW with its default settings. You can, however, explicitly define it.

from weaviate.classes.config import Configure

client.collections.create(
    name=collection_name,
    # ... other parameters
    vector_config=Configure.Vectors.text2vec_weaviate(
        vector_index_config=Configure.VectorIndex.hnsw()
    ),
)

Tuning HNSW for Performance

The key to HNSW is balancing the trade-offs between search speed, recall (accuracy), and import/build time. You can tune this balance by adjusting its parameters.

The most important parameters are:

maxConnections: The number of connections each node in the graph will have. More connections lead to higher accuracy but use more memory and can slow down searches.
efConstruction: The size of the dynamic list used during index construction. A higher value creates a more accurate graph, improving search performance at the cost of longer import times.
ef: The size of the dynamic list used during a search. This is one of the most critical parameters for tuning. A higher ef value leads to better recall but slower searches.

Let's create a collection with custom HNSW parameters to optimize for high recall.

from weaviate.classes.config import Configure, VectorDistances

client.collections.create(
    name=collection_name,
    # ... other parameters
    vector_config=Configure.Vectors.text2vec_weaviate(
        vector_index_config=Configure.VectorIndex.hnsw(
            # Distance metric
            distance_metric=VectorDistances.COSINE,
            # Parameters for HNSW index construction
            ef_construction=256,    # Dynamic list size during construction
            max_connections=128,    # Maximum number of connections per node
            quantizer=Configure.VectorIndex.Quantizer.bq(), # Quantizer configuration
            # Parameters for HNSW search
            ef=-1,                  # Dynamic list size during search; -1 enables dynamic Ef
            dynamic_ef_factor=15,   # Multiplier for dynamic Ef
            dynamic_ef_min=200,     # Minimum threshold for dynamic Ef
            dynamic_ef_max=1000,    # Maximum threshold for dynamic Ef
        )
    ),
)

For a deeper dive into all available parameters, see the HNSW configuration reference.

The Flat Index: For Accuracy and Small Datasets

The Flat index performs a brute-force search by comparing a query vector to every single vector in the index. This guarantees perfect recall but does not scale well for large datasets.

It's an excellent choice for:

Small indexes (e.g., under 10,000-20,000 objects).
Multi-tenant use cases where each tenant has a small, isolated index.

The main benefit of the Flat index is its extremely low memory overhead, as it doesn't need to store a complex graph structure.

Configuring a Flat Index

Here’s how to configure a collection to use the Flat index. You can also enable quantization to speed up the brute-force search.

from weaviate.classes.config import Configure, VectorDistances

client.collections.create(
    name=collection_name,
    # ... other parameters
    vector_config=Configure.Vectors.text2vec_weaviate(
        vector_index_config=Configure.VectorIndex.flat(
            distance_metric=VectorDistances.COSINE,                     # Distance metric
            quantizer=Configure.VectorIndex.Quantizer.bq(cache=True),   # Quantizer configuration
            vector_cache_max_objects=1000000,                           # Maximum number of objects in the cache
        )
    ),
)

For more details, see the Flat index configuration reference.

The Dynamic Index: The Best of Both Worlds

The Dynamic index is a powerful feature for use cases where the size of a collection (or a tenant's data) can vary unpredictably.

It works by:

Starting with a Flat index, which is efficient for small numbers of objects.
Automatically converting to an HNSW index once the number of objects crosses a specified threshold (default is 10,000).

This is particularly useful in multi-tenant environments, as small tenants can use the memory-efficient Flat index, while large tenants automatically get the performance benefits of HNSW.

Asynchronous Indexing Required

The Dynamic index requires asynchronous indexing to be enabled in your Weaviate instance.

Basic Dynamic Index Configuration

Here's how to set up a collection with a Dynamic index using default settings.

from weaviate.classes.config import Configure

client.collections.create(
    name=collection_name,
    # ... other parameters
    vector_config=Configure.Vectors.text2vec_weaviate(
        vector_index_config=Configure.VectorIndex.dynamic()
    ),
    multi_tenancy_config=Configure.multi_tenancy(enabled=True), # Dyanmic index works well with multi-tenancy set-ups
)

Custom Dynamic Index Configuration

You can customize both the HNSW and Flat configurations that the Dynamic index will use, as well as the threshold for switching.

from weaviate.classes.config import Configure, VectorDistances

client.collections.create(
    name=collection_name,
    # ... other parameters
    vector_config=Configure.Vectors.text2vec_weaviate(
        vector_index_config=Configure.VectorIndex.dynamic(
            distance_metric=VectorDistances.COSINE,                     # Distance metric
            threshold=25000,                                            # Threshold for switching to dynamic index
            hnsw=Configure.VectorIndex.hnsw(
                # Your preferred HNSW configuration
            ),
            flat=Configure.VectorIndex.flat(
                # Your preferred flat configuration
            ),
        )
    ),
    multi_tenancy_config=Configure.multi_tenancy(   # Dyanmic index works well with multi-tenancy set-ups
        enabled=True,
        auto_tenant_creation=True,
        auto_tenant_activation=True,
    ),
)

For more details, see the Dynamic index configuration reference.

Next Steps

You now have a practical understanding of how to configure Weaviate's different vector index types.

For more about the concepts behind these indexes, visit the Vector Index Concepts page.
To learn about reducing memory usage even further, check out the concepts page on Vector Quantization.

Additional resources

Need help?

Select & configure vector indexes

The HNSW Index: For Speed and Scale

Default HNSW Configuration

Tuning HNSW for Performance

The Flat Index: For Accuracy and Small Datasets

Configuring a Flat Index

The Dynamic Index: The Best of Both Worlds

Basic Dynamic Index Configuration

Custom Dynamic Index Configuration

Next Steps

Additional resources

Need help?

The HNSW Index: For Speed and Scale​

Default HNSW Configuration​

Tuning HNSW for Performance​

The Flat Index: For Accuracy and Small Datasets​

Configuring a Flat Index​

The Dynamic Index: The Best of Both Worlds​

Basic Dynamic Index Configuration​

Custom Dynamic Index Configuration​

Next Steps​

The HNSW Index: For Speed and Scale

Default HNSW Configuration

Tuning HNSW for Performance

The Flat Index: For Accuracy and Small Datasets

Configuring a Flat Index

The Dynamic Index: The Best of Both Worlds

Basic Dynamic Index Configuration

Custom Dynamic Index Configuration

Next Steps