Enable RQ compression

This guide shows you how to enable RQ 8-bit compression on existing collections in your Weaviate Cloud cluster. RQ compression significantly reduces vector dimensions costs while maintaining high recall.

Permanent Change

Once compression is enabled on a collection, it cannot be disabled. Always test compression on non-production data first.

Prerequisites

A Weaviate client library installed
API key with write access to your Weaviate Cloud cluster

Connect to your cluster

First, establish a connection to your Weaviate Cloud cluster:

API docs

More info

import os, weaviate

# Best practice: store your credentials in environment variables
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]

client = weaviate.connect_to_weaviate_cloud(
    cluster_url=weaviate_url, auth_credentials=weaviate_api_key
)

Replace YOUR-WEAVIATE-CLOUD-URL with your cluster URL (e.g., https://your-cluster.weaviate.network) and YOUR-API-KEY with your authentication key.

Update a single collection

The update syntax depends on your collection's vector index type (HNSW, flat, or dynamic) and whether it uses named vectors.

HNSW index (default)

Most collections use the HNSW index. To enable RQ 8-bit compression:

API docs

More info

from weaviate.classes.config import Reconfigure

collection = client.collections.get("MyUncompressedCollection")
collection.config.update(
    vector_config=Reconfigure.Vectors.update(
        name="default",
        vector_index_config=Reconfigure.VectorIndex.hnsw(
            quantizer=Reconfigure.VectorIndex.Quantizer.rq(bits=8),
        ),
    )
)

Named vectors

If your collection uses named vectors (not "default"), update the name parameter to match your vector name.

Flat index

Compression settings on flat indexes are immutable after collection creation. You cannot enable or change compression on an existing flat index. To use RQ compression with a flat index, you must specify the compression settings when creating the collection.

Dynamic index

For collections using the dynamic index, you can update the HNSW compression settings. Note that the flat index portion of a dynamic index cannot be modified after creation:

API docs

More info

from weaviate.classes.config import Reconfigure

# For dynamic indexes, only the HNSW portion can be updated after creation
# The flat index compression settings are immutable
collection = client.collections.get("MyUncompressedCollection")
collection.config.update(
    vector_config=Reconfigure.Vectors.update(
        name="default",
        vector_index_config=Reconfigure.VectorIndex.dynamic(
            hnsw=Reconfigure.VectorIndex.hnsw(
                quantizer=Reconfigure.VectorIndex.Quantizer.rq(bits=8),
            ),
        ),
    )
)

Legacy collections (pre-named vectors)

Collections created before Weaviate v1.24 (when named vectors were introduced) use a different schema structure. For these collections, use vector_index_config directly instead of vector_config:

API docs

More info

from weaviate.classes.config import Reconfigure

# For collections created before named vectors were introduced (pre-v1.24),
# use vector_index_config directly instead of vector_config
collection = client.collections.get("MyLegacyCollection")
collection.config.update(
    vector_index_config=Reconfigure.VectorIndex.hnsw(
        quantizer=Reconfigure.VectorIndex.Quantizer.rq(bits=8),
    )
)

Update multiple collections

Before updating multiple collections, you should understand the index types in your cluster. Different index types require different update syntax.

List collections by index type

The following example categorizes your collections by their vector index types:

API docs

More info

from weaviate.collections.classes.config import (
    _VectorIndexConfigHNSW,
    _VectorIndexConfigFlat,
    _VectorIndexConfigDynamic,
)

# Group collections by their vector index type
hnsw_collections = []
flat_collections = []
dynamic_collections = []
legacy_collections = []

collections = client.collections.list_all()

for collection_name in collections:
    collection = client.collections.get(collection_name)
    config = collection.config.get()

    # Check if this is a legacy collection (no named vectors)
    if not config.vector_config:
        # Legacy collection - check the top-level vector_index_config
        legacy_collections.append(collection_name)
        continue

    # For each named vector, determine its index type
    for vector_name, vector_config in config.vector_config.items():
        index_config = vector_config.vector_index_config
        entry = {"collection": collection_name, "vector": vector_name}

        if isinstance(index_config, _VectorIndexConfigHNSW):
            hnsw_collections.append(entry)
        elif isinstance(index_config, _VectorIndexConfigFlat):
            flat_collections.append(entry)
        elif isinstance(index_config, _VectorIndexConfigDynamic):
            dynamic_collections.append(entry)

print(f"HNSW collections: {len(hnsw_collections)}")
print(f"Flat collections: {len(flat_collections)}")
print(f"Dynamic collections: {len(dynamic_collections)}")
print(f"Legacy collections: {len(legacy_collections)}")

Update multiple collections

The following example enables RQ compression on HNSW collections in batches. To avoid cluster instability, limit each batch to approximately 100 collections:

API docs

More info

from weaviate.classes.config import Reconfigure

# Process collections in batches to avoid cluster instability
BATCH_SIZE = 100

# Only process the first batch (adjust slice for subsequent batches)
batch = hnsw_collections[:BATCH_SIZE]

for entry in batch:
    collection_name = entry["collection"]
    vector_name = entry["vector"]

    collection = client.collections.get(collection_name)
    print(f"Enabling RQ-8 compression for {collection_name} (vector: {vector_name})")
    collection.config.update(
        vector_config=Reconfigure.Vectors.update(
            name=vector_name,
            vector_index_config=Reconfigure.VectorIndex.hnsw(
                quantizer=Reconfigure.VectorIndex.Quantizer.rq(bits=8),
            ),
        )
    )

print(
    f"Processed {len(batch)} collections. Remaining: {len(hnsw_collections) - BATCH_SIZE}"
)

Batch size limits

Enabling compression on too many collections at once can cause cluster instability or crashes. Process collections in batches of no more than 100, and for very large collections, update them one at a time.

Dry run

Consider adding a "dry run" mode that only prints what would change without actually updating collections. Comment out the collection.config.update() call and test the logic first.

Performance considerations

Enabling compression requires additional memory during the encoding process. Plan your compression rollout carefully to avoid resource exhaustion.

Memory usage during compression

When compression is enabled on an existing collection, Weaviate re-encodes all vectors. This process temporarily increases memory usage.

Example: A collection with 1 million objects and 1,536 dimensions uses approximately 6.5 GB of memory before compression. During the compression process, memory usage increases by approximately 1.5 GB (~23% overhead) before settling to the compressed size.

Recommended approach for large clusters

Enabling compression on too many collections simultaneously can cause cluster instability or crashes. Follow these guidelines to safely roll out compression:

For clusters with multiple collections or large vector counts:

Assess your cluster: Use the list collections script to understand what you're working with
Process in batches: Limit each batch to approximately 100 collections to avoid overwhelming cluster resources
Handle large collections individually: For very large collections (millions of objects), enable compression one collection at a time
Build in buffers: Add sufficient delays between batches to allow compression to complete before starting the next batch
Start with smaller collections: Test on smaller collections first to understand timing and resource impact before compressing larger ones

Verify compression status

After enabling compression, verify the configuration:

API docs

More info

collection = client.collections.get("MyUncompressedCollection")
config = collection.config.get()

# Check if this is a legacy collection (no named vectors)
if config.vector_config:
    # Named vectors - iterate through vector_config
    for vector_name, vector_config in config.vector_config.items():
        print(f"\nVector: {vector_name}")
        quantizer = vector_config.vector_index_config.quantizer
        if quantizer:
            print(f"  Quantizer type: {type(quantizer).__name__}")
            if hasattr(quantizer, "bits"):
                print(f"  Bits: {quantizer.bits}")
        else:
            print("  No compression enabled")
else:
    # Legacy collection - check vector_index_config directly
    print(f"\nLegacy collection (no named vectors)")
    quantizer = config.vector_index_config.quantizer
    if quantizer:
        print(f"  Quantizer type: {type(quantizer).__name__}")
        if hasattr(quantizer, "bits"):
            print(f"  Bits: {quantizer.bits}")
    else:
        print("  No compression enabled")

You should see output confirming the quantizer type is rq with bits: 8.

What happens after enabling compression?

When you enable RQ compression on an existing collection:

Existing vectors are re-encoded: Weaviate automatically converts existing vector data to use RQ compression
Storage and memory usage are reduced: Less disk space and RAM are required for vector data
Query performance: Query speed remains similar with minimal recall impact (typically 1-2%)
Irreversible: The change cannot be undone

Compression timing

For large collections, the compression process may take some time. Your collection remains queryable during this process, but you may see temporary performance impacts during the conversion.

Best practices

Test first: Always test compression on a non-production cluster or collection first
Avoid bulk updates on large clusters: Do not loop through many large collections at once; process them individually or with sufficient delays
Allow completion time: Large collections may take significant time to re-encode all vectors
Monitor performance: Check query recall and latency after enabling compression
Backup data: Although compression is safe, consider backing up critical data before making changes

Further resources

Questions and feedback

If you have any questions or feedback, let us know in the user forum.

Technical questions

If you have questions feel free to post on our Community forum.

Documentation feedback

Leave feedback by opening a GitHub issue.

Additional resources

Need help?

Enable RQ compression

Prerequisites

Connect to your cluster

Update a single collection

HNSW index (default)

Flat index

Dynamic index

Legacy collections (pre-named vectors)

Update multiple collections

List collections by index type

Update multiple collections

Performance considerations

Memory usage during compression

Recommended approach for large clusters

Verify compression status

What happens after enabling compression?

Best practices

Further resources

Questions and feedback

Additional resources

Need help?

Prerequisites​

Connect to your cluster​

Update a single collection​

HNSW index (default)​

Flat index​

Dynamic index​

Legacy collections (pre-named vectors)​

Update multiple collections​

List collections by index type​

Update multiple collections​

Performance considerations​

Memory usage during compression​

Recommended approach for large clusters​

Verify compression status​

What happens after enabling compression?​

Best practices​

Further resources​

Questions and feedback​

Prerequisites

Connect to your cluster

Update a single collection

HNSW index (default)

Flat index

Dynamic index

Legacy collections (pre-named vectors)

Update multiple collections

List collections by index type

Update multiple collections

Performance considerations

Memory usage during compression

Recommended approach for large clusters

Verify compression status

What happens after enabling compression?

Best practices

Further resources

Questions and feedback