Enable RQ compression
This guide shows you how to enable RQ 8-bit compression on existing collections in your Weaviate Cloud cluster. RQ compression significantly reduces vector dimensions costs while maintaining high recall.
Once compression is enabled on a collection, it cannot be disabled. Always test compression on non-production data first.
Prerequisites
- A Weaviate client library installed
- API key with write access to your Weaviate Cloud cluster
Connect to your cluster
First, establish a connection to your Weaviate Cloud cluster:
If a snippet doesn't work or you have feedback, please open a GitHub issue.
import os, weaviate
# Best practice: store your credentials in environment variables
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
client = weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url, auth_credentials=weaviate_api_key
)
Replace YOUR-WEAVIATE-CLOUD-URL with your cluster URL (e.g., https://your-cluster.weaviate.network) and YOUR-API-KEY with your authentication key.
Update a single collection
The update syntax depends on your collection's vector index type (HNSW, flat, or dynamic) and whether it uses named vectors.
HNSW index (default)
Most collections use the HNSW index. To enable RQ 8-bit compression:
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Reconfigure
collection = client.collections.get("MyUncompressedCollection")
collection.config.update(
vector_config=Reconfigure.Vectors.update(
name="default",
vector_index_config=Reconfigure.VectorIndex.hnsw(
quantizer=Reconfigure.VectorIndex.Quantizer.rq(bits=8),
),
)
)
If your collection uses named vectors (not "default"), update the name parameter to match your vector name.
Flat index
Compression settings on flat indexes are immutable after collection creation. You cannot enable or change compression on an existing flat index. To use RQ compression with a flat index, you must specify the compression settings when creating the collection.
Dynamic index
For collections using the dynamic index, you can update the HNSW compression settings. Note that the flat index portion of a dynamic index cannot be modified after creation:
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Reconfigure
# For dynamic indexes, only the HNSW portion can be updated after creation
# The flat index compression settings are immutable
collection = client.collections.get("MyUncompressedCollection")
collection.config.update(
vector_config=Reconfigure.Vectors.update(
name="default",
vector_index_config=Reconfigure.VectorIndex.dynamic(
hnsw=Reconfigure.VectorIndex.hnsw(
quantizer=Reconfigure.VectorIndex.Quantizer.rq(bits=8),
),
),
)
)
Legacy collections (pre-named vectors)
Collections created before Weaviate v1.24 (when named vectors were introduced) use a different schema structure. For these collections, use vector_index_config directly instead of vector_config:
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Reconfigure
# For collections created before named vectors were introduced (pre-v1.24),
# use vector_index_config directly instead of vector_config
collection = client.collections.get("MyLegacyCollection")
collection.config.update(
vector_index_config=Reconfigure.VectorIndex.hnsw(
quantizer=Reconfigure.VectorIndex.Quantizer.rq(bits=8),
)
)
Update multiple collections
Before updating multiple collections, you should understand the index types in your cluster. Different index types require different update syntax.
List collections by index type
The following example categorizes your collections by their vector index types:
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.collections.classes.config import (
_VectorIndexConfigHNSW,
_VectorIndexConfigFlat,
_VectorIndexConfigDynamic,
)
# Group collections by their vector index type
hnsw_collections = []
flat_collections = []
dynamic_collections = []
legacy_collections = []
collections = client.collections.list_all()
for collection_name in collections:
collection = client.collections.get(collection_name)
config = collection.config.get()
# Check if this is a legacy collection (no named vectors)
if not config.vector_config:
# Legacy collection - check the top-level vector_index_config
legacy_collections.append(collection_name)
continue
# For each named vector, determine its index type
for vector_name, vector_config in config.vector_config.items():
index_config = vector_config.vector_index_config
entry = {"collection": collection_name, "vector": vector_name}
if isinstance(index_config, _VectorIndexConfigHNSW):
hnsw_collections.append(entry)
elif isinstance(index_config, _VectorIndexConfigFlat):
flat_collections.append(entry)
elif isinstance(index_config, _VectorIndexConfigDynamic):
dynamic_collections.append(entry)
print(f"HNSW collections: {len(hnsw_collections)}")
print(f"Flat collections: {len(flat_collections)}")
print(f"Dynamic collections: {len(dynamic_collections)}")
print(f"Legacy collections: {len(legacy_collections)}")
Update multiple collections
The following example enables RQ compression on HNSW collections in batches. To avoid cluster instability, limit each batch to approximately 100 collections:
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Reconfigure
# Process collections in batches to avoid cluster instability
BATCH_SIZE = 100
# Only process the first batch (adjust slice for subsequent batches)
batch = hnsw_collections[:BATCH_SIZE]
for entry in batch:
collection_name = entry["collection"]
vector_name = entry["vector"]
collection = client.collections.get(collection_name)
print(f"Enabling RQ-8 compression for {collection_name} (vector: {vector_name})")
collection.config.update(
vector_config=Reconfigure.Vectors.update(
name=vector_name,
vector_index_config=Reconfigure.VectorIndex.hnsw(
quantizer=Reconfigure.VectorIndex.Quantizer.rq(bits=8),
),
)
)
print(
f"Processed {len(batch)} collections. Remaining: {len(hnsw_collections) - BATCH_SIZE}"
)
Enabling compression on too many collections at once can cause cluster instability or crashes. Process collections in batches of no more than 100, and for very large collections, update them one at a time.
Consider adding a "dry run" mode that only prints what would change without actually updating collections. Comment out the collection.config.update() call and test the logic first.
Performance considerations
Enabling compression requires additional memory during the encoding process. Plan your compression rollout carefully to avoid resource exhaustion.
Memory usage during compression
When compression is enabled on an existing collection, Weaviate re-encodes all vectors. This process temporarily increases memory usage.
Example: A collection with 1 million objects and 1,536 dimensions uses approximately 6.5 GB of memory before compression. During the compression process, memory usage increases by approximately 1.5 GB (~23% overhead) before settling to the compressed size.
Recommended approach for large clusters
Enabling compression on too many collections simultaneously can cause cluster instability or crashes. Follow these guidelines to safely roll out compression:
For clusters with multiple collections or large vector counts:
- Assess your cluster: Use the list collections script to understand what you're working with
- Process in batches: Limit each batch to approximately 100 collections to avoid overwhelming cluster resources
- Handle large collections individually: For very large collections (millions of objects), enable compression one collection at a time
- Build in buffers: Add sufficient delays between batches to allow compression to complete before starting the next batch
- Start with smaller collections: Test on smaller collections first to understand timing and resource impact before compressing larger ones
Verify compression status
After enabling compression, verify the configuration:
If a snippet doesn't work or you have feedback, please open a GitHub issue.
collection = client.collections.get("MyUncompressedCollection")
config = collection.config.get()
# Check if this is a legacy collection (no named vectors)
if config.vector_config:
# Named vectors - iterate through vector_config
for vector_name, vector_config in config.vector_config.items():
print(f"\nVector: {vector_name}")
quantizer = vector_config.vector_index_config.quantizer
if quantizer:
print(f" Quantizer type: {type(quantizer).__name__}")
if hasattr(quantizer, "bits"):
print(f" Bits: {quantizer.bits}")
else:
print(" No compression enabled")
else:
# Legacy collection - check vector_index_config directly
print(f"\nLegacy collection (no named vectors)")
quantizer = config.vector_index_config.quantizer
if quantizer:
print(f" Quantizer type: {type(quantizer).__name__}")
if hasattr(quantizer, "bits"):
print(f" Bits: {quantizer.bits}")
else:
print(" No compression enabled")
You should see output confirming the quantizer type is rq with bits: 8.
What happens after enabling compression?
When you enable RQ compression on an existing collection:
- Existing vectors are re-encoded: Weaviate automatically converts existing vector data to use RQ compression
- Storage and memory usage are reduced: Less disk space and RAM are required for vector data
- Query performance: Query speed remains similar with minimal recall impact (typically 1-2%)
- Irreversible: The change cannot be undone
For large collections, the compression process may take some time. Your collection remains queryable during this process, but you may see temporary performance impacts during the conversion.
Best practices
- Test first: Always test compression on a non-production cluster or collection first
- Avoid bulk updates on large clusters: Do not loop through many large collections at once; process them individually or with sufficient delays
- Allow completion time: Large collections may take significant time to re-encode all vectors
- Monitor performance: Check query recall and latency after enabling compression
- Backup data: Although compression is safe, consider backing up critical data before making changes
Further resources
- How-to configure: Rotational quantization (compression)
- Concepts: Vector quantization
- Starter guides: Managing resources - Compression
Questions and feedback
If you have any questions or feedback, let us know in the user forum.
