Skip to main content
Go to documentation:
⌘U
Weaviate Database

Develop AI applications using Weaviate's APIs and tools

Deploy

Deploy, configure, and maintain Weaviate Database

Weaviate Agents

Build and deploy intelligent agents with Weaviate

Weaviate Cloud

Manage and scale Weaviate in the cloud

Additional resources

Academy
Integrations
Contributor guide
Events & Workshops

Need help?

Weaviate LogoAsk AI Assistant⌘K
Community Forum

Rotational Quantization (RQ)

Technical preview

Rotational quantization (RQ) was added in v1.32 as a technical preview.

This means that the feature is still under development and may change in future releases, including potential breaking changes. We do not recommend using this feature in production environments at this time.

Rotational quantization (RQ) is a fast untrained vector compression technique that offers 4x compression while retaining almost perfect recall (98-99% on most datasets).

HNSW only

RQ is currently not supported for the flat index type.

Basic configuration

RQ can be enabled at collection creation time:

from weaviate.classes.config import Configure, Property, DataType

client.collections.create(
name="MyCollection",
vector_config=Configure.Vectors.text2vec_openai(
quantizer=Configure.VectorIndex.Quantizer.rq()
),
properties=[
Property(name="title", data_type=DataType.TEXT),
],
)

Custom configuration

To tune RQ, use these quantization and vector index parameters:

ParameterTypeDefaultDetails
rq: bitsinteger8The number of bits used to quantize each data point. Currently only 8 bits is supported.
rq: rescoreLimitinteger-1The minimum number of candidates to fetch before rescoring.
from weaviate.classes.config import Configure, Property, DataType

client.collections.create(
name="MyCollection",
vector_config=Configure.Vectors.text2vec_openai(
quantizer=Configure.VectorIndex.Quantizer.rq(
bits=8, # Number of bits, only 8 is supported for now
),
vector_index_config=Configure.VectorIndex.hnsw(
vector_cache_max_objects=100000,
),
),
properties=[
Property(name="title", data_type=DataType.TEXT),
],
)

Multiple vector embeddings (named vectors)

Added in v1.24

Collections can have multiple named vectors. The vectors in a collection can have their own configurations, and compression must be enabled independently for each vector. Every vector is independent and can use PQ, BQ, RQ, SQ, or no compression.

Multi-vector embeddings (ColBERT, ColPali, etc.)

Added in v1.30

Multi-vector embeddings (implemented through models like ColBERT, ColPali, or ColQwen) represent each object or query using multiple vectors instead of a single vector. Just like with single vectors, multi-vectors support PQ, BQ, RQ, SQ, or no compression.

During the initial search phase, compressed vectors are used for efficiency. However, when computing the MaxSim operation, uncompressed vectors are utilized to ensure more precise similarity calculations. This approach balances the benefits of compression for search efficiency with the accuracy of uncompressed vectors during final scoring.

Multi-vector performance

RQ supports multi-vector embeddings. Each token vector is rounded up to a multiple of 64 dimensions, which may result in less than 4x compression for very short vectors. This is a technical limitation that may be addressed in future versions.

Further resources

Questions and feedback

If you have any questions or feedback, let us know in the user forum.