Rotational Quantization (RQ)
Rotational quantization (RQ) was added in v1.32
as a technical preview.
This means that the feature is still under development and may change in future releases, including potential breaking changes.
We do not recommend using this feature in production environments at this time.
Rotational quantization (RQ) is a fast untrained vector compression technique that offers 4x compression while retaining almost perfect recall (98-99% on most datasets).
RQ is currently not supported for the flat index type.
Basic configuration
RQ can be enabled at collection creation time:
- Python Client v4
- JS/TS Client v3
- Go
- Java
from weaviate.classes.config import Configure, Property, DataType
client.collections.create(
name="MyCollection",
vector_config=Configure.Vectors.text2vec_openai(
quantizer=Configure.VectorIndex.Quantizer.rq()
),
properties=[
Property(name="title", data_type=DataType.TEXT),
],
)
await client.collections.create({
name: "MyCollection",
vectorizers : configure.vectors.text2VecOpenAI({
quantizer: configure.vectorIndex.quantizer.rq()
}),
properties: [
{ name: "title", dataType: weaviate.configure.dataType.TEXT }
]
})
// Define the configuration for RQ. Setting 'enabled' to true
rq_config := map[string]interface{}{
"enabled": true,
}
// Define the class schema
class := &models.Class{
Class: className,
Vectorizer: "text2vec-openai",
// Assign the RQ configuration to the vector index config
VectorIndexConfig: map[string]interface{}{
"rq": rq_config,
},
}
// Create the collection in Weaviate
err = client.Schema().ClassCreator().
WithClass(class).
Do(context.Background())
WeaviateClass myCollection = WeaviateClass.builder()
.className("MyCollection")
.vectorizer("text2vec-openai")
.vectorIndexConfig(VectorIndexConfig.builder()
.rq(RQConfig.builder()
.enabled(true)
.build())
.build())
.properties(Arrays.asList(
Property.builder()
.name("title")
.dataType(Arrays.asList(DataType.TEXT))
.build()))
.build();
Result<Boolean> createResult = client.schema().classCreator()
.withClass(myCollection)
.run();
Custom configuration
To tune RQ, use these quantization and vector index parameters:
Parameter | Type | Default | Details |
---|---|---|---|
rq : bits | integer | 8 | The number of bits used to quantize each data point. Currently only 8 bits is supported. |
rq : rescoreLimit | integer | -1 | The minimum number of candidates to fetch before rescoring. |
- Python Client v4
- JS/TS Client v3
- Go
- Java
from weaviate.classes.config import Configure, Property, DataType
client.collections.create(
name="MyCollection",
vector_config=Configure.Vectors.text2vec_openai(
quantizer=Configure.VectorIndex.Quantizer.rq(
bits=8, # Number of bits, only 8 is supported for now
),
vector_index_config=Configure.VectorIndex.hnsw(
vector_cache_max_objects=100000,
),
),
properties=[
Property(name="title", data_type=DataType.TEXT),
],
)
await client.collections.create({
name: "MyCollection",
vectorizers: configure.vectors.text2VecOpenAI({
quantizer: configure.vectorIndex.quantizer.rq({
bits: 8, // Number of bits, only 8 is supported for now
}),
vectorIndexConfig: configure.vectorIndex.hnsw({
distanceMetric: configure.vectorDistances.COSINE,
vectorCacheMaxObjects: 100000,
}),
}),
properties: [
{ name: "title", dataType: weaviate.configure.dataType.TEXT }
],
})
// Define a custom configuration for RQ
rq_with_options_config := map[string]interface{}{
"enabled": true,
"rescoreLimit": 200, // The number of candidates to fetch before rescoring
}
// Define the class schema with the custom RQ config and other HNSW settings
class_with_options := &models.Class{
Class: className,
Vectorizer: "text2vec-openai",
VectorIndexConfig: map[string]interface{}{
"rq": rq_with_options_config,
"distance": "cosine", // Set the distance metric for HNSW
"vectorCacheMaxObjects": 100000, // Configure the vector cache
},
}
// Create the collection in Weaviate
err = client.Schema().ClassCreator().
WithClass(class_with_options).
Do(context.Background())
WeaviateClass myCollection = WeaviateClass.builder()
.className("MyCollection")
.vectorizer("text2vec-openai")
.vectorIndexConfig(VectorIndexConfig.builder()
.rq(RQConfig.builder()
.enabled(true)
.bits(8L) // Number of bits, only 8 is supported for now
.rescoreLimit(10L)
.build())
.build())
.properties(Arrays.asList(
Property.builder()
.name("title")
.dataType(Arrays.asList(DataType.TEXT))
.build()))
.build();
Result<Boolean> createResult = client.schema().classCreator()
.withClass(myCollection)
.run();
Multiple vector embeddings (named vectors)
v1.24
Collections can have multiple named vectors. The vectors in a collection can have their own configurations, and compression must be enabled independently for each vector. Every vector is independent and can use PQ, BQ, RQ, SQ, or no compression.
Multi-vector embeddings (ColBERT, ColPali, etc.)
v1.30
Multi-vector embeddings (implemented through models like ColBERT, ColPali, or ColQwen) represent each object or query using multiple vectors instead of a single vector. Just like with single vectors, multi-vectors support PQ, BQ, RQ, SQ, or no compression.
During the initial search phase, compressed vectors are used for efficiency. However, when computing the MaxSim
operation, uncompressed vectors are utilized to ensure more precise similarity calculations. This approach balances the benefits of compression for search efficiency with the accuracy of uncompressed vectors during final scoring.
RQ supports multi-vector embeddings. Each token vector is rounded up to a multiple of 64 dimensions, which may result in less than 4x compression for very short vectors. This is a technical limitation that may be addressed in future versions.
Further resources
Questions and feedback
If you have any questions or feedback, let us know in the user forum.