Binary Quantization (BQ)
v1.23
BQ is available for the flat
index type from v1.23
onwards and for the hnsw
index type from v1.24
.
Binary quantization (BQ) is a vector compression technique that can reduce the size of a vector.
To use BQ, enable it as shown below and add data to the collection.
Additional information
- How to set the index type
Enable compression for new collection
BQ can be enabled at collection creation time through the collection definition:
- Python Client v4
- Python Client v3
- JS/TS Client v3
- JS/TS Client v2
- Go
- Java
from weaviate.classes.config import Configure
client.collections.create(
name="MyCollection",
vector_config=Configure.Vectors.text2vec_openai(
name="default",
quantizer=Configure.VectorIndex.Quantizer.bq(),
),
)
class_definition = {
"class": "MyCollection",
"vectorizer": "text2vec-openai", # Can be any vectorizer
"vectorIndexType": "flat",
"vectorIndexConfig": {
"bq": {
"enabled": True,
},
},
# Remainder not shown
}
client.schema.create_class(class_definition)
import { configure } from 'weaviate-client';
const collection = await client.collections.create({
name: 'MyCollection',
vectorizers: weaviate.configure.vectors.selfProvided({
vectorIndexConfig: weaviate.configure.vectorIndex.hnsw({
quantizer: weaviate.configure.vectorIndex.quantizer.bq(),
})
})
})
async function enableBQ() {
const classObj = {
class: 'MyCollection',
vectorizer: 'text2vec-openai', // Can be any vectorizer
vectorIndexType: 'flat',
vectorIndexConfig: {
bq: {
enabled: true,
},
},
// Remainder not shown
};
const res = await client.schema.classCreator().withClass(classObj).do();
console.log(res);
}
await enableBQ();
// Define the configuration for BQ. Setting 'enabled' to true
bq_config := map[string]interface{}{
"enabled": true,
}
// Define the class schema
class := &models.Class{
Class: className,
Vectorizer: "text2vec-openai",
// Assign the BQ configuration to the vector index config
VectorIndexConfig: map[string]interface{}{
"bq": bq_config,
},
}
// Create the collection in Weaviate
err = client.Schema().ClassCreator().
WithClass(class).
Do(context.Background())
WeaviateClass myCollection = WeaviateClass.builder()
.className("MyCollection")
.vectorizer("text2vec-openai")
.vectorIndexConfig(VectorIndexConfig.builder()
.bq(BQConfig.builder()
.enabled(true)
.build())
.build())
.properties(Arrays.asList(
Property.builder()
.name("title")
.dataType(Arrays.asList(DataType.TEXT))
.build()))
.build();
Result<Boolean> createResult = client.schema().classCreator()
.withClass(myCollection)
.run();
Enable compression for existing collection
v1.31
The ability to enable BQ compression after collection creation was added in Weaviate v1.31
.
BQ can also be enabled for an existing collection by updating the collection definition:
- Python Client v4
- Go
- Java
from weaviate.classes.config import Reconfigure
collection = client.collections.get("MyCollection")
collection.config.update(
vector_config=Reconfigure.Vectors.update(
name="default",
vector_index_config=Reconfigure.VectorIndex.flat(
quantizer=Reconfigure.VectorIndex.Quantizer.bq(
rescore_limit=20,
),
),
)
)
// Get the existing collection configuration
class, err := client.Schema().ClassGetter().
WithClassName(className).Do(context.Background())
if err != nil {
log.Fatalf("get class for vec idx cfg update: %v", err)
}
// Get the current vector index configuration
cfg := class.VectorIndexConfig.(map[string]interface{})
// Add BQ configuration to enable binary quantization
cfg["bq"] = map[string]interface{}{
"enabled": true,
"rescoreLimit": 200,
"cache": true,
}
// Update the class configuration
class.VectorIndexConfig = cfg
// Apply the updated configuration to the collection
err = client.Schema().ClassUpdater().
WithClass(class).Do(context.Background())
if err != nil {
log.Fatalf("update class to use bq: %v", err)
}
WeaviateClass updatedCollection = WeaviateClass.builder()
.className("MyCollection")
.description("Updated collection with BQ compression")
.properties(Arrays.asList(
Property.builder()
.name("title")
.dataType(Arrays.asList(DataType.TEXT))
.build()))
.vectorizer("text2vec-openai")
.vectorIndexConfig(VectorIndexConfig.builder()
.bq(BQConfig.builder()
.enabled(true)
.rescoreLimit(10L)
.build())
.build())
.build();
Result<Boolean> updateResult = client.schema().classUpdater()
.withClass(updatedCollection)
.run();
BQ parameters
The following parameters are available for BQ compression, under vectorIndexConfig
:
Parameter | Type | Default | Details |
---|---|---|---|
bq : enabled | boolean | false | Enable BQ. Weaviate uses binary quantization (BQ) compression when true . The Python client v4 does not use the enabled parameter. To enable BQ with the v4 client, set a quantizer in the collection definition. |
bq : rescoreLimit | integer | -1 | The minimum number of candidates to fetch before rescoring. |
bq : cache | boolean | false | Whether to use the vector cache. |
vectorCacheMaxObjects | integer | 1e12 | Maximum number of objects in the memory cache. By default, this limit is set to one trillion (1e12 ) objects when a new collection is created. For sizing recommendations, see Vector cache considerations. |
For example:
- Python Client v4
- Python Client v3
- JS/TS Client v3
- JS/TS Client v2
- Go
- Java
from weaviate.classes.config import Configure
client.collections.create(
name="MyCollection",
vector_config=Configure.Vectors.text2vec_openai(
name="default",
quantizer=Configure.VectorIndex.Quantizer.bq(rescore_limit=200, cache=True),
vector_index_config=Configure.VectorIndex.flat(
vector_cache_max_objects=100000,
),
),
)
class_definition = {
"class": "MyCollection",
"vectorizer": "text2vec-openai", # Can be any vectorizer
"vectorIndexType": "flat",
"vectorIndexConfig": {
"bq": {
"enabled": True,
"rescoreLimit": 200, # The minimum number of candidates to fetch before rescoring
"cache": True, # Default: False
},
"vectorCacheMaxObjects": 100000, # Cache size (used if `cache` enabled)
},
# Remainder not shown
}
client.schema.create_class(class_definition)
import { configure } from 'weaviate-client';
const collection = await client.collections.create({
name: 'MyCollection',
vectorizers: weaviate.configure.vectors.selfProvided({
vectorIndexConfig: weaviate.configure.vectorIndex.hnsw({
quantizer: weaviate.configure.vectorIndex.quantizer.bq({
cache: true, // Enable caching
rescoreLimit: 200, // The minimum number of candidates to fetch before rescoring
}),
vectorCacheMaxObjects: 10000 // Cache size (used if `cache` enabled)
})
})
})
async function bqWithOptions() {
const classObj = {
class: 'MyCollection',
vectorizer: 'text2vec-openai', // Can be any vectorizer
vectorIndexType: 'flat',
vectorIndexConfig: {
bq: {
enabled: true,
rescoreLimit: 200, // The minimum number of candidates to fetch before rescoring
cache: true, // Default: false
},
vectorCacheMaxObjects: 100000, // Cache size (used if `cache` enabled)
},
// Remainder not shown
};
const res = await client.schema.classCreator().withClass(classObj).do();
console.log(res);
}
await bqWithOptions();
// Define a custom configuration for BQ
bq_with_options_config := map[string]interface{}{
"enabled": true,
"rescoreLimit": 200, // The minimum number of candidates to fetch before rescoring
"cache": true, // Enable caching of binary quantized vectors
}
// Define the class schema with the custom BQ config and other HNSW settings
class_with_options := &models.Class{
Class: className,
Vectorizer: "text2vec-openai",
VectorIndexConfig: map[string]interface{}{
"bq": bq_with_options_config,
"distance": "cosine", // Set the distance metric for HNSW
"vectorCacheMaxObjects": 100000, // Configure the vector cache
},
}
// Create the collection in Weaviate
err = client.Schema().ClassCreator().
WithClass(class_with_options).
Do(context.Background())
WeaviateClass myCollection = WeaviateClass.builder()
.className("MyCollection")
.vectorizer("text2vec-openai")
.vectorIndexConfig(VectorIndexConfig.builder()
.bq(BQConfig.builder()
.enabled(true)
.rescoreLimit(20L)
.build())
.build())
.properties(Arrays.asList(
Property.builder()
.name("title")
.dataType(Arrays.asList(DataType.TEXT))
.build()))
.build();
Result<Boolean> createResult = client.schema().classCreator()
.withClass(myCollection)
.run();
Additional considerations
Multiple vector embeddings (named vectors)
v1.24
Collections can have multiple named vectors. The vectors in a collection can have their own configurations, and compression must be enabled independently for each vector. Every vector is independent and can use PQ, BQ, RQ, SQ, or no compression.
Multi-vector embeddings (ColBERT, ColPali, etc.)
v1.30
Multi-vector embeddings (implemented through models like ColBERT, ColPali, or ColQwen) represent each object or query using multiple vectors instead of a single vector. Just like with single vectors, multi-vectors support PQ, BQ, RQ, SQ, or no compression.
During the initial search phase, compressed vectors are used for efficiency. However, when computing the MaxSim
operation, uncompressed vectors are utilized to ensure more precise similarity calculations. This approach balances the benefits of compression for search efficiency with the accuracy of uncompressed vectors during final scoring.
Further resources
- Starter guides: Compression
- Reference: Vector index
- Concepts: Vector quantization
- Concepts: Vector index
Questions and feedback
If you have any questions or feedback, let us know in the user forum.