Binary Quantization (BQ)
Starting with v1.33, you can set a default quantization for new collections using the DEFAULT_QUANTIZATION environment variable. This variable is not set by default, meaning no quantization is applied unless you explicitly configure it. When set (e.g., to 8-bit RQ quantization), all newly created collections will use that quantization setting. Note that once set on a collection, quantization can't be disabled. Default quantization won't be applied to a collection if the index type isn't supported (for example PQ and SQ aren't supported for the flat index).
Binary quantization (BQ) is a vector compression technique that can reduce the size of a vector.
To use BQ, enable it as shown below and add data to the collection.
Additional information
- How to set the index type
Enable compression for new collection
BQ can be enabled at collection creation time through the collection definition:
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure
client.collections.create(
name="MyCollection",
vector_config=Configure.Vectors.text2vec_openai(
name="default",
quantizer=Configure.VectorIndex.Quantizer.bq(),
),
)
Enable compression for existing collection
v1.31The ability to enable BQ compression after collection creation was added in Weaviate v1.31.
BQ can also be enabled for an existing collection by updating the collection definition:
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Reconfigure
collection = client.collections.use("MyCollection")
collection.config.update(
vector_config=Reconfigure.Vectors.update(
name="default",
vector_index_config=Reconfigure.VectorIndex.flat(
quantizer=Reconfigure.VectorIndex.Quantizer.bq(
rescore_limit=20,
),
),
)
)
BQ parameters
The following parameters are available for BQ compression, under vectorIndexConfig:
| Parameter | Type | Default | Details |
|---|---|---|---|
bq : enabled | boolean | false | Enable BQ. Weaviate uses binary quantization (BQ) compression when true. The Python client does not use the enabled parameter. To enable BQ with the v4 client, set a quantizer in the collection definition. |
bq : rescoreLimit | integer | -1 | The minimum number of candidates to fetch before rescoring. |
bq : cache | boolean | false | Whether to cache the vectors in memory. (only when using the flat vector index type) |
vectorCacheMaxObjects | integer | 1e12 | Maximum number of objects in the memory cache. By default, this limit is set to one trillion (1e12) objects when a new collection is created. For sizing recommendations, see Vector cache considerations. |
For example:
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure
client.collections.create(
name="MyCollection",
vector_config=Configure.Vectors.text2vec_openai(
name="default",
quantizer=Configure.VectorIndex.Quantizer.bq(rescore_limit=200, cache=True),
vector_index_config=Configure.VectorIndex.flat(
vector_cache_max_objects=100000,
),
),
)
Additional considerations
Multiple vector embeddings (named vectors)
Collections can have multiple named vectors. The vectors in a collection can have their own configurations, and compression must be enabled independently for each vector. Every vector is independent and can use PQ, BQ, RQ, SQ, or no compression.
Multi-vector embeddings (ColBERT, ColPali, etc.)
v1.30Multi-vector embeddings (implemented through models like ColBERT, ColPali, or ColQwen) represent each object or query using multiple vectors instead of a single vector. Just like with single vectors, multi-vectors support PQ, BQ, RQ, SQ, or no compression.
During the initial search phase, compressed vectors are used for efficiency. However, when computing the MaxSim operation, uncompressed vectors are utilized to ensure more precise similarity calculations. This approach balances the benefits of compression for search efficiency with the accuracy of uncompressed vectors during final scoring.
Further resources
- Starter guides: Compression
- Reference: Vector index
- Concepts: Vector quantization
- Concepts: Vector index
Questions and feedback
If you have any questions or feedback, let us know in the user forum.
