Skip to main content
Go to documentation:
⌘U
Weaviate Database

Develop AI applications using Weaviate's APIs and tools

Deploy

Deploy, configure, and maintain Weaviate Database

Weaviate Agents

Build and deploy intelligent agents with Weaviate

Weaviate Cloud

Manage and scale Weaviate in the cloud

Additional resources

Academy
Integrations
Contributor guide

Need help?

Weaviate LogoAsk AI Assistant⌘K
Community Forum

Horizontal Scaling Deployment Strategies

Weaviate offers two complementary superpowers for scaling your deployment: sharding and replication. Sharding divides your data across multiple nodes, allowing you to handle datasets far larger than a single machine could process. Meanwhile, replication creates redundant copies of your data, ensuring high availability even when individual nodes fail or need maintenance. While each scaling method shines on its own, the true magic happens when they join forces.

Let's explore how you can harness these capabilities to build a deployment that's both massive in scale and rock-solid in reliability!

Scaling Methods

Sharding vs Replication

Replication

Replication creates redundant copies of your data, it is useful when your data needs to be highly available.

Sharding

Sharding divides data across nodes, it is useful when your dataset is too large for just a single node.

Choosing your strategy

Requirement/GoalShardingReplicationBoth CombinedPrimary Consideration
Handle dataset too large for single node
Yes
No
Yes

How much data are you storing?

  • Vector dimensions and count determine memory requirements

  • Sharding divides this across nodes

Improve query throughput
Maybe*
Yes
Yes

Is your workload read-heavy?

  • Replication allows distributing read queries across nodes

  • Sharding may help with certain query patterns

Accelerate data imports
Yes
No
Yes

Is import speed a priority?

  • Sharding enables parallel processing of imports

  • Replication adds overhead during imports

Ensure high availability
No
Yes
Yes

Can you tolerate downtime?

  • Replication provides redundancy if nodes fail

  • Without replication, shard loss = data loss

Enable zero-downtime upgrades
No
Yes
Yes

How critical is continuous operation?

  • Replication allows rolling updates

  • Production systems typically require this capability

Optimize resource utilization
Yes
Maybe*
Maybe*

Are you resource-constrained?

  • Sharding distributes load efficiently

  • Replication adds resource overhead

Geographic distribution
No
Yes
Yes

Do you need multi-region support?

  • Replicas can be deployed across regions

  • Reduces latency for geographically distributed users

*This may serve as a partial solution and will depend on your configuration.

Sharding: Divide and Conquer

You've made the decision to shard your data, now let's get it configured:

from weaviate.classes.config import Configure

client.collections.create(
"Article",
sharding_config=Configure.sharding(
virtual_per_physical=128,
desired_count=1,
desired_virtual_count=128,
)
)

Parameters

These parameters are used to configure your collection shards:

ParameterTypeDescription
desiredCountintegerImmutable, Optional. Controls the target number of physical shards for the collection index. Defaults to the number of nodes in the cluster, but can be explicitly set lower. If set higher than the node count, some nodes will host multiple shards.
virtualPerPhysicalintegerImmutable, Optional. Defines how many virtual shards correspond to one physical shard, defaulting to 128. Using virtual shards aids in reducing data movement during resharding.
desiredVirtualCountintegerRead-only. Shows the target total number of virtual shards, calculated as desiredCount * virtualPerPhysical.

Replication: An army of clones

Configure your data's replication to ensure it's always available:

Replication factor change

Currently (from v1.25.0 onwards) a replication factor cannot be changed once it is set.

This is due to the schema consensus algorithm change in v1.25. This will be improved in future versions.

Configure replication settings, such as async replication and deletion resolution strategy.

from weaviate.classes.config import Configure, ReplicationDeletionStrategy

client.collections.create(
"Article",
replication_config=Configure.replication(
factor=3,
async_enabled=True, # Enable asynchronous repair
deletion_strategy=ReplicationDeletionStrategy.TIME_BASED_RESOLUTION, # Added in v1.28; Set the deletion conflict resolution strategy
)
)

In a highly available environment, combining sharding and replication leverages the power and capabilities of both methods to be a dynamic duo that keeps your deployment highly available. If given the opportunity, those two techniques will be your deployment's dynamic duo. Specifically using the ASYNC_REPLICATION environment variables introduced in the 1.29 release will allow you to unleash the full power of horizontal scaling!

Questions and feedback

If you have any questions or feedback, let us know in the user forum.