Replica movement
v1.32Beyond setting the initial replication factor, you can actively manage the placement of shard replicas within your Weaviate cluster. This is useful for rebalancing data after scaling, decommissioning nodes, or optimizing data locality. Replica movement is managed through a set of dedicated RESTful API endpoints or through the client library API described below.
When a replica movement is initiated, it changes the replication factor of that shard only, not the entire collection. A collection has a particular replication factor, but a shard (a subset of a collection), can have its own replication factor which can be different. If a replica COPY operation is made, this can be incremented.
Check shard state
Before initiating any movement, you might want to inspect the current distribution of replicas. You can retrieve the sharding state of an entire collection, or its specific shard.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
sharding_state = client.cluster.query_sharding_state(
    collection=collection_name,
    # shard=shard_name,  # Optional: specify a shard to filter results
)
print(f"Shards in '{collection_name}': {[s.name for s in sharding_state.shards]}")
for shard in sharding_state.shards:
    print(f"Nodes for shard '{shard.name}': {shard.replicas}")
Code output
Shards in 'MyReplicatedDocCollection': ['0QK7V2bbAHQ2', 'arxzWNklLIU7', 'w5OcBGbNvRt4']
Nodes for shard '0QK7V2bbAHQ2': ['node3', 'node1']
Nodes for shard 'arxzWNklLIU7': ['node1', 'node2']
Nodes for shard 'w5OcBGbNvRt4': ['node2', 'node3']
Initiate a replica movement
Copy or move a shard replica by specifying the the source node, destination node, collection name, shard ID and operation type (MOVE or COPY).
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.cluster.models import ReplicationType
operation_id = client.cluster.replicate(
    collection=collection_name,
    shard=shard_name,
    source_node=source_node_name,
    target_node=target_node_name,
    replication_type=ReplicationType.COPY,  # For copying a shard
    # replication_type=ReplicationType.MOVE,  # For moving a shard
)
print(f"Replication initiated, ID: {operation_id}")
Code output
Replication initiated, ID: 32536c0e-09e1-4ea1-a2c5-e85af10a9d58
Check the status of a replication operation
Shard replication operations are asynchronous. An operation status can be queried, with an option to view the full operation history.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
op_status = client.cluster.replications.get(
    uuid=operation_id,
    include_history=True
)
print(f"Status for {operation_id}: {op_status.status}")
print(f"History for {operation_id}: {op_status.status_history}")
Code output
Status for f771aae1-f3c4-4fac-bae6-90597e8c70bd: ReplicateOperationStatus(state=<ReplicateOperationState.FINALIZING: 'FINALIZING'>, errors=[])
History for f771aae1-f3c4-4fac-bae6-90597e8c70bd: [ReplicateOperationStatus(state=<ReplicateOperationState.REGISTERED: 'REGISTERED'>, errors=[]), ReplicateOperationStatus(state=<ReplicateOperationState.HYDRATING: 'HYDRATING'>, errors=[])]
The movement operation can have one of the following states:
- REGISTERED
- HYDRATING
- FINALIZING
- DEHYDRATING
- READY
- CANCELLED
To learn more about the replication states, check out Concepts: Replication architecture.
List replication operations
List all ongoing and completed operations. This can be filtered by node, collection and shard.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
all_ops = client.cluster.replications.list_all()
print(f"Total replication operations: {len(all_ops)}")
filtered_ops = client.cluster.replications.query(
    collection=collection_name,
    target_node=target_node_name,
)
print(
    f"Filtered operations for collection '{collection_name}' on '{target_node_name}': {len(filtered_ops)}"
)
Code output
Total replication operations: 1
Filtered operations for collection 'MyReplicatedDocCollection' on 'node3': 1
Cancel a replication operation
The operation will be stopped if possible. If successfully cancelled, its state will change to CANCELLED.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
client.cluster.replications.cancel(uuid=operation_id)
Delete a replication operation
Remove a replication operation from the logs. If the operation is active, it will be cancelled first.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
client.cluster.replications.delete(uuid=operation_id)
Delete all replication operations
Remove all replication operations from the logs. If any operations are active, they will be cancelled first.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
client.cluster.replications.delete_all()
Further resources
Questions and feedback
If you have any questions or feedback, let us know in the user forum.
