Replica movement

Added in v1.32

Beyond setting the initial replication factor, you can actively manage the placement of shard replicas within your Weaviate cluster. This is useful for rebalancing data after scaling, decommissioning nodes, or optimizing data locality. Replica movement is managed through a set of dedicated RESTful API endpoints or through the client library API described below.

When a replica movement is initiated, it changes the replication factor of that shard only, not the entire collection. A collection has a particular replication factor, but a shard (a subset of a collection), can have its own replication factor which can be different. If a replica COPY operation is made, this can be incremented.

Check shard state

Before initiating any movement, you might want to inspect the current distribution of replicas. You can retrieve the sharding state of an entire collection, or its specific shard.

API docs

More info

sharding_state = client.cluster.query_sharding_state(
    collection=collection_name,
    # shard=shard_name,  # Optional: specify a shard to filter results
)

print(f"Shards in '{collection_name}': {[s.name for s in sharding_state.shards]}")
for shard in sharding_state.shards:
    print(f"Nodes for shard '{shard.name}': {shard.replicas}")

Code output

Shards in 'MyReplicatedDocCollection': ['0QK7V2bbAHQ2', 'arxzWNklLIU7', 'w5OcBGbNvRt4']
Nodes for shard '0QK7V2bbAHQ2': ['node3', 'node1']
Nodes for shard 'arxzWNklLIU7': ['node1', 'node2']
Nodes for shard 'w5OcBGbNvRt4': ['node2', 'node3']

Initiate a replica movement

Copy or move a shard replica by specifying the the source node, destination node, collection name, shard ID and operation type (MOVE or COPY).

API docs

More info

from weaviate.cluster.models import ReplicationType

operation_id = client.cluster.replicate(
    collection=collection_name,
    shard=shard_name,
    source_node=source_node_name,
    target_node=target_node_name,
    replication_type=ReplicationType.COPY,  # For copying a shard
    # replication_type=ReplicationType.MOVE,  # For moving a shard
)
print(f"Replication initiated, ID: {operation_id}")

Code output

Replication initiated, ID: 32536c0e-09e1-4ea1-a2c5-e85af10a9d58

Check the status of a replication operation

Shard replication operations are asynchronous. An operation status can be queried, with an option to view the full operation history.

API docs

More info

op_status = client.cluster.replications.get(
    uuid=operation_id,
    include_history=True
)
print(f"Status for {operation_id}: {op_status.status}")
print(f"History for {operation_id}: {op_status.status_history}")

Code output

Status for f771aae1-f3c4-4fac-bae6-90597e8c70bd: ReplicateOperationStatus(state=<ReplicateOperationState.FINALIZING: 'FINALIZING'>, errors=[])
History for f771aae1-f3c4-4fac-bae6-90597e8c70bd: [ReplicateOperationStatus(state=<ReplicateOperationState.REGISTERED: 'REGISTERED'>, errors=[]), ReplicateOperationStatus(state=<ReplicateOperationState.HYDRATING: 'HYDRATING'>, errors=[])]

note

The movement operation can have one of the following states:

REGISTERED
HYDRATING
FINALIZING
DEHYDRATING
READY
CANCELLED

To learn more about the replication states, check out Concepts: Replication architecture.

List replication operations

List all ongoing and completed operations. This can be filtered by node, collection and shard.

API docs

More info

all_ops = client.cluster.replications.list_all()
print(f"Total replication operations: {len(all_ops)}")

filtered_ops = client.cluster.replications.query(
    collection=collection_name,
    target_node=target_node_name,
)
print(
    f"Filtered operations for collection '{collection_name}' on '{target_node_name}': {len(filtered_ops)}"
)

Code output

Total replication operations: 1
Filtered operations for collection 'MyReplicatedDocCollection' on 'node3': 1

Cancel a replication operation

The operation will be stopped if possible. If successfully cancelled, its state will change to CANCELLED.

API docs

More info

client.cluster.replications.cancel(uuid=operation_id)

Delete a replication operation

Remove a replication operation from the logs. If the operation is active, it will be cancelled first.

API docs

More info

client.cluster.replications.delete(uuid=operation_id)

Delete all replication operations

Remove all replication operations from the logs. If any operations are active, they will be cancelled first.

API docs

More info

client.cluster.replications.delete_all()

Further resources

Questions and feedback

If you have any questions or feedback, let us know in the user forum.

Technical questions

If you have questions feel free to post on our Community forum.

Documentation feedback

Leave feedback by opening a GitHub issue.

Additional resources

Need help?

Replica movement

Check shard state

Initiate a replica movement

Check the status of a replication operation

List replication operations

Cancel a replication operation

Delete a replication operation

Delete all replication operations

Further resources

Questions and feedback

Additional resources

Need help?

Check shard state​

Initiate a replica movement​

Check the status of a replication operation​

List replication operations​

Cancel a replication operation​

Delete a replication operation​

Delete all replication operations​

Further resources​

Questions and feedback​

Check shard state

Initiate a replica movement

Check the status of a replication operation

List replication operations

Cancel a replication operation

Delete a replication operation

Delete all replication operations

Further resources

Questions and feedback