Skip to main content
Go to documentation:
⌘U
Weaviate Database

Develop AI applications using Weaviate's APIs and tools

Deploy

Deploy, configure, and maintain Weaviate Database

Weaviate Agents

Build and deploy intelligent agents with Weaviate

Weaviate Cloud

Manage and scale Weaviate in the cloud

Additional resources

Integrations
Contributor guide
Events & Workshops
Weaviate Academy

Need help?

Weaviate LogoAsk AI Assistant⌘K
Community Forum

Monitoring

Weaviate can expose Prometheus-compatible metrics for monitoring. A standard Prometheus/Grafana setup can be used to visualize metrics on various dashboards.

Metrics can be used to measure request latencies, import speed, time spent on vector vs object storage, memory usage, application usage, and more.

Configure Monitoring

Enable within Weaviate

To tell Weaviate to collect metrics and expose them in a Prometheus-compatible format, all that's required is to set the following environment variable:

PROMETHEUS_MONITORING_ENABLED=true

By default, Weaviate will expose the metrics at <hostname>:2112/metrics. You can optionally change the port to a custom port using the following environment variable:

PROMETHEUS_MONITORING_PORT=3456

Scrape metrics from Weaviate

Metrics are typically scraped into a time-series database, such as Prometheus. How you consume metrics depends on your setup and environment.

The Weaviate examples repo contains a fully pre-configured setup using Prometheus, Grafana and some example dashboards. You can start up a full-setup including monitoring and dashboards with a single command. In this setup the following components are used:

  • Docker Compose is used to provide a fully-configured setup that can be started with a single command.
  • Weaviate is configured to expose Prometheus metrics as outlined in the section above.
  • A Prometheus instance is started with the setup and configured to scrape metrics from Weaviate every 15s.
  • A Grafana instance is started with the setup and configured to use the Prometheus instance as a metrics provider. Additionally, it runs a dashboard provider that contains a few sample dashboards.

Multi-tenancy

When using multi-tenancy, we suggest setting the PROMETHEUS_MONITORING_GROUP environment variable as true so that data across all tenants are grouped together for monitoring.

Obtainable Metrics

Versioning & breaking changes

Be aware that metrics do not follow the semantic versioning guidelines of other Weaviate features. Weaviate's main APIs are stable and breaking changes are extremely rare. Metrics, however, have shorter feature lifecycles. It can sometimes be necessary to introduce an incompatible change or entirely remove a metric, for example, because the cost of observing a specific metric in production has grown too high. As a result, it is possible that a Weaviate minor release contains a breaking change for the Monitoring system. If so, it will be clearly highlighted in the release notes.

The list of metrics that are obtainable through Weaviate's metric system is constantly being expanded. The complete list of metrics can be found in the source code files:

This page describes metrics and their uses. Typically metrics are quite granular, as they can always be aggregated later on. For example if the granularity is "shard", you could aggregate all "shard" metrics of the same "class" (collection) to obtain a class metrics, or aggregate all metrics to obtain the metric for the entire Weaviate instance.

General & build information

MetricDescriptionLabelsType
weaviate_build_infoProvides general information about the build (What version is currently running? How long has this version been running, etc)version, revision, branch, goVersionGauge
weaviate_runtime_config_hashHash value of the currently active runtime configuration, useful for tracking when new configurations take effectsha256Gauge
weaviate_runtime_config_last_load_successIndicates whether the last loading attempt was successful (1 for success, 0 for failure)NoneGauge
weaviate_schema_collectionsShows the total number of collections at any given pointnodeIDGauge
weaviate_schema_shardsShows the total number of shards at any given pointnodeID, status (HOT, COLD, WARM, FROZEN)Gauge

Object and query operations

Batch operations

MetricDescriptionLabelsType
batch_durations_msDuration of a single batch operation in ms. The operation label further defines what operation as part of the batch (e.g. object, inverted, vector) is being used. Granularity is a shard of a class.operation, class_name, shard_nameHistogram
batch_delete_durations_msDuration of a batch delete in ms. The operation label further defines what operation as part of the batch delete is being measured. Granularity is a shard of a classoperation, class_name, shard_nameSummary
batch_size_bytesSize of a raw batch request batch in bytesapiSummary
batch_size_objectsNumber of objects in a batchNoneSummary
batch_size_tenantsNumber of unique tenants referenced in a batchNoneSummary
batch_objects_processed_totalNumber of objects processed in a batchclass_name, shard_nameCounter
batch_objects_processed_bytesNumber of bytes processed in a batchclass_name, shard_nameCounter

Object operations

MetricDescriptionLabelsType
object_countNumbers of objects present. Granularity is a shard of a classclass_name, shard_nameGauge
objects_durations_msDuration of an individual object operation, such as put, delete, etc. as indicated by the operation label, also as part of a batch. The step label adds additional precision to each operation. Granularity is a shard of a class.operation, step, class_name, shard_nameSummary

Query operations

MetricDescriptionLabelsType
concurrent_queries_countNumber of concurrently running query operationsclass_name, query_typeGauge
queries_durations_msDuration of queries in millisecondsclass_name, query_typeHistogram
queries_filtered_vector_durations_msDuration of queries in millisecondsclass_name, shard_name, operationSummary
concurrent_goroutinesNumber of concurrently running goroutinesclass_name, query_typeGauge
requests_totalMetric that tracks all user requests to determine if it was successful or failedstatus, class_name, api, query_typeGauge
query_dimensions_totalThe vector dimensions used by any read-query that involves vectorsquery_type, operation, class_nameCounter
query_dimensions_combined_totalThe vector dimensions used by any read-query that involves vectors, aggregated across all classes and shardsNoneCounter

Vector index

General vector index

MetricDescriptionLabelsType
vector_index_sizeThe total capacity of the vector index. Typically larger than the number of vectors imported as it grows proactively.class_name, shard_nameGauge
vector_index_operationsTotal number of mutating operations on the vector index. The operation itself is defined by the operation label.operation, class_name, shard_nameGauge
vector_index_durations_msDuration of regular vector index operation, such as insert or delete. The operation itself is defined through the operation label. The step label adds more granularity to each operation.operation, step, class_name, shard_nameSummary
vector_index_maintenance_durations_msDuration of a sync or async vector index maintenance operation. The operation itself is defined through the operation label.operation, class_name, shard_nameSummary
vector_index_tombstonesNumber of currently active tombstones in the vector index. Will go up on each incoming delete and go down after a completed repair operation.class_name, shard_nameGauge
vector_index_tombstone_cleanup_threadsNumber of currently active threads for repairing/cleaning up the vector index after deletes have occurred.class_name, shard_nameGauge
vector_index_tombstone_cleanedTotal number of deleted and removed vectors after repair operations.class_name, shard_nameCounter
vector_index_tombstone_unexpected_totalTotal number of unexpected tombstones that were found, for example because a vector was not found for an existing id in the indexclass_name, shard_name, operationCounter
vector_index_tombstone_cycle_start_timestamp_secondsUnix epoch timestamp of the start of the current tombstone cleanup cycleclass_name, shard_nameGauge
vector_index_tombstone_cycle_end_timestamp_secondsUnix epoch timestamp of the end of the last tombstone cleanup cycle. A negative value indicates that the cycle is still runningclass_name, shard_nameGauge
vector_index_tombstone_cycle_progressA ratio (percentage) of the progress of the current tombstone cleanup cycle. 0 indicates the very beginning, 1 is a complete cycle.class_name, shard_nameGauge
vector_dimensions_sumTotal dimensions in a shardclass_name, shard_nameGauge
vector_segments_sumTotal segments in a shard if quantization enabledclass_name, shard_nameGauge

Vector index (IVF-specific)

MetricDescriptionLabelsType
vector_index_postingsThe size of the vector index postings. Typically much lower than number of vectors.class_name, shard_nameGauge
vector_index_posting_size_vectorsThe size of individual vectors in each posting listclass_name, shard_nameHistogram
vector_index_pending_background_operationsNumber of background operations yet to be processedoperation, class_name, shard_nameGauge
vector_index_background_operations_durations_msDuration of typical vector index background operations (split, merge, reassign)operation, class_name, shard_nameSummary
vector_index_store_operations_durations_msDuration of store operations (put, append, get)operation, class_name, shard_nameSummary

Async index queue

MetricDescriptionLabelsType
queue_sizeNumber of records in the queueclass_name, shard_nameGauge
queue_disk_usageDisk usage of the queueclass_name, shard_nameGauge
queue_pausedWhether the queue is pausedclass_name, shard_nameGauge
queue_countNumber of queuesclass_name, shard_nameGauge
queue_partition_processing_duration_msDuration in ms of a single partition processingclass_name, shard_nameHistogram
vector_index_queue_insert_countNumber of insert operations added to the vector index queueclass_name, shard_name, target_vectorCounter
vector_index_queue_delete_countNumber of delete operations added to the vector index queueclass_name, shard_name, target_vectorCounter

Tombstone management

MetricDescriptionLabelsType
tombstone_find_local_entrypointTotal number of tombstone delete local entrypoint callsclass_name, shard_nameCounter
tombstone_find_global_entrypointTotal number of tombstone delete global entrypoint callsclass_name, shard_nameCounter
tombstone_reassign_neighborsTotal number of tombstone reassign neighbor callsclass_name, shard_nameCounter
tombstone_delete_list_sizeDelete list size of tombstonesclass_name, shard_nameGauge

LSM store

The following sections provide detailed metrics for LSM (Log-Structured Merge-tree) bucket operations and replication functionality.

General LSM store

MetricDescriptionLabelsType
lsm_active_segmentsNumber of currently present segments per shard. Granularity is shard of a class. Grouped by strategy.strategy, class_name, shard_name, pathGauge
lsm_objects_bucket_segment_countNumber of segments per shard in the objects bucketstrategy, class_name, shard_name, pathGauge
lsm_compressed_vecs_bucket_segment_countNumber of segments per shard in the vectors_compressed bucketstrategy, class_name, shard_name, pathGauge
lsm_segment_countNumber of segments by levelstrategy, class_name, shard_name, path, levelGauge
lsm_segment_objectsNumber of entries per LSM segment by level. Granularity is shard of a class. Grouped by strategy and level.strategy, class_name, shard_name, path, levelGauge
lsm_segment_sizeSize of LSM segment by level and unitstrategy, class_name, shard_name, path, level, unitGauge
lsm_segment_unloadedNumber of unloaded segmentsstrategy, class_name, shard_name, pathGauge
lsm_memtable_sizeSize of memtable by pathstrategy, class_name, shard_name, pathGauge
lsm_memtable_durations_msTime in ms for a bucket operation to completestrategy, class_name, shard_name, path, operationSummary
lsm_bitmap_buffers_usageNumber of bitmap buffers used by sizesize, operationCounter

LSM bucket operations

These metrics track read and write operations on LSM buckets, providing detailed visibility into database performance.

MetricDescriptionLabelsType
lsm_bucket_read_operation_countTotal number of LSM bucket read operations requestedoperation (get), component (active_memtable, flushing_memtable, segment_group)Counter
lsm_bucket_read_operation_ongoingNumber of LSM bucket read operations currently in progressoperation (get), component (active_memtable, flushing_memtable, segment_group)Gauge
lsm_bucket_read_operation_failure_countNumber of failed LSM bucket read operationsoperation (get), component (active_memtable, flushing_memtable, segment_group)Counter
lsm_bucket_read_operation_duration_secondsDuration of LSM bucket read operations in secondsoperation (get), component (active_memtable, flushing_memtable, segment_group)Histogram
lsm_bucket_write_operation_countTotal number of LSM bucket write operations requestedoperation (put, delete)Counter
lsm_bucket_write_operation_ongoingNumber of LSM bucket write operations currently in progressoperation (put, delete)Gauge
lsm_bucket_write_operation_failure_countNumber of failed LSM bucket write operationsoperation (put, delete)Counter
lsm_bucket_write_operation_duration_secondsDuration of LSM bucket write operations in secondsoperation (put, delete)Histogram

LSM bucket lifecycle

These metrics track the initialization and shutdown of LSM buckets.

MetricDescriptionLabelsType
lsm_bucket_init_countTotal number of LSM bucket initializations requestedstrategyCounter
lsm_bucket_init_in_progressNumber of LSM bucket initializations currently in progressstrategyGauge
lsm_bucket_init_failure_countNumber of failed LSM bucket initializationsstrategyCounter
lsm_bucket_init_duration_secondsDuration of LSM bucket initialization in secondsstrategyHistogram
lsm_bucket_shutdown_countTotal number of LSM bucket shutdowns requestedstrategyCounter
lsm_bucket_shutdown_in_progressNumber of LSM bucket shutdowns currently in progressstrategyGauge
lsm_bucket_shutdown_duration_secondsDuration of LSM bucket shutdown in secondsstrategyHistogram
lsm_bucket_shutdown_failure_countNumber of failed LSM bucket shutdownsstrategyCounter

LSM bucket cursors

These metrics track cursor usage patterns in LSM buckets.

MetricDescriptionLabelsType
lsm_bucket_opened_cursorsNumber of opened LSM bucket cursorsstrategyCounter
lsm_bucket_open_cursorsNumber of currently open LSM bucket cursorsstrategyGauge
lsm_bucket_cursor_duration_secondsDuration of LSM bucket cursor operations in secondsstrategyHistogram

LSM segment metrics

These metrics provide visibility into LSM segment storage and size distribution.

MetricDescriptionLabelsType
lsm_bucket_segment_totalTotal number of LSM bucket segmentsstrategyGauge
lsm_bucket_segment_size_bytesSize of LSM bucket segments in bytesstrategyHistogram

LSM compaction

These metrics track compaction operations that merge and optimize LSM segments.

MetricDescriptionLabelsType
lsm_bucket_compaction_countTotal number of LSM bucket compactions requestedstrategyCounter
lsm_bucket_compaction_in_progressNumber of LSM bucket compactions currently in progressstrategyGauge
lsm_bucket_compaction_failure_countNumber of failed LSM bucket compactionsstrategyCounter
lsm_bucket_compaction_noop_countNumber of times the periodic LSM bucket compaction task ran but found nothing to compactstrategyCounter
lsm_bucket_compaction_duration_secondsDuration of LSM bucket compaction in secondsstrategyHistogram

LSM memtable operations

These metrics track memtable flush operations that persist in-memory data to disk.

MetricDescriptionLabelsType
lsm_memtable_flush_totalTotal number of LSM memtable flushesstrategyCounter
lsm_memtable_flush_in_progressNumber of LSM memtable flushes in progressstrategyGauge
lsm_memtable_flush_failures_totalTotal number of failed LSM memtable flushesstrategyCounter
lsm_memtable_flush_duration_secondsDuration of LSM memtable flush in secondsstrategyHistogram
lsm_memtable_flush_size_bytesSize of LSM memtable at flushing time, in bytesstrategyHistogram

LSM WAL recovery

These metrics track Write-Ahead Log (WAL) recovery operations during startup.

MetricDescriptionLabelsType
lsm_bucket_wal_recovery_countTotal number of LSM bucket WAL recoveries requestedstrategyCounter
lsm_bucket_wal_recovery_in_progressNumber of LSM bucket WAL recoveries currently in progressstrategyGauge
lsm_bucket_wal_recovery_failure_countNumber of failed LSM bucket WAL recoveriesstrategyCounter
lsm_bucket_wal_recovery_duration_secondsDuration of LSM bucket WAL recovery in secondsstrategyHistogram

Schema & cluster consensus

Schema & RAFT consensus

MetricDescriptionLabelsType
schema_writes_secondsDuration of schema writes (which always involve the leader)typeSummary
schema_reads_local_secondsDuration of local schema reads that do not involve the leadertypeSummary
schema_reads_leader_secondsDuration of schema reads that are passed to the leadertypeSummary
schema_wait_for_version_secondsDuration of waiting for a schema version to be reachedtypeSummary

Schema transactions (deprecated)

MetricDescriptionLabelsType
schema_tx_opened_totalTotal number of opened schema transactionsownershipCounter
schema_tx_closed_totalTotal number of closed schema transactions. A close must be either successful or failedownership, statusCounter
schema_tx_duration_secondsMean duration of a tx by statusownership, statusSummary

RAFT metrics (internal)

MetricDescriptionLabelsType
weaviate_internal_counter_raft_applyNumber of transactions in the configured intervalNoneCounter
weaviate_internal_counter_raft_state_candidateNumber of times the raft server initiated an electionNoneCounter
weaviate_internal_counter_raft_state_followerNumber of times in the configured interval that the raft server became a followerNoneSummary
weaviate_internal_counter_raft_state_leaderNumber of times the raft server became a leaderNoneCounter
weaviate_internal_counter_raft_transition_heartbeat_timeoutNumber of times that the node transitioned to candidate state after not receiving a heartbeat message from the last known leaderNoneCounter
weaviate_internal_gauge_raft_commitNumLogsNumber of logs processed for application to the finite state machine in a single batchNoneGauge
weaviate_internal_gauge_raft_leader_dispatchNumLogsNumber of logs committed to disk in the most recent batchNoneGauge
weaviate_internal_gauge_raft_leader_oldestLogAgeThe number of milliseconds since the oldest log in the leader's log store was writtenNoneGauge
weaviate_internal_gauge_raft_peersThe number of peers in the raft cluster configurationNoneGauge
weaviate_internal_sample_raft_boltdb_logBatchSizeMeasures the total size in bytes of logs being written to the db in a single batchquantile=0.5, 0.9, 0.99Summary
weaviate_internal_sample_raft_boltdb_logSizeMeasures the size of logs being written to the dbquantile=0.5, 0.9, 0.99Summary
weaviate_internal_sample_raft_boltdb_logsPerBatchMeasures the number of logs being written per batch to the dbquantile=0.5, 0.9, 0.99Summary
weaviate_internal_sample_raft_boltdb_writeCapacityTheoretical write capacity in terms of the number of logs that can be written per secondquantile=0.5, 0.9, 0.99Summary
weaviate_internal_sample_raft_thread_fsm_saturationAn approximate measurement of the proportion of time the Raft FSM goroutine is busy and unavailable to accept new workquantile=0.5, 0.9, 0.99Summary
weaviate_internal_sample_raft_thread_main_saturationAn approximate measurement of the proportion of time the main Raft goroutine is busy and unavailable to accept new work (percentage)quantile=0.5, 0.9, 0.99Summary
weaviate_internal_timer_raft_boltdb_getLogMeasures the amount of time spent reading logs from the db (in ms)quantile=0.5, 0.9, 0.99Summary
weaviate_internal_timer_raft_boltdb_storeLogsTime required to record any outstanding logs since the last request to append entries for the given nodequantile=0.5, 0.9, 0.99Summary
weaviate_internal_timer_raft_commitTimeTime required to commit a new entry to the raft log on the leader nodequantile=0.5, 0.9, 0.99Summary
weaviate_internal_timer_raft_fsm_applyNumber of logs committed by the finite state machine since the last intervalquantile=0.5, 0.9, 0.99Summary
weaviate_internal_timer_raft_fsm_enqueueTime required to queue up a batch of logs for the finite state machine to applyquantile=0.5, 0.9, 0.99Summary
weaviate_internal_timer_raft_leader_dispatchLogTime required for the leader node to write a log entry to diskquantile=0.5, 0.9, 0.99Summary

Memberlist (internal)

MetricDescriptionLabelsType
weaviate_internal_sample_memberlist_queue_broadcastsShows the number of messages in the broadcast queue of Memberlistquantile=0.5, 0.9, 0.99Summary
weaviate_internal_timer_memberlist_gossipShows the latency distribution of the each gossip made in Memberlistquantile=0.5, 0.9, 0.99Summary

System resources

File I/O & memory

MetricDescriptionLabelsType
file_io_writes_total_bytesTotal number of bytes written to diskoperation, strategySummary
file_io_reads_total_bytesTotal number of bytes read from diskoperationSummary
mmap_operations_totalTotal number of mmap operationsoperation, strategyCounter
mmap_proc_mapsNumber of entries in /proc/self/mapsNoneGauge

Async operations

MetricDescriptionLabelsType
async_operations_runningNumber of currently running async operations. The operation itself is defined through the operation label.operation, class_name, shard_name, pathGauge

Checksum

MetricDescriptionLabelsType
checksum_validation_duration_secondsDuration of checksum validationNoneSummary
checksum_bytes_readNumber of bytes read during checksum validationNoneSummary

Startup

MetricDescriptionLabelsType
startup_progressA ratio (percentage) of startup progress for a particular component in a shardoperation, class_name, shard_nameGauge
startup_durations_msDuration of individual startup operations in ms. The operation itself is defined through the operation label.operation, class_name, shard_nameSummary
startup_diskio_throughputDisk I/O throughput in bytes/s at startup operations, such as reading back the HNSW index or recovering LSM segments. The operation itself is defined by the operation label.operation, class_name, shard_nameSummary

Backup & restore

MetricDescriptionLabelsType
backup_restore_msDuration of a backup restorebackend_name, class_nameSummary
backup_restore_class_msDuration restoring classclass_nameSummary
backup_restore_init_msStartup phase of a backup restorebackend_name, class_nameSummary
backup_restore_from_backend_msFile transfer stage of a backup restorebackend_name, class_nameSummary
backup_store_to_backend_msFile transfer stage of a backup storebackend_name, class_nameSummary
bucket_pause_durations_msBucket pause durationsbucket_dirSummary
backup_restore_data_transferredTotal number of bytes transferred during a backup restorebackend_name, class_nameCounter
backup_store_data_transferredTotal number of bytes transferred during a backup storebackend_name, class_nameCounter

Shard management

MetricDescriptionLabelsType
shards_loadedNumber of shards loadedNoneGauge
shards_unloadedNumber of shards not loadedNoneGauge
shards_loadingNumber of shards in process of loadingNoneGauge
shards_unloadingNumber of shards in process of unloadingNoneGauge
weaviate_index_shards_totalTotal number of shards per index statusstatus (READONLY, INDEXING, LOADING, READY, SHUTDOWN)Gauge
weaviate_index_shard_status_update_duration_secondsTime taken to update shard status in secondsstatus (READONLY, INDEXING, LOADING, READY, SHUTDOWN)Histogram

Modules & extensions

Vectorization (Text2Vec)

MetricDescriptionLabelsType
t2v_concurrent_batchesNumber of batches currently runningvectorizerGauge
t2v_batch_queue_duration_secondsTime of a batch spend in specific portions of the queuevectorizer, operationHistogram
t2v_request_duration_secondsDuration of an individual request to the vectorizervectorizerHistogram
t2v_tokens_in_batchNumber of tokens in a user-defined batchvectorizerHistogram
t2v_tokens_in_requestNumber of tokens in an individual request sent to the vectorizervectorizerHistogram
t2v_rate_limit_statsRate limit stats for the vectorizervectorizer, statGauge
t2v_repeat_statsWhy batch scheduling is repeatedvectorizer, statGauge
t2v_requests_per_batchNumber of requests required to process an entire (user) batchvectorizerHistogram

Tokenizer

MetricDescriptionLabelsType
tokenizer_duration_secondsDuration of a tokenizer operationtokenizerHistogram
tokenizer_requests_totalNumber of tokenizer requeststokenizerCounter
tokenizer_initialize_duration_secondsDuration of a tokenizer initialization operationtokenizerHistogram
token_count_totalNumber of tokens processedtokenizerCounter
token_count_per_requestNumber of tokens processed per requesttokenizerHistogram

Module & external API

MetricDescriptionLabelsType
weaviate_module_requests_totalNumber of module requests to external APIsop, apiCounter
weaviate_module_request_duration_secondsDuration of an individual request to a module external APIop, apiHistogram
weaviate_module_requests_per_batchNumber of items in a batchop, apiHistogram
weaviate_module_request_size_bytesSize (in bytes) of the request sent to an external APIop, apiHistogram
weaviate_module_response_size_bytesSize (in bytes) of the response received from an external APIop, apiHistogram
weaviate_vectorizer_request_tokensNumber of tokens in the request sent to an external vectorizerinout, apiHistogram
weaviate_module_request_single_countNumber of single-item external API requestsop, apiCounter
weaviate_module_request_batch_countNumber of batched module requestsop, apiCounter
weaviate_module_error_totalNumber of module errorsop, module, endpoint, status_codeCounter
weaviate_module_call_error_totalNumber of module errors (related to external calls)module, endpoint, status_codeCounter
weaviate_module_response_status_totalNumber of API response statusesop, endpoint, statusCounter
weaviate_module_batch_error_totalNumber of batch errorsoperation, class_nameCounter

Usage Tracking

MetricDescriptionLabelsType
weaviate_usage_{gcs|s3}_operations_totalTotal number of operations for module labelsoperation (collect/upload), status (success/error)Counter
weaviate_usage_{gcs|s3}_operation_latency_secondsLatency of usage operations in secondsoperation (collect/upload)Histogram
weaviate_usage_{gcs|s3}_resource_countNumber of resources tracked by moduleresource_type (collections/shards/backups)Gauge
weaviate_usage_{gcs|s3}_uploaded_file_size_bytesSize of the uploaded usage file in bytesNoneGauge

Replication

Async replication

These metrics track asynchronous replication operations for maintaining data consistency across replicas.

MetricDescriptionLabelsType
async_replication_goroutines_runningNumber of currently running async replication goroutinestype (hashbeater, hashbeat_trigger)Gauge
async_replication_hashtree_init_countCount of async replication hashtree initializationsNoneCounter
async_replication_hashtree_init_runningNumber of currently running hashtree initializationsNoneGauge
async_replication_hashtree_init_failure_countCount of async replication hashtree initialization failuresNoneCounter
async_replication_hashtree_init_duration_secondsDuration of hashtree initialization in secondsNoneHistogram
async_replication_iteration_countCount of async replication comparison iterationsNoneCounter
async_replication_iteration_failure_countCount of async replication iteration failuresNoneCounter
async_replication_iteration_duration_secondsDuration of async replication comparison iterations in secondsNoneHistogram
async_replication_hashtree_diff_duration_secondsDuration of async replication hashtree diff computation in secondsNoneHistogram
async_replication_object_digests_diff_duration_secondsDuration of async replication object digests diff computation in secondsNoneHistogram
async_replication_propagation_countCount of async replication propagation executionsNoneCounter
async_replication_propagation_failure_countCount of async replication propagation failuresNoneCounter
async_replication_propagation_object_countCount of objects propagated by async replicationNoneCounter
async_replication_propagation_duration_secondsDuration of async replication propagation in secondsNoneHistogram

Replication coordinator

These metrics track the replication coordinator's read and write operations across replicas.

MetricDescriptionLabelsType
replication_coordinator_writes_succeed_allCount of requests succeeding a write to all replicasNoneCounter
replication_coordinator_writes_succeed_someCount of requests succeeding a write to some replicas > CL but less than allNoneCounter
replication_coordinator_writes_failedCount of requests failing due to consistency levelNoneCounter
replication_coordinator_reads_succeed_allCount of requests succeeding a read from CL replicasNoneCounter
replication_coordinator_reads_succeed_someCount of requests succeeding a read from some replicas < CL but more than zeroNoneCounter
replication_coordinator_reads_failedCount of requests failing due to read from replicasNoneCounter
replication_read_repair_countCount of read repairs startedNoneCounter
replication_read_repair_failureCount of read repairs failedNoneCounter
replication_coordinator_writes_duration_secondsDuration in seconds of write operations to replicasNoneHistogram
replication_coordinator_reads_duration_secondsDuration in seconds of read operations from replicasNoneHistogram
replication_read_repair_duration_secondsDuration in seconds of read repair operationsNoneHistogram

Sample Dashboards

Weaviate does not ship with any dashboards by default, but here is a list of dashboards being used by the various Weaviate teams, both during development, and when helping users. These do not come with any support, but may still be helpful. Treat them as inspiration to design your own dashboards which fit your uses perfectly:

DashboardPurposePreview
Cluster Workload in KubernetesVisualize cluster workload, usage and activity in KubernetesCluster Workload in Kubernetes
Importing Data Into WeaviateVisualize speed of import operations (including its components, such as object store, inverted index, and vector index)Importing Data into Weaviate
Object OperationsVisualize speed of whole object operations, such as GET, PUT, etc.Objects
Vector IndexVisualize the current state, as well as operations on the HNSW vector indexVector Index
LSM StoresGet insights into the internals (including segments) of the various LSM stores within WeaviateLSM Store
StartupVisualize the startup process, including recovery operationsStartup
UsageObtain usage metrics, such as number of objects imported, etc.Usage
Aysnc index queueObserve index queue activityAsync index queue

nodes API Endpoint

To get collection details programmatically, use the nodes REST endpoint.

The nodes endpoint returns an array of nodes. The nodes have the following fields:

  • name: Name of the node.
  • status: Status of the node (one of: HEALTHY, UNHEALTHY, UNAVAILABLE, INDEXING).
  • version: Version of Weaviate running on the node.
  • gitHash: Short git hash of the latest commit of Weaviate running on the node.
  • stats: Statistics for the node.
    • shardCount: Total number of shards on the node.
    • objectCount Total number of indexed objects on the node.
  • shards: Array of shard statistics. To see shards details, set output == verbose.
    • name: Name of the shard.
    • class: Name of the collection stored on the shard.
    • objectCount: Number of indexed objects on the shard.
    • vectorQueueLength: Number of objects waiting to be indexed on the shard. (Available starting in Weaviate 1.22 when ASYNC_INDEXING is enabled.)

Questions and feedback

If you have any questions or feedback, let us know in the user forum.