Query profiling

Added in v1.36.9

Query profiling provides per-shard timing breakdowns for search queries. Enable it on any search request to see how long each phase takes — vector search, keyword scoring, filter evaluation, object retrieval — broken down by shard and cluster node.

Profiling uses the same instrumentation as slow query logging. It adds minimal overhead when enabled and zero overhead when disabled.

Enable profiling

Add query_profile=True to MetadataQuery, or include "query_profile" in the metadata list:

Python
JavaScript/TypeScript
Java v6
C#

import weaviate
from weaviate.classes.query import MetadataQuery

client = weaviate.connect_to_local()

collection = client.collections.get("Article")

response = collection.query.near_vector(
    near_vector=[0.1, 0.2, 0.3],
    limit=5,
    return_metadata=MetadataQuery(query_profile=True, distance=True),
)

if response.query_profile:
    for shard in response.query_profile.shards:
        print(f"Shard: {shard.name} (node: {shard.node})")
        for search_type, profile in shard.searches.items():
            print(f"  [{search_type}]")
            for key, value in profile.details.items():
                print(f"    {key}: {value}")

const collection = client.collections.use('Article');

const nvResponse = await collection.query.nearVector([0.1, 0.2, 0.3, 0.0, 0.0, 0.0, 0.0, 0.0], {
  limit: 5,
  returnMetadata: ['queryProfile', 'distance'],
});

if (nvResponse.queryProfile) {
  for (const shard of nvResponse.queryProfile.shards) {
    console.log(`Shard: ${shard.name} (node: ${shard.node})`);
    for (const [searchType, profile] of Object.entries(shard.searches)) {
      console.log(`  [${searchType}]`);
      for (const [key, value] of Object.entries(profile.details)) {
        console.log(`    ${key}: ${value}`);
      }
    }
  }
}

CollectionHandle<Map<String, Object>> collection =
    client.collections.use(COLLECTION);

var response = collection.query.nearVector(
    new float[]{0.1f, 0.2f, 0.3f},
    q -> q.limit(5).returnMetadata(Metadata.QUERY_PROFILE, Metadata.DISTANCE));

if (response.queryProfile() != null) {
  for (var shard : response.queryProfile().shards()) {
    // Per-shard execution timing breakdown for vector searches.
    for (var entry : shard.searches().entrySet()) {
      System.out.println("  [" + entry.getKey() + "]");
      for (var detail : entry.getValue().entrySet()) {
        System.out.println("    " + detail.getKey() + ": " + detail.getValue());
      }
    }
  }
}

var collection = client.Collections.Use(COLLECTION);

var response = await collection.Query.NearVector(
    vectors: new float[] { 0.1f, 0.2f, 0.3f },
    limit: 5,
    returnMetadata: MetadataOptions.QueryProfile | MetadataOptions.Distance
);

if (response.QueryProfile != null)
{
    foreach (var shard in response.QueryProfile.Shards)
    {
        Console.WriteLine($"Shard: {shard.Name} (node: {shard.Node})");
        foreach (var (searchType, profile) in shard.Searches)
        {
            Console.WriteLine($"  [{searchType}]");
            foreach (var (key, value) in profile.Details)
            {
                Console.WriteLine($"    {key}: {value}");
            }
        }
    }
}

Profile data is returned on the response object at response.query_profile, not on individual result objects. It represents the entire query across all shards.

Supported search types

Search type	Profile sections	Query methods
Vector search	`vector`	`near_vector`, `near_object`, `near_text`, `near_image`, etc.
Keyword search (BM25)	`keyword`	`bm25`
Hybrid search	`vector` + `keyword`	`hybrid`
Object fetch	`object`	`fetch_objects`
Any search + filters	Includes filter metrics	Add `filters` to any search
Any search + groupBy	Profile at query level	Add `group_by` to any search

BM25 example

Python
JavaScript/TypeScript
Java v6
C#

from weaviate.classes.query import MetadataQuery

collection = client.collections.get("Article")

response = collection.query.bm25(
    query="machine learning",
    return_metadata=MetadataQuery(query_profile=True, score=True),
)

if response.query_profile:
    for shard in response.query_profile.shards:
        print(f"Shard: {shard.name} (node: {shard.node})")
        for search_type, profile in shard.searches.items():
            print(f"  [{search_type}]")
            for key, value in profile.details.items():
                print(f"    {key}: {value}")

const bm25Response = await collection.query.bm25('machine learning', {
  returnMetadata: ['queryProfile', 'score'],
});

if (bm25Response.queryProfile) {
  for (const shard of bm25Response.queryProfile.shards) {
    console.log(`Shard: ${shard.name} (node: ${shard.node})`);
    for (const [searchType, profile] of Object.entries(shard.searches)) {
      console.log(`  [${searchType}]`);
      for (const [key, value] of Object.entries(profile.details)) {
        console.log(`    ${key}: ${value}`);
      }
    }
  }
}

CollectionHandle<Map<String, Object>> collection =
    client.collections.use(COLLECTION);

var response = collection.query.bm25(
    "machine learning",
    q -> q.returnMetadata(Metadata.QUERY_PROFILE, Metadata.SCORE));

if (response.queryProfile() != null) {
  for (var shard : response.queryProfile().shards()) {
    for (var entry : shard.searches().entrySet()) {
      System.out.println("  [" + entry.getKey() + "]");
      for (var detail : entry.getValue().entrySet()) {
        System.out.println("    " + detail.getKey() + ": " + detail.getValue());
      }
    }
  }
}

var collection = client.Collections.Use(COLLECTION);

var response = await collection.Query.BM25(
    query: "machine learning",
    returnMetadata: MetadataOptions.QueryProfile | MetadataOptions.Score
);

if (response.QueryProfile != null)
{
    foreach (var shard in response.QueryProfile.Shards)
    {
        Console.WriteLine($"Shard: {shard.Name} (node: {shard.Node})");
        foreach (var (searchType, profile) in shard.Searches)
        {
            Console.WriteLine($"  [{searchType}]");
            foreach (var (key, value) in profile.Details)
            {
                Console.WriteLine($"    {key}: {value}");
            }
        }
    }
}

Hybrid example

Hybrid search produces both vector and keyword profile sections per shard:

Python
JavaScript/TypeScript
Java v6
C#

from weaviate.classes.query import MetadataQuery

collection = client.collections.get("Article")

response = collection.query.hybrid(
    query="machine learning",
    return_metadata=MetadataQuery(query_profile=True),
    limit=5,
)

if response.query_profile:
    for shard in response.query_profile.shards:
        print(f"Shard: {shard.name} (node: {shard.node})")
        for search_type, profile in shard.searches.items():
            print(f"  [{search_type}]")
            for key, value in profile.details.items():
                print(f"    {key}: {value}")

const hybridResponse = await collection.query.hybrid('machine learning', {
  returnMetadata: ['queryProfile'],
  vector: [0.1, 0.2, 0.3, 0.0, 0.0, 0.0, 0.0, 0.0],
  limit: 5,
});

if (hybridResponse.queryProfile) {
  for (const shard of hybridResponse.queryProfile.shards) {
    console.log(`Shard: ${shard.name} (node: ${shard.node})`);
    for (const [searchType, profile] of Object.entries(shard.searches)) {
      console.log(`  [${searchType}]`);
      for (const [key, value] of Object.entries(profile.details)) {
        console.log(`    ${key}: ${value}`);
      }
    }
  }
}

CollectionHandle<Map<String, Object>> collection =
    client.collections.use(COLLECTION);

var response = collection.query.hybrid(
    "machine learning",
    q -> q.limit(5).returnMetadata(Metadata.QUERY_PROFILE));

if (response.queryProfile() != null) {
  for (var shard : response.queryProfile().shards()) {
    for (var entry : shard.searches().entrySet()) {
      System.out.println("  [" + entry.getKey() + "]");
      for (var detail : entry.getValue().entrySet()) {
        System.out.println("    " + detail.getKey() + ": " + detail.getValue());
      }
    }
  }
}

var collection = client.Collections.Use(COLLECTION);

var response = await collection.Query.Hybrid(
    query: "machine learning",
    vectors: new float[] { 0.1f, 0.2f, 0.3f },
    limit: 5,
    returnMetadata: MetadataOptions.QueryProfile
);

if (response.QueryProfile != null)
{
    foreach (var shard in response.QueryProfile.Shards)
    {
        Console.WriteLine($"Shard: {shard.Name} (node: {shard.Node})");
        foreach (var (searchType, profile) in shard.Searches)
        {
            Console.WriteLine($"  [{searchType}]");
            foreach (var (key, value) in profile.Details)
            {
                Console.WriteLine($"    {key}: {value}");
            }
        }
    }
}

Response structure

The profile is structured as:

response.query_profile
  └── shards[]
        ├── name          # Shard identifier (e.g. "shard_0")
        ├── node          # Cluster node (e.g. "weaviate-0")
        └── searches      # Dict of search type → profile
              ├── "vector" → details: { key: value, ... }
              ├── "keyword" → details: { key: value, ... }
              └── "object" → details: { key: value, ... }

Each search type contains a details dict with string key-value pairs. The available metrics depend on the query type, index configuration, and filter usage.

Available metrics

General metrics

Metric	Description	Present when
`total_took`	Total time for this shard's search	Always
`objects_took`	Time retrieving objects from storage	Always
`sort_took`	Time sorting results	When sorting is applied

Vector search metrics

Metric	Description
`vector_search_took`	Time spent in vector index search
`knn_search_layer_N_took`	Per-layer HNSW graph traversal time (N = layer number)
`knn_search_rescore_took`	Time rescoring compressed vectors (PQ/SQ/RQ/BQ)
`hnsw_flat_search`	Whether flat (brute-force) search was used instead of HNSW (`"true"` or `"false"`)

Filter metrics

Metric	Description
`filters_build_allow_list_took`	Time building the filter allow-list
`filters_ids_matched`	Number of object IDs matching the filter

BM25 keyword metrics

Metric	Description
`kwd_method`	BM25 scoring method used (e.g., `blockmaxwand`)
`kwd_time`	Total BM25 scoring time
`kwd_1_tok_time`	Query tokenization time
`kwd_3_term_time`	Term dictionary lookup time
`kwd_4_bmw_time`	BlockMaxWAND scoring time
`kwd_6_res_count`	Number of results from keyword scoring

Example output

A hybrid search on a 3-node cluster with filters produces profiles for both vector and keyword phases on each shard:

Shard: shard_abc (node: weaviate-0)
  [keyword]
    kwd_method:                        blockmaxwand
    kwd_time:                          242.75µs
    kwd_1_tok_time:                    18.291µs
    kwd_3_term_time:                   52.083µs
    kwd_4_bmw_time:                    156.417µs
    total_took:                        248.833µs
  [vector]
    filters_build_allow_list_took:     31.125µs
    filters_ids_matched:               847
    knn_search_layer_0_took:           14µs
    objects_took:                      153.542µs
    total_took:                        198.666µs
    vector_search_took:                40.959µs

Shard: shard_def (node: weaviate-1)
  [keyword]
    kwd_method:                        blockmaxwand
    kwd_time:                          189.333µs
    total_took:                        195.25µs
  [vector]
    filters_build_allow_list_took:     27.458µs
    filters_ids_matched:               912
    total_took:                        172.417µs
    vector_search_took:                35.75µs

Multi-node behavior

In multi-node clusters, the coordinator node aggregates profile data from all shards across all nodes. Each shard profile includes the node field identifying which cluster node executed that shard's search. This makes it straightforward to identify performance imbalances across nodes.

Performance impact

When disabled (default): Zero overhead. A single boolean check skips all profiling code paths.
When enabled: Adds timing instrumentation to each shard search. The overhead is small (microsecond-level timer reads) but measurable under high-throughput workloads. Use for debugging and optimization, not in production hot paths.

Limitations

Response-level only: Profile data is on response.query_profile, not on individual objects. It represents the entire query, not individual result objects.
Search phases only: Profiling covers vector search, keyword scoring, and filter evaluation. It does not include time spent in generative modules, rerankers, or post-processing.
No per-object breakdown: You get per-shard timing, not per-object timing.
Metrics vary by query: Not all metrics appear in every response. Available metrics depend on the search type, index type (HNSW vs. flat), compression settings, and whether filters are used.

Questions and feedback

Have a question or feedback? Here's how to reach us.

Community Forum

Ask questions and connect with other developers on our Community forum.

Support

Weaviate Cloud user or customer? Find the right channel on the Support page.

Additional resources

Need help?

Query profiling

Enable profiling

Supported search types

BM25 example

Hybrid example

Response structure

Available metrics

General metrics

Vector search metrics

Filter metrics

BM25 keyword metrics

Example output

Multi-node behavior

Performance impact

Limitations

Questions and feedback

Additional resources

Need help?

Enable profiling​

Supported search types​

BM25 example​

Hybrid example​

Response structure​

Available metrics​

General metrics​

Vector search metrics​

Filter metrics​

BM25 keyword metrics​

Example output​

Multi-node behavior​

Performance impact​

Limitations​

Questions and feedback​

Enable profiling

Supported search types

BM25 example

Hybrid example

Response structure

Available metrics

General metrics

Vector search metrics

Filter metrics

BM25 keyword metrics

Example output

Multi-node behavior

Performance impact

Limitations

Questions and feedback