Skip to main content
Go to documentation:
⌘U
Weaviate Database

Develop AI applications using Weaviate's APIs and tools

Deploy

Deploy, configure, and maintain Weaviate Database

Weaviate Agents

Build and deploy intelligent agents with Weaviate

Weaviate Cloud

Manage and scale Weaviate in the cloud

Additional resources

Integrations
Contributor guide
Events & Workshops
Weaviate Academy

Need help?

Weaviate LogoAsk AI Assistant⌘K
Community Forum

Aggregate data

Aggregate queries process the result set to return calculated results. Use aggregate queries for groups of objects or the entire result set.

Additional information

To run an Aggregate query, specify the following:

  • A target collection to search

  • One or more aggregated properties, such as:

    • A meta property
    • An object property
    • The groupedBy property
  • Select at least one sub-property for each selected property

For details, see Aggregate.

Retrieve the count meta property

Return the number of objects matched by the query.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.aggregate.over_all(total_count=True)

print(response.total_count)
Example response

The output is like this:

{
"data": {
"Aggregate": {
"JeopardyQuestion": [
{
"meta": {
"count": 10000
}
}
]
}
}
}

Aggregate text properties

This example counts occurrence frequencies:

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Metrics

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.aggregate.over_all(
return_metrics=Metrics("answer").text(
top_occurrences_count=True,
top_occurrences_value=True,
min_occurrences=5 # Threshold minimum count
)
)

print(response.properties["answer"].top_occurrences)
Example response

The output is like this:

{
"data": {
"Aggregate": {
"JeopardyQuestion": [
{
"answer": {
"count": 10000,
"topOccurrences": [
{
"occurs": 19,
"value": "Australia"
},
{
"occurs": 18,
"value": "Hawaii"
},
{
"occurs": 16,
"value": "Boston"
},
{
"occurs": 15,
"value": "French"
},
{
"occurs": 15,
"value": "India"
}
],
"type": "text"
}
}
]
}
}
}

Aggregate int properties

This example shows aggregation with integers.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Metrics

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.aggregate.over_all(
# Use `.number` for floats (`NUMBER` datatype in Weaviate)
return_metrics=Metrics("points").integer(sum_=True, maximum=True, minimum=True),
)

print(response.properties["points"].sum_)
print(response.properties["points"].minimum)
print(response.properties["points"].maximum)
Example response

The output is like this:

{
"data": {
"Aggregate": {
"JeopardyQuestion": [
{
"points": {
"count": 10000,
"sum": 6324100
}
}
]
}
}
}

Aggregate groupedBy properties

To group your results, use groupBy in the query.

To retrieve aggregate data for each group, use the groupedBy properties.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.aggregate import GroupByAggregate

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.aggregate.over_all(
group_by=GroupByAggregate(prop="round")
)

# print rounds names and the count for each
for group in response.groups:
print(f"Value: {group.grouped_by.value} Count: {group.total_count}")
Example response

The output is like this:

{
"data": {
"Aggregate": {
"JeopardyQuestion": [
{
"groupedBy": {
"value": "Double Jeopardy!"
},
"meta": {
"count": 5193
}
},
{
"groupedBy": {
"value": "Jeopardy!"
},
"meta": {
"count": 4522
}
},
{
"groupedBy": {
"value": "Final Jeopardy!"
},
"meta": {
"count": 285
}
}
]
}
}
}
groupBy limitations
  • groupBy only works with near<Media> operators.
  • The groupBy path is limited to one property or cross-reference. Nested paths are not supported.

You can use Aggregate with a similarity search operator (one of the Near operators).

Use objectLimit to specify the maximum number of objects to aggregate.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Metrics

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.aggregate.near_text(
query="animals in space",
object_limit=10,
return_metrics=Metrics("points").number(sum_=True),
)

print(response.properties["points"].sum_)
Example response

The output is like this:

{
"data": {
"Aggregate": {
"JeopardyQuestion": [
{
"points": {
"sum": 4600
}
}
]
}
}
}

Set a similarity distance

You can use Aggregate with a similarity search operator (one of the Near operators).

Use distance to specify how similar the objects should be.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Metrics

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.aggregate.near_text(
query="animals in space",
distance=0.19,
return_metrics=Metrics("points").number(sum_=True),
)

print(response.properties["points"].sum_)
Example response

The output is like this:

{
"data": {
"Aggregate": {
"JeopardyQuestion": [
{
"points": {
"sum": 2500
}
}
]
}
}
}

You can use Aggregate with a hybrid search operator.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Metrics, BM25Operator

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.aggregate.hybrid(
query="animals in space",
bm25_operator=BM25Operator.and_(), # Additional parameters available, such as `bm25_operator`, `filter` etc.
object_limit=10,
return_metrics=Metrics("points").number(sum_=True),
)

print(response.properties["points"].sum_)
Example response

The output is like this:

{
"data": {
"Aggregate": {
"JeopardyQuestion": [
{
"points": {
"sum": 6700
}
}
]
}
}
}

Filter results

For more specific results, use a filter to narrow your search.

py docs  API docs
More infoCode snippets in the documentation reflect the latest client library and Weaviate Database version. Check the Release notes for specific versions.

If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter

jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.aggregate.over_all(
filters=Filter.by_property("round").equal("Final Jeopardy!"),
)

print(response.total_count)
Example response

The output is like this:

{
"data": {
"Aggregate": {
"JeopardyQuestion": [
{
"meta": {
"count": 285
}
}
]
}
}
}

Questions and feedback

If you have any questions or feedback, let us know in the user forum.