Filters
Filters let you include, or exclude, particular objects from your result set based on provided conditions.
For a list of filter operators, see the API reference page.
Filter with one condition
Add a filter to your query, to limit the result set.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=Filter.by_property("round").equal("Double Jeopardy!"),
limit=3
)
for o in response.objects:
print(o.properties)
Example response
The output is like this:
{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "garage",
"question": "This French word originally meant \"a place where one docks\" a boat, not a car",
"round": "Double Jeopardy!"
},
{
"answer": "Mexico",
"question": "The Colorado River provides much of the border between this country's Baja California Norte & Sonora",
"round": "Double Jeopardy!"
},
{
"answer": "Amy Carter",
"question": "On September 1, 1996 this former first daughter married Jim Wentzel at the Pond House near Plains",
"round": "Double Jeopardy!"
}
]
}
}
}
Filter with multiple conditions
To filter with two or more conditions, use And, Or and Not to define the relationship between the conditions.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
The v4 Python client API provides filtering by any_of, or all_of, as well as using & or | operators.
- Use
any_oforall_offor filtering by any, or all of a list of provided filters. - Use
&or|for filtering by pairs of provided filters.
Filter with & or |
from weaviate.classes.query import Filter
jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
# Use & as AND
# | as OR
filters=(
Filter.by_property("round").equal("Double Jeopardy!") &
Filter.by_property("points").less_than(600) &
Filter.not_(Filter.by_property("answer").equal("Yucatan"))
),
limit=3
)
for o in response.objects:
print(o.properties)
Filter with any of
from weaviate.classes.query import Filter
jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=(
Filter.any_of([ # Combines the below with `|`
Filter.by_property("points").greater_or_equal(700),
Filter.by_property("points").less_than(500),
Filter.by_property("round").equal("Double Jeopardy!"),
])
),
limit=5
)
for o in response.objects:
print(o.properties)
Filter with all of
from weaviate.classes.query import Filter
jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=(
Filter.all_of([ # Combines the below with `&`
Filter.by_property("points").greater_than(300),
Filter.by_property("points").less_than(700),
Filter.by_property("round").equal("Double Jeopardy!"),
])
),
limit=5
)
for o in response.objects:
print(o.properties)
Example response
The output is like this:
{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "Mexico",
"points": 200,
"question": "The Colorado River provides much of the border between this country's Baja California Norte & Sonora",
"round": "Double Jeopardy!"
},
{
"answer": "Amy Carter",
"points": 200,
"question": "On September 1, 1996 this former first daughter married Jim Wentzel at the Pond House near Plains",
"round": "Double Jeopardy!"
},
{
"answer": "Greek",
"points": 400,
"question": "Athenians speak the Attic dialect of this language",
"round": "Double Jeopardy!"
}
]
}
}
}
Combine filters with And or Or
Group and nest filter conditions with And and Or operators to express compound logic.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=Filter.by_property("answer").like("*bird*") &
(Filter.by_property("points").greater_than(700) | Filter.by_property("points").less_than(300)),
limit=3
)
for o in response.objects:
print(o.properties)
Example response
The output is like this:
{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "The Firebird",
"points": 1000,
"question": "This title character has the face & arms of a woman & a body of feathers that tapers off in flames",
"round": "Double Jeopardy!"
},
{
"answer": "the Firebird",
"points": 800,
"question": "This Stravinsky character first played by Tamara Karsavina has the face & arms of a girl & a body of feathers",
"round": "Double Jeopardy!"
}
]
}
}
}
Additional information
To create a nested filter, follow these steps.
- Set the outer
operatorequal toAndorOr. - Add
operands. - Inside an
operandexpression, setoperatorequal toAndorOrto add the nested group. - Add
operandsto the nested group as needed.
Combine filters and search operators
Filters work with search operators like nearXXX, hybrid, and bm25.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.near_text(
query="fashion icons",
filters=Filter.by_property("points").greater_than(200),
limit=3
)
for o in response.objects:
print(o.properties)
Example response
The output is like this:
{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "fashion designers",
"points": 400,
"question": "Ted Lapidus, Guy Laroche, Christian Lacroix",
"round": "Jeopardy!"
},
{
"answer": "Dapper Flapper",
"points": 400,
"question": "A stylish young woman of the 1920s",
"round": "Double Jeopardy!"
},
{
"answer": "Women's Wear Daily",
"points": 800,
"question": "This daily chronicler of the fashion industry launched \"W\", a bi-weekly, in 1972",
"round": "Jeopardy!"
}
]
}
}
}
ContainsAny Filter
The ContainsAny operator works on text properties and take an array of values as input. It will match objects where the property contains any (i.e. one or more) of the values in the array.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
jeopardy = client.collections.use("JeopardyQuestion")
token_list = ["australia", "india"]
response = jeopardy.query.fetch_objects(
# Find objects where the `answer` property contains any of the strings in `token_list`
filters=Filter.by_property("answer").contains_any(token_list),
limit=3
)
for o in response.objects:
print(o.properties)
Example response
The output is like this:
{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "India",
"points": 100,
"question": "Country that is home to Parsis & Sikhs",
"round": "Jeopardy!"
},
{
"answer": "Australia",
"points": 400,
"question": "The redundant-sounding Townsville, in this country's Queensland state, was named for Robert Towns",
"round": "Double Jeopardy!"
},
{
"answer": "Australia",
"points": 100,
"question": "Broken Hill, this country's largest company, took its name from a small town in New South Wales",
"round": "Jeopardy!"
}
]
}
}
}
ContainsAll Filter
The ContainsAll operator works on text properties and take an array of values as input. It will match objects where the property contains all of the values in the array.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
jeopardy = client.collections.use("JeopardyQuestion")
token_list = ["blue", "red"]
response = jeopardy.query.fetch_objects(
# Find objects where the `question` property contains all of the strings in `token_list`
filters=Filter.by_property("question").contains_all(token_list),
limit=3
)
for o in response.objects:
print(o.properties)
Example response
The output is like this:
{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "James Patterson",
"points": 1000,
"question": "His Alex Cross thrillers include \"Roses are Red\" & \"Violets are Blue\"",
"round": "Jeopardy!"
},
{
"answer": "a chevron",
"points": 800,
"question": "Chevron's red & blue logo is this heraldic shape, meant to convey rank & service",
"round": "Jeopardy!"
},
{
"answer": "litmus",
"points": 400,
"question": "Vegetable dye that turns red in acid solutions & blue in alkaline solutions",
"round": "Double Jeopardy!"
}
]
}
}
}
ContainsNone Filter
The ContainsNone operator works on text properties and take an array of values as input. It will match objects where the property contains none of the values in the array.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
jeopardy = client.collections.get("JeopardyQuestion")
token_list = ["bird", "animal"]
response = jeopardy.query.fetch_objects(
# Find objects where the `question` property contains none of the strings in `token_list`
filters=Filter.by_property("question").contains_none(token_list),
limit=3
)
for o in response.objects:
print(o.properties)
Example response
The output is like this:
{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "Frank Lloyd Wright",
"hasCategory": [
{
"title": "PEOPLE"
}
],
"question": "In 1939 this famous architect polished off his Johnson Wax Building in Racine, Wisconsin"
},
{
"answer": "a luffa",
"hasCategory": [
{
"title": "FOOD"
}
],
"question": "When it's young & tender, this gourd used in the bathtub can be eaten like a squash"
},
{
"answer": "a snail",
"hasCategory": [
{
"title": "SCIENCE & NATURE"
}
],
"question": "Like an escargot, the abalone is an edible one of these gastropods"
}
]
}
}
}
ContainsAny, ContainsAll and ContainsNone with batch delete
If you want to do a batch delete, see Delete objects.
Filter text on partial matches
If the object property is a text, or text-like data type such as object ID, use Like to filter on partial text matches.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=Filter.by_property("answer").like("*ala*"),
limit=3
)
for o in response.objects:
print(o.properties)
Example response
The output is like this:
{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "interglacial",
"question": "This term refers to the warm periods within ice ages; we're in one of those periods now",
"round": "Jeopardy!"
},
{
"answer": "the Interior",
"question": "In 1849, Thomas Ewing, \"The Logician of the West\", became the USA's first Secy. of this Cabinet Dept.",
"round": "Jeopardy!"
},
{
"answer": "Interlaken, Switzerland",
"question": "You can view the Jungfrau Peak from the main street of this town between the Brienz & Thun Lakes",
"round": "Final Jeopardy!"
}
]
}
}
}
Additional information
The * wildcard operator matches zero or more characters. The ? operator matches exactly one character.
Currently, the Like filter is not able to match wildcard characters (? and *) as literal characters (read more).
Filter using cross-references
Queries involving cross-references can be slower than queries that do not involve cross-references, especially at scale such as for multiple objects or complex queries.
At the first instance, we strongly encourage you to consider whether you can avoid using cross-references in your data schema. As a scalable AI database, Weaviate is well-placed to perform complex queries with vector, keyword and hybrid searches involving filters. You may benefit from rethinking your data schema to avoid cross-references where possible.
For example, instead of creating separate "Author" and "Book" collections with cross-references, consider embedding author information directly in Book objects and using searches and filters to find books by author characteristics.
To filter on properties from a cross-referenced object, add the collection name to the filter.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter, QueryReference
jeopardy = client.collections.use("JeopardyQuestion")
response = jeopardy.query.fetch_objects(
filters=Filter.by_ref(link_on="hasCategory").by_property("title").like("*Sport*"),
return_references=QueryReference(link_on="hasCategory", return_properties=["title"]),
limit=3
)
for o in response.objects:
print(o.properties)
print(o.references["hasCategory"].objects[0].properties["title"])
Example response
The output is like this:
{
"data": {
"Get": {
"JeopardyQuestion": [
{
"answer": "Sampan",
"hasCategory": [
{
"title": "TRANSPORTATION"
}
],
"question": "Smaller than a junk, this Oriental boat usually has a cabin with a roof made of mats",
"round": "Jeopardy!"
},
{
"answer": "Emmitt Smith",
"hasCategory": [
{
"title": "SPORTS"
}
],
"question": "In 1994 this Dallas Cowboy scored 22 touchdowns; in 1995 he topped that with 25",
"round": "Jeopardy!"
},
{
"answer": "Lee Iacocca",
"hasCategory": [
{
"title": "TRANSPORTATION"
}
],
"question": "Chrysler executive who developed the Ford Mustang",
"round": "Jeopardy!"
}
]
}
}
}
By geo-coordinates
Currently, geo-coordinate filtering is limited to the nearest 800 results from the source location, which will be further reduced by any other filter conditions and search parameters.
If you plan on a densely populated dataset, consider using another strategy such as geo-hashing into a text datatype, and filtering further, such as with a ContainsAny filter.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
from weaviate.classes.query import GeoCoordinate
response = publications.query.fetch_objects(
filters=(
Filter
.by_property("headquartersGeoLocation")
.within_geo_range(
coordinate=GeoCoordinate(
latitude=52.39,
longitude=4.84
),
distance=1000 # In meters
)
)
)
for o in response.objects:
print(o.properties) # Inspect returned objects
By DATE datatype
To filter by a DATE datatype property, specify the date/time as an RFC 3339 timestamp, or a client library-compatible type such as a Python datetime object.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from datetime import datetime, timezone
from weaviate.classes.query import Filter, MetadataQuery
# Set the timezone for avoidance of doubt
filter_time = datetime(2022, 6, 10).replace(tzinfo=timezone.utc)
# The filter threshold could also be an RFC 3339 timestamp, e.g.:
# filter_time = "2022-06-10T00:00:00.00Z"
response = collection.query.fetch_objects(
limit=3,
# This property (`some_date`) is a `DATE` datatype
filters=Filter.by_property("some_date").greater_than(filter_time),
)
for o in response.objects:
print(o.properties) # Inspect returned objects
Filter by metadata
Filters also work with metadata properties such as object id, property length, and timestamp.
For the full list, see API references: Filters.
By object id
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
collection = client.collections.use("Article")
target_id = "00037775-1432-35e5-bc59-443baaef7d80"
response = collection.query.fetch_objects(
filters=Filter.by_id().equal(target_id)
)
for o in response.objects:
print(o.properties) # Inspect returned objects
print(o.uuid)
By object timestamp
This filter requires the property timestamp to be indexed.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from datetime import datetime, timezone
from weaviate.classes.query import Filter, MetadataQuery
collection = client.collections.use("Article")
# Set the timezone for avoidance of doubt (otherwise the client will emit a warning)
filter_time = datetime(2020, 1, 1).replace(tzinfo=timezone.utc)
response = collection.query.fetch_objects(
limit=3,
filters=Filter.by_creation_time().greater_than(filter_time),
return_metadata=MetadataQuery(creation_time=True)
)
for o in response.objects:
print(o.properties) # Inspect returned objects
print(o.metadata.creation_time) # Inspect object creation time
By object property length
This filter requires the property length to be indexed.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
collection = client.collections.use("JeopardyQuestion")
response = collection.query.fetch_objects(
limit=3,
filters=Filter.by_property("answer", length=True).greater_than(length_threshold),
)
for o in response.objects:
print(o.properties) # Inspect returned objects
print(len(o.properties["answer"])) # Inspect property length
By object null state
This filter requires the property null state to be indexed.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.query import Filter
collection = client.collections.use("WineReview")
response = collection.query.fetch_objects(
limit=3,
# This requires the `country` property to be configured with `index_null_state=True``
filters=Filter.by_property("country").is_none(True) # Find objects where the `country` property is null
)
print("despot. othing")
for o in response.objects:
print("despot"+o.properties) # Inspect returned objects
Filter on nested object properties
Available from Weaviate v1.38 as a preview, gated by the WEAVIATE_PREVIEW_NESTED_FILTERING=on environment variable on the server. The path syntax and operator semantics are stable, but the on-disk encoding may change before GA — don't rely on persistent state from preview clusters carrying over to the GA release. The env var is removed at GA and the feature is enabled unconditionally.
object and object[] properties carry their own nested schemas. To filter on a value inside a nested object, use a single dotted path naming the path from the parent property down to the leaf you want to compare.
Given a collection like this:
client.collections.create(
name="Document",
vector_config=Configure.Vectors.self_provided(),
inverted_index_config=Configure.inverted_index(index_null_state=True),
properties=[
Property(name="title", data_type=DataType.TEXT, tokenization=Tokenization.FIELD),
Property(
name="cars",
data_type=DataType.OBJECT_ARRAY,
nested_properties=[
Property(name="make", data_type=DataType.TEXT, tokenization=Tokenization.FIELD),
Property(name="color", data_type=DataType.TEXT, tokenization=Tokenization.FIELD),
Property(
name="tires",
data_type=DataType.OBJECT_ARRAY,
nested_properties=[
Property(name="brand", data_type=DataType.TEXT, tokenization=Tokenization.FIELD),
Property(name="width", data_type=DataType.INT),
],
),
],
),
],
)
The filter property is a single dotted path. The dot is the only separator. An optional [N] after any segment pins that segment to an array index (0-based).
| Path | Meaning |
|---|---|
cars.make | Any car's make (matches if any element of the cars array has it) |
cars[0].make | The first car's make (positional) |
cars.tires.width | Any tire on any car (recursive across two object[] levels) |
cars[1].tires[2].brand | The second car's third tire's brand (positional through nesting) |
[N] on a segment requires that segment to be an object[] (array). Every intermediate segment must be object or object[] — you cannot pivot through a scalar. The leaf may be any supported scalar type.
Match any element (default)
A path without [N] markers matches if any element in the parent array satisfies the condition.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
# "any car has make = Toyota" — matches Doc 1 (first car) and Doc 2 (only car)
response = docs.query.fetch_objects(
filters=Filter.by_property("cars.make").equal("Toyota"),
return_properties=["title"],
)
for o in response.objects:
print(o.properties)
Match by position
Use [N] to pin a path segment to a specific array index. Indices are 0-based.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
# "the FIRST car has make = Toyota" — Doc 3's first car is Honda, so it's excluded
response = docs.query.fetch_objects(
filters=Filter.by_property("cars[0].make").equal("Toyota"),
return_properties=["title"],
)
Same-element correlation across leaves
Combining two leaf filters with And matches when the same element in the parent array satisfies both. A document with one car (Toyota, blue) and another (Honda, red) would not match cars.make = "Toyota" AND cars.color = "red" — both conditions must hold on the same car.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
# "the SAME car is both Toyota AND red" — only Doc 1's first car qualifies.
# Without same-element correlation a doc with separate (Toyota, blue) and
# (Honda, red) cars would also match, which is wrong.
response = docs.query.fetch_objects(
filters=(
Filter.by_property("cars.make").equal("Toyota")
& Filter.by_property("cars.color").equal("red")
),
return_properties=["title"],
)
Deep / recursive paths
object[] can nest inside object[] to any depth. Each segment in the dotted path traverses one level.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
# "any tire on any car is wider than 200" — Doc 1 (215) and Doc 3 (250)
response = docs.query.fetch_objects(
filters=Filter.by_property("cars.tires.width").greater_than(200),
return_properties=["title"],
)
Check whether a nested object is absent
Pointing a path at an object or object[] segment (rather than a scalar leaf) is only valid with IsNull, which asks whether that whole sub-object is present.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
# "the first car has no tires" — only the Toyota in Doc 2
response = docs.query.fetch_objects(
filters=Filter.by_property("cars[0].tires").is_none(True),
return_properties=["title"],
)
Limitations
- Allowed leaf data types:
text,int,number,boolean,date,uuid, and their array variants.blob,blobHash,geoCoordinates,phoneNumber, and cross-references (cref) are not allowed inside nested objects for nested filtering. IndexFilterableis required: nested filtering uses the filterable inverted index on each leaf.IndexRangeFiltersandIndexSearchableflags exist on nested-property definitions but are not yet exercised by the nested searcher — range filters on nested numeric leaves currently use the filterable bucket.- Tokenization matters: nested
textleaves use the same tokenization options as flat properties. For exact-match filters on names, codes, or identifiers, settokenization: fieldon the leaf so the value is stored as a single token. - Reference-path vs nested-path: a reference-path filter is a multi-element
Path(["inCity", "City", "name"]) traversing cross-references; a nested-path filter is a single-element path with dots inside it (["cars.make"]).
Filter considerations
Tokenization
Weaviate converts filter terms into tokens. The default tokenization is word. The word tokenizer keeps alphanumeric characters, lowercase them and splits on whitespace. It converts a string like "Test_domain_weaviate" into "test", "domain", and "weaviate".
For details and additional tokenization methods, see Tokenization.
Improve filter performance
If you encounter slow filter performance, consider adding a limit parameter or additional where operators to restrict the size of your data set.
List of filter operators
For a list of filter operators, see the reference page.
Related pages
Questions and feedback
Have a question or feedback? Here's how to reach us.
