Basic collection operations
Every object in Weaviate belongs to exactly one collection. Use the examples on this page to manage your collections.
Newer Weaviate documentation discuses "collections." Older Weaviate documentation refers to "classes" instead. Expect to see both terms throughout the documentation.
Create a collection
To create a collection, specify at least the collection name. If you don't specify any properties, auto-schema
creates them.
Weaviate follows GraphQL naming conventions.
- Start collection names with an upper case letter.
- Start property names with a lower case letter.
If you use an initial upper case letter to define a property name, Weaviate changes it to a lower case letter internally.
- Python Client v4
- Python Client v3
- JS/TS Client v3
- JS/TS Client v2
- Java
- Go
client.collections.create("Article")
class_name = "Article"
class_obj = {"class": class_name}
client.schema.create_class(class_obj) # returns null on success
const newCollection = await client.collections.create({
name: 'Article'
})
// The returned value is the full collection definition, showing all defaults
console.log(JSON.stringify(newCollection, null, 2));
const className = 'Article';
const emptyClassDefinition = {
class: className,
};
// Add the class to the schema
let result = await client.schema
.classCreator()
.withClass(emptyClassDefinition)
.do();
String collectionName = "Article";
WeaviateClass emptyClass = WeaviateClass.builder()
.className(collectionName)
.build();
// Add the collection to the schema
Result<Boolean> result = client.schema().classCreator()
.withClass(emptyClass)
.run();
className := "Article"
emptyClass := &models.Class{
Class: className,
}
// Create the collection (also called class)
err := client.Schema().ClassCreator().
WithClass(emptyClass).
Do(ctx)
Using too many collections can lead to scalability issues like high memory usage and degraded query performance. Instead, consider using multi-tenancy, where a single collection is subdivided into multiple tenants.
For more details, see Starter Guides: Scaling limits with collections.
Create a collection and define properties
Properties are the data fields in your collection. Each property has a name and a data type.
Additional information
Use properties to configure additional parameters such as data type, index characteristics, or tokenization.
For details, see:
- Python Client v4
- Python Client v3
- JS/TS Client v3
- JS/TS Client v2
- Java
- Go
from weaviate.classes.config import Property, DataType
# Note that you can use `client.collections.create_from_dict()` to create a collection from a v3-client-style JSON object
client.collections.create(
"Article",
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="body", data_type=DataType.TEXT),
]
)
class_obj = {
"class": "Article",
"properties": [
{
"name": "title",
"dataType": ["text"],
},
{
"name": "body",
"dataType": ["text"],
},
],
}
client.schema.create_class(class_obj) # returns null on success
import { dataType } from 'weaviate-client';
await client.collections.create({
name: 'Article',
properties: [
{
name: 'title',
dataType: dataType.TEXT,
},
{
name: 'body',
dataType: dataType.TEXT,
},
],
})
const classWithProps = {
class: 'Article',
properties: [
{
name: 'title',
dataType: ['text'],
},
{
name: 'body',
dataType: ['text'],
},
],
};
// Add the class to the schema
result = await client.schema.classCreator().withClass(classWithProps).do();
// Define collection properties
Property titleProperty = Property.builder()
.name("title")
.description("Title Property Description...")
.dataType(Arrays.asList(DataType.TEXT))
.build();
Property bodyProperty = Property.builder()
.name("body")
.description("Body Property Description...")
.dataType(Arrays.asList(DataType.TEXT))
.build();
// Add the defined properties to the collection
WeaviateClass articleCollection = WeaviateClass.builder()
.className(collectionName)
.description("Article collection Description...")
.properties(Arrays.asList(titleProperty, bodyProperty))
.build();
Result<Boolean> result = client.schema().classCreator()
.withClass(articleCollection)
.run();
articleClass := &models.Class{
Class: "Article",
Description: "Collection of articles",
Properties: []*models.Property{
{
Name: "title",
DataType: schema.DataTypeText.PropString(),
},
{
Name: "body",
DataType: schema.DataTypeText.PropString(),
},
},
}
Disable auto-schema
By default, Weaviate creates missing collections and missing properties. When you configure collections manually, you have more precise control of the collection settings.
To disable auto-schema
set AUTOSCHEMA_ENABLED: 'false'
in your system configuration file.
Read a single collection definition
Retrieve a collection definition from the schema.
- Python Client v4
- Python Client v3
- JS/TS Client v3
- JS/TS Client v2
- Java
- Go
articles = client.collections.get("Article")
articles_config = articles.config.get()
print(articles_config)
class_name = "Article"
response = client.schema.get(class_name)
print(json.dumps(response, indent=2))
let articles = client.collections.get('Article')
const collectionConfig = await articles.config.get()
console.log(collectionConfig)
const className = 'Article';
let classDefinition = await client.schema
.classGetter()
.withClassName(className)
.do();
console.log(JSON.stringify(classDefinition, null, 2));
String collectionName = "Article";
Result<WeaviateClass> result = client.schema().classGetter()
.withClassName(collectionName)
.run();
String json = new GsonBuilder().setPrettyPrinting().create().toJson(result.getResult());
System.out.println(json);
className := "Article"
class, err := client.Schema().ClassGetter().
WithClassName(className).
Do(ctx)
b, err := json.MarshalIndent(class, "", " ")
fmt.Println(string(b))
Sample configuration: Text objects
This configuration for text objects defines the following:
- The collection name (
Article
) - The vectorizer module (
text2vec-cohere
) and model (embed-multilingual-v2.0
) - A set of properties (
title
,body
) withtext
data types.
{
"class": "Article",
"vectorizer": "text2vec-cohere",
"moduleConfig": {
"text2vec-cohere": {
"model": "embed-multilingual-v2.0"
}
},
"properties": [
{
"name": "title",
"dataType": ["text"]
},
{
"name": "body",
"dataType": ["text"]
}
]
}
Sample configuration: Nested objects
v1.22
This configuration for nested objects defines the following:
-
The collection name (
Person
) -
The vectorizer module (
text2vec-huggingface
) -
A set of properties (
last_name
,address
)last_name
hastext
data typeaddress
hasobject
data type
-
The
address
property has two nested properties (street
andcity
)
{
"class": "Person",
"vectorizer": "text2vec-huggingface",
"properties": [
{
"dataType": ["text"],
"name": "last_name"
},
{
"dataType": ["object"],
"name": "address",
"nestedProperties": [
{ "dataType": ["text"], "name": "street" },
{ "dataType": ["text"], "name": "city" }
]
}
]
}
Sample configuration: Generative search
This configuration for retrieval augmented generation defines the following:
- The collection name (
Article
) - The default vectorizer module (
text2vec-openai
) - The generative module (
generative-openai
) - A set of properties (
title
,chunk
,chunk_no
andurl
) - The tokenization option for the
url
property - The vectorization option (
skip
vectorization) for theurl
property
{
"class": "Article",
"vectorizer": "text2vec-openai",
"vectorIndexConfig": {
"distance": "cosine"
},
"moduleConfig": {
"generative-openai": {}
},
"properties": [
{
"name": "title",
"dataType": ["text"]
},
{
"name": "chunk",
"dataType": ["text"]
},
{
"name": "chunk_no",
"dataType": ["int"]
},
{
"name": "url",
"dataType": ["text"],
"tokenization": "field",
"moduleConfig": {
"text2vec-openai": {
"skip": true
}
}
}
]
}
Sample configuration: Images
This configuration for image search defines the following:
-
The collection name (
Image
) -
The vectorizer module (
img2vec-neural
)- The
image
property configures collection to store image data.
- The
-
The vector index distance metric (
cosine
) -
A set of properties (
image
), with theimage
property set asblob
.
For image searches, see Image search.
{
"class": "Image",
"vectorizer": "img2vec-neural",
"vectorIndexConfig": {
"distance": "cosine"
},
"moduleConfig": {
"img2vec-neural": {
"imageFields": ["image"]
}
},
"properties": [
{
"name": "image",
"dataType": ["blob"]
}
]
}
Read all collection definitions
Fetch the database schema to retrieve all of the collection definitions.
- Python Client v4
- Python Client v3
- JS/TS Client v3
- JS/TS Client v2
- Java
- Go
response = client.collections.list_all(simple=False)
print(response)
response = client.schema.get()
print(json.dumps(response, indent=2))
const allCollections = await client.collections.listAll()
console.log(JSON.stringify(allCollections, null, 2));
let allCollections = await client.schema.getter().do();
console.log(JSON.stringify(allCollections, null, 2));
Result<Schema> result = client.schema().getter()
.run();
String json = new GsonBuilder().setPrettyPrinting().create().toJson(result.getResult());
System.out.println(json);
schema, err := client.Schema().Getter().
Do(ctx)
b, err := json.MarshalIndent(schema, "", " ")
fmt.Println(string(b))
Update a collection definition
Currently (from v1.25.0
onwards) a replication factor cannot be changed once it is set.
This is due to the schema consensus algorithm change in v1.25
. This will be improved in future versions.
You can update a collection definition to change the mutable collection settings.
- Python Client v4
- Python Client v3
- JS/TS Client v3
- JS/TS Client v2
- Java
- Go
from weaviate.classes.config import Reconfigure, VectorFilterStrategy, ReplicationDeletionStrategy
articles = client.collections.get("Article")
# Update the collection definition
articles.config.update(
description="An updated collection description.",
property_descriptions={
"title": "The updated title description for article",
}, # Available from Weaviate v1.31.0
inverted_index_config=Reconfigure.inverted_index(
bm25_k1=1.5
),
vector_index_config=Reconfigure.VectorIndex.hnsw(
filter_strategy=VectorFilterStrategy.ACORN # Available from Weaviate v1.27.0
),
replication_config=Reconfigure.replication(
deletion_strategy=ReplicationDeletionStrategy.TIME_BASED_RESOLUTION # Available from Weaviate v1.28.0
)
)
articles = client.collections.get("Article")
article_shards = articles.config.update_shards(
status="READY",
shard_names=shard_names # The names (List[str]) of the shard to update (or a shard name)
)
print(article_shards)
class_name = "Article"
# Update the collection definition
collection_def_changes = {
"class": class_name,
"invertedIndexConfig": {
"bm25": {
"k1": 1.5 # Change the k1 parameter from 1.2
}
},
"vectorIndexConfig": {
"filterStrategy": "acorn" # Available from Weaviate v1.27.0
},
"replicationConfig": {
"deletionStrategy": "TimeBasedResolution" # Available from Weaviate v1.28.0
}
}
client.schema.update_config("Article", collection_def_changes)
import { reconfigure } from 'weaviate-client';
let articles = client.collections.get('Article')
await articles.config.update({
invertedIndex: reconfigure.invertedIndex({
bm25k1: 1.5 // Change the k1 parameter from 1.2
}),
vectorizers: reconfigure.vectorizer.update({
vectorIndexConfig: reconfigure.vectorIndex.hnsw({
quantizer: reconfigure.vectorIndex.quantizer.pq(),
ef: 4,
filterStrategy: 'acorn', // Available from Weaviate v1.27.0
}),
})
})
// Collection definition updates are not available in the v2 API.
// Consider upgrading to the v3 API, or deleting and recreating the collection.
// Get existing collection
Result<WeaviateClass> existingResult = client.schema().classGetter()
.withClassName(collectionName)
.run();
assertThat(existingResult).isNotNull()
.returns(false, Result::hasErrors);
WeaviateClass existingClass = existingResult.getResult();
// Create updated configurations
InvertedIndexConfig invertedConfig = InvertedIndexConfig.builder()
.bm25(BM25Config.builder().k1(1.5f).build())
.build();
VectorIndexConfig vectorConfig = VectorIndexConfig.builder()
.filterStrategy(VectorIndexConfig.FilterStrategy.ACORN)
.build();
ReplicationConfig replicationConfig = ReplicationConfig.builder()
.deletionStrategy(ReplicationConfig.DeletionStrategy.NO_AUTOMATED_RESOLUTION)
.build();
// Update collection with new configurations - preserve critical existing configs
WeaviateClass updatedClass = WeaviateClass.builder()
.className(collectionName)
.shardingConfig(existingClass.getShardingConfig()) // Preserve sharding (immutable)
.invertedIndexConfig(invertedConfig) // Update
.vectorIndexConfig(vectorConfig) // Update
.replicationConfig(replicationConfig) // Update
.build();
Result<Boolean> updateResult = client.schema().classUpdater()
.withClass(updatedClass)
.run();
updatedArticleClassConfig := &models.Class{
// Note: The new collection config must be provided in full,
// including the configuration that is not being updated.
// We suggest using the original class config as a starting point.
Class: "Article",
InvertedIndexConfig: &models.InvertedIndexConfig{
Bm25: &models.BM25Config{
K1: 1.5,
},
},
VectorIndexConfig: map[string]interface{}{
"filterStrategy": "acorn",
},
ReplicationConfig: &models.ReplicationConfig{
DeletionStrategy: models.ReplicationConfigDeletionStrategyTimeBasedResolution,
},
}
Delete a collection
You can delete any unwanted collection(s), along with the data that they contain.
When you delete a collection, you delete all associated objects!
Be very careful with deletes on a production database and anywhere else that you have important data.
This code deletes a collection and its objects.
- Python Client v4
- Python Client v3
- JS/TS Client v3
- JS/TS Client v2
- Go
- Java
- Curl
# collection_name can be a string ("Article") or a list of strings (["Article", "Category"])
client.collections.delete(collection_name) # THIS WILL DELETE THE SPECIFIED COLLECTION(S) AND THEIR OBJECTS
# Note: you can also delete all collections in the Weaviate instance with:
# client.collections.delete_all()
# delete class "Article" - THIS WILL DELETE ALL DATA IN THIS CLASS
client.schema.delete_class("Article") # Replace with your class name
// delete collection "Article" - THIS WILL DELETE THE COLLECTION AND ALL ITS DATA
await client.collections.delete('Article')
// you can also delete all collections of a cluster
// await client.collections.deleteAll()
// delete collection "Article" - THIS WILL DELETE THE COLLECTION AND ALL ITS DATA
await client.schema
.classDeleter()
.withClassName('Article')
.do();
className := "YourClassName"
// delete the class
if err := client.Schema().ClassDeleter().WithClassName(className).Do(context.Background()); err != nil {
// Weaviate will return a 400 if the class does not exist, so this is allowed, only return an error if it's not a 400
if status, ok := err.(*fault.WeaviateClientError); ok && status.StatusCode != http.StatusBadRequest {
panic(err)
}
}
Result<Boolean> result = client.schema().classDeleter()
.withClassName(collectionName)
.run();
curl \
-X DELETE \
https://WEAVIATE_INSTANCE_URL/v1/schema/YourClassName # Replace WEAVIATE_INSTANCE_URL with your instance URL
Add a property
Indexing limitations after data import
There are no index limitations when you add collection properties before you import data.
If you add a new property after you import data, there is an impact on indexing.
Property indexes are built at import time. If you add a new property after importing some data, pre-existing objects index aren't automatically updated to add the new property. This means pre-existing objects aren't added to the new property index. Queries may return unexpected results because the index only includes new objects.
To create an index that includes all of the objects in a collection, do one of the following:
- New collections: Add all of the collection's properties before importing objects.
- Existing collections: Export the existing data from the collection. Re-create it with the new property. Import the data into the updated collection.
We are working on a re-indexing API to allow you to re-index the data after adding a property. This will be available in a future release.
- Python Client v4
- Python Client v3
- JS/TS Client v3
- JS/TS Client v2
- Go
- Java
from weaviate.classes.config import Property, DataType
articles = client.collections.get("Article")
articles.config.add_property(
Property(
name="onHomepage",
data_type=DataType.BOOL
)
)
add_prop = {
"dataType": [
"boolean"
],
"name": "onHomepage"
}
client.schema.property.create("Article", add_prop)
let articles = client.collections.get('Article')
articles.config.addProperty({
name: 'onHomepage',
dataType: 'boolean'
})
const className = 'Article';
const prop = {
dataType: ['boolean'],
name: 'onHomepage',
};
const response = await client.schema
.propertyCreator()
.withClassName(className)
.withProperty(prop)
.do();
console.log(JSON.stringify(response, null, 2));
package main
import (
"context"
"github.com/weaviate/weaviate-go-client/v5/weaviate"
"github.com/weaviate/weaviate/entities/models"
)
func main() {
cfg := weaviate.Config{
Host: "localhost:8080",
Scheme: "http",
}
client, err := weaviate.NewClient(cfg)
if err != nil {
panic(err)
}
prop := &models.Property{
DataType: []string{"boolean"},
Name: "onHomepage",
}
err := client.Schema().PropertyCreator().
WithClassName("Article").
WithProperty(prop).
Do(context.Background())
if err != nil {
panic(err)
}
}
Property property = Property.builder()
.dataType(Arrays.asList(DataType.BOOLEAN))
.name(propertyName)
.build();
Result<Boolean> result = client.schema().propertyCreator()
.withClassName(collectionName)
.withProperty(property)
.run();
Set inverted index parameters
Various inverted index parameters are configurable for each collection. Some parameters are set at the collection level, while others are set at the property level.
- Python Client v4
- Python Client v3
- JS/TS Client v3
- JS/TS Client v2
- Java
- Go
from weaviate.classes.config import Configure, Property, DataType
client.collections.create(
"Article",
# Additional settings not shown
properties=[ # properties configuration is optional
Property(
name="title",
data_type=DataType.TEXT,
index_filterable=True,
index_searchable=True,
),
Property(
name="chunk",
data_type=DataType.TEXT,
index_filterable=True,
index_searchable=True,
),
Property(
name="chunk_number",
data_type=DataType.INT,
index_range_filters=True,
),
],
inverted_index_config=Configure.inverted_index( # Optional
bm25_b=0.7,
bm25_k1=1.25,
index_null_state=True,
index_property_length=True,
index_timestamps=True
)
)
class_obj = {
"class": "Article",
"properties": [
{
"name": "title",
"dataType": ["text"],
"indexFilterable": True,
"indexSearchable": True,
"moduleConfig": {
"text2vec-huggingface": {}
}
},
{
"name": "chunk",
"dataType": ["text"],
"indexFilterable": True,
"indexSearchable": True,
},
{
"name": "chunk_no",
"dataType": ["int"],
"indexRangeFilters": True,
},
],
"invertedIndexConfig": {
"bm25": {
"b": 0.7,
"k1": 1.25
},
"indexTimestamps": True,
"indexNullState": True,
"indexPropertyLength": True
}
}
client.schema.create_class(class_obj)
import { dataType } from 'weaviate-client';
await client.collections.create({
name: 'Article',
properties: [
{
name: 'title',
dataType: dataType.TEXT,
indexFilterable: true,
indexSearchable: true,
},
{
name: 'chunk',
dataType: dataType.TEXT,
indexFilterable: true,
indexSearchable: true,
},
{
name: 'chunk_no',
dataType: dataType.INT,
indexRangeFilters: true,
},
],
invertedIndex: {
bm25: {
b: 0.7,
k1: 1.25
},
indexNullState: true,
indexPropertyLength: true,
indexTimestamps: true
}
})
const classWithInvIndexSettings = {
class: 'Article',
properties: [
{
name: 'title',
dataType: ['text'],
indexFilterable: true,
indexSearchable: true,
},
{
name: 'chunk',
dataType: ['text'],
indexFilterable: true,
indexSearchable: true,
},
{
name: 'chunk_no',
dataType: ['int'],
indexRangeFilters: true,
},
],
invertedIndexConfig: {
bm25: {
b: 0.7,
k1: 1.25
},
indexTimestamps: true,
indexNullState: true,
indexPropertyLength: true
}
};
// Add the class to the schema
result = await client.schema
.classCreator()
.withClass(classWithPropModuleSettings)
.do();
// Create properties with specific indexing configurations
Property titleProperty = Property.builder()
.name("title")
.dataType(Arrays.asList(DataType.TEXT))
.indexFilterable(true)
.indexSearchable(true)
.build();
Property chunkProperty = Property.builder()
.name("chunk")
.dataType(Arrays.asList(DataType.INT))
.indexRangeFilters(true)
.build();
// Configure BM25 settings
BM25Config bm25Config = BM25Config.builder()
.b(0.7f)
.k1(1.25f)
.build();
// Configure inverted index with BM25 and other settings
InvertedIndexConfig invertedIndexConfig = InvertedIndexConfig.builder()
.bm25(bm25Config)
.indexNullState(true)
.indexPropertyLength(true)
.indexTimestamps(true)
.build();
// Create the Article collection with properties and inverted index configuration
WeaviateClass articleCollection = WeaviateClass.builder()
.className(collectionName)
.properties(Arrays.asList(titleProperty, chunkProperty))
.invertedIndexConfig(invertedIndexConfig)
.build();
// Add the collection to the schema
Result<Boolean> result = client.schema().classCreator()
.withClass(articleCollection)
.run();
vTrue := true
vFalse := false
articleClass := &models.Class{
Class: "Article",
Description: "Collection of articles",
Properties: []*models.Property{
{
Name: "title",
DataType: schema.DataTypeText.PropString(),
Tokenization: "lowercase",
IndexFilterable: &vTrue,
IndexSearchable: &vFalse,
},
{
Name: "chunk",
DataType: schema.DataTypeText.PropString(),
Tokenization: "word",
IndexFilterable: &vTrue,
IndexSearchable: &vTrue,
},
{
Name: "chunk_no",
DataType: schema.DataTypeInt.PropString(),
IndexRangeFilters: &vTrue,
},
},
InvertedIndexConfig: &models.InvertedIndexConfig{
Bm25: &models.BM25Config{
B: 0.7,
K1: 1.25,
},
IndexNullState: true,
IndexPropertyLength: true,
IndexTimestamps: true,
},
}
Further resources
- Manage collections: Vectorizer and vector index
- References: Configuration: Schema
- Concepts: Data structure
-
API References: REST: Schema
Questions and feedback
If you have any questions or feedback, let us know in the user forum.