Quickstart
This quickstart shows you how to combine Weaviate Cloud and the Weaviate Embeddings service to:
- Set up a Weaviate Cloud instance. (10 minutes)
- Add and vectorize your data using Weaviate Embeddings. (10 minutes)
- Perform a semantic (vector) search and hybrid search. (10 minutes)
Notes:
- The code examples here are self-contained. You can copy and paste them into your own environment to try them out.
Requirements
To use Weaviate Embeddings, you will need:
- A Weaviate Cloud Sandbox running at least Weaviate
1.28.5 - A Weaviate client library that supports Weaviate Embeddings:
- Python client version
4.9.5or higher - JavaScript/TypeScript client version
3.2.5or higher - Go/Java clients are not yet officially supported; you must pass the
X-Weaviate-Api-KeyandX-Weaviate-Cluster-Urlheaders manually upon instantiation as shown below.
- Python client version
Step 1: Set up Weaviate
1.1 Create a new cluster
To create a free Sandbox cluster in Weaviate Cloud, follow these instructions.
When possible, try to use the latest Weaviate version. New releases include cutting-edge features, performance enhancements, and critical security updates to keep your application safe and up-to-date.
1.2 Install a client library
We recommend using a client library to work with Weaviate. Follow the instructions below to install one of the official client libraries, available in Python, JavaScript/TypeScript, Go, and Java.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
Install the latest, Python client v4, by adding weaviate-client to your Python environment with pip:
pip install -U weaviate-client
1.3 Connect to Weaviate Cloud
Weaviate Embeddings is integrated with Weaviate Cloud. Your Weaviate Cloud credentials will be used to authorize your Weaviate Cloud instance's access for Weaviate Embeddings.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
import weaviate
from weaviate.classes.init import Auth
import os
# Best practice: store your credentials in environment variables
weaviate_url = os.getenv("WEAVIATE_URL")
weaviate_key = os.getenv("WEAVIATE_API_KEY")
client = weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url, # Weaviate URL: "REST Endpoint" in Weaviate Cloud console
auth_credentials=Auth.api_key(weaviate_key), # Weaviate API key: "ADMIN" API key in Weaviate Cloud console
)
print(client.is_ready()) # Should print: `True`
# Work with Weaviate
client.close()
Step 2: Populate the database
2.1 Define a collection
Now we can define a collection that will store our data. When creating a collection, you need to specify one of the available models for the vectorizer to use. This model will be used to create vector embeddings from your data.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
from weaviate.classes.config import Configure, Property, DataType
client.collections.create(
"DemoCollection",
properties=[
Property(name="title", data_type=DataType.TEXT),
],
vector_config=[
Configure.Vectors.text2vec_weaviate(
name="title_vector",
source_properties=["title"],
model="Snowflake/snowflake-arctic-embed-l-v2.0",
# Further options
# dimensions=256
# base_url="<custom_weaviate_embeddings_url>",
)
],
# Additional parameters not shown
)
For more information about the available model options visit the Choose a model page.
2.2 Import objects
After configuring the vectorizer, import data into Weaviate. Weaviate generates embeddings for text objects using the specified model.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
source_objects = [
{"title": "The Shawshank Redemption", "description": "A wrongfully imprisoned man forms an inspiring friendship while finding hope and redemption in the darkest of places."},
{"title": "The Godfather", "description": "A powerful mafia family struggles to balance loyalty, power, and betrayal in this iconic crime saga."},
{"title": "The Dark Knight", "description": "Batman faces his greatest challenge as he battles the chaos unleashed by the Joker in Gotham City."},
{"title": "Jingle All the Way", "description": "A desperate father goes to hilarious lengths to secure the season's hottest toy for his son on Christmas Eve."},
{"title": "A Christmas Carol", "description": "A miserly old man is transformed after being visited by three ghosts on Christmas Eve in this timeless tale of redemption."}
]
collection = client.collections.use("DemoCollection")
with collection.batch.fixed_size(batch_size=200) as batch:
for src_obj in source_objects:
# The model provider integration will automatically vectorize the object
batch.add_object(
properties={
"title": src_obj["title"],
"description": src_obj["description"],
},
# vector=vector # Optionally provide a pre-obtained vector
)
if batch.number_errors > 10:
print("Batch import stopped due to excessive errors.")
break
failed_objects = collection.batch.failed_objects
if failed_objects:
print(f"Number of failed imports: {len(failed_objects)}")
print(f"First failed object: {failed_objects[0]}")
Step 3: Query your data
Once the vectorizer is configured, Weaviate will perform vector search operations using the specified model.
Vector (near text) search
When you perform a vector search, Weaviate converts the text query into an embedding using the specified model and returns the most similar objects from the database.
The query below returns the n most similar objects from the database, set by limit.
If a snippet doesn't work or you have feedback, please open a GitHub issue.
collection = client.collections.use("DemoCollection")
response = collection.query.near_text(
query="A holiday film", # The model provider integration will automatically vectorize the query
limit=2
)
for obj in response.objects:
print(obj.properties["title"])
Next steps
Check out which additional models are available through Weaviate Embeddings.
Discover how hybrid search combines keyword matching and semantic search.
Support & feedback
For help with Shared Cloud and Dedicated Cloud, contact Weaviate support directly to open a support ticket. To add a support plan, contact Weaviate sales.
If you have any questions or feedback, let us know in the user forum.
