Creating vector embeddings
Overview
To use Aerospike Vector Search (AVS), you must build an application that generates vector embeddings. This page outlines some general approaches for generating vector embeddings using Python and a machine learning model.
Generate embeddings using a hosted service
Using a hosted model like OpenAI provides ease of use, quick deployment, and access to cutting-edge technology without the need for significant infrastructure investment. It ensures scalability, automatic updates, and maintenance, allowing organizations to focus on application development rather than managing the underlying model infrastructure.
Install the OpenAI Python client library if you haven't already:
pip install openai
Use the following Python code to generate a vector embedding:
import openai
# Set your OpenAI API key
openai.api_key = 'your-api-key-here'
# Define the text chunk for which you want to generate an embedding
text_chunk = "OpenAI's GPT-4 is a powerful language model capable of performing a wide range of natural language processing tasks."
# Generate the embedding
response = openai.Embedding.create(
input=text_chunk,
model="text-embedding-ada-002"
)
# Extract the embedding vector
embedding_vector = response['data'][0]['embedding']
# Print the embedding vector
print(embedding_vector)
Self-host an open-source model
Self-hosting a machine learning model offers enhanced data privacy, security, and control over the environment, making it easier to comply with regulatory requirements and optimize performance. It can also be more cost-effective for high usage scenarios, eliminating dependency on third-party providers and reducing latency.
The following example shows how you can generate a vector embedding from a chunk of text using the LLaMA model. You can use the generated vector for various downstream tasks such as similarity searches and other vector computations.
Install the required libraries:
pip install transformers
pip install torchUse the following Python code to generate a vector embedding:
from transformers import AutoTokenizer, AutoModel
import torch
# Load the LLaMA model and tokenizer
model_name = "facebook/llama-7b" # Replace with the correct model name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)
# Define the text chunk for which you want to generate an embedding
text_chunk = "LLaMA is a powerful language model capable of performing a wide range of natural language processing tasks."
# Tokenize the text chunk
inputs = tokenizer(text_chunk, return_tensors="pt")
# Generate the embeddings
with torch.no_grad():
outputs = model(**inputs)
# The embeddings are typically in the 'last_hidden_state' tensor
embeddings = outputs.last_hidden_state
# Average the token embeddings to get a single vector representation
embedding_vector = torch.mean(embeddings, dim=1).squeeze().numpy()
# Print the embedding vector
print(embedding_vector)
Additional resources
Hugging Face Model Hub: A comprehensive repository of pre-trained models for various natural language processing (NLP) tasks, computer vision, and more.
TensorFlow Hub: A library of reusable MLLs for TensorFlow, offering models for text, image, and audio processing.
PyTorch Hub: A repository of pre-trained PyTorch models, facilitating easy integration and deployment for various machine learning tasks.
Kaggle: An online community for data scientists and machine learning practitioners, providing datasets, notebooks, and pre-trained models.
GitHub: A vast platform hosting a multitude of open-source projects, including repositories for MLLs, tools, and frameworks.
OpenVINO Model Zoo: A collection of pre-trained models optimized for Intel hardware, supporting various AI tasks.
Model Zoo for Caffe, TensorFlow, PyTorch, MXNet: Collections of pre-trained models specific to different deep learning frameworks, offering models for diverse applications.
NVIDIA NGC: A platform offering GPU-optimized deep learning frameworks and pre-trained models for various AI tasks.