Embeddings with pgvector
Learn vector embeddings, similarity search, and SQL-based retrieval with PostgreSQL pgvector extension. Master semantic search fundamentals for RAG and AI applications.
{
"results": [
{ "chunkId": "1", "score": 0.12 },
{ "chunkId": "5", "score": 0.18 },
{ "chunkId": "3", "score": 0.24 }
]
}Learning Objectives
Master vector embeddings and SQL-based similarity search for building semantic search and retrieval systems.
Goal
Learn vector embeddings, similarity search, and SQL-based retrieval with PostgreSQL pgvector extension.
Scope
- •Generate embeddings for text chunks
- •Store embeddings in pgvector
- •Run similarity search queries
Tech Stack
- •FastAPI for REST endpoints
- •OpenAI or local embedding models
- •PostgreSQL + pgvector extension
Success Criteria
System Architecture
Three-tier architecture for generating, storing, and searching vector embeddings with PostgreSQL pgvector.
FastAPI Service
REST API layer handling embed and search requests with model integration
- →POST /embed endpoint
- →POST /search/pgvector endpoint
- →OpenAI or local model client
Embedding Model
Model layer for generating vector representations of text chunks
- →OpenAI text-embedding-3-small
- →Local Sentence Transformers
- →Configurable dimensions (384-1536)
PostgreSQL + pgvector
Storage layer with pgvector extension for efficient vector operations
- →pgvector extension enabled
- →VECTOR column type support
- →Distance operators (<->, <#>, <=>)
Data Model
Simple table schema for storing embeddings with pgvector VECTOR column type.
chunk_embeddings
Store vector representations of text chunks
→Setup pgvector Extension
CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE chunk_embeddings ( chunk_id VARCHAR(255) PRIMARY KEY, embedding VECTOR(1536), created_at TIMESTAMP DEFAULT NOW(), model_name VARCHAR(100) ); -- Create index for faster similarity search CREATE INDEX ON chunk_embeddings USING ivfflat (embedding vector_cosine_ops);
Embedding Generation
Generate vector embeddings for text chunks using OpenAI or local models and persist to PostgreSQL.
OpenAI Embeddings
Use OpenAI's embedding API for high-quality vector representations
from openai import OpenAI
client = OpenAI(api_key="...")
def generate_embedding(text: str):
response = client.embeddings.create(
model="text-embedding-3-small",
input=text
)
return response.data[0].embedding
# Returns 1536-dimensional vector
embedding = generate_embedding("...")
# [0.012, -0.034, 0.056, ...]Local Models
Use Sentence Transformers for offline embedding generation
from sentence_transformers import SentenceTransformer
model = SentenceTransformer(
'all-MiniLM-L6-v2'
)
def generate_embedding(text: str):
embedding = model.encode(text)
return embedding.tolist()
# Returns 384-dimensional vector
embedding = generate_embedding("...")
# [0.023, -0.045, 0.067, ...]Store Embeddings in pgvector
import psycopg2
from pgvector.psycopg2 import register_vector
conn = psycopg2.connect("postgresql://...")
register_vector(conn)
cursor = conn.cursor()
def save_embedding(chunk_id: str, embedding: list, model: str):
cursor.execute("""
INSERT INTO chunk_embeddings (chunk_id, embedding, model_name)
VALUES (%s, %s, %s)
ON CONFLICT (chunk_id) DO UPDATE
SET embedding = EXCLUDED.embedding
""", (chunk_id, embedding, model))
conn.commit()
save_embedding("chunk-123", embedding, "text-embedding-3-small")Similarity Search
Run SQL-based vector similarity searches using pgvector distance operators to find semantically similar content.
SQL Vector Search Query
Find top-k most similar chunks using distance operators
SELECT chunk_id, embedding <-> %s AS distance FROM chunk_embeddings ORDER BY embedding <-> %s LIMIT 5;
Query Embedding
Generate embedding for search query text
query_embedding = generate_embedding(query)
Distance Calculation
Compare query vector with stored embeddings
Returns smallest distances first
Python Search Implementation
def search_similar(query: str, limit: int = 5):
# Generate query embedding
query_embedding = generate_embedding(query)
# Execute similarity search
cursor.execute("""
SELECT chunk_id, embedding <-> %s AS distance
FROM chunk_embeddings
ORDER BY embedding <-> %s
LIMIT %s
""", (query_embedding, query_embedding, limit))
results = cursor.fetchall()
return [
{"chunkId": row[0], "score": float(row[1])}
for row in results
]
# Search for similar chunks
results = search_similar("explain neural networks")
# Returns: [{"chunkId": "1", "score": 0.12}, ...]Distance Metrics
Understand the difference between cosine distance and L2 distance for vector similarity calculations.
Cosine Distance
Measures angle between vectors, normalized by magnitude
L2 Distance (Euclidean)
Measures straight-line distance between vector points
Inner Product Distance
Negative dot product - useful for maximum inner product search (MIPS) scenarios
SELECT chunk_id, embedding <-> query AS l2, embedding <=> query AS cosine, embedding <#> query AS inner FROM chunk_embeddings LIMIT 5;
REST API
Two simple FastAPI endpoints for generating embeddings and performing similarity searches.
POST /embed
Generate and store embedding
{
"chunkId": "chunk-123",
"text": "Machine learning basics"
}{
"chunkId": "chunk-123",
"dimensions": 1536,
"model": "text-embedding-3-small"
}POST /search/pgvector
Find similar chunks
{
"query": "explain neural networks",
"limit": 5,
"distance": "cosine"
}{
"results": [
{"chunkId": "1", "score": 0.12},
{"chunkId": "5", "score": 0.18}
]
}FastAPI Implementation
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
class EmbedRequest(BaseModel):
chunkId: str
text: str
class SearchRequest(BaseModel):
query: str
limit: int = 5
distance: str = "cosine"
@app.post("/embed")
def embed_text(req: EmbedRequest):
embedding = generate_embedding(req.text)
save_embedding(req.chunkId, embedding, "text-embedding-3-small")
return {
"chunkId": req.chunkId,
"dimensions": len(embedding),
"model": "text-embedding-3-small"
}
@app.post("/search/pgvector")
def search_pgvector(req: SearchRequest):
results = search_similar(req.query, req.limit)
return {"results": results}Learning Benefits
Master essential skills for building semantic search and RAG applications.
Definition of Done
Ready to build semantic search?
This block provides the foundation for building RAG applications, semantic search engines, and AI-powered recommendation systems with PostgreSQL pgvector.
Build Your Next Product With AI Expertise
Experience the future of software development. Let our GenAI platform accelerate your next project.
Schedule a Free AI Blueprint Session