How I Built My RAG System With Supabase and pgvector
By Neel Vora
This post walks through how I built my RAG System With Supabase and pgvector, and where it fits in the rest of my work.
Years of building content management systems for government agencies gave me deep appreciation for how organizations structure knowledge - patterns that directly informed this RAG architecture.
This was one of the key AI engineering projects in my portfolio.
It demonstrates real retrieval augmented generation with:
- Document ingestion
- Chunking
- Embeddings
- Vector storage
- Similarity search
- Streaming model responses
Document ingestion
Users can add documents through a panel. The server:
- Validates the input
- Splits the text into chunks
- Generates embeddings
- Stores them in Supabase
Chunking logic
Each chunk is about 500 characters with 50 characters of overlap. This keeps context meaningful without making embeddings too large.
pgvector
I used a Supabase Postgres table with a vector column.
The similarity search uses:
match_rag_chunks(query_embedding)
This returns rows ordered by cosine similarity.
Query flow
- Embed the user query
- Fetch the top matching chunks
- Feed them into GPT 4o mini
- Stream the response to the browser
- Show citation sources
Why this matters
This is production quality RAG in a real environment. It combines design retrieval systems, vector search pipelines, and streaming AI interfaces in a production environment.
Keep exploring
From here you can:
-
See how I applied similar patterns on the DSHS Field Guide
-
Try the RAG Knowledge Search demo at /ai/rag.
-
Visit /neel-vora for more background about me and my work.
-
Browse more posts on the blog.
Thanks for reading! If you found this useful, check out my other posts or explore the live demos in my AI Lab.