🤖 AI•By Neel Vora•December 5, 2025•6 min read

Building RAG Knowledge Search for My Portfolio

RAGOpenAINext.jsTypeScriptVector Search

This is how I built the RAG system that powers the "Ask about my work" feature on this site. It uses pgvector, OpenAI embeddings, and a citation system that links back to source material.

This post walks through how I built RAG Knowledge Search, and where it fits in the rest of my work.

The document-centric thinking behind this RAG system draws from my CMS work, where understanding content hierarchies and metadata was essential.

I wanted my portfolio to show real AI engineering work, not just marketing bullets. RAG Knowledge Search is the first project inside my AI Lab: a retrieval augmented generation system that can answer questions about a small knowledge base and show exactly which sources it used.

This post walks through why I built it, the stack I used, and some of the tradeoffs I made for a portfolio friendly version one.

Why start with RAG

RAG keeps popping up in real client conversations. People want chat based interfaces that can answer questions about their docs, their product, or their internal knowledge without hallucinating all over the place.

As a web engineer, I wanted a project that sits right at the intersection of what I already do well and where the industry is heading:

Next.js and TypeScript for the frontend and API routes
OpenAI for embeddings and generation
A simple vector search layer that I can later swap out for a more serious database
An interface that feels like a real product, not a toy demo

RAG Knowledge Search lives at /ai/rag on my site and it is wired like a real feature, not a separate playground.

High level architecture

Here is the shape of the system in version one:

Next.js 16 app router page at /ai/rag with a chat style UI
API route at /api/ai/rag/query that handles embedding, search, and generation
A small knowledge base stored as structured chunks in memory on the server
Optional user added documents that are embedded and stored in a shared store during the life of the process
OpenAI embeddings (text-embedding-3-small) for both the base knowledge and user docs
GPT 4o mini for answering questions using the retrieved context

The important part is that the whole flow is explicit and inspectable. The UI shows which chunks were used, how relevant they were, and what the final answer was.

Chunking and embeddings

RAG falls apart if your context is messy, so I started with a very simple but explicit chunking approach.

When I add static knowledge to the system, I store it as small text blocks. For user documents, the ingestion route does the following:

Thanks for reading! If you found this useful, check out my other posts or explore the live demos in my AI Lab.

Explore AI Lab More Posts About Neel Vora

Building RAG Knowledge Search for My Portfolio

Why start with RAG

High level architecture

Chunking and embeddings

More Posts

Building the Admin System for My Portfolio

How My AI Chat Widget Works Across the Entire Site

The Architecture of My AI Lab

Vector search and ranking

Streaming responses

Document ingestion UI

Guardrails and basic safety

What I would do next

Why this belongs in my portfolio

Keep exploring