This project helps you chat with your documents! It works by first processing your documents, breaking them into smaller pieces (chunks), and converting them into special numbers called embeddings. These embeddings and chunks are stored in a vector database. When you ask a question, the system finds the most relevant chunks in the database based on your question's embedding and then uses an AI model to generate an answer based only on those relevant chunks.

Visual Overview

image.png

Chapters

Chapter 1: Vector Database (Chroma DB)

Chapter 2: Text Embeddings Text Embeddings

Chapter 3: Text Splitting (Chunking)

Chapter 4: Data Ingestion Pipeline

Chapter 5: Query Processing

Chapter 6: Prompt Engineering