RAG App

Building a RAG App

Building a RAG app today is like building a CRUD. But building a performant an scalable RAG becomes a bit difficult.

In this blog post I'll share everything that i learn't about RAG here.

The most important steps in RAG are :

Ingestion
Retreival
Generation

Data Preparation

When chunking the data, we can add some metadata to the chucked part, so when retreiving filtering can be super easy.

Eval Driven Development

Eval Driven Development is basically solving problems one at a one by defining problems statements

Evals can make sure if the query scenario was able to process the query before hitting production and can save you from a lot of embarassement.

Data Ingestion

Ingestion is the first step in RAG, it involves breaking down the data and creating a database.

It involves :

chucking
vector embeddings

An embedding model converts text into dense numerical vectors. Using cosine similarity, let's us find things close in this high dimensional space.

Chunking Strategies

Fixed Sized Chunking.
Chunking with overlap.
Semantic Chunking with Sentence Transformers.

Steps Breadown

We can use spacy library to convert text into sentences.
Generate Embeddings - Using sentence transformers, we convert these sentences into numeric embeddings
Calculate similarities.
Decide the chunk thresholda.
Spacy makes the process of converting raw text into sentences easier.
Finding the cosine similarity

Creating Embeddings ( Encoders )

Embeddings are data encoded in a high dimensional space. Higher the dimension is, better nuanced information is embedded into the encoding, but at the same time requires more compute resources.

There are a lot of embedding models in the market, in this tutorial i'll use the hugging face sentence transformers model all-mpnet-base-v2. ( our focus is not on embeddings, but to create a RAG app )