RAG App
Building a RAG App
Building a RAG app today is like building a CRUD. But building a performant an scalable RAG becomes a bit difficult.
In this blog post I'll share everything that i learn't about RAG here.
The most important steps in RAG are :
- Ingestion
- Retreival
- Generation
Data Preparation
- When chunking the data, we can add some metadata to the chucked part, so when retreiving filtering can be super easy.
Eval Driven Development
Eval Driven Development is basically solving problems one at a one by defining problems statements
Evals can make sure if the query scenario was able to process the query before hitting production and can save you from a lot of embarassement.
Data Ingestion
Ingestion is the first step in RAG, it involves breaking down the data and creating a database.
It involves :
- chucking
- vector embeddings
An embedding model converts text into dense numerical vectors. Using cosine similarity, let's us find things close in this high dimensional space.
Chunking Strategies
- Fixed Sized Chunking.
- Chunking with overlap.
- Semantic Chunking with Sentence Transformers.
Steps Breadown
- We can use spacy library to convert text into sentences.
- Generate Embeddings - Using sentence transformers, we convert these sentences into numeric embeddings
- Calculate similarities.
-
Decide the chunk thresholda.
-
Spacy makes the process of converting raw text into sentences easier.
- Finding the cosine similarity
Creating Embeddings ( Encoders )
Embeddings are data encoded in a high dimensional space. Higher the dimension is, better nuanced information is embedded into the encoding, but at the same time requires more compute resources.
There are a lot of embedding models in the market, in this tutorial i'll use the
hugging face sentence transformers model all-mpnet-base-v2. ( our focus is not
on embeddings, but to create a RAG app )