MIGUEL - Educacional - Bruno Galvão

Project Philosophy

Education-focused approach with curated small documents, simple FAISS indexing, and local FLAN-T5 LLM so students can run everything on CPU/GPU without external services.

Educational Focus

Designed specifically for learning and teaching RAG concepts with clear, step-by-step documentation and minimal complexity to understand core principles.

High Performance

Optimized for fast execution with lightweight models (MiniLM embeddings, FLAN-T5) that can run efficiently on limited hardware resources.

Flexible Architecture

Modular design allowing easy experimentation with different components: embeddings, vector stores, and language models.

Self-Contained

Runs entirely locally with optional cloud integrations, perfect for students and developers without API dependencies.

System Architecture

The pipeline follows a clean, linear flow from document ingestion to answer generation, with each component clearly separated for educational clarity.

1

Sample Documents

Curated collection of educational texts covering churn analysis, NPS metrics, LangChain framework, and RAG concepts.

2

Text Processing

Clean and chunk documents into simple string segments optimized for embedding generation and semantic search.

3

Vector Embeddings

Convert text chunks to high-dimensional vectors using sentence-transformers/all-MiniLM-L6-v2.

4

FAISS Index

Store and index embeddings in Facebook's FAISS vector database for lightning-fast similarity search.

5

Similarity Search

Perform top-k similarity search to find most relevant document chunks for any given query.

6

Answer Generation

Process retrieved context and user query through FLAN-T5 local LLM via RetrievalQA chain.

Implementation Details

Step-by-step breakdown of the core pipeline implementation with actual code examples from the project.

Core RAG Pipeline Implementation

# 1) Sample educational documents
docs = [
    "Churn is customer cancellation and represents significant business impact...",
    "NPS, or Net Promoter Score, measures customer satisfaction and loyalty...",
    "LangChain is a powerful library for building LLM-powered applications...",
    "RAG combines information retrieval with text generation for enhanced accuracy...",
]

# 2) Setup embeddings and vector store
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chains import RetrievalQA

# Initialize lightweight, fast embeddings
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Create FAISS vector store and retriever
vectorstore = FAISS.from_texts(docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

# 3) Setup local FLAN-T5 model
from transformers import pipeline
from langchain_community.llms import HuggingFacePipeline

gen_pipeline = pipeline("text2text-generation", model="google/flan-t5-base", max_new_tokens=256)
llm = HuggingFacePipeline(pipeline=gen_pipeline)

# 4) Create complete RAG chain
qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)

# 5) Process queries and generate answers
response = qa_chain.invoke({"query": "What does churn mean?"})
print(response["result"])

Minimalist Corpus

Small, curated collection of educational strings about churn, NPS, LangChain, RAG, and embeddings - perfect for demonstrating retrieval concepts.

Fast Embeddings

Uses MiniLM for lightweight, efficient text-to-vector conversion that runs smoothly on CPU with high-quality representations.

FAISS Vector Search

Facebook's FAISS library provides lightning-fast similarity search with configurable top-k retrieval for finding relevant context.

Local FLAN-T5 LLM

Google's model runs locally via HuggingFace, eliminating API dependencies while providing solid text generation.

LangChain Integration

RetrievalQA chain orchestrates the entire pipeline, seamlessly combining document retrieval with language model generation.

OpenAI Fallback

Optional integration with OpenAI models for enhanced performance when API access is available.

Technology Stack

Carefully selected technologies optimized for educational use, performance, and accessibility.

Python 3.10+

Core language

LangChain

Orchestration

FLAN-T5

Local LLM

FAISS

Vector Search

HuggingFace

Models & Embeddings

Sentence-Transformers

Embedding model

Evaluation Framework

Comprehensive testing approach to ensure system reliability and educational value.

Evaluation Script Example

import time

# Test questions with expected topics
test_questions = [
    ("What does churn mean?", "churn"),
    ("How does RAG work?", "RAG"),
    ("What is NPS?", "NPS"),
    ("Explain LangChain benefits", "LangChain"),
]

# Evaluation metrics
def evaluate_system(qa_chain, test_questions):
    results = []
    for question, expected_topic in test_questions:
        start_time = time.time()
        result = qa_chain.invoke({"query": question})
        response_time = time.time() - start_time
        
        response_text = result["result"].lower()
        is_relevant = expected_topic.lower() in response_text
        
        results.append({
            "question": question,
            "response_time": response_time,
            "relevance": is_relevant,
            "answer": result["result"]
        })
        print(f"❓ {question}")
        print(f"✅ {result['result']}")
        print(f"⏱️ {response_time:.2f}s")
        print(f"🎯 Relevant: {is_relevant}
")
    return results

Groundedness Testing

Validates that responses accurately reflect information from retrieved documents, ensuring factual consistency.

Relevance Assessment

Measures how well the retrieval system finds contextually appropriate documents for different query types.

Fluency Evaluation

Assesses response coherence, grammatical correctness, and overall quality of generated text.

Project Structure

Well-organized codebase following Python best practices with comprehensive tooling.

Directory Structure

Miguel_LLM-educacional/
├── 📁 config/           # Hydra configuration files
│   ├── main.yaml        # Main configuration
│   ├── model/           # Model parameters
│   └── process/         # Processing parameters
├── 📁 data/             # Project data
│   ├── raw/             # Raw input data
│   ├── processed/       # Cleaned data
│   └── final/           # Final datasets
├── 📁 notebooks/        # Jupyter notebooks
│   └── miguel_llm.ipynb
├── 📁 src/              # Source code
│   ├── __init__.py
│   ├── process.py       # Data processing
│   ├── train_model.py   # Model training
│   └── utils.py         # Utility functions
├── 📁 tests/            # Automated tests
├── pyproject.toml       # Poetry dependencies
└── README.md            # Project documentation

Getting Started

Simple setup process to get the MIGUEL system running locally in minutes.

Quick Setup Instructions

# 1. Clone the repository
git clone https://github.com/bcmaymonegalvao/Miguel_LLM-educacional.git
cd Miguel_LLM-educacional

# 2. Create virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate

# 3. Install dependencies
pip install -U pip
pip install faiss-cpu sentence-transformers langchain langchain-community transformers torch

# 4. Run the notebook
jupyter notebook notebooks/miguel_llm.ipynb

Google Colab Ready

One-click execution in Google Colab with pre-configured environment and GPU acceleration.

Local Development

Complete local setup with minimal dependencies, perfect for offline development and learning.

Educational Materials

Comprehensive Jupyter notebooks with step-by-step explanations and learning exercises.

Ready to Explore MIGUEL?

Dive into the code, experiment with the live demo, or reach out to discuss machine learning projects.

Explore Source Code Try Live Demo Contact Me

MIGUEL - Educacional