case study • ai tool

AI Handbook Generator

A Python AI tool that generates 20,000-word structured handbooks from uploaded PDFs through a conversational Gradio interface — using a custom RAG engine and the LongWriter technique.

Live Demo ↗ Source Back to Projects

Overview

Project Summary

AI Handbook Generator lets you upload PDF documents, ask questions about them, and generate 20,000+ word structured handbooks through a chat interface. Built to solve the LLM output limit problem — standard models cap at a few thousand words per response, so I implemented the LongWriter technique to break generation into sections and assemble them into a full document.

Stack

Stack & Architecture

Python — application layer
Gradio — chat UI with streaming output
Groq API — Llama 3.3 70B for generation
sentence-transformers — local text embeddings
numpy — cosine similarity vector search
pdfplumber / pypdf — PDF text extraction
Supabase — optional persistent storage
Hugging Face Spaces — deployment

Features

Key Features

PDF upload & indexing — extract and embed text from any PDF in seconds
RAG-grounded chat — answers pulled from uploaded documents, not hallucinated
20,000-word handbook generation — triggered via chat, streamed live to the browser
LongWriter technique — Plan → Write per section → Assemble avoids output limits
Markdown export — download the finished handbook as a .md file
Local embeddings — no embedding API key required; runs sentence-transformers on-device

Challenges

Challenges & Learnings

LLM output limits — solved with LongWriter: split generation into 12–16 sections, each ~1,500 words, then assembled
Slow RAG indexing — original LightRAG approach made LLM calls during indexing (slow); replaced with local numpy cosine similarity search
Dependency conflicts — sentence-transformers 5.x broke PyTorch; pinned to <4
Gradio breaking changes — v6 removed several parameters and changed chat history format; updated all affected code
API pivots — switched from xAI to Groq after hitting credit limits; OpenAI-compatible SDK made this a one-line change

Problem

Standard LLMs can't generate a full 20,000-word document in one API call — they hit output token limits and truncate. I wanted to build something that could take uploaded PDFs and produce a complete, structured handbook from them, not a summary, but a full long-form document grounded in the source material.

Approach

I implemented the LongWriter / AgentWrite technique from AI research: first generate a full table of contents with word-count targets per section, then write each section in a separate API call using relevant RAG context, then assemble the sections into a single document. For the RAG layer, I built a custom vector search engine using sentence-transformers for local embeddings and numpy cosine similarity for retrieval — no external vector database required.

What I Built

Core

Custom RAG Engine

Built a vector similarity search engine from scratch using sentence-transformers for local embeddings and numpy for cosine similarity. Chunks are embedded and saved to disk on upload, so indexing survives app restarts without any vector database.

Core

LongWriter Generation

Implemented the Plan → Write → Assemble pipeline. The LLM first creates a 12–16 section outline with word targets, then writes each section individually using retrieved context, then all sections are joined into one document. Each generation step streams live to the UI.

UI

Gradio Chat Interface

Built a tabbed Gradio interface with PDF upload, a streaming chat for both Q&A and handbook generation, and a one-click markdown export. Deployed to Hugging Face Spaces with the Groq API key set as a Space secret.

Result

The app can generate a 20,000+ word structured handbook from uploaded PDFs in 5–15 minutes, streamed live. This project pushed me into Python backend work, RAG architecture, real dependency debugging, and practical LLM engineering — closer to production AI tooling than anything I'd built before.