π§ NeuraDocs β An AI-powered Rag System that answer questions using internal documentation.
NeuraDocs is an intelligent system that answers technical questions using your companyβs internal PDF documentation. It leverages Retrieval-Augmented Generation (RAG) to ground responses in your own knowledge base, enabling fast, accurate, and explainable answers.
π Key Features
- π Semantic search over internal PDFs using vector embeddings
- π€ AI-powered answers with citations from your actual docs
- π Automatic PDF parsing, chunking, and indexing
- π§ Retrieval-Augmented Generation (RAG) pipeline
- π§Ύ REST API for internal use or integration with tools/UIs
π Example Use Cases
- Internal dev documentation Q&A β Quickly answers technical questions by extracting insights from internal developer docs.
- AI-powered engineering assistant β Provides intelligent, context-aware support for engineering tasks using internal knowledge.
- Knowledge base augmentation β Turns static documentation into an interactive, searchable AI-powered resource.
- Automated onboarding and support tools β Delivers instant answers to onboarding and support queries using internal content.
π§° Technologies Used
- Python β Core language
- FastAPI β REST API framework
- LangChain β RAG orchestration and text processing
- OpenAI β Embeddings and chat completions (GPT-3.5/GPT-4)
- Qdrant β High-performance vector database
- PyMuPDF (fitz) β PDF parsing
- dotenv β Environment configuration
π§ Retrieval-Augmented Generation (RAG) Pipeline
- PDF Parsing: Extract text from PDFs using PyMuPDF.
- Chunking: Split text into overlapping semantic chunks using LangChain.
- Embedding: Encode chunks into vector embeddings using OpenAI.
- Storage: Store vectors and metadata in Qdrant.
- Querying:
- Embed user query
- Search similar chunks from vector DB
- Prompt LLM with retrieved content
- Response: Return generated answer with traceable sources.
π‘ API Endpoint
POST /ask
- Accepts a natural language query.
- Returns a AI-generated answer based on internal documentation.
- Includes document metadata for context traceability.