Skip to content

ALEX8642/LLM-chatbot

Repository files navigation

LLM-Chatbot

CI Python 3.11+ License: MIT Code style: black FastAPI Haystack

A Haystack 2.x and Ollama based chatbot for user assistance interpreting manuals and documentation. Drop in your PDF manuals, and the system will automatically process them for intelligent question answering.

Features

  • FastAPI backend with Haystack 2.x for document processing and QA
  • Vector search using Qdrant and text search using OpenSearch
  • Local LLM support via Ollama
  • React/TypeScript frontend
  • Document ingestion pipeline for manuals and documentation

Project Structure

llm-chatbot/
├── backend/           # FastAPI server and API
│   ├── .env          # Backend environment variables
│   └── ingest_manuals.py  # Document ingestion script
├── frontend/         # React/TypeScript frontend
├── scripts/          # Utility scripts
│   └── launch.py     # Service launch and health check script
├── manuals/          # Place manuals and docs here for ingestion
├── docker-compose.yaml  # Docker services configuration
└── pyproject.toml    # Python dependencies and project metadata

Requirements

  • Python 3.10 or higher
  • Node.js 18 or higher
  • Docker and Docker Compose
  • Ollama (for local LLM support)

Working with Manuals

Adding New Manuals

  1. Place your PDF manuals in the manuals/ directory
  2. Run the ingestion script:
    python backend/ingest_manuals.py

The system will automatically:

  • Extract metadata from filenames
  • Generate unique IDs and readable labels
  • Split documents into chunks
  • Create embeddings and store them

For example:

Input filename: XYZ-123-456_Product_Manual_v2.1.pdf
↓
Automatic metadata:
{
    "id": "product-manual",
    "label": "Product Manual",
    "product_id": "XYZ"
}

Custom Metadata (Optional)

If you need to override the automatic metadata, create a manual_metadata.json file:

{
    "example.pdf": {
        "id": "custom-id",
        "label": "Custom Label",
        "product_id": "CustomProduct"
    }
}

Setup

  1. Install Python dependencies using uv:

    python -m pip install uv
    uv venv
    source .venv/bin/activate  # or .venv/Scripts/activate on Windows
    uv pip install -e .

    For development:

    uv pip install -e ".[dev]"
  2. Configure environment variables:

    cp frontend/.env.example frontend/.env
    cp backend/.env.example backend/.env
    # Edit .env files as needed
  3. Start required services:

    python scripts/launch.py
  4. Set up the frontend:

    cd frontend
    npm install
    npm run dev

Development

Backend Development

The backend uses FastAPI and Haystack 2.x:

  1. Start the development server:

    uvicorn api:app --reload
  2. Access the API docs at: http://localhost:8000/docs

Frontend Development

The frontend is a React application with TypeScript:

  1. Start the development server:

    cd frontend
    npm run dev
  2. Access the UI at: http://localhost:5173

Adding Documents

Place your documents in the manuals/ directory and run:

python backend/ingest_manuals.py

Architecture

  • FastAPI Backend: Handles API requests, document processing, and LLM interactions
  • Qdrant: Vector database for semantic search
  • OpenSearch: Text search and document storage
  • Ollama: Local LLM integration
  • Haystack: Document processing and QA pipeline
  • React Frontend: User interface and chat interactions

Docker Services

  • Qdrant: Vector search (port 6333)
  • OpenSearch: Text search and document storage (port 9200)
  • OpenSearch Dashboards: Search visualization (port 5601)

Data is persisted in Docker volumes:

  • opensearch_data: OpenSearch data
  • qdrant_data: Qdrant vectors and metadata

This project is independently developed by Alexander Feht during personal time using personal hardware. It is not affiliated with or commissioned by any employer, and was licensed under MIT prior to any discussions with external parties.

About

RAG-based chatbot for technical manuals using Haystack 2.x and Ollama

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors