Skip to content

bradleyd/rustopedia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🦀 Rust LLM Agent — Local Developer Assistant

This project is a local Rust-powered AI assistant that uses embedded documentation, agent tools, and a local LLM to help answer Rust programming questions — with context-aware code generation and crate discovery.


📦 Features

  • RAG (Retrieval-Augmented Generation) with ChromaDB
  • 🕵️‍♂️ Agents for:
    • Crate docs (docs.rs)
    • GitHub repo search + README parsing
    • Standard library and Rust Book content
  • 🤖 Local LLM (LLaMA3 / Mistral / OpenHermes) integration
  • 🧠 Chat interface with memory + tool calling
  • 📁 Modular source loading from local docs (rustup, crates, scraped pages)

🛠️ Setup

1. Clone the project

git clone https://github.com/bradleyd/rustopedia.git
cd rustopedia

2. Set up Python (for embeddings + ChromaDB)

cd rag 
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

🧠 Load Documentation into Vector DB

You can load any Rust content (books, docs, READMEs) into ChromaDB.

Supported Sources

  • Local docs via rustup (HTML extraction + chunking)
  • Crate docs from docs.rs (automated fetching)
  • Rust book or stdlib
  • GitHub crate READMEs

✅ Quick Setup - Load All Documentation

Option 1: Load Rust core docs (Book + Stdlib)

cd rag
chmod +x rustup_docs.sh
./rustup_docs.sh

Option 2: Load popular crate documentation

cd rag
chmod +x crate_docs.sh
./crate_docs.sh all  # Fetches serde, tokio, clap, etc.

Option 3: Load specific crate docs

./crate_docs.sh fetch tokio     # Fetch specific crate
./crate_docs.sh embed           # Embed all fetched crates

Manual Document Loading

Place .md or .txt files into sample_docs/ under collection folders:

sample_docs/rust-book/
sample_docs/rust-docs/
sample_docs/crates/

Then run:

python embed_docs.py --dir ../sample_docs/your-collection --collection your-name --chunk-size 800 --overlap 100

All files will be embedded with intelligent chunking and stored in ./chroma_db.


🔍 Querying the Knowledge Base

You can manually query docs using:

python query_docs.py "How do I deserialize JSON into an enum?"

This returns the most relevant chunks across all embedded docs.


💬 Running the Assistant (Chat Mode)

Launch the LLM shell with agent orchestration:

cargo run --bin llm_shell

You can now ask things like:

  • "What’s the best way to build a CLI app in Rust?"
  • "How do I initialize a Vec with a capacity?"
  • "Give me an example using serde to serialize structs"

The system will:

  • Search ChromaDB
  • Route queries to agents (GitHub, crate, docs)
  • Return helpful, idiomatic answers using a local LLM

🧱 Agent Overview

GitHub Discovery Agent (Rust)

  • Classifies queries into topics (e.g. cli, web)
  • Uses GitHub API to find top crates
  • Pulls README and returns as context

Docs Agent

  • Fetches documentation from docs.rs or stdlib/book
  • Extracts relevant examples
  • Adds results to Chroma

📦 Rust Project Structure

rustopedia/
├── llm_shell/                    # Main chat interface with agent orchestration
├── rag_engine/                   # Core RAG functionality
├── rag_server/                   # HTTP server for RAG queries
├── agents/
│   ├── github_agent/            # GitHub repository discovery
│   ├── docs_agent/              # Documentation fetching
│   └── crate_agent/             # Crate.io integration
├── /
│   ├── embed_docs.py            # Document embedding with chunking
│   ├── query_docs.py            # Manual query interface
│   ├── rustup_docs.sh           # Automated Rust docs extraction
│   ├── crate_docs.sh            # Crate documentation manager
│   └── sample_docs/             # Document storage (gitignored)
└── chroma_db/                   # Persistent vector database

🧪 Testing Examples

# Load Rust core documentation
cd rag
./rustup_docs.sh

# Load popular crate docs
./crate_docs.sh all

# Query manually
python query_docs.py "How do I create a VecDeque?"

# Chat with LLM
cd ../
cargo run

📎 TODO (next ideas)

  • Allow agent results to include multiple sources
  • Stream LLM responses with token buffering
  • Add RAG-based fallback when LLM confidence is low
  • Use embeddings to determine best agent (vs. keyword match)

🔧 Advanced Usage

Crate Documentation Management

The crate_docs.sh script provides comprehensive crate documentation management:

# View all available commands
./crate_docs.sh

# Fetch specific crates
./crate_docs.sh fetch serde
./crate_docs.sh fetch tokio

# Fetch popular crates automatically
./crate_docs.sh popular

# Re-embed all fetched crates
./crate_docs.sh embed

# View collection statistics
./crate_docs.sh stats

# Clear and rebuild everything
./crate_docs.sh all

Custom Documentation Sources

Install additional Rust docs locally:

rustup component add rust-docs

The rustup_docs.sh script will automatically extract and embed them with optimal chunking.


License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •