libscribe is a tool for ingesting and processing code repositories, primarily from GitHub, and storing them in a vector database.
- Ingests repositories from GitHub.
- Processes the repository content.
- Stores the processed data in a vector database.
graph TD
%% Core Components
A[FastAPI Application] --> |Ingest| B[Document Processing]
A --> |Query| Q[Query Processing]
%% Processing Flow
B --> C[GitHub Loader]
C --> D[GitHub API]
C --> E[Vector Store]
%% Storage & Query
E --> |Embeddings| K[Vector Database]
Q --> |Search| K
K --> |Results| Q
%% External Services
subgraph External Services
D[GitHub API]
K[Vector Database]
end
%% Configuration
M[Environment Config] --> A
%% Styling
classDef external fill:#f96,stroke:#333
class D,K external
The diagram above illustrates the system's architecture and data flow:
- The FastAPI application handles both ingestion and query requests
- A background task is created to handle the ingestion process
- LangChain's GitHub Loader fetches and filters repository content
- Documents are enriched with metadata (owner, repo, branch, etc.)
- The vector store pipeline:
- Generates embeddings using VoyageAI
- Stores vectors in Qdrant DB
- The query pipeline:
- Processes search queries
- Performs similarity search in Qdrant
- Returns relevant documents
- External services (GitHub, VoyageAI, Qdrant) are integrated via API keys
To ingest a repository, you can use the ingest_repository
function in src/app/main.py
. Provide the repository URL and branch as input.
The src
directory contains the following subdirectories:
app
: Contains the main application logic, including the API endpoints.ingestion
: Contains the logic for ingesting data from GitHub and processing it.storage
: Contains the logic for interacting with the vector database.utils
: Contains utility functions, such as parsing repository URLs.
- Install UV.
curl -LsSf https://astral.sh/uv/install.sh | sh
- Install dependencies using
uv sync
. - Create a
.env
file based on the.env.example
. - Run the FastAPI application using
uvicorn src.app.main:app --reload
.
For more details about future plans, please refer to the ROADMAP.md
file.