Fancryrag Neo4j GraphRAG Baseline

This project bootstraps a Neo4j-backed GraphRAG pipeline using Astral's uv package manager.

Quick Start

# Install dependencies
uv sync

# Launch Neo4j with APOC on the internal rag-net network
docker compose up -d neo4j

# Populate vector index (cosine, 768 dim)
make index

# Create or verify the full-text index used for lexical recall
make fulltext-index

# Run the ingestion pipeline over a text file
printf 'Alice founded Acme Corp in 2012. Bob joined in 2015.' > sample.txt
make ingest f=sample.txt

# Inspect graph counts
make counts

Configure secrets in .env.local before running ingestion. When targeting a local OpenAI-compatible embedding server (e.g., localhost:20010), set:

EMBEDDING_API_BASE_URL=http://localhost:20010/v1
EMBEDDING_API_KEY=<token or dummy if not required>

If you are using OpenAI project-scoped keys (sk-proj-...), also capture the project identifier issued by OpenAI:

OPENAI_API_KEY=sk-proj-...
OPENAI_PROJECT=proj_...

Project keys require the project field so the Python SDK can route requests to the correct workspace.

For hybrid search, set the index configuration variables (sensible defaults shown):

INDEX_NAME=text_embeddings
FULLTEXT_INDEX_NAME=chunk_text_fulltext
FULLTEXT_LABEL=Chunk
FULLTEXT_PROPERTY=text
FULLTEXT_READY_ATTEMPTS=10
FULLTEXT_READY_DELAY=3

The full-text index script is idempotent; rerun make fulltext-index after ingestion jobs or schema changes to keep lexical search synchronized with vector metadata.

FULLTEXT_READY_ATTEMPTS and FULLTEXT_READY_DELAY tune how long the provisioning script waits for Neo4j to accept connections before failing. Defaults cover local Docker startup, but tighten them for pre-warmed environments or extend for slower clusters.

FastMCP Hybrid Server

Story 1.2 introduces a Google OAuth-protected FastMCP server that fronts the HybridCypherRetriever. Configure the additional environment variables (see .env.example for defaults):

MCP_BASE_URL: Public base URL used during OAuth callbacks (https://...).
MCP_SERVER_HOST, MCP_SERVER_PORT, MCP_SERVER_PATH: Local binding for the HTTP transport; defaults to 0.0.0.0:8080 and /mcp.
GOOGLE_OAUTH_CLIENT_ID, GOOGLE_OAUTH_CLIENT_SECRET: Credentials issued via Google Cloud Console.
GOOGLE_OAUTH_REQUIRED_SCOPES: Comma-separated scopes. The baseline requests openid and userinfo.email.
HYBRID_RETRIEVAL_QUERY_PATH: Path to the Cypher projection appended after the hybrid search prelude. The default file queries/hybrid_retrieval.cypher returns the node, its text, and the combined score.
EMBEDDING_MODEL, EMBEDDING_TIMEOUT_SECONDS, EMBEDDING_MAX_RETRIES: Tuning knobs for the OpenAI-compatible embedding client that powers query embeddings.

After populating .env.local, start the server:

uv run python servers/mcp_hybrid_google.py

Structured JSON logs announce startup, incoming tool invocations, embedding latencies, and retries. The /mcp/search tool returns both normalized vector and full-text scores so downstream clients (e.g., ChatGPT) can reason about result ranking. Use the /mcp/fetch tool to retrieve a specific node by elementId and see its metadata.

Neo4j Container Layout

The docker-compose.yml file provisions a single neo4j service on an isolated bridge network named rag-net. The container mounts persistent volumes for /data, /logs, and /plugins, pulls the official neo4j:5.18 image, and enables APOC via the NEO4J_PLUGINS env setting. Credentials come from .env.local, allowing make up and make down to manage the lifecycle with docker compose. Re-run make index whenever the embedding dimension changes (default: 768 for the local model).

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.bmad-core		.bmad-core
.bmad-infrastructure-devops		.bmad-infrastructure-devops
.github/workflows		.github/workflows
docs		docs
pipelines		pipelines
queries		queries
scripts		scripts
servers		servers
src/fancryrag		src/fancryrag
tests		tests
tools		tools
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
Makefile		Makefile
README.md		README.md
TESTING_SUMMARY.md		TESTING_SUMMARY.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
sample.txt		sample.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fancryrag Neo4j GraphRAG Baseline

Quick Start

FastMCP Hybrid Server

Neo4j Container Layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

DrJLabs/NeoRAG

Folders and files

Latest commit

History

Repository files navigation

Fancryrag Neo4j GraphRAG Baseline

Quick Start

FastMCP Hybrid Server

Neo4j Container Layout

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages