Second Brain

Local RAG Agent by Henry Daum

If you find any errors, please let me know.

NEW: You can download the Installer for Second Brain at this Google drive link (If you trust it, which I'm not assuming everybody will. If you don't, you can install it manually with the instructions below.)

Screenshots & GIFs

SecondBrainDemoVideo.mp4

SecondBrainDemoVideo2.mp4

Features

Multi-directory support: Sync and search across multiple directories simultaneously
Semantic search: searches based on deep semantic meaning.
Lexical search: finds a needle in a haystack using keywords.
Supported files: .txt, .pdf, .docx, .gdoc, .png, .jpg, .jpeg, .gif, .webp
Multimodal in three ways:
- Can embed and search for both text and images.
- Can attach text documents and images, and use them to search.
- Can use AI language models with vision.
Optional AI Mode:
- LLM Integration: Supports local models via LM Studio and cloud models, like GPT-5, via OpenAI.
- Retrieval-Augmented Generation (RAG): Uses search results to provide high-quality AI responses rooted in the knowledge-base.
- The AI Mode is entirely optional; a search can be run without AI.
Sync 100,000+ files.
High-performance sync: Multi-threaded processing for fast synchronization of large directories.
Security and privacy: local search ensures no data is sent to the internet.
Works with Google Drive.
Open source: all the code is right here.

Performance Optimization

Second Brain is optimized for handling large directories with thousands of files and images:

Parallel Processing: Uses multi-threading to process text files concurrently (up to 8 workers by default)
Optimized Image Handling: Larger batch sizes and efficient image loading for faster sync
Smart Resource Management: Auto-detects optimal worker count based on your CPU
Expected Performance: 4-8x faster text processing, 3-5x faster image processing compared to sequential processing

For large directories (1000+ files), the improved sync speed means you can keep your knowledge base up-to-date more frequently without long waiting times.

Infographic of How it Works

Second Brain Search Flowchart Infographic

How It Works

Backend Code

SecondBrainBackend.py

This file handles two things: syncing the directory and performing searches.

Syncing: Scans the target directory, detects new, updated, or deleted files. Extracts text from files and splits it into manageable chunks, while also batching images together for the sake of speed. Converts text chunks and images into vector embeddings using sentence-transformer models. Stores embeddings in a persistent ChromaDB vector database. To enable lexical search, image captions are created and stored along with image embeddings. A lexical index of both text documents and image captions is created using BM25 for lexical search.

Retrieval: Performs both a semantic and lexical search, then combines the outputs, reranks all results using Maximal Marginal Relevance (MMR), takes the top N outputs, and finally filters for relevance using AI (last step optional).

Frontend Code

SecondBrainFrontend.py

The frontend code controls the user interface for the application. It controls threading for non-blocking backend operations, prompting patterns for the AI, and logging functions. Manages text and image attachments to construct complex prompts for the LLM. Allows the user to send messages in a chat-like format. Performs many different functions simultaneously, and orchestrates blocking and re-enabling different buttons at different times to prevent crashes. Looks good, too, if I may say so myself.

Uniquely, image and text results can be attached in order to perform continous searches, like web-surfing but through your own filebase. There is no other experience exactly like it. Simply click on an image result or the four squares by a text result and then click "attach result." Other similar amenities are available in the app.

Settings

config.json

This file can be edited by the user and changes the cool features of Second Brain (see below).

Google Drive Authentication

credentials.json

This file must be added (see below) if using Google Drive (optional), as it allows the syncing of Google Doc (.gdoc) files.

Image Labels

image_labels.csv

This CSV is used as a pool for possible image labels. The image labels are chosen based on how close the image embedding is to each label embedding (a categorization task). It was constructed based on Google's Open Images dataset for object identification.

Hotkey to Run Second Brain

SecondBrainHotkey.ahk If you want, you can make a hotkey (I do ctrl+j) using AutoHotkey that automatically runs SecondBrainFrontend.py when you are anywhere on your computer. This makes it very convenient. To do this, you'll need to download AutoHotkey itself, and edit it so that the path points to where your SecondBrain folder is. To make the hotkey active when you start your computer, place the script inside your startup files, which you can find (on Windows) using Win+R and then typing shell:startup.

Getting Started

1. Prerequisites

Before running the application, ensure you have the following:

Python 3.9 or higher - Download Python (Make sure to check "Add Python to PATH" during installation)
LM Studio (optional, for local AI): Download and install LM Studio with a .gguf model, ideally with vision capabilities (unsloth/gemma-3-4b-it works well, but any model may be used).
OpenAI API (optional, for cloud AI): Make sure you have an API key from platform.openai.com with sufficient funds.
System requirements: The search engine can be used with GPU or CPU. It will automatically detect if a GPU exists and use it if available. Different text and image embedding models use different amounts of memory, and can be configured. For example, the default models use 2GB of VRAM/RAM.

2. Quick Installation & Launch 🚀

NEW: Automatic Setup Scripts!

After downloading the repository, simply run the appropriate script for your system:

Windows (PowerShell - recommended):

.\setup_and_run.ps1

If you get an ExecutionPolicy error, run:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Windows (Command Prompt):

setup_and_run.bat

Linux / macOS:

chmod +x setup_and_run.sh  # Make executable (first time only)
./setup_and_run.sh

The script will automatically:

✅ Check for Python 3.9+
✅ Verify all required files are present
✅ Create a virtual environment
✅ Install all dependencies
✅ Launch the application

First run may take 5-15 minutes to download dependencies (~2-3 GB). Subsequent runs will be much faster - just run the same script again!

3. Manual Installation (if automatic setup fails)

Download SecondBrainBackend.py, SecondBrainFrontend.py, config.json, image_labels.csv and requirements.txt from this repository. Place them in a folder.

Then install dependencies:

pip install -r requirements.txt

And run:

python SecondBrainFrontend.py

4. Configuration

Open config.json and update the target_directories array to point to the folder(s) you want to use as your knowledge base.

You can specify one or multiple directories:

Single directory:

"target_directories": [
    "C:\\Users\\user\\Documents\\MyNotes"
]

Multiple directories:

"target_directories": [
    "C:\\Users\\user\\Documents\\MyNotes",
    "C:\\Users\\user\\My Drive",
    "D:\\Projects\\Documentation"
]

All specified directories will be scanned and synced together into a unified searchable knowledge base.

Note: The old target_directory (singular) format is still supported for backward compatibility.

(Optional - skip if irrelevant) To enable Google Doc syncing, follow the Google Cloud API instructions to get a credentials.json file using OAuth. Either place the file in the project directory, or place it somewhere else and update credentials_path in config.json to point to it. The first time you sync, a browser window will open for you to authorize the application. Authentication is very finnicky and it might be necessary to delete the authorization token (token.json) and then reauthorize to get Drive syncing to work. You can do this with the "Reauthorize Drive" button.

Usage Guide

Syncing your files:

Click the Sync Directory button to start embedding the files from your target_directory. Depending on the size of the directory, it can take a while (hours), but is necessary to enable search. The sync can be cancelled midway through by clicking the Cancel Sync button with no consequences. If restarted, the sync will continue where it left off.

Searching:

Text results are based on relevant chunks of text from the embedding database. You can click on image results and file paths to open them directly, attach them, see their parent folder, or copy their path.

Using AI mode:

Toggle the AI Mode checkbox to enable or disable LLM augmented search. When enabled, Second Brain loads the selected LLM from LM Studio or the OpenAI API. Then, searches are enhanced with AI-generated queries, results are filtered by the AI based on relevance, and a final "AI Insights" summary is always streamed in the results container. It is optional but recommended to use vision-enabled models, since it is needed to filter images and give insights on them. When disabled, the LLM is unloaded to save system resources, and the app performs a direct vector/lexical search with no query expansion, filtering, or summary. It is possible to toggle query expansion and filtering (see config: query_expansion and llm_filter_results).

Attaching files:

If you attach a text document (.pdf, .docx, etc.), the entire text will be added if it is below a certain size (see config: max_attachment_size). If it is too large, it extracts several relevant chunks to provide focused context for the search. If you attach an image, you can send it to find visually similar images in your database and related documents. You can send an attachment without a message.

Saving Insights:

If you find that Second Brain has produced a good insight, you can save it for future use. When you click the "Save Insight" button, found after the AI Insights section of a response, the query and response get saved to a .txt file. These text files will be embedded, which gives Second Brain a long-term memory of its own contributions. (You can re-read or delete entries by going to the saved_insights folder, in the Second Brain directory.)

Configuration Details

config.json

Parameter Name	Function	Range	Default
credentials_path	Path to credentials.json which is used to authenticate Google Drive downloads. By default it will look in the same folder as SecondBrainFrontend.py	Any path	"credentials.json"
target_directories	Array of directories to sync. Embeds all valid files from all specified directories into a unified searchable knowledge base. Supports both single and multiple directories. Old single `target_directory` format is supported for backward compatibility.	Array of directory paths	["C:\\Users\\user\\My Drive"]
text_model_name	SentenceTransformers text embedding model name. "BAAI/bge-m3", "BAAI/bge-large-en-v1.5", and "BAAI/bge-small-en-v1.5" work well, with the small version using fewer system resources, and m3 using the most. Also, BAAI/bge-m3 is multilingual. If you change the text model name, you need to remake all of your embeddings with that model - delete the chroma_db folder or use the helper function at the bottom of SecondBrainBackend.	Any valid name	"BAAI/bge-large-en-v1.5"
image_model_name	SentenceTransformers image embedding model name. "clip-ViT-L-14", "clip-ViT-B-16", and "clip-ViT-B-32" work well, with the 32 version using fewer system resources, and L-14 using the most. If you change the image model name, you need to remake all of your embeddings with that model.	Any valid name	"clip-ViT-B-16"
embed_use_cuda	Turning this to false forces the embedding models to use CPU. If set to true, uses CUDA if possible. I recommend using CUDA for syncing, but CPU is more than enough for individual searches, which means you can allocate more VRAM for a local model if desired.	true or false	true
batch_size	How many text chunks/images are processed in parallel with the embedders. Embedding is faster with higher values, given enough compute. Optimized default for better performance.	1-100	32
max_workers	Number of parallel workers for file processing during sync. Higher values speed up synchronization for large directories. Auto-detected based on CPU cores if not specified.	1-16	8
chunk_size	Maximum size of text chunks created from the text splitter, in tokens; 200 and 0 chunk_overlap is shown in research to be very good.	50-2000	200
chunk_overlap	Used to preserve continuity when splitting text.	0-200	0
max_seq_length	Maximum input size for the text embedding model, in tokens. Lower values are faster. Must be larger than chunk_size.	Depends on model, 250-8192	512
mmr_lambda	Prioritize diversity or relevance in search results; 0 = prioritize diversity only, 1 = prioritize relevance only.	0.0-1.0	0.5
mmr_alpha	The MMR rerank uses a hybrid semantic-lexical diversity metric. An mmr_alpha of 0.0 prioritizes lexical diversity, while 1.0 is for using only semantic diversity in choosing how to rerank.	0.0-1.0	0.5
search_multiplier	How many results to find for each query.	At least 1	5
ai_mode	Whether to use AI to aid in searches.	true or false	true
show_buttons	Whether to show the button row at the top.	true or false	true
llm_filter_results	Filtering results is somewhat slow. This gives the option to turn that part of ai_mode on/off.	true or false	false
llm_backend	Choose which AI backend to use. OpenAI is slower and not local but has smarter models.	“LM Studio” or "OpenAI"	"LM Studio"
lms_model_name	Can be any language model from LM Studio, but it must be already downloaded. Vision is preferred.	Any valid name	"unsloth/gemma-3-4b-it"
openai_model_name	Can be any OpenAI text model. (Some OpenAI models, like gpt-5, require additional verification to use.)	Any OpenAI model	"gpt-5-mini"
openai_api_key	When using OpenAI as a backend, an API key from platform.openai.com is required. Using it costs money, so the account must have enough funds. If this field is left blank (""), the OpenAI client will look for an environmental variable called OPENAI_API_KEY, and use that. Otherwise, the client will use the string found here.	Any API key	""
max_results	Sets the maximum for both text and image results.	1-30	6
search_prefix	Phrase prefixed to the start of text search queries; recommended with the text embedding models BAAI/bge-large-en-v1.5 and BAAI/bge-small-en-v1.5. Not needed for BAAI/bge-m3. Set to “” to disable.	-	"Represent this sentence for searching relevant passages: "
query_multiplier	How many queries the AI is asked to make to augment the search, based on the user’s attachment and user prompt. Set to 0 to turn off the feature.	0 or more	5
max_attachment_size	Will try to add an entire attachment to an AI as context if it is below this size, measured in tokens. Decrease if the context window is small. If an attachment is larger than this size, only the most relevant chunks of the attachment will be added.	256-8192	1024
n_attachment_chunks	How many relevant text chunks to extract from attachments. In addition to LLM context, these are used as additional queries for lexical and semantic search.	1-10	3
system_prompt	Special instructions for how the AI should do its job. Feel free to change the special instructions to anything.	Any string	"You are a personal search assistant, made to turn user prompts into accurate and relevant search results, using information from the user's database. Special instruction: Your persona is that of an analytical and definitive guide. You explain all topics with a formal, structured, and declarative tone. You frequently use simple, structured analogies to illustrate relationships and often frame your responses with short, philosophical aphorisms."
generate_queries_prompt	Special instructions for how the AI should do its job. It's best not to change this, but you can.	Any string	"{system_prompt}\n\n{user_request}\n{attachment_context}\nBased on the user's prompt, generate {n} creative search queries that could retrieve relevant {content} to answer the user. These queries will go into a semantic search algorithm to retreive relevant {content} from the user's database. The queries should be broad enough to find a variety of related items. These queries will search a somewhat small and personal database (that is, the user's hard drive). Respond with a plain list with no supporting text or markdown.{reminder_suffix}"
evaluate_text_relevance_prompt	Special instructions for how the AI should do its job. It's best not to change this, but you can.	Any string	"{system_prompt}\n\n{attachment_context}\n{user_request}\nDocument excerpt to evaluate:\n"{chunk}"\n\nIs this excerpt worth keeping? Respond only with YES or NO.\n\nRelevance is the most important thing. Does the snippet connect to the user's request?\n\nIf the excerpt is gibberish, respond with NO.\n\n(Again: respond only with YES or NO.)"
evaluate_image_relevance_prompt	Special instructions for how the AI should do its job. It's best not to change this, but you can.	Any string	"{system_prompt}\n\n{attachment_context}\n{user_request}\nIs the provided image worth keeping? Respond only with YES or NO.\n\nRelevance is the most important thing. Does the photo connect to the user's request?\n\nIf the image is blank, corrupted, or unreadable, respond with NO.\n\nImage file path: {image_path}\n\nIf the user's query has an exact match within the file path, respond with YES.\n\n(Again: respond only with YES or NO.)"
synthesize_results_prompt	Special instructions for how the AI should do its job. It's best not to change this, but you can.	Any string	"{system_prompt}\n\nIt is {date_time}.\n\n{user_request}\n{attachment_context}\n{database_results}\nYour Task:\nBased exclusively on the information provided above, write a concise and helpful response. Your primary goal is to synthesize the information to guide the user towards what they want.\n\nInstructions:\n- The text search results are snippets from larger documents and may be incomplete.\n- Do not assume or guess the author of a document unless the source text makes it absolutely clear.\n- The documents don't have timestamps; don't assume the age of a document unless the source text makes it absolutely clear.\n- Cite every piece of information you use from the search results with its source, like so: (source_name).\n- If the provided search results are not relevant to the user's request, state that you could not find any relevant information.\n- Use markdown formatting (e.g., bolding, bullet points) to make the response easy to read.\n- If there are images, make sure to consider them for your response."

Dependencies

All Python dependencies are listed in requirements.txt:

pip install -r requirements.txt

OR simply run setup_and_run.bat to install everything automatically.

requests
PyMuPDF
chromadb
python-docx
rank_bm25
flet
numpy
Pillow
langchain_core
transformers
sentence-transformers
scipy
langchain-text-splitters
openai
lmstudio
google-auth-oauthlib
google-api-python-client
torch

Hidden Variables

These can be found in the code to change minor features. IMG_THUMBNAIL, stop_words, jpeg_quality, min_len, low_compression_threshold, high_compression_threshold, temperature, openai_vision_keywords, image_preview_size

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SecondBrainBackend.py		SecondBrainBackend.py
SecondBrainFrontend.py		SecondBrainFrontend.py
SecondBrainHotkey.ahk		SecondBrainHotkey.ahk
config.json		config.json
image_labels.csv		image_labels.csv
requirements.txt		requirements.txt
setup_and_run.bat		setup_and_run.bat
setup_and_run.ps1		setup_and_run.ps1
setup_and_run.sh		setup_and_run.sh

License

hvkeyn/second-brain

Folders and files

Latest commit

History

Repository files navigation