Ally is an AI-powered CLI tool designed to assist with anything from everyday tasks to complex projects efficiently, without leaving the terminal.
Ally was built a fully local agentic system using Ollama, but it also works seamlessly with:
- OpenAI
- Anthropic
- Google GenAI
- Cerebras
- (more integrations on the way!)
This tool is best suited for scenarios where privacy is paramount and agentic capabilities are needed in the workflow.
A general-purpose agent that can:
-
Read, write, modify, and delete files and directories.
-
Access the internet.
-
Execute commands and code.
Note: Tools always ask for your permission before executing.
Ally can take your files, embed them into its knowledge base, and use them to respond to your prompts with a high level of accuracy.
Currently, Ally's embedding functions can use:
- Hugging Face models (locally)
- Ollama Embedding models (locally)
- NLP Cloud (hosted)
- OpenAI (hosted)
- More on the way...
RAG Tutorial:
-
Setup the
config.jsonas shown below with the appropriate embedding settings. -
Provide the path to the file or folder whose contents should be embedded. As an alternative, you can launch Ally from that directory.
-
Use
/embed <path> <collection_name>or/embed . <collection_name>if already at the correct path. -
Start the RAG session with
/start_rag -
End the RAG session with
/stop_rag
Note that Ally will not use any external data to answer your prompts during RAG sessions unless explicitly given permission to.
Additional commands:
-
Edit indexed collections with
/index <collection_name>and/unindex <collection_name>.Note: newly created collections are already indexed. -
View all collections with
/list -
Reset the database with
/purgeor delete a specific collection with/delete <collection_name>
- Use the
--create-projectflag or the/projectcommand in the default chat interface.
Complete workflow:
- Asks for your project idea.
- Runs the Brainstormer Agent to create the context space and full project specification for the Codegen Agent (in
.mdformat). - Optionally lets you provide more context by chatting interactively with the Brainstormer Agent.
- Runs the Codegen Agent using the generated
.mdfiles. - Opens an interactive chat with the Codegen Agent to refine or extend the project.
This workflow is still in its early stages.
You have 2 options: Via Docker or locally on your machine.
Create a .env file (or copy .env.example) in any location
# Inference providers (only include those you need)
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
GOOGLE_GEN_AI_API_KEY=...
CEREBRAS_API_KEY=...
# Embedding providers APIs (As needed. Only add those you need.)
NLP_CLOUD_API_KEY=...
# Google Search API (if omitted, online search tools will be limited)
GOOGLE_SEARCH_API_KEY=...
SEARCH_ENGINE_ID=...
See steps for the Google Search API part.
Open a terminal in that directory and type
# Pull the Ally image:
docker pull yassw0rks/ally:latest
# Start the container for the first time
docker run -it --env-file .env --name ally yassw0rks/ally:latest
# You could also assign a volume:
# Replace <YOUR_LOCAL_DIR> with a path on your machine.
# Note that Docker must have permission to <YOUR_LOCAL_DIR>. It can be configured from the settings.
docker run -it \
--env-file .env \
-v <YOUR_LOCAL_DIR>:/data \
--name ally \
yassw0rks/ally:latestNext time you want to jump back in
# Check if container already running
docker ps
# If it is running
docker exec -it ally /bin/bash
# If it's stopped
docker start -ai allyEdit the
config.jsonfile (see below) inside/appas needed. Nano is included in the image for your convenience.
Note this image does not contain Ollama. But it can easily be setup once inside the container.
In your chosen installation folder, open a terminal window and run:
git clone https://github.com/YassWorks/Ally.gitThis file (located at Ally/) controls Ally's main settings and integrations.
Example configuration:
{
"provider": "openai",
"provider_per_model": {
"general": "ollama",
"code_gen": "anthropic",
"brainstormer": null, // autofilled with 'openai'
"web_searcher": null // autofilled with 'openai'
},
"model": "gpt-4o",
"models": {
"general": "gpt-oss:20b",
"code_gen": "claude-sonnet-3.5",
"brainstormer": null, // autofilled with 'gpt-4o'
"web_searcher": null // autofilled with 'gpt-4o'
},
"temperatures": {
"general": 0.7,
"code_gen": 0,
"brainstormer": 1,
"web_searcher": 0
},
"system_prompts": {
// (recommended) leave as-is to use Ally's defaults
"general": null,
"code_gen": null,
"brainstormer": null,
"web_searcher": null
},
"embedding_provider": null, // example: "hf" or "ollama"
"embedding_model": null, // example: "sentence-transformers/all-MiniLM-L6-v2" or "all-minilm"
"scraping_method": "simple" // or "docling"
}Note: Docling is heavy. And requires lots of dependencies. It's recommended to go with the local install if you wish to use Docling.
Alternatively, you could setup a volume (for the parsing and the embedding models) between your machine and the container so that models are persisted across sessions. See below for information where the models are stored inside the container by default.
This file stores your API keys.
# Inference providers (only include those you need)
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
GOOGLE_GEN_AI_API_KEY=...
CEREBRAS_API_KEY=...
# Embedding providers APIs (As needed. Only add those you need.)
NLP_CLOUD_API_KEY=...
# Google Search API (if omitted, online search tools will be limited)
GOOGLE_SEARCH_API_KEY=...
SEARCH_ENGINE_ID=...
- Set up a Google Programmable Search Engine
- Copy the contents above (or from
.env.example) into.env. - Fill in your API keys and IDs.
Depending on your OS choose either setup.cmd (Windows) or setup.sh (Linux/Mac)
Note: Ally creates its own virtual environment to keep dependencies isolated and automatically adds itself to your PATH.
Now you’re ready to run Ally from anywhere in the terminal using ally.
Use ally -h for more help.
- Edit the following environment variable if needed:
| Environment Variable | Purpose |
|---|---|
ALLY_HISTORY_DIR |
Controls where Ally stores its history. |
ALLY_DATABASE_DIR |
Controls where Ally stores its database. |
ALLY_EMBEDDING_MODELS_DIR |
Controls where Ally stores its embedding models (Hugging Face). |
ALLY_PARSING_MODELS_DIR |
Controls where Ally stores its parsing models used by Docling. |
Defaults are:
Windows:
%LOCALAPPDATA%\Ally\...
Linux & MacOS:
~/.local/share/Ally/...
-
RAG-related tools used by Ally are large in size and are therefore downloaded only after RAG settings are enabled in the config.json file. As a result, Ally will perform additional downloads the next time it is launched following these configuration changes.
-
To save a chat, use /id to view the conversation ID. The next time you open Ally, continue the conversation by using the -i flag followed by the ID. You can do the same inside the CLI, just do
/id <your_id> -
Embedding and scraping files that require OCR (such as PDFs and DOCX) currently use a CPU-only PyTorch installation. You can modify the configuration to utilize a GPU if desired, though this is typically only necessary for processing very large files.
Apache-2.0
Issues and PRs are always welcome 💌
Contact me via email to discuss contributions or collaborations on other projects if you liked my work!
