🚀 MassGen: Multi-Agent Scaling System for GenAI

MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks.

Multi-agent scaling through intelligent collaboration in Grok Heavy style

MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks. It assigns a task to multiple AI agents who work in parallel, observe each other's progress, and refine their approaches to converge on the best solution to deliver a comprehensive and high-quality result. The power of this "parallel study group" approach is exemplified by advanced systems like xAI's Grok Heavy and Google DeepMind's Gemini Deep Think. This project started with the "threads of thought" and "iterative refinement" ideas presented in The Myth of Reasoning, and extends the classic "multi-agent conversation" idea in AG2. Here is a video recording of the background context introduction presented at the Berkeley Agentic AI Summit 2025.

📋 Table of Contents

🗺️ Roadmap

Key Future Enhancements
- Advanced Agent Collaboration
- Expanded Model, Tool & Agent Integration
- Improved Performance & Scalability
- Enhanced Developer Experience
- Web Interface
v0.0.7 Roadmap

✨ Key Features

Feature	Description
🤝 Cross-Model/Agent Synergy	Harness strengths from diverse frontier model-powered agents
⚡ Parallel Processing	Multiple agents tackle problems simultaneously
👥 Intelligence Sharing	Agents share and learn from each other's work
🔄 Consensus Building	Natural convergence through collaborative refinement
📊 Live Visualization	See agents' working processes in real-time

🏗️ System Design

MassGen operates through an architecture designed for seamless multi-agent collaboration:

graph TB
    O[🚀 MassGen Orchestrator<br/>📋 Task Distribution & Coordination]

    subgraph Collaborative Agents
        A1[Agent 1<br/>🏗️ Anthropic/Claude + Tools]
        A2[Agent 2<br/>🌟 Google/Gemini + Tools]
        A3[Agent 3<br/>🤖 OpenAI/GPT/O + Tools]
        A4[Agent 4<br/>⚡ xAI/Grok + Tools]
    end

    H[🔄 Shared Collaboration Hub<br/>📡 Real-time Notification & Consensus]

    O --> A1 & A2 & A3 & A4
    A1 & A2 & A3 & A4 <--> H

    classDef orchestrator fill:#e1f5fe,stroke:#0288d1,stroke-width:3px
    classDef agent fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef hub fill:#e8f5e8,stroke:#388e3c,stroke-width:2px

    class O orchestrator
    class A1,A2,A3,A4 agent
    class H hub

The system's workflow is defined by the following key principles:

Parallel Processing - Multiple agents tackle the same task simultaneously, each leveraging their unique capabilities (different models, tools, and specialized approaches).

Real-time Collaboration - Agents continuously share their working summaries and insights through a notification system, allowing them to learn from each other's approaches and build upon collective knowledge.

Convergence Detection - The system intelligently monitors when agents have reached stability in their solutions and achieved consensus through natural collaboration rather than forced agreement.

Adaptive Coordination - Agents can restart and refine their work when they receive new insights from others, creating a dynamic and responsive problem-solving environment.

This collaborative approach ensures that the final output leverages collective intelligence from multiple AI systems, leading to more robust and well-rounded results than any single agent could achieve alone.

🚀 Quick Start

1. 📥 Installation

Core Installation:

git clone https://github.com/Leezekun/MassGen.git
cd MassGen
pip install uv
uv venv

Optional CLI Tools (for enhanced capabilities):

# Claude Code CLI - Advanced coding assistant
npm install -g @anthropic-ai/claude-code

2. 🔐 API Configuration

Create a .env file in the massgen directory with your API keys:

# Copy example configuration
cp .env.example .env

# Edit with your API keys
ANTHROPIC_API_KEY=your-anthropic-key-here
GEMINI_API_KEY=your-gemini-key-here
OPENAI_API_KEY=your-openai-key-here
XAI_API_KEY=your-xai-key-here
ZAI_API_KEY=your-zai-key-here

Make sure you set up the API key for the model you want to use.

Useful links to get API keys:

3. 🧩 Supported Models and Tools

Models

The system currently supports multiple model providers with advanced capabilities: Anthropic Claude, Claude Code, Google Gemini, OpenAI, xAI Grok, Z AI. More providers and local inference of open-weight models (using vllm or sglang) are welcome to be added.

Tools

MassGen agents can leverage various tools to enhance their problem-solving capabilities. Both API-based and CLI-based backends support different tool capabilities.

Supported Built-in Tools by Backend:

Backend	Live Search	Code Execution	File Operations	Advanced Features
Claude API	✅	✅	❌	Web search, code interpreter
Claude Code	✅	✅	✅	Native Claude Code SDK, comprehensive dev tools
Gemini API	✅	✅	❌	Web search, code execution
Grok API	✅	❌	❌	Web search only
OpenAI API	✅	✅	❌	Web search, code interpreter
ZAI API	❌	❌	❌	-

4. 🏃 Run MassGen

Quick Test with A Single Model

API-based backends:

uv run python -m massgen.cli --model gemini-2.5-flash "Which AI won IMO in 2025?"
uv run python -m massgen.cli --model gpt-5-mini "Which AI won IMO in 2025?"
uv run python -m massgen.cli --model grok-3-mini "Which AI won IMO in 2025?"
uv run python -m massgen.cli --model glm-4.5 "Which AI won IMO in 2025?"

All supported models can be found here.

CLI-based backends:

# Claude Code - Native Claude Code SDK with comprehensive dev tools
uv run python -m massgen.cli --backend claude_code "Can I use claude-3-5-haiku for claude code?"
uv run python -m massgen.cli --backend claude-code "Debug this Python script"

--backend is required for this type of backends.

Multiple Agents from Config

# Use configuration file
uv run python -m massgen.cli --config three_agents_default.yaml "Compare different approaches to renewable energy"

# Mixed API and CLI backends
uv run python -m massgen.cli --config claude_code_flash2.5.yaml "Complex coding task requiring multiple perspectives"

All available quick configuration files can be found here.

CLI Configuration Parameters

Parameter	Description
`--config`	Path to YAML configuration file with agent definitions, model parameters, backend parameters and UI settings
`--backend`	Backend type for quick setup without a config file (`claude`, `claude_code`, `gemini`, `grok`, `openai`, `zai`). Optional because we can infer backend type through model.
`--model`	Model name for quick setup (e.g., `gemini-2.5-flash`, `gpt-5-nano`, ...). See all supported models. `--config` and `--model` are mutually exclusive - use one or the other.
`--system-message`	System prompt for the agent in quick setup mode. If `--config` is provided, `--system-message` is omitted.
`--no-display`	Disable real-time streaming UI coordination display (fallback to simple text output).
`--no-logs`	Disable real-time logging.
`"<your question>"`	Optional single-question input; if omitted, MassGen enters interactive chat mode.

Configuration File Format

MassGen supports YAML configuration files with the following structure (All available quick configuration files can be found here): MassGen supports YAML/JSON configuration files with the following structure (All available quick configuration files can be found here):

Single Agent Configuration:

Use the agent field to define a single agent with its backend and settings:

agent: 
  id: "<agent_name>"
  backend:
    type: "chatcompletion" | "claude" | "claude_code" | "gemini" | "grok" | "openai" | "zai" #Type of backend 
    model: "<model_name>" # Model name
    api_key: "<optional_key>"  # API key for backend. Uses env vars by default.
  system_message: "..."    # System Message for Single Agent

Multi-Agent Configuration:

Use the agents field to define multiple agents, each with its own backend and config:

agents:  # Multiple agents (alternative to 'agent')
  - id: "<agent1 name>"
    backend: 
      type: "chatcompletion" | "claude" | "claude_code" | "gemini" | "grok" | "openai" | "zai" #Type of backend
      model: "<model_name>" # Model name
      api_key: "<optional_key>"  # API key for backend. Uses env vars by default.
    system_message: "..."    # System Message for Single Agent
  - id: "..."
    backend:
      type: "..."
      model: "..."
      ...
    system_message: "..."

Backend Configuration:

Detailed parameters for each agent's backend can be specified using the following configuration formats:

Chatcompletion

backend:
  type: "chatcompletion"
  model: "gpt-oss-120b"  # Model name
  base_url: "https://api.cerebras.ai/v1" # Base URL for API endpoint
  api_key: "<optional_key>"          # API key for backend. Uses env vars by default.
  temperature: 0.7                   # Creativity vs consistency (0.0-1.0)
  max_tokens: 2500                   # Maximum response length

Claude

backend:
  type: "claude"
  model: "claude-sonnet-4-20250514"  # Model name
  api_key: "<optional_key>"          # API key for backend. Uses env vars by default.
  temperature: 0.7                   # Creativity vs consistency (0.0-1.0)
  max_tokens: 2500                   # Maximum response length
  enable_web_search: true            # Web search capability
  enable_code_execution: true        # Code execution capability

Gemini

backend:
  type: "gemini"
  model: "gemini-2.5-flash"          # Model name
  api_key: "<optional_key>"          # API key for backend. Uses env vars by default.
  temperature: 0.7                   # Creativity vs consistency (0.0-1.0)
  max_tokens: 2500                   # Maximum response length
  enable_web_search: true            # Web search capability
  enable_code_execution: true        # Code execution capability

Grok

backend:
  type: "grok"
  model: "grok-3-mini"               # Model name
  api_key: "<optional_key>"          # API key for backend. Uses env vars by default.
  temperature: 0.7                   # Creativity vs consistency (0.0-1.0)
  max_tokens: 2500                   # Maximum response length
  enable_web_search: true            # Web search capability (uses default: mode="auto", return_citations=true)
  # OR manually specify search parameters via extra_body (conflicts with enable_web_search):
  # extra_body:
  #   search_parameters:
  #     mode: "auto"                 # Search strategy (see Grok API docs for valid values)
  #     return_citations: true       # Include search result citations

OpenAI

backend:
  type: "openai"
  model: "gpt-5"                     # Model name
  api_key: "<optional_key>"          # API key for backend. Uses env vars by default.
  temperature: 0.7                   # Creativity vs consistency (0.0-1.0, GPT-5 series models and GPT o-series models don't support this)
  max_tokens: 2500                   # Maximum response length (GPT-5 series models and GPT o-series models don't support this)
  text: 
    verbosity: "medium"              # Response detail level (low/medium/high, only supported in GPT-5 series models)
  reasoning:                         
    effort: "medium"                 # Reasoning depth (low/medium/high, only supported in GPT-5 series models and GPT o-series models)
    summary: "auto"                  # Automatic reasoning summaries (optional)
  enable_web_search: true            # Web search capability - can be used with reasoning
  enable_code_interpreter: true      # Code interpreter capability - can be used with reasoning

Claude Code

backend:
  type: "claude_code"
  cwd: "claude_code_workspace"  # Working directory for file operations
  api_key: "<optional_key>"          # API key for backend. Uses env vars by default.
  
  # Claude Code specific options
  append_system_prompt: ""  # Custom system prompt to append
  max_thinking_tokens: 4096                   # Maximum thinking tokens
  
  # Tool configuration (Claude Code's native tools)
  allowed_tools:
    - "Read"           # Read files from filesystem
    - "Write"          # Write files to filesystem  
    - "Edit"           # Edit existing files
    - "MultiEdit"      # Multiple edits in one operation
    - "Bash"           # Execute shell commands
    - "Grep"           # Search within files
    - "Glob"           # Find files by pattern
    - "LS"             # List directory contents
    - "WebSearch"      # Search the web
    - "WebFetch"       # Fetch web content
    - "TodoWrite"      # Task management
    - "NotebookEdit"   # Jupyter notebook editing
    # MCP tools (if available)
    - "mcp__ide__getDiagnostics"
    - "mcp__ide__executeCode"

ZAI

backend:
  type: "zai"
  model: "glm-4.5"  # Model name
  base_url: "https://api.z.ai/api/paas/v4/" # Base URL for API endpoint
  api_key: "<optional_key>"          # API key for backend. Uses env vars by default.
  temperature: 0.7                   # Creativity vs consistency (0.0-1.0)
  top_p: 0.7                    # Nucleus sampling cutoff; keeps smallest set of tokens with cumulative probability ≥ top_p

UI Configuration:

Configure how MassGen displays information and handles logging during execution:

ui:
  display_type: "rich_terminal" | "terminal" | "simple"  # Display format for agent interactions
  logging_enabled: true | false                          # Enable/disable real-time logging

display_type: Controls the visual presentation of agent interactions
- "rich_terminal": Full-featured display with multi-region layout, live status updates, and colored output
- "terminal": Standard terminal display with basic formatting and sequential output
- "simple": Plain text output without any formatting or special display features
logging_enabled: When true, saves detailed timestamp, agent outputs and system status

Interactive Multi-Turn Mode

MassGen supports an interactive mode where you can have ongoing conversations with the system:

# Start interactive mode with a single agent
uv run python -m massgen.cli --model gpt-5-mini

# Start interactive mode with configuration file
uv run python -m massgen.cli --config three_agents_default.yaml

Interactive Mode Features:

Multi-turn conversations: Multiple agents collaborate to chat with you in an ongoing conversation
Real-time feedback: Displays real-time agent and system status
Clear conversation history: Type /clear to reset the conversation and start fresh
Easy exit: Type /quit, /exit, /q, or press Ctrl+C to stop

Watch the recorded demo:

5. 📊 View Results

The system provides multiple ways to view and analyze results:

Real-time Display

Live Collaboration View: See agents working in parallel through a multi-region terminal display
Status Updates: Real-time phase transitions, voting progress, and consensus building
Streaming Output: Watch agents' reasoning and responses as they develop

Watch an example here:

Comprehensive Logging

All sessions are automatically logged with detailed information. The file can be viewed throught the interaction with UI.

agent_outputs/
  ├── agent_1.txt       # The full logs by agent 1
  ├── agent_2.txt       # The full logs by agent 2
  ├── agent_3.txt       # The full logs by agent 3
  ├── system_status.txt # The full logs of system status

💡 Examples

Here are a few examples of how you can use MassGen for different tasks:

Case Studies

To see how MassGen works in practice, check out these detailed case studies based on real session logs:

MassGen Case Studies

1. ❓ Question Answering

# Ask a question about a complex topic
uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "what's best to do in Stockholm in October 2025"

uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "give me all the talks on agent frameworks in Berkeley Agentic AI Summit 2025, note, the sources must include the word Berkeley, don't include talks from any other agentic AI summits"

2. 🧠 Creative Writing

# Generate a short story
uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "Write a short story about a robot who discovers music."

3. 🧠 Research

uv run python -m massgen.cli --config massgen/configs/gemini_4o_claude.yaml "How much does it cost to run HLE benchmark with Grok-4"

4. 💻 Development & Coding Tasks

# Single agent with comprehensive development tools
uv run python -m massgen.cli --config massgen/configs/claude_code_single.yaml "Create a Flask web app with user authentication and database integration"

# Multi-agent development team collaboration  
uv run python -m massgen.cli --config massgen/configs/claude_code_flash2.5_gptoss.yaml "Debug and optimize this React application, then write comprehensive tests"

# Quick coding task with claude_code backend
uv run python -m massgen.cli --backend claude_code "Refactor this Python code to use async/await and add error handling"

🗺️ Roadmap

MassGen is currently in its foundational stage, with a focus on parallel, asynchronous multi-agent collaboration and orchestration. Our roadmap is centered on transforming this foundation into a highly robust, intelligent, and user-friendly system, while enabling frontier research and exploration. An earlier version of MassGen can be found here.

⚠️ Early Stage Notice: As MassGen is in active development, please expect upcoming breaking architecture changes as we continue to refine and improve the system.

Key Future Enhancements:

Advanced Agent Collaboration: Exploring improved communication patterns and consensus-building protocols to improve agent synergy.
Expanded Model, Tool & Agent Integration: Adding support for more models/tools/agents, including a wider range of tools like MCP Servers, and coding agents.
Improved Performance & Scalability: Optimizing the streaming and logging mechanisms for better performance and resource management.
Enhanced Developer Experience: Introducing a more modular agent design and a comprehensive benchmarking framework for easier extension and evaluation.
Web Interface: Developing a web-based UI for better visualization and interaction with the agent ecosystem.

We welcome community contributions to help us achieve these goals.

v0.0.7 Roadmap

Version 0.0.7 focuses primarily on Local Model Support, enabling integration with local inference engines for open-weight models. Key enhancements include:

Local Model Integration (Required): 🚀 Support for backends like LM Studio/vllm/sglang to run open-weight models locally
Enhanced Backend Features (Optional): 🔄 Improved error handling, health monitoring, and backend stability enhancements
Advanced CLI Features (Optional): Conversation save/load functionality, templates, export formats, and better multi-turn display

For detailed milestones and technical specifications, see the full v0.0.7 roadmap.

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

⭐ Star this repo if you find it useful! ⭐

Made with ❤️ by the MassGen team

Name		Name	Last commit message	Last commit date
Latest commit History 541 Commits
.devcontainer		.devcontainer
assets		assets
docs		docs
massgen		massgen
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
ROADMAP_v0.0.7.md		ROADMAP_v0.0.7.md
function_calls.txt		function_calls.txt
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

License

balakreshnan/MassGen

Folders and files

Latest commit

History

Repository files navigation

🚀 MassGen: Multi-Agent Scaling System for GenAI

📋 Table of Contents

✨ Key Features

🏗️ System Design

🚀 Quick Start

💡 Examples

🗺️ Roadmap

📚 Additional Resources

✨ Key Features

🏗️ System Design

🚀 Quick Start

1. 📥 Installation

2. 🔐 API Configuration

3. 🧩 Supported Models and Tools

Models

Tools

4. 🏃 Run MassGen

Quick Test with A Single Model

Multiple Agents from Config

CLI Configuration Parameters

Configuration File Format

Chatcompletion

Claude

Gemini

Grok

OpenAI

Claude Code

ZAI

Interactive Multi-Turn Mode

5. 📊 View Results

Real-time Display

Comprehensive Logging

💡 Examples

Case Studies

1. ❓ Question Answering

2. 🧠 Creative Writing

3. 🧠 Research

4. 💻 Development & Coding Tasks

🗺️ Roadmap

Key Future Enhancements:

v0.0.7 Roadmap

🤝 Contributing

📄 License

⭐ Star History

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages