Sifaka

What is Sifaka?

Sifaka implements cutting-edge AI research papers as production-ready code. Instead of just generating text and hoping it's good, Sifaka creates a feedback loop where AI systems validate, critique, and iteratively improve their own outputs until they meet your quality standards.

The Problem: Most AI text generation is a black box - you send a prompt, get back text, and hope it's good enough. No validation, no improvement, no guarantees.

Sifaka's Solution: A transparent, observable pipeline where every generation goes through validation and improvement cycles, implementing techniques from papers like Reflexion, Constitutional AI, and Self-RAG.

Key Innovations:

Research-to-Production: Direct implementations of 7+ major AI research papers
Guaranteed Quality: Validation-first design ensures outputs meet your criteria
Complete Observability: Full audit trails of every decision, iteration, and improvement
Iterative Improvement: AI systems that get better at their tasks through self-critique
Production-Ready: Built on PydanticAI with type safety and async support

Installation

# Core installation
pip install sifaka

# With model providers
pip install sifaka[models]

# With everything
pip install sifaka[all]

Quick Start

import asyncio
from pydantic_ai import Agent
from sifaka.agents import create_pydantic_chain
from sifaka.validators import LengthValidator
from sifaka.critics import ReflexionCritic
from sifaka.models import create_model

async def main():
    # Create PydanticAI agent
    agent = Agent("openai:gpt-4", system_prompt="You are a helpful assistant.")

    # Create Sifaka components
    validator = LengthValidator(min_length=50, max_length=500)
    critic = ReflexionCritic(model=create_model("openai:gpt-3.5-turbo"))

    # Create chain
    chain = create_pydantic_chain(
        agent=agent,
        validators=[validator],
        critics=[critic]
    )

    # Run the chain
    result = await chain.run("Write about renewable energy trends.")
    print(f"Generated: {result.text}")
    print(f"Iterations: {result.iteration}")

# Run the async function
asyncio.run(main())

How It Works

Create Thought: A central state container tracks the entire process
Generate: PydanticAI agent generates text with optional tool calling
Validate: Built-in validators check the output quality
Critique: If validation fails, critics provide improvement feedback
Iterate: The process repeats with feedback until validation passes or max iterations reached

The Thought container maintains complete audit trails including prompts, generations, validation results, critic feedback, and tool calls.

What Makes Sifaka Special?

🔬 Research-Backed: Direct implementations of breakthrough papers:

Reflexion (Shinn et al. 2023) - Self-reflection for iterative improvement
Constitutional AI (Anthropic) - Principle-based text evaluation
Self-Refine (Madaan et al. 2023) - Iterative self-improvement through critique
Self-RAG (Asai et al. 2023) - Retrieval-augmented self-critique
Meta-Rewarding (Wu et al. 2024) - Two-stage judgment with meta-evaluation

🎯 Production-Ready: Built on PydanticAI with full type safety, async support, and tool calling

📊 Complete Observability: Every decision is logged - see exactly why your AI made each choice

🔄 Guaranteed Improvement: Systems that get measurably better through validation and critique cycles

How Sifaka Works

Sifaka transforms unreliable AI text generation into a systematic quality assurance process through iterative validation and improvement:

graph TD
    A[Input Prompt] --> B[PydanticAI Agent]
    B --> C[Generate Initial Text]
    C --> D[Run Validators]
    D --> E{Validation Passed?}

    E -->|Yes| F[Run Critics]
    F --> G{Critics Suggest Improvements?}

    G -->|No| H[Final Output]
    G -->|Yes| I[Apply Critic Feedback]

    E -->|No| J[Validation Failed]
    J --> K[Apply Validation Context]
    K --> I

    I --> L[Generate Improved Text]
    L --> M{Max Iterations Reached?}

    M -->|No| D
    M -->|Yes| H

    style A fill:#e1f5fe
    style H fill:#c8e6c9
    style J fill:#ffcdd2
    style D fill:#fff3e0
    style F fill:#f3e5f5

The Sifaka Process

Generation: PydanticAI agent generates initial text from your prompt
Validation: Multiple validators check the text against your requirements (length, format, content, etc.)
Criticism: If validation passes, critics evaluate quality and suggest improvements
Improvement: Failed validation or critic feedback triggers text regeneration with specific guidance
Iteration: Process repeats until validation passes and critics are satisfied (or max iterations reached)
Observability: Every step is logged in a Thought object for complete audit trails

Key Benefits

Reliability: Systematic validation ensures outputs meet your criteria
Quality: Research-backed critics improve text beyond basic requirements
Transparency: Complete audit trail shows why each decision was made
Flexibility: Mix and match validators and critics for your specific needs
Production-Ready: Built on PydanticAI with full async support and type safety

Environment Setup

Set API keys in your environment:

export OPENAI_API_KEY=your_openai_api_key
export ANTHROPIC_API_KEY=your_anthropic_api_key
export GOOGLE_API_KEY=your_google_api_key

Advanced Examples

Adding Tool Calling

import asyncio
from pydantic_ai import Agent
from sifaka.agents import create_pydantic_chain

async def main():
    # Create agent with tools
    agent = Agent("openai:gpt-4", system_prompt="You are a helpful assistant.")

    @agent.tool_plain
    def search_web(query: str) -> str:
        """Search the web for information."""
        return f"Search results for: {query}"

    @agent.tool_plain
    def calculate(expression: str) -> str:
        """Calculate mathematical expressions."""
        try:
            result = eval(expression)  # Note: Use safely in production
            return f"Result: {result}"
        except:
            return "Invalid expression"

    # Create chain with tools
    chain = create_pydantic_chain(agent=agent)
    result = await chain.run("Search for Python tutorials and calculate 15 * 24")
    print(f"Result: {result.text}")

asyncio.run(main())

Using Multiple Critics

import asyncio
from pydantic_ai import Agent
from sifaka.agents import create_pydantic_chain
from sifaka.critics import ConstitutionalCritic, SelfRefineCritic
from sifaka.validators import LengthValidator
from sifaka.models import create_model

async def main():
    # Create agent and critics
    agent = Agent("openai:gpt-4", system_prompt="You are a helpful assistant.")
    critic_model = create_model("openai:gpt-3.5-turbo")

    constitutional_critic = ConstitutionalCritic(
        model=critic_model,
        principles=["Be helpful", "Avoid harmful content", "Be accurate"]
    )
    self_refine_critic = SelfRefineCritic(model=critic_model)

    # Create chain with multiple critics
    chain = create_pydantic_chain(
        agent=agent,
        validators=[LengthValidator(min_length=100, max_length=1000)],
        critics=[constitutional_critic, self_refine_critic],
        max_improvement_iterations=3
    )

    result = await chain.run("Write about the benefits of renewable energy.")
    print(f"Final text: {result.text}")
    print(f"Iterations: {result.iteration}")

asyncio.run(main())

Development

# Install development dependencies
uv pip install -e ".[dev]"

# Run tests
make test

# Format code
make format

Contributing

We welcome contributions! See the examples/ directory for working examples of different critics and validators.

License

MIT License - see LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 417 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
examples		examples
sifaka		sifaka
.coveragerc		.coveragerc
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
codecov.yml		codecov.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sifaka

What is Sifaka?

Installation

Quick Start

How It Works

What Makes Sifaka Special?

How Sifaka Works

The Sifaka Process

Key Benefits

Environment Setup

Advanced Examples

Adding Tool Calling

Using Multiple Critics

Development

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

srivassumit/sifaka

Folders and files

Latest commit

History

Repository files navigation

Sifaka

What is Sifaka?

Installation

Quick Start

How It Works

What Makes Sifaka Special?

How Sifaka Works

The Sifaka Process

Key Benefits

Environment Setup

Advanced Examples

Adding Tool Calling

Using Multiple Critics

Development

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages