Skip to content

Feature: Sidecar Lifecycle Management Plugin #3062

@louis030195

Description

@louis030195

Summary

Add a lifecycle management plugin for Tauri sidecars that handles spawning, monitoring, health checks, auto-restart, and graceful shutdown of external binary processes.

Motivation

Currently, Tauri's sidecar support requires significant boilerplate for production-ready lifecycle management. Developers must manually implement:

  • Process spawning and monitoring
  • Crash detection and auto-restart with backoff
  • Port conflict resolution
  • Health checking before usage
  • Graceful shutdown on app exit
  • Process cleanup to avoid orphans
  • Cross-platform signal handling

This is especially critical for apps using sidecars as critical infrastructure (e.g., local MCP servers, database engines, API servers) in regulated industries requiring high reliability.

Current Implementation Pain Points

From our production implementation at Mediar (desktop automation platform):

Manual Lifecycle Management Required

In mediar-app (src-tauri/src/mcp_server.rs):

pub struct McpServerManager {
    port: u16,
    is_running: Arc<AtomicBool>,
    start_time: Option<Instant>,
    auto_restart_enabled: bool,
}

impl McpServerManager {
    // Manual implementation of:
    // - ensure_running() with health checks
    // - kill_existing_processes() with platform-specific cleanup
    // - find_available_port() for conflict resolution
    // - is_healthy() with HTTP polling
    // ~200+ lines of lifecycle management code
}

In terminator-mcp-agent (index.js wrapper):

function shutdown() {
    if (shuttingDown) return;
    shuttingDown = true;
    if (child && !child.killed) {
        if (child.stdin) child.stdin.end();
        const termTimeout = setTimeout(() => {
            if (!child.killed) {
                if (process.platform === "win32") {
                    killProcess(child);
                } else {
                    try {
                        process.kill(child.pid, "SIGTERM");
                    } catch (e) { }
                    setTimeout(() => {
                        if (!child.killed) killProcess(child);
                    }, 2000);
                }
            }
        }, 2000);
        child.on("exit", () => clearTimeout(termTimeout));
    }
    process.exit();
}

process.on("SIGINT", shutdown);
process.on("SIGTERM", shutdown);
process.on("exit", shutdown);

// Auto-restart logic
let restartAttempts = 0;
const MAX_RESTART_ATTEMPTS = 3;
child.on("exit", (code, signal) => {
    if (code === 0xC00000FD) { // Stack overflow on Windows
        console.error(`[Stack overflow detected, restarting...]`);
        // Restart logic...
    }
});

Problems This Creates

  1. Duplicate Code: Every Tauri app using sidecars reimplements the same patterns
  2. Platform Inconsistency: Windows vs Unix signal handling differs significantly
  3. Race Conditions: Manual process tracking prone to timing bugs
  4. Orphan Processes: Easy to leak processes if shutdown handlers fail
  5. No Standard Health Checks: Each app implements custom health monitoring
  6. Port Conflicts: Manual port management required
  7. Resource Leaks: stdout/stderr pipes need careful handling

Proposed Solution

Create a new plugin: tauri-plugin-sidecar-lifecycle

API Design

use tauri_plugin_sidecar_lifecycle::{SidecarBuilder, SidecarConfig, HealthCheck};

#[tauri::command]
async fn ensure_mcp_server(app: AppHandle) -> Result<SidecarInfo, String> {
    let config = SidecarConfig::builder()
        .name("terminator-mcp-agent")
        .args(["--transport", "http", "--port", "auto", "--cors"])
        .port_range(8080..8200)
        .health_check(HealthCheck::Http {
            path: "/health",
            timeout: Duration::from_secs(5),
            interval: Duration::from_secs(2),
        })
        .auto_restart(true)
        .max_restart_attempts(3)
        .restart_backoff(Duration::from_secs(1))
        .graceful_shutdown_timeout(Duration::from_secs(5))
        .on_crash(|error| {
            eprintln!("Sidecar crashed: {}", error);
        })
        .build();

    let manager = app.sidecar_lifecycle();
    manager.ensure_running("mcp-server", config).await
}

Key Features

1. Auto-Start & Health Monitoring

// Plugin handles:
// - Finding available port
// - Spawning process
// - Waiting for health check to pass
// - Retrying on failure
manager.ensure_running("my-server", config).await?;

2. Crash Detection & Auto-Restart

SidecarConfig::builder()
    .auto_restart(true)
    .max_restart_attempts(3)
    .restart_backoff(Duration::from_secs(1))
    .on_crash(|info: CrashInfo| {
        // Log to analytics, notify user, etc.
        println!("Crashed: {} (attempt {}/{})",
            info.exit_code, info.restart_attempt, info.max_attempts);
    })

3. Graceful Shutdown

// Plugin automatically:
// 1. Sends SIGTERM on app exit
// 2. Waits for graceful_shutdown_timeout
// 3. Sends SIGKILL if still running
// 4. Cleans up resources
app.on_exit(|app| {
    // Plugin handles cleanup automatically
});

4. Health Checks

enum HealthCheck {
    Http { path: String, timeout: Duration, interval: Duration },
    Tcp { port: u16, timeout: Duration },
    Command { cmd: String, args: Vec<String>, expected_exit: i32 },
    Custom(Box<dyn Fn() -> Future<Output = bool>>),
}

5. Port Management

SidecarConfig::builder()
    .port_range(8080..8200) // Auto-find available port
    .port_variable("PORT")   // Pass as env var or arg replacement

6. Process Cleanup

// Before starting, optionally kill existing processes
SidecarConfig::builder()
    .cleanup_on_start(true)
    .process_name_pattern("terminator-mcp-agent*")

Tauri Command Integration

#[tauri::command]
async fn get_sidecar_info(app: AppHandle) -> Result<SidecarInfo, String> {
    app.sidecar_lifecycle()
        .get_info("my-server")
        .ok_or("Server not running")
}

#[derive(Serialize)]
struct SidecarInfo {
    port: u16,
    is_running: bool,
    is_healthy: bool,
    url: String,
    uptime_seconds: u64,
    restart_count: u32,
    last_health_check: SystemTime,
}

JavaScript/TypeScript Usage

import { SidecarLifecycle } from '@tauri-apps/api/sidecar-lifecycle';

// Ensure running (idempotent)
const info = await SidecarLifecycle.ensureRunning('mcp-server');
console.log(`Server running on port ${info.port}`);

// Listen to events
await SidecarLifecycle.onCrash('mcp-server', (event) => {
    console.error('Server crashed:', event);
});

await SidecarLifecycle.onHealthy('mcp-server', () => {
    console.log('Server is healthy');
});

// Manual control
await SidecarLifecycle.restart('mcp-server');
await SidecarLifecycle.stop('mcp-server');
const status = await SidecarLifecycle.getStatus('mcp-server');

Benefits

For Developers

  • 80% less boilerplate: No manual lifecycle management code
  • Production-ready: Built-in crash recovery, health checks, cleanup
  • Consistent cross-platform: Plugin handles Windows/Mac/Linux differences
  • Type-safe: Full Rust and TypeScript types

For End Users

  • Reliability: Auto-restart prevents dead sidecars
  • Clean shutdown: No orphan processes
  • Better errors: Standardized error reporting with context

For Tauri Ecosystem

  • Lower barrier: Easier to build apps with sidecars
  • Best practices: Canonical way to manage sidecar lifecycle
  • Plugin ecosystem: Enables higher-level plugins (databases, servers, etc.)

Prior Art

  • systemd (Linux service manager): Restart policies, health checks
  • Docker/Kubernetes: Container lifecycle management
  • PM2 (Node.js): Process management with auto-restart
  • Windows Services: Automatic recovery actions

Implementation Notes

Phase 1: Core Features

  • Process spawning with platform-specific handling
  • Health check system (HTTP, TCP, custom)
  • Auto-restart with backoff
  • Graceful shutdown on app exit
  • Port management

Phase 2: Advanced Features

  • Process cleanup on start
  • Custom crash handlers
  • Metrics/telemetry hooks
  • Multiple sidecar coordination
  • Rolling restarts

Phase 3: Developer Experience

  • TypeScript bindings
  • Comprehensive examples
  • Migration guide from manual management
  • Performance benchmarks

Real-World Use Cases

  1. Local MCP Servers (our use case): AI agents need reliable local servers
  2. Embedded Databases: SQLite, DuckDB, Redis sidecars
  3. Backend Services: GraphQL, REST APIs running locally
  4. Development Tools: Language servers, linters, formatters
  5. Media Processing: FFmpeg, ImageMagick wrappers

Migration Path

Existing apps can migrate incrementally:

// Before (manual)
let mcp_server = MCP_SERVER.lock().await;
mcp_server.ensure_running(app_handle).await?;

// After (plugin)
app.sidecar_lifecycle()
    .ensure_running("mcp-server", config)
    .await?;

Open Questions

  1. Should this be part of tauri-plugin-shell or a separate plugin?
  2. How to handle sidecar dependencies (one sidecar needs another)?
  3. Support for sidecar-to-sidecar communication patterns?
  4. Integration with existing process monitoring tools?

References


Context: Built by Mediar team (AI desktop automation) after managing production sidecars for MCP servers in regulated industries. Happy to contribute implementation or collaborate on design.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions