web-eval-agent is a Model Context Protocol (MCP) server that spins up a browser-use–capable debugging agent to autonomously run and evaluate web apps straight from your editor. It’s positioned as a “let the coding agent debug itself” companion: the agent launches the app, navigates flows, captures evidence, and iterates on failures without manual copy-pasting of logs. The repository focuses on developer ergonomics, exposing typed MCP tools so clients like Claude Desktop can start sessions, gather traces, and reason over failures with structured artifacts. Marketing and README material emphasize supercharging local debugging loops by combining live browser execution with LLM-driven hypotheses and fixes. Activity on the repo shows steady iteration, with issues and PRs centered on reliability and developer experience. In short, it wraps autonomous, in-editor web testing and diagnosis behind a predictable MCP interface.
Features
- One-command launch of an autonomous, browser-use debugging agent
- Typed MCP tools to start runs, collect traces, and summarize failures
- Structured artifacts from console, network, and DOM for LLM reasoning
- Tight editor loop for reproduce → diagnose → try-fix cycles
- Session summaries that highlight failing steps and probable causes
- Extensible hooks to customize app startup and environment setup