A collection of practical examples demonstrating different approaches to browser automation and validation using modern web technologies and AI.
What it showcases: AI-powered validation of browser automation using Vision Language Models (VLM)
Demonstrates how to use Google's Gemini Vision API to analyze screenshots and determine if Puppeteer automation tasks completed successfully. Instead of traditional assertions, this example uses AI to visually validate automation results - useful for complex UI interactions that are difficult to test programmatically.
Key concept: VLM-as-a-judge pattern where AI acts as a visual validator for automation workflows.
Technologies: Puppeteer, Google Gemini Vision API, Node.js
Each directory contains its own README with specific setup instructions. Generally:
- Navigate to the example you want to explore
- Install dependencies:
npm install
- Configure any required API keys (see individual README files)
- Run the example:
npm start
When adding new examples:
- Create a new directory with a descriptive name
- Include a comprehensive README.md with setup instructions
- Add dependencies to package.json
- Include example outputs or screenshots where applicable
- Update this main README to document the new example
Example | Technology Stack | Complexity | Use Case |
---|---|---|---|
vlm-as-a-judge | Puppeteer + Gemini Vision | Intermediate | AI-powered automation validation |
More examples coming soon...