Skip to content

Web Agent is an automation tool driven by AI. Designed for seamless navigation and task execution on the web, it intelligently interacts with dynamic web elements, performs searches, downloads files, and adapts to page changes.

License

Notifications You must be signed in to change notification settings

CursorTouch/Web-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌐 Web-Agent

License Python Powered by Playwright
Follow on Twitter Join us on Discord

Web Agent is your intelligent browsing companion, built to seamlessly navigate websites, interact with dynamic content, perform smart searches, download files, and adapt to ever-changing pages — all with minimal effort from you. Powered by advanced LLMs and the robust Playwright framework, it transforms complex web tasks into streamlined, automated workflows that boost productivity and save time.

🛠️Installation Guide

Prerequisites

  • Python 3.11 or higher
  • Langgraph
  • Playwright

Installation Steps

Clone the repository:

git clone https://github.com/Jeomon/Web-Agent.git
cd Web-Agent

Install dependencies:

pip install -r requirements.txt

Setup Playwright:

playwright install

Setting up the .env file:

GOOGLE_API_KEY=""

Basic setup of the agent.

from src.inference.gemini import ChatGemini
from src.agent.web import WebAgent
from dotenv import load_dotenv
import os

load_dotenv()
google_api_key=os.getenv('GOOGLE_API_KEY')

llm=ChatGemini(model='gemini-2.0-flash',api_key=google_api_key,temperature=0)
agent=WebAgent(llm=llm,verbose=True,use_vision=False)

user_query=input('Enter your query: ')
agent_response=agent.invoke(user_query)
print(agent_response.get('output'))

Execute the following command to start the agent:

python app.py

🎥Demos

Prompt: I want to know the price details of the RTX 4060 laptop gpu from varrious sellers from amazon.in

Amazon.mov

Prompt: Make a twitter post about AI on X

Twitter.mov

Prompt: Can you play the trailer of GTA 6 on youtube

Youtube.mov

Prompt: Can you go to my github account and visit the Windows MCP

Github.mov

🪪License

This project is licensed under MIT License - see the LICENSE file for details.

🤝Contributing

Contributions are welcome! Please see CONTRIBUTING for setup instructions and development guidelines.


📒References


🤙🏾Contact

For queries or support, please reach out via GitHub Issues.

E-mail: [email protected], [email protected]

About

Web Agent is an automation tool driven by AI. Designed for seamless navigation and task execution on the web, it intelligently interacts with dynamic web elements, performs searches, downloads files, and adapts to page changes.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published