Web Agent is your intelligent browsing companion, built to seamlessly navigate websites, interact with dynamic content, perform smart searches, download files, and adapt to ever-changing pages — all with minimal effort from you. Powered by advanced LLMs and the robust Playwright framework, it transforms complex web tasks into streamlined, automated workflows that boost productivity and save time.
- Python 3.11 or higher
- Langgraph
- Playwright
Clone the repository:
git clone https://github.com/Jeomon/Web-Agent.git
cd Web-Agent
Install dependencies:
pip install -r requirements.txt
Setup Playwright:
playwright install
Setting up the .env
file:
GOOGLE_API_KEY=""
Basic setup of the agent.
from src.inference.gemini import ChatGemini
from src.agent.web import WebAgent
from dotenv import load_dotenv
import os
load_dotenv()
google_api_key=os.getenv('GOOGLE_API_KEY')
llm=ChatGemini(model='gemini-2.0-flash',api_key=google_api_key,temperature=0)
agent=WebAgent(llm=llm,verbose=True,use_vision=False)
user_query=input('Enter your query: ')
agent_response=agent.invoke(user_query)
print(agent_response.get('output'))
Execute the following command to start the agent:
python app.py
Prompt: I want to know the price details of the RTX 4060 laptop gpu from varrious sellers from amazon.in
Amazon.mov
Prompt: Make a twitter post about AI on X
Twitter.mov
Prompt: Can you play the trailer of GTA 6 on youtube
Youtube.mov
Prompt: Can you go to my github account and visit the Windows MCP
Github.mov
This project is licensed under MIT License - see the LICENSE file for details.
Contributions are welcome! Please see CONTRIBUTING for setup instructions and development guidelines.
For queries or support, please reach out via GitHub Issues.
E-mail: [email protected], [email protected]