Automate browser actions using natural language commands. Powered by Playwright and supports multiple LLM providers.
A FastMCP server that enables browser automation through natural language commands. This server allows Language Models to browse the web, fill out forms, click buttons, and perform other web-based tasks via a simple API.
Install with a specific provider (e.g., OpenAI)
pip install -e "git+https://github.com/yourusername/browser-use-mcp.git#egg=browser-use-mcp[openai]"
Or install all providers
pip install -e "git+https://github.com/yourusername/browser-use-mcp.git#egg=browser-use-mcp[all-providers]"
Install Playwright browsers
playwright install chromium
Add the browser-use-mcp server to your MCP client configuration:
{
"mcpServers": {
"browser-use-mcp": {
"command": "browser-use-mcp",
"args": ["--model", "gpt-4o"],
"env": {
"OPENAI_API_KEY": "your-openai-api-key", // Or any other provider's API key
"DISPLAY": ":0" // For GUI environments
}
}
}
}
Replace "your-openai-api-key"
with your actual API key or use an environment variable reference like process.env.OPENAI_API_KEY
.
import asyncio
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from mcp_use import MCPAgent, MCPClient
async def main():
# Load environment variables
load_dotenv()
# Create MCPClient from config file
client = MCPClient(
config={
"mcpServers": {
"browser-use-mcp": {
"command": "browser-use-mcp",
"args": ["--model", "gpt-4o"],
"env": {
"OPENAI_API_KEY": os.getenv("OPENAI_API_KEY"),
"DISPLAY": ":0",
},
}
}
}
)
# Create LLM
llm = ChatOpenAI(model="gpt-4o")
# Create agent with the client
agent = MCPAgent(llm=llm, client=client, max_steps=30)
# Run the query
result = await agent.run(
"""
Navigate to https://github.com, search for "browser-use-mcp", and summarize the project.
""",
max_steps=30,
)
print(f"\nResult: {result}")
if __name__ == "__main__":
asyncio.run(main())
~/Library/Application Support/Claude/claude_desktop_config.json
%AppData%\Claude\claude_desktop_config.json
{
"mcpServers": {
"browser-use": {
"command": "browser-use-mcp",
"args": ["--model", "claude-3-opus-20240229"]
}
}
}
The following LLM providers are supported for browser automation:
Provider | API Key Environment Variable |
---|---|
OpenAI | OPENAI_API_KEY |
Anthropic | ANTHROPIC_API_KEY |
GOOGLE_API_KEY | |
Cohere | COHERE_API_KEY |
Mistral AI | MISTRAL_API_KEY |
Groq | GROQ_API_KEY |
Together AI | TOGETHER_API_KEY |
AWS Bedrock | AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY |
Fireworks | FIREWORKS_API_KEY |
Azure OpenAI | AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT |
Vertex AI | GOOGLE_APPLICATION_CREDENTIALS |
NVIDIA | NVIDIA_API_KEY |
AI21 | AI21_API_KEY |
Databricks | DATABRICKS_HOST and DATABRICKS_TOKEN |
IBM watsonx.ai | WATSONX_API_KEY |
xAI | XAI_API_KEY |
Upstage | UPSTAGE_API_KEY |
Hugging Face | HUGGINGFACE_API_KEY |
Ollama | OLLAMA_BASE_URL |
Llama.cpp | LLAMA_CPP_SERVER_URL |
For more information check out: https://python.langchain.com/docs/integrations/chat/
You can create a .env
file in the project directory with your API keys:
OPENAI_API_KEY=your_openai_key_here
# Or any other provider key
.env
file.playwright install chromium
.--model
flag to specify a valid model for your provider.--debug
to enable more detailed logging that can help identify issues.MIT # browser-use-mcp
A server for integrating Jira with Claude, enabling project and issue management. Requires configuration via environment variables.
Access your WeChat Reading (微信读书) bookshelf, notes, highlights, and reviews.
A text enhancement tool that transforms story content into rich, detailed narratives using advanced literary techniques and heuristic analysis.
Talk with your Apple Notes
An intelligent shipping assistant for managing shipments, requiring a ShipBoss API token.
Enables AI assistants to seamlessly interact with your Twenty CRM data through its API.
A cognitive framework selector to help choose the right mental models and thinking frameworks for any situation.
A sound tool for MCP-compatible IDEs like Cursor. Plays sounds for events like completion, error, and notification.
A knowledge management server for stdlib and specs documents, with a configurable storage path.
Shipment tracking api and logistics management capabilities through the TrackMage API