A Java-based MCP server for interacting with the Crawl4ai web scraping API.
jcrawl4ai-mcp-server is a Spring Boot-based MCP server that interacts with the Crawl4ai API to perform web crawling. The main functionalities include:
Configure the following properties in the src/main/resources/application.properties
file:
cawl4ai.base-url
: Base URL of the Crawl4ai server.cawl4ai.api-token
: API token for the Crawl4ai server.Example configuration:
cawl4ai.base-url=http://your-cral4ai-server-url:11235
cawl4ai.api-token=your-api-token
The project depends on the following libraries:
Build and run the project using Maven:
mvn clean install
java -jar target/jcawl4ai-mcp-server-1.0.0.jar
You can download the jar file from this link directly.
crawl
Methodurls
: Array of target website URLs.strategy
: Crawl strategy.max_depth
: Maximum depth.output_format
: Output format.task
MethodtaskId
: Task ID.Log file path: ./target/mcp-stdio-server.log
.
{
"mcpServers": {
"jcawl4ai-mcp-server": {
"autoApprove": [
"crawl",
"task"
],
"disabled": false,
"timeout": 60,
"command": "java",
"args": [
"-jar",
"/path/to/your/jar/file/jcawl4ai-mcp-server-1.0.0.jar"
],
"transportType": "stdio"
}
}
}
If you have any questions or suggestions, please contact Ken Ye.
Java 实现的 MCP 服务器,用于与 Crawl4ai API 进行交互。
jcrawl4ai-mcp-server 是一个基于 Spring Boot 的 MCP 服务器,用于调用 Crawl4ai API 进行网页爬取。该项目的主要功能包括:
在 src/main/resources/application.properties
文件中配置以下属性:
cawl4ai.base-url
:Crawl4ai 服务器的基础 URL。cawl4ai.api-token
:Crawl4ai 服务器的 API 令牌。示例配置:
cawl4ai.base-url=http://your-cral4ai-server-url:11235
cawl4ai.api-token=your-api-token
项目依赖于以下库:
使用 Maven 构建并运行项目:
mvn clean install
java -jar target/jcawl4ai-mcp-server-1.0.0.jar
您可以从以下链接中直接下载jar包: link
crawl
方法urls
:目标网站的 URL 数组。strategy
:爬取策略。max_depth
:最大深度。output_format
:输出格式。task
方法taskId
:任务 ID。日志文件路径为 ./target/mcp-stdio-server.log
。
{
"mcpServers": {
"jcawl4ai-mcp-server": {
"autoApprove": [
"crawl",
"task"
],
"disabled": false,
"timeout": 60,
"command": "java",
"args": [
"-jar",
"/path/to/your/jar/file/jcawl4ai-mcp-server-1.0.0.jar"
],
"transportType": "stdio"
}
}
}
如果您有任何问题或建议,请联系 Ken Ye。
Fetches and caches daily articles from GeekNews using web scraping.
Automate browser interactions in the cloud (e.g. web navigation, data extraction, form filling, and more)
Fetch and extract web content using a Playwright headless browser, with support for intelligent extraction and flexible output.
Access Outscraper's data extraction services for business intelligence, location data, reviews, and contact information from various online platforms.
AI tools for web scraping, crawling, browser control, and web search via the Oxylabs AI Studio API.
Easy web data access. Simplified retrieval of information from websites and online sources.
Fetches horse racing news from the thoroughbreddailynews.com RSS feed.
Extract web data with Firecrawl
Automate web browsers and perform web scraping tasks using the Playwright framework.
A Go-based MCP server for interacting with the Lightpanda Browser using the Chrome DevTools Protocol (CDP).