Hugging Face
Access the Hugging Face Dataset Viewer API to query, explore, search, and analyze machine learning datasets from the Hugging Face Hub.
Hugging Face MCP Server
MCP server providing access to the Hugging Face Dataset Viewer API. Query datasets, explore data, search content, and analyze statistics from the Hugging Face Hub's extensive collection of machine learning datasets.
Features
- 12 MCP Tools covering all API endpoints:
get_dataset_splits- Dataset splits informationget_dataset_info- Dataset metadata and featuresget_dataset_first_rows- Preview first 100 rowsget_dataset_rows- Paginated data accesssearch_dataset- Full-text search within datasetsget_dataset_size- Dataset size informationget_dataset_statistics- Statistical analysisfilter_dataset- SQL-like data filteringcheck_dataset_validity- Dataset validationget_dataset_parquet- Parquet file informationget_dataset_opt_in_out_urls- Opt-in/out URLsget_dataset_presidio_entities- PII entity detection
Quick Start
Claude Desktop Integration
Add to your Claude Desktop claude_desktop_config.json:
{
"mcpServers": {
"huggingface-mcp": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"--name", "huggingface-mcp-claude",
"huggingface-mcp:latest"
]
}
}
}
Build Docker Image
make docker-build
Development
Prerequisites
- Python 3.12+
- uv
Setup
make install # Install dependencies
make test # Run tests (38 tests)
make example # Test all tools
make run # Start server directly
Docker Commands
make docker-build # Build image
make docker-run # Run container
make docker-stop # Stop container
API Coverage
Implements all GET endpoints from the Hugging Face Dataset Viewer API:
| Endpoint | Tool | Description |
|---|---|---|
/splits | get_dataset_splits | Dataset splits information |
/info | get_dataset_info | Dataset metadata and features |
/first-rows | get_dataset_first_rows | Preview first 100 rows |
/rows | get_dataset_rows | Paginated data access |
/search | search_dataset | Full-text search within datasets |
/size | get_dataset_size | Dataset size information |
/statistics | get_dataset_statistics | Statistical analysis |
/filter | filter_dataset | SQL-like data filtering |
/is-valid | check_dataset_validity | Dataset validation |
/parquet | get_dataset_parquet | Parquet file information |
/opt-in-out-urls | get_dataset_opt_in_out_urls | Opt-in/out URLs |
/presidio-entities | get_dataset_presidio_entities | PII entity detection |
Built with FastMCP following all development best practices.
Configuration
Copy the example environment file and configure as needed:
cp .env.example .env
# Edit .env with your configuration
Usage
Running the Server
make run
Running Tests
make test
Running Examples
make example
Docker
Build and Run
make docker-build
make docker-run
With Docker Compose
docker-compose up --build
Development
TODO: Add development guidelines
API Documentation
TODO: Add API documentation
Contributing
TODO: Add contributing guidelines
License
TODO: Add license information
Related Servers
K8s MCP Server
A server for Kubernetes CLI tools like kubectl, istioctl, helm, and argocd, supporting multi-cluster management via dynamic kubeconfig.
LlamaIndex MCP demos
Expose LlamaCloud services as MCP tools for building and managing LLM applications.
MCP Kubernetes Server
Control Kubernetes clusters through interactions with Large Language Models (LLMs).
mcp-pfsense
MCP server for managing pfSense firewalls through AI assistants — firewall rules, DHCP, DNS, gateways, ARP, and services. 17 tools with two-step confirmation for destructive operations.
mcp-server-insumer
On-chain token verification across 31 blockchains. 16 tools for ECDSA-signed attestations, discount codes, merchant discovery, and autonomous onboarding.
Google Ads
MCP server acting as an interface to the Google Ads, enabling programmatic access to Google Ads data and management features.
CData ShipStation
A read-only MCP server by CData for querying live ShipStation data, enabling LLMs to access shipping and order information.
SwarmSync AI Marketplace
A2A agent marketplace via MCP — discover agents, hire with AP2 escrow, route LLM prompts to the best model, check trust scores, and register as a marketplace agent.
IOL MCP Tool
Interact with the Invertir Online (IOL) API to manage investments and access market data.
AWS Cost Explorer & Bedrock Logs
Retrieve AWS spend data from Cost Explorer and Amazon Bedrock usage data from CloudWatch logs.