Hugging Face

Access the Hugging Face Dataset Viewer API to query, explore, search, and analyze machine learning datasets from the Hugging Face Hub.

Hugging Face MCP Server

MCP server providing access to the Hugging Face Dataset Viewer API. Query datasets, explore data, search content, and analyze statistics from the Hugging Face Hub's extensive collection of machine learning datasets.

Features

  • 12 MCP Tools covering all API endpoints:
    • get_dataset_splits - Dataset splits information
    • get_dataset_info - Dataset metadata and features
    • get_dataset_first_rows - Preview first 100 rows
    • get_dataset_rows - Paginated data access
    • search_dataset - Full-text search within datasets
    • get_dataset_size - Dataset size information
    • get_dataset_statistics - Statistical analysis
    • filter_dataset - SQL-like data filtering
    • check_dataset_validity - Dataset validation
    • get_dataset_parquet - Parquet file information
    • get_dataset_opt_in_out_urls - Opt-in/out URLs
    • get_dataset_presidio_entities - PII entity detection

Quick Start

Claude Desktop Integration

Add to your Claude Desktop claude_desktop_config.json:

{
  "mcpServers": {
    "huggingface-mcp": {
      "command": "docker",
      "args": [
        "run",
        "--rm",
        "-i",
        "--name", "huggingface-mcp-claude",
        "huggingface-mcp:latest"
      ]
    }
  }
}

Build Docker Image

make docker-build

Development

Prerequisites

  • Python 3.12+
  • uv

Setup

make install    # Install dependencies
make test       # Run tests (38 tests)
make example    # Test all tools
make run        # Start server directly

Docker Commands

make docker-build    # Build image
make docker-run      # Run container
make docker-stop     # Stop container

API Coverage

Implements all GET endpoints from the Hugging Face Dataset Viewer API:

EndpointToolDescription
/splitsget_dataset_splitsDataset splits information
/infoget_dataset_infoDataset metadata and features
/first-rowsget_dataset_first_rowsPreview first 100 rows
/rowsget_dataset_rowsPaginated data access
/searchsearch_datasetFull-text search within datasets
/sizeget_dataset_sizeDataset size information
/statisticsget_dataset_statisticsStatistical analysis
/filterfilter_datasetSQL-like data filtering
/is-validcheck_dataset_validityDataset validation
/parquetget_dataset_parquetParquet file information
/opt-in-out-urlsget_dataset_opt_in_out_urlsOpt-in/out URLs
/presidio-entitiesget_dataset_presidio_entitiesPII entity detection

Built with FastMCP following all development best practices.

Configuration

Copy the example environment file and configure as needed:

cp .env.example .env
# Edit .env with your configuration

Usage

Running the Server

make run

Running Tests

make test

Running Examples

make example

Docker

Build and Run

make docker-build
make docker-run

With Docker Compose

docker-compose up --build

Development

TODO: Add development guidelines

API Documentation

TODO: Add API documentation

Contributing

TODO: Add contributing guidelines

License

TODO: Add license information

Related Servers