Academia MCP
Search for scientific publications across ArXiv, ACL Anthology, HuggingFace Datasets, and Semantic Scholar.
Academia MCP
MCP server with tools to search, fetch, analyze, and report on scientific papers and datasets.
Features
- ArXiv search and download
- ACL Anthology search
- Hugging Face datasets search
- Semantic Scholar citations and references
- Web search via Exa, Brave, or Tavily
- Web page crawler, LaTeX compilation, PDF reading
- Optional LLM-powered tools for document QA and research proposal workflows
Requirements
- Python 3.12+
Install
- Using pip (end users):
pip3 install academia-mcp
- For development (uv + Makefile):
uv venv .venv
make install
Quickstart
- Run over HTTP (default transport):
uv run -m academia_mcp --transport streamable-http
- Run over stdio (for local MCP clients like Claude Desktop):
python -m academia_mcp --transport stdio
Notes:
- Transports:
stdio
,sse
,streamable-http
. host
/port
are used for HTTP transports; ignored forstdio
. Default port is5056
(orPORT
).
Claude Desktop config
{
"mcpServers": {
"academia": {
"command": "python3",
"args": [
"-m",
"academia_mcp",
"--transport",
"stdio"
]
}
}
}
Available tools (one-liners)
arxiv_search
: Query arXiv with field-specific queries and filters.arxiv_download
: Fetch a paper by ID and convert to structured text (HTML/PDF modes).anthology_search
: Search ACL Anthology with fielded queries and optional date filtering.hf_datasets_search
: Find Hugging Face datasets with filters and sorting.s2_get_citations
: List papers citing a given arXiv paper (Semantic Scholar Graph).s2_get_references
: List papers referenced by a given arXiv paper.visit_webpage
: Fetch and normalize a web page.web_search
: Unified search wrapper; available when at least one of Exa/Brave/Tavily keys is set.exa_web_search
,brave_web_search
,tavily_web_search
: Provider-specific search.get_latex_templates_list
,get_latex_template
: Enumerate and fetch built-in LaTeX templates.compile_latex
: Compile LaTeX to PDF inWORKSPACE_DIR
.read_pdf
: Extract text per page from a PDF.download_pdf_paper
,review_pdf_paper
: Download and optionally review PDFs (requires LLM + workspace).document_qa
: Answer questions over provided document chunks (requires LLM).extract_bitflip_info
,generate_research_proposals
,score_research_proposals
: Research proposal helpers (requires LLM).
Availability notes:
- Set
WORKSPACE_DIR
to enablecompile_latex
,read_pdf
,download_pdf_paper
, andreview_pdf_paper
. - Set
OPENROUTER_API_KEY
to enable LLM tools (document_qa
,review_pdf_paper
, and bitflip tools). - Set one or more of
EXA_API_KEY
,BRAVE_API_KEY
,TAVILY_API_KEY
to enableweb_search
and provider tools.
Environment variables
Set as needed depending on which tools you use:
OPENROUTER_API_KEY
: required for LLM-related tools.BASE_URL
: override OpenRouter base URL.DOCUMENT_QA_MODEL_NAME
: override default model fordocument_qa
.BITFLIP_MODEL_NAME
: override default model for bitflip tools.TAVILY_API_KEY
: enables Tavily inweb_search
.EXA_API_KEY
: enables Exa inweb_search
andvisit_webpage
.BRAVE_API_KEY
: enables Brave inweb_search
.WORKSPACE_DIR
: directory for generated files (PDFs, temp artifacts).PORT
: HTTP port (default5056
).
You can put these in a .env
file in the project root.
Docker
Build the image:
docker build -t academia_mcp .
Run the server (HTTP):
docker run --rm -p 5056:5056 \
-e PORT=5056 \
-e OPENROUTER_API_KEY=your_key_here \
-e WORKSPACE_DIR=/workspace \
-v "$PWD/workdir:/workspace" \
academia_mcp
Or use existing image: phoenix120/academia_mcp
Examples
Makefile targets
make install
: install the package in editable mode with uvmake validate
: run black, flake8, and mypy (strict)make test
: run the test suite with pytestmake publish
: build and publish using uv
LaTeX/PDF requirements
Only needed for LaTeX/PDF tools. Ensure a LaTeX distribution is installed and pdflatex
is on PATH, as well as latexmk
. On Debian/Ubuntu:
sudo apt install texlive-latex-base texlive-fonts-recommended texlive-latex-extra texlive-science latexmk
Related Servers
Serper Search and Scrape
Web search and webpage scraping using the Serper API.
Tavily
A comprehensive search API for real-time web search, data extraction, and crawling, requiring a Tavily API key.
PortOne Global MCP Server
Search and read PortOne documentation, including API schemas and product guides.
Chaitin IP Intelligence
Search for IP addresses using Chaitin's IP Intelligence API.
G-Search MCP
A Google search server using Playwright for parallel keyword searches.
Marketaux
Search for market news and financial data by entity, country, industry, or symbol using the Marketaux API.
Semantic Scholar
Access Semantic Scholar's academic paper database through their API.
Hotel Booking
Search and book from over 2 million hotels with shopping and booking capabilities.
Google Search Console
An MCP server for accessing Google Search Console data, including site performance and indexing status.
Researcher MCP
A research assistant powered by Perplexity AI for intelligent search, documentation retrieval, and code assistance.