MCP Servers

MinerU MCP: Document Parsing

MinerU MCP turns difficult documents into agent-readable text while keeping parsing steps inspectable.

Complex documents only become research material when they become readable text.

MinerU MCP integrates MinerU's document parsing API directly into Claude, so a research session can parse difficult documents without switching tools.

Info

Created by: Xule Lin | Version: 1.1.4 | GitHub

Key stats: 90%+ accuracy (VLM mode) | 109 languages | Up to 200 documents per batch | 73% token reduction

Useful for: systematic literature reviews, batch PDF processing, and research corpus preparation

Tip

Part of the Toolkit: MinerU is available as an optional MCP in the Interpretive Orchestration Plugin, where it supports PDF parsing for qualitative research workflows.

Host support

MinerU runs as a standard stdio MCP server, available via npx, Smithery, or a local clone.

The public package still installs through npx -y mineru-mcp; recent source updates moved contributor tooling and release automation to Bun/OIDC, but users do not need Bun for normal MCP usage.

Host	Status	How to add
Claude Code (CLI)	Full	`claude mcp add mineru-mcp -e MINERU_API_KEY=… -- npx -y mineru-mcp`
Claude Code (Desktop "Code" tab)	Full	Same `.mcp.json` as the CLI
Codex CLI / Codex Desktop	Full	`codex mcp add mineru --env MINERU_API_KEY=… -- npx -y mineru-mcp`
Antigravity CLI	Adjacent	Configure through documented Antigravity settings/plugin paths; no verified MCP one-liner
Claude Desktop (chat)	Full	`claude_desktop_config.json`
Cursor / VS Code / Windsurf	Full	Standard MCP config
Cherry Studio, Witsy, Cline	Full	Smithery install or manual config

Per-host install commands are in the Installation & Setup section below.

An MCP server that wraps MinerU's document parsing API for Claude Code research workflows. Instead of switching between tools or running scripts, you can parse documents directly within the conversation.

Why use MinerU MCP instead of manual conversion?

integrated workflow: parse documents without leaving Claude
multi-format support: PDF, DOC, DOCX, PPT, PPTX, PNG, JPG, JPEG
batch processing: handle 200 documents simultaneously
local file workflow: upload files from your machine, poll for completion, download results
quality options: choose speed (Pipeline) or accuracy (VLM)
73% token reduction: shorter tool descriptions for lower context usage

The six tools

MinerU MCP provides six tools covering two workflows: URL-based parsing (tools 1--4) and a local file pipeline (tools 5--6).

1. `mineru_parse`

Process a single document with customizable options.

Parameter	Description	Default
`url`	Document URL (required)	-
`model`	`pipeline` (fast) or `vlm` (accurate)	pipeline
`pages`	Page ranges to parse (e.g. `"1-10,15"`)	all pages
`formats`	Extra export formats beyond markdown	-
`ocr`	Enable OCR for scanned documents	false
`formula`	Recognize mathematical/chemical formulas	false
`table`	Detect and extract tables	true
`language`	OCR language (109 supported)	`en`

Example prompt:

Parse pages 1-25 of this paper with VLM mode for maximum accuracy:
https://arxiv.org/pdf/2401.12345.pdf

2. `mineru_status`

Check task completion and get download URLs.

Parameter	Description	Default
`task_id`	Task ID from a parse request (required)	-
`format`	`concise` or `detailed` response	concise

Example prompt:

Check the status of my parsing job and download the markdown when ready

3. `mineru_batch`

Process multiple document URLs simultaneously for SLR corpus preparation.

Limits:

Maximum 200 documents per batch
200MB per file, 600 pages per document
2000 pages/day at high priority

Example prompt:

Batch process these 50 papers using VLM mode for my literature review:
[list of URLs]

4. `mineru_batch_status`

Retrieve paginated results from batch jobs.

Parameter	Description	Default
`batch_id`	Batch ID from a batch request (required)	-
`limit`	Number of results to return	-
`offset`	Pagination offset	0
`format`	`concise` or `detailed` response	concise

5. `mineru_upload_batch`

Upload local files from your machine for batch processing — no need to host files at a URL.

Parameter	Description	Default
`directory`	Path to a folder of documents	-
`files`	Array of specific file paths	-
`model`	`pipeline` (fast) or `vlm` (accurate)	pipeline
`formula`	Recognize formulas	false
`table`	Detect and extract tables	true
`language`	OCR language	`en`
`formats`	Extra export formats	-

Provide either directory or files (not both).

6. `mineru_download_results`

Download processed results as named markdown files to a local directory.

Parameter	Description	Default
`batch_id`	Batch ID to download results for (required)	-
`output_dir`	Local directory for output files (required)	-
`overwrite`	Overwrite existing files	false

Tip

Local file workflow: tools 5 and 6 enable a complete local pipeline. Upload files from your machine with mineru_upload_batch, poll with mineru_batch_status, then save results with mineru_download_results. No URLs or manual downloads needed.

VLM mode vs pipeline mode

Use for: academic papers, complex layouts, tables, formulas

90%+ accuracy using Vision Language Models
Slower processing (worth the wait for important documents)
Higher API cost
Better for SLR corpora where accuracy matters

Parse with model='vlm' for maximum accuracy

Use for: simple documents, speed priority, exploratory screening

Faster processing
Lower cost
Good for initial screening passes
Default mode for quick tasks

Quick parse this document for initial review

Use cases for research

1. SLR corpus preparation

Converting 50+ papers for systematic review:

I have 47 papers from my Scopus search that need to be converted
to markdown for analysis. Here are the URLs:
[paste URLs]

Use VLM mode for accurate table extraction. This is for my
systematic literature review on organizational learning.

2. Local file processing

When your papers are already downloaded (e.g., from Zotero):

Upload all PDFs in ~/Documents/slr-papers/ using VLM mode,
then download the results to ~/Documents/slr-markdown/

3. Batch processing for literature analysis

Screen a large set before detailed analysis:

Quick parse these 100 papers using pipeline mode to extract
abstracts and key sections. I'll do detailed VLM parsing
on the 20 most relevant ones later.

4. Multilingual research

MinerU supports 109 OCR languages:

Parse this German-language paper with OCR enabled and
language set to 'de'. Extract the methodology section.

Install paths

Two ways to set MinerU up:

Manually — get a MINERU_API_KEY from mineru.net, then register the MCP with your client. See the steps below.
Via Carrel — run /carrel-setup and say yes when the interview asks about complex / scanned PDFs. Carrel adds MinerU at project level and prompts for the API key.

The manual path works in any MCP client; the Carrel path is Claude Code-only but skips the config steps.

Installation & setup

Step 1: get API key

Visit mineru.net
Create account and generate API key
Save securely (you'll need it for configuration)

Step 2: install MCP

claude mcp add mineru-mcp -e MINERU_API_KEY=your-api-key -- npx -y mineru-mcp

Verify with claude mcp list — you should see mineru-mcp available.

codex mcp add mineru --env MINERU_API_KEY=your-api-key -- npx -y mineru-mcp

Antigravity CLI has a plugin/settings surface, but the currently verified agy command set does not include a top-level MCP add subcommand. Use Antigravity's documented MCP or plugin configuration path when Google publishes the exact host syntax; do not translate the old Gemini CLI MCP command literally.

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "mineru": {
      "command": "npx",
      "args": ["-y", "mineru-mcp"],
      "env": {
        "MINERU_API_KEY": "your-api-key"
      }
    }
  }
}

Add to your MCP settings JSON (VS Code uses servers, Cursor uses mcpServers):

{
  "mcpServers": {
    "mineru": {
      "command": "npx",
      "args": ["-y", "mineru-mcp"],
      "env": {
        "MINERU_API_KEY": "your-api-key"
      }
    }
  }
}

npx -y @smithery/cli install @linxule/mineru-mcp --client claude

Works with Claude Desktop, Cherry Studio, and other MCP clients. Set your API key in environment variables.

Info

MinerU MCP supports 11+ client configurations including Windsurf, Cline, Cherry Studio, and Witsy. See the full setup guide on GitHub for all options.

Configuration options

Variable	Default	Purpose
`MINERU_API_KEY`	Required	Bearer token from mineru.net
`MINERU_BASE_URL`	`https://mineru.net/api/v4`	API endpoint
`MINERU_DEFAULT_MODEL`	`pipeline`	Default parsing mode

Feature	MinerU MCP	Mistral OCR (Script)
Integration	MCP (inline in Claude)	Python script
Use case	Claude workflows, real-time	Bulk offline processing
Formats	PDF, DOC, DOCX, PPT, PPTX, images	PDF only
Batch limit	200 docs	Unlimited
VLM mode	Yes (90%+)	No
Local files	Yes (upload_batch)	Yes
Languages	109	Variable
Setup	API key + MCP	API key + Python
Cost	Per-page API	Per-page API

Recommendation: Use MinerU MCP for integrated Claude workflows and multi-format documents. Use Mistral script for very large offline batch jobs.

Limitations & considerations

API key required — Get from mineru.net
File size: 200MB max per file
Page limit: 600 pages per document
Daily quota: 2000 pages at high priority
VLM mode: More accurate but slower and costlier

Resources

GitHub: linxule/mineru-mcp
npm: mineru-mcp
Smithery: Install for any AI client
MinerU Platform: mineru.net
MinerU Open Source: opendatalab/MinerU
Related: OCR Guide | SLR Workflow

MinerU MCP: Document Parsing

Host support

What is MinerU MCP?

The six tools

1. `mineru_parse`

2. `mineru_status`

3. `mineru_batch`

4. `mineru_batch_status`

5. `mineru_upload_batch`

6. `mineru_download_results`

VLM mode vs pipeline mode

Use cases for research

1. SLR corpus preparation

2. Local file processing

3. Batch processing for literature analysis

4. Multilingual research

Install paths

Installation & setup

Step 1: get API key

Step 2: install MCP

Configuration options

Integration with Research Memex

With OCR guide

With SLR workflow

With Interpretive Orchestration

MinerU vs Mistral OCR

Limitations & considerations

Resources