Created by: Xule LinKey Stats: 90%+ accuracy (VLM mode) | 109 languages | Up to 200 documents per batch | 73% token reductionPerfect for: Systematic literature reviews, batch PDF processing, research corpus preparation
What is MinerU MCP?
An MCP server that wraps MinerU’s document parsing API, optimized for Claude Code research workflows. Instead of switching between tools or running scripts, you can parse documents directly within your AI conversation. Why use MinerU MCP instead of manual conversion?- Integrated workflow: Parse PDFs without leaving Claude
- Batch processing: Handle 200 documents simultaneously
- Quality options: Choose speed (Pipeline) or accuracy (VLM)
- 73% token reduction: Optimized tool descriptions for efficient context usage
The Four Tools
1. mineru_parse
Process a single document with customizable options.
| Parameter | Description | Default |
|---|---|---|
url | Document URL (required) | - |
model | pipeline (fast) or vlm (accurate) | pipeline |
ocr | Enable OCR for scanned documents | false |
formula | Recognize mathematical/chemical formulas | false |
table | Detect and extract tables | true |
language | OCR language (109 supported) | en |
2. mineru_status
Check task completion and get download URLs.
Example prompt:
3. mineru_batch
Process multiple documents simultaneously - perfect for SLR corpus preparation.
Limits:
- Maximum 200 documents per batch
- 200MB per file, 600 pages per document
- 2000 pages/day at high priority
4. mineru_batch_status
Retrieve paginated results from batch jobs.
VLM Mode vs Pipeline Mode
- VLM Mode
- Pipeline Mode
Best for: Academic papers, complex layouts, tables, formulas
- 90%+ accuracy using Vision Language Models
- Slower processing (worth the wait for important documents)
- Higher API cost
- Recommended for SLR corpus where accuracy matters
Use Cases for Research
1. SLR Corpus Preparation
Converting 50+ papers for systematic review:2. Batch Processing for Literature Analysis
Screen a large set before detailed analysis:3. Multilingual Research
MinerU supports 109 OCR languages:Installation & Setup
Step 1: Get API Key
- Visit mineru.net
- Create account and generate API key
- Save securely (you’ll need it for configuration)
Step 2: Install MCP
- Claude Code
- Smithery
- Claude Desktop
claude mcp list - you should see mineru-mcp available.Configuration Options
| Variable | Default | Purpose |
|---|---|---|
MINERU_API_KEY | Required | Bearer token from mineru.net |
MINERU_BASE_URL | https://mineru.net/api/v4 | API endpoint |
MINERU_DEFAULT_MODEL | pipeline | Default parsing mode |
Integration with Research Memex
With OCR Guide
MinerU MCP is the recommended approach for PDF conversion in Research Memex workflows. See the PDF to Markdown Conversion Guide for comparison with other methods.With SLR Workflow
Use MinerU for batch PDF processing in your Systematic Literature Review workflow. Perfect for converting your Zotero exports to AI-ready markdown.With Interpretive Orchestration
MinerU is bundled as an optional MCP in the Interpretive Orchestration Plugin for qualitative research. It powers document ingestion alongside Markdownify for a complete document processing pipeline.MinerU vs Mistral OCR
| Feature | MinerU MCP | Mistral OCR (Script) |
|---|---|---|
| Integration | MCP (inline in Claude) | Python script |
| Best for | Claude workflows, real-time | Bulk offline processing |
| Batch limit | 200 docs | Unlimited |
| VLM mode | Yes (90%+) | No |
| Languages | 109 | Variable |
| Setup | API key + MCP | API key + Python |
| Cost | Per-page API | Per-page API |
Limitations & Considerations
- API key required - Get from mineru.net
- File size: 200MB max per file
- Page limit: 600 pages per document
- Daily quota: 2000 pages at high priority
- VLM mode: More accurate but slower and costlier
Resources
- GitHub: linxule/mineru-mcp
- Smithery: Install for any AI client
- MinerU Platform: mineru.net
- MinerU Open Source: opendatalab/MinerU
- Related: OCR Guide | SLR Workflow
← Back to Advanced Topics