MCP Servers
DeepThonk: OpenDeepThink for Agents
A TypeScript CLI and MCP server that wraps hard tasks in OpenDeepThink-style candidate generation, pairwise judging, mutation, and ranking.
When one answer is too brittle, make the model argue with alternatives.
DeepThonk implements the OpenDeepThink algorithm as a provider-neutral CLI and MCP server. It runs a population of candidate answers through pairwise judging, Bradley-Terry ranking, critique-guided mutation, elite preservation, and a final dense ranking pass. The point is not speed. It is spending controlled test-time compute on tasks where breadth plus judgment can beat one expensive single shot.
Info
Created by: Xule Lin
GitHub: linxule/deepthonk
Runtime: TypeScript / Node.js
Package: deepthonk on npm, with dt as the short CLI alias
Algorithm source: OpenDeepThink, Zhou et al. (arXiv:2605.15177)
Host Support
DeepThonk exposes the same engine through a shell CLI and an MCP server. Use the CLI when you want a visible run directory. Use MCP when you want an agent to plan, start, poll, inspect, and export runs from inside its own workspace.
| Host | Status | How to add |
|---|---|---|
| Claude Code (CLI / Desktop) | Full | claude mcp add deepthonk -- npx -y deepthonk serve-mcp --transport stdio |
| Codex CLI / Codex Desktop | Full | codex mcp add deepthonk -- npx -y deepthonk serve-mcp --transport stdio |
| Claude Desktop (chat) | Full | Add an npx -y deepthonk serve-mcp --transport stdio server to claude_desktop_config.json |
| Cursor / VS Code / Windsurf | Full | Standard MCP stdio config |
| Any shell | Full CLI | npx -y deepthonk ... or npm install -g deepthonk |
Provider API keys come from the host process environment. DeepSeek is first-class; OpenAI-compatible endpoints and OpenRouter are configurable.
When to Use DeepThonk
- Hard reasoning tasks where a single response collapses too early.
- Coding and debugging plans that benefit from multiple candidate approaches before implementation.
- Literature synthesis where you want alternative framings ranked against an explicit rubric.
- Methodology design where critique-guided mutation can surface better versions of an initial design.
Avoid it for tiny questions, subjective taste calls, or tasks where judge noise is likely to dominate. DeepThonk spends many model calls by design; plan the budget before running.
Agent-Readable Surface
The MCP server is intentionally inspectable. An agent can:
| Surface | Purpose |
|---|---|
deepthonk.plan | Estimate calls and rounds before spending money |
deepthonk.start / deepthonk.status / deepthonk.result | Run long jobs asynchronously |
deepthonk.run | Blocking convenience for shorter jobs |
deepthonk.rank | Rank your own candidate set without generation |
deepthonk.mutate | Improve one candidate with critique |
deepthonk.export | Export run summaries or full traces |
deepthonk://runs/... resources | Inspect config, candidates, comparisons, scores, usage, and traces |
That traceability matters for research work. If a synthesis improves, you can inspect the candidates and judgments that got it there instead of treating the final answer as magic.
Quick Start
Run without installing:
npx -y deepthonk plan --profile paper
npx -y deepthonk run --provider fake --profile quick \
--task "Find the smallest positive integer divisible by 3, 4, and 5." \
--out runs/test-quick
npx -y deepthonk inspect runs/test-quickFor paid providers, configure a reusable profile first:
deepthonk setup \
--provider deepseek \
--api-key-env DEEPSEEK_API_KEY \
--fast-model deepseek-v4-flash \
--judge-model deepseek-v4-proThen plan before you run:
deepthonk plan --config ~/.config/deepthonk/config.yaml
deepthonk run --task task.md --config ~/.config/deepthonk/config.yaml --profile quick --dry-runRelationship to the Stack
DeepThonk sits next to Sequential Thinking, not underneath it. Sequential Thinking gives a single model a structured scratchpad. DeepThonk creates and ranks many attempts. Use Sequential Thinking when you want one transparent reasoning chain; use DeepThonk when you want search, mutation, judging, and a trace of alternatives.
It also pairs naturally with Vox: Vox gives you access to many models; DeepThonk gives you a repeatable protocol for spending more compute on a hard task.
Carrel status: DeepThonk is not installed by Carrel today. Add it manually when a project needs OpenDeepThink-style test-time compute.