Created by: Xule LinGitHub: linxule/vox-mcpRuntime: Python / uvDesign philosophy: Minimal intervention. The only value Vox adds is routing and conversation memory — everything else is pure passthrough.
Why Vox?
When you’re working in Claude Code and want a second opinion from Gemini, GPT, or DeepSeek, you’d normally have to switch applications. Vox lets you query any model without leaving your current workflow. Key difference from alternatives: Most multi-model tools inject their own system prompts or modify your messages. Vox doesn’t. What you send is what the model receives.Supported Providers
| Provider | Env Variable | Example Models |
|---|---|---|
| Google Gemini | GEMINI_API_KEY | gemini-2.5-pro |
| OpenAI | OPENAI_API_KEY | gpt-5.1, gpt-5, o3, o4-mini |
| Anthropic | ANTHROPIC_API_KEY | claude-4-opus, claude-4-sonnet |
| xAI | XAI_API_KEY | grok-3, grok-3-fast |
| DeepSeek | DEEPSEEK_API_KEY | deepseek-chat, deepseek-reasoner |
| Moonshot (Kimi) | MOONSHOT_API_KEY | kimi-k2-thinking-turbo, kimi-k2.5 |
| OpenRouter | OPENROUTER_API_KEY | Any OpenRouter model |
| Custom/Local | CUSTOM_API_URL | Ollama, vLLM, LM Studio |
Core Tools
Vox provides three tools through the MCP protocol:chat
Send prompts to any supported model with optional file or image attachments.
listmodels
Show all available models, aliases, and capabilities across your configured providers.
dump_threads
Export conversation threads as JSON or Markdown — useful for documenting multi-model analysis.
Multi-Turn Conversations
Vox supports persistent threads viacontinuation_id. This means you can:
- Start a conversation with Gemini about a theoretical framework
- Continue the same thread with follow-up questions
- Switch to DeepSeek mid-conversation to get a different perspective
- Export the entire multi-model dialogue
Research Workflows
- Model Comparison
- Critical Challenge
- Triangulation
Compare perspectives on the same research question:Ask the same analytical question to 3-4 models and compare their responses. Each model brings different strengths — Claude for nuanced interpretation, Gemini for large-context synthesis, DeepSeek for cost-effective exploration.This is particularly valuable for:
- Theory development (different models foreground different tensions)
- Literature gap identification
- Methodological critique
Setup
MCP Client Configuration
Vox runs as a stdio MCP server. Replace/path/to/vox-mcp with the absolute path to your cloned repo.
- Claude Code
- Claude Desktop
- Cursor
- Windsurf
Via CLI:Or add to
.mcp.json in your project root:Configuration Options
Beyond API keys, Vox supports several configuration options in.env:
DEFAULT_MODEL
Set to
auto (default) to let the agent pick the best model, or specify a model name like gemini-2.5-pro to always route to that model.CONVERSATION_TIMEOUT_HOURS
How long conversation threads stay alive. Default: 24 hours. Threads expire after this period of inactivity.
MAX_CONVERSATION_TURNS
Maximum number of turns per conversation thread. Default: 100. Prevents runaway threads from consuming memory.
Model Restrictions
Per-provider allowlists like
GOOGLE_ALLOWED_MODELS, OPENAI_ALLOWED_MODELS, etc. Restrict which models are available to prevent accidental use of expensive models.Part of the Research Memex Ecosystem
Vox integrates naturally with other Research Memex tools:- Interpretive Orchestration Plugin — Multi-model triangulation during qualitative analysis
- Claude Code Setup Guide — Your primary research environment
- AI Model Reference Guide — Understanding which models to query for what