Mastering MCP Servers

Vox MCP: Multi-Model AI Gateway

Access 8+ AI providers from any MCP client — pure passthrough with no system prompt injection. Created by Xule Lin.

Vox MCP is a multi-model AI gateway that lets you access any AI provider directly from Claude Code, Claude Desktop, Cursor, or any MCP client. Unlike other multi-model tools, Vox uses a pure passthrough design — prompts go to providers unmodified, responses come back unmodified. No system prompt injection, no response formatting, no behavioral directives.

Info

Created by: Xule Lin

GitHub: linxule/vox-mcp

Runtime: Python / uv

Design philosophy: Minimal intervention. The only value Vox adds is routing and conversation memory — everything else is pure passthrough.

Why Vox?

When you're working in Claude Code and want a second opinion from Gemini, GPT, or DeepSeek, you'd normally have to switch applications. Vox lets you query any model without leaving your current workflow.

Key difference from alternatives: Most multi-model tools inject their own system prompts or modify your messages. Vox doesn't. What you send is what the model receives.

Supported Providers

Provider	Env Variable	Example Models
Google Gemini	`GEMINI_API_KEY`	gemini-2.5-pro
OpenAI	`OPENAI_API_KEY`	gpt-5.1, gpt-5, o3, o4-mini
Anthropic	`ANTHROPIC_API_KEY`	claude-4-opus, claude-4-sonnet
xAI	`XAI_API_KEY`	grok-3, grok-3-fast
DeepSeek	`DEEPSEEK_API_KEY`	deepseek-chat, deepseek-reasoner
Moonshot (Kimi)	`MOONSHOT_API_KEY`	kimi-k2-thinking-turbo, kimi-k2.5
OpenRouter	`OPENROUTER_API_KEY`	Any OpenRouter model
Custom/Local	`CUSTOM_API_URL`	Ollama, vLLM, LM Studio

You only need API keys for providers you want to use. Vox works with any subset.

Core Tools

Vox provides three tools through the MCP protocol:

`chat`

Send prompts to any supported model with optional file or image attachments.

"Use vox chat with gemini-2.5-pro:
Compare these two theoretical frameworks and identify tensions..."

`listmodels`

Show all available models, aliases, and capabilities across your configured providers.

`dump_threads`

Export conversation threads as JSON or Markdown — useful for documenting multi-model analysis.

Multi-Turn Conversations

Vox supports persistent threads via continuation_id. This means you can:

Start a conversation with Gemini about a theoretical framework
Continue the same thread with follow-up questions
Switch to DeepSeek mid-conversation to get a different perspective
Export the entire multi-model dialogue

Threads are shadow-persisted to disk as JSONL for durability and can be exported as Markdown.

Research Workflows

Compare perspectives on the same research question:

Ask the same analytical question to 3-4 models and compare their responses. Each model brings different strengths — Claude for nuanced interpretation, Gemini for large-context synthesis, DeepSeek for cost-effective exploration.

This is particularly valuable for:

Theory development (different models foreground different tensions)
Literature gap identification
Methodological critique

Setup

Clone and install

git clone https://github.com/linxule/vox-mcp.git
cd vox-mcp
uv sync

Configure API keys

cp .env.example .env
# Edit .env — add at least one provider API key

Test the server

uv run python server.py

Add to your MCP client

See the configuration tabs below for your specific client.

MCP Client Configuration

Vox runs as a stdio MCP server. Replace /path/to/vox-mcp with the absolute path to your cloned repo.

Via CLI:

claude mcp add vox-mcp \
  -e GEMINI_API_KEY=your-key-here \
  -- uv run --directory /path/to/vox-mcp python server.py

Or add to .mcp.json in your project root:

{
  "mcpServers": {
    "vox-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
      "env": {
        "GEMINI_API_KEY": "your-key-here"
      }
    }
  }
}

Add to claude_desktop_config.json:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "vox-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
      "env": {
        "GEMINI_API_KEY": "your-key-here"
      }
    }
  }
}

Add to .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):

{
  "mcpServers": {
    "vox-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
      "env": {
        "GEMINI_API_KEY": "your-key-here"
      }
    }
  }
}

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "vox-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
      "env": {
        "GEMINI_API_KEY": "your-key-here"
      }
    }
  }
}

Tip

API keys can live in either the MCP client config or the .env file inside the vox-mcp directory (loaded automatically). If both are set and conflict, add VOX_FORCE_ENV_OVERRIDE=true to .env to prefer your local values.

Configuration Options

Beyond API keys, Vox supports several configuration options in .env:

DEFAULT_MODEL

Set to auto (default) to let the agent pick the best model, or specify a model name like gemini-2.5-pro to always route to that model.

CONVERSATION_TIMEOUT_HOURS

How long conversation threads stay alive. Default: 24 hours. Threads expire after this period of inactivity.

MAX_CONVERSATION_TURNS

Maximum number of turns per conversation thread. Default: 100. Prevents runaway threads from consuming memory.

Model Restrictions

Per-provider allowlists like GOOGLE_ALLOWED_MODELS, OPENAI_ALLOWED_MODELS, etc. Restrict which models are available to prevent accidental use of expensive models.

Warning

See .env.example in the repository for the full reference of all configuration options.

Part of the Research Memex Ecosystem

Vox integrates naturally with other Research Memex tools:

Interpretive Orchestration Plugin — Multi-model triangulation during qualitative analysis
Claude Code Setup Guide — Your primary research environment
AI Model Reference Guide — Understanding which models to query for what