Core References
AI Model Reference Guide
Compare current AI model families, understand reasoning and sampling controls, and choose models for research reasoning, writing, and analysis tasks
Reference page for: AI model selection and configuration. Other pages link here for current model-family characteristics and provider-specific settings.
Overview: Current, but Not Permanent
Warning
AI model names, aliases, pricing, rate limits, and control parameters change quickly. This page was spot-checked against provider documentation on June 23, 2026. Before building a workflow around a specific model ID, check the provider's current model list.
The practical question is not "which model is best?" It is which model, access path, and control surface fit the task:
- Reasoning depth: Can it handle theoretical tension and multi-step analysis?
- Context capacity: Can it read the relevant corpus without destructive chunking?
- Writing behavior: Does it preserve academic nuance and voice?
- Cost and latency: Can you afford to iterate?
- Control surface: Does it want reasoning effort, thinking level, prompt constraints, or sampling parameters?
Which AI Model Should I Use?
Best starting points:
- Claude Opus 4.8 or Claude Sonnet 4.6 for theory, critique, and nuanced writing.
- GPT-5.5 for reliable complex reasoning, coding-adjacent work, and structured fixes.
- Gemini 3.5 Flash or Gemini 3.1 Pro preview for large-context synthesis.
- GLM-5.2 for long-context coding-agent reasoning and Chinese-English synthesis.
Use cases: theory building, critical analysis, synthesis, methodology design.
Pro tip: Start with cheaper or free access for exploration, then move the strongest prompt and evidence set to the model you trust for final reasoning.
Full comparison: AI Model Reference Guide
Current Model Families
OpenAI GPT Family
- Current anchor: GPT-5.5 for complex reasoning and coding; GPT-5.4, GPT-5.4 mini, and GPT-5.4 nano for lower cost/latency.
- Context: up to 1M on GPT-5.5 / GPT-5.4; smaller variants vary.
- Control surface:
reasoning.effortwhere supported; sampling defaults are usually enough unless a specific API task needs tuning. - Best for: reliable editing, coding-adjacent research infrastructure, broad general reasoning.
- Source: OpenAI model docs
Anthropic Claude Family
- Current anchor: Claude Opus 4.8, Claude Sonnet 4.6, Claude Haiku 4.5.
- Special status: Claude Mythos Preview is invitation-only and focused on defensive cybersecurity workflows. Claude Fable 5 launched June 9, 2026 as Anthropic's first public Mythos-class model, then was suspended globally on June 12, 2026 following a US export-control directive; do not rely on it until access is restored.
- Control surface:
effortfor Opus 4.8; do not set non-defaulttemperature,top_p, ortop_kon Opus 4.8 Messages API requests. - Best for: theoretical writing, qualitative interpretation, long-form critique, careful synthesis.
- Source: Claude model docs
Google Gemini Family
- Current anchor: Gemini 3.5 Flash stable; Gemini 3.1 Pro preview; Gemini 3 Flash preview; Gemini 3.1 Flash-Lite stable.
- Context: Gemini remains the large-context workhorse; exact limits and rate limits vary by tier and model.
- Control surface: for Gemini 3.x, remove
temperature,top_p, andtop_k; usethinking_levelwhere relevant. - Best for: large-corpus synthesis, volume processing, multimodal tasks, Google ecosystem workflows.
- Source: Gemini API model docs
Access Paths
| Access path | Best for | Notes |
|---|---|---|
| Provider web app | Fast exploration | Good for trying a model before wiring API access. |
| Google AI Studio | Gemini API keys and free-tier experiments | Use for direct Gemini model access, not Antigravity CLI auth. |
| Cherry Studio | GUI model comparison and knowledge bases | Best first interface for many researchers. |
| OpenCode | CLI-side provider comparison | Useful when model choice is part of the method. |
| Vox MCP | Multi-model access inside an MCP client | Pure passthrough; good for challenge and triangulation. |
| Claude Code / Antigravity CLI | Agentic project work | Product sign-in flows, not generic provider-key clients. |
Configuration: Defaults First
Sampling is no longer the default lever
The old advice was to tune temperature for every model. That is now misleading. Many current reasoning models either reject sampling changes, ignore them, or explicitly recommend defaults.
Default rule: omit temperature, top_p, and top_k unless the provider currently documents that the model supports and benefits from changing them.
Provider-specific notes:
| Provider/model | Recommendation |
|---|---|
| Claude Opus 4.8 | Omit non-default temperature, top_p, top_k; non-default values can error. |
| Gemini 3.x | Remove temperature, top_p, top_k; use thinking_level and prompt constraints. |
| Kimi K2.7 / K2.6 | Do not set temperature; it is not modifiable on these models. |
| GLM-5.2 | Default temperature is 1.0; prefer reasoning_effort for thinking control. |
| DeepSeek | Temperature remains documented by use case; default is 1.0. |
| xAI Grok 4.3 | Prefer reasoning_effort; watch incompatible parameters on reasoning models. |
Tip
If a compatible model forces you to choose a temperature and the provider does not give task-specific guidance, start at 1.0. Drop lower only for mechanical extraction or classification where repeatability matters.
Strategic Model Usage
Use cheap or free access paths to learn model behavior against your own corpus.
- Start with Google AI Studio, Cherry Studio, OpenCode, or OpenRouter.
- Run the same prompt across at least three model families.
- Record failure modes, not just best answers.
- Promote a model only after it handles your real material.
Free and Low-Cost Options
Google AI Studio remains the easiest free/low-cost backup for Gemini API work, but limits change often. Check Gemini rate limits before relying on a quota for teaching or batch work.
Other cost-management paths:
- Use OpenCode or Cherry Studio to compare providers before committing to a long run.
- Use DeepSeek or Qwen for high-volume screening when quality is sufficient.
- Use local models when privacy matters more than frontier quality.
- Reserve expensive models for the final synthesis, argument, or validation pass.
Next Step
The reference above tells you what is available. The companion page tells you how to test those choices against your own materials: AI Model Discovery Protocol.