Skip to content

Core References

AI Model Reference Guide

Compare current AI model families, understand reasoning and sampling controls, and choose models for research reasoning, writing, and analysis tasks

Reference page for: AI model selection and configuration. Other pages link here for current model-family characteristics and provider-specific settings.

Overview: Current, but Not Permanent

Warning

AI model names, aliases, pricing, rate limits, and control parameters change quickly. This page was spot-checked against provider documentation on June 23, 2026. Before building a workflow around a specific model ID, check the provider's current model list.

The practical question is not "which model is best?" It is which model, access path, and control surface fit the task:

  • Reasoning depth: Can it handle theoretical tension and multi-step analysis?
  • Context capacity: Can it read the relevant corpus without destructive chunking?
  • Writing behavior: Does it preserve academic nuance and voice?
  • Cost and latency: Can you afford to iterate?
  • Control surface: Does it want reasoning effort, thinking level, prompt constraints, or sampling parameters?

Which AI Model Should I Use?

Best starting points:

  • Claude Opus 4.8 or Claude Sonnet 4.6 for theory, critique, and nuanced writing.
  • GPT-5.5 for reliable complex reasoning, coding-adjacent work, and structured fixes.
  • Gemini 3.5 Flash or Gemini 3.1 Pro preview for large-context synthesis.
  • GLM-5.2 for long-context coding-agent reasoning and Chinese-English synthesis.

Use cases: theory building, critical analysis, synthesis, methodology design.

Pro tip: Start with cheaper or free access for exploration, then move the strongest prompt and evidence set to the model you trust for final reasoning.

Full comparison: AI Model Reference Guide


Current Model Families

OpenAI GPT Family

  • Current anchor: GPT-5.5 for complex reasoning and coding; GPT-5.4, GPT-5.4 mini, and GPT-5.4 nano for lower cost/latency.
  • Context: up to 1M on GPT-5.5 / GPT-5.4; smaller variants vary.
  • Control surface: reasoning.effort where supported; sampling defaults are usually enough unless a specific API task needs tuning.
  • Best for: reliable editing, coding-adjacent research infrastructure, broad general reasoning.
  • Source: OpenAI model docs

Anthropic Claude Family

  • Current anchor: Claude Opus 4.8, Claude Sonnet 4.6, Claude Haiku 4.5.
  • Special status: Claude Mythos Preview is invitation-only and focused on defensive cybersecurity workflows. Claude Fable 5 launched June 9, 2026 as Anthropic's first public Mythos-class model, then was suspended globally on June 12, 2026 following a US export-control directive; do not rely on it until access is restored.
  • Control surface: effort for Opus 4.8; do not set non-default temperature, top_p, or top_k on Opus 4.8 Messages API requests.
  • Best for: theoretical writing, qualitative interpretation, long-form critique, careful synthesis.
  • Source: Claude model docs

Google Gemini Family

  • Current anchor: Gemini 3.5 Flash stable; Gemini 3.1 Pro preview; Gemini 3 Flash preview; Gemini 3.1 Flash-Lite stable.
  • Context: Gemini remains the large-context workhorse; exact limits and rate limits vary by tier and model.
  • Control surface: for Gemini 3.x, remove temperature, top_p, and top_k; use thinking_level where relevant.
  • Best for: large-corpus synthesis, volume processing, multimodal tasks, Google ecosystem workflows.
  • Source: Gemini API model docs

Access Paths

Access pathBest forNotes
Provider web appFast explorationGood for trying a model before wiring API access.
Google AI StudioGemini API keys and free-tier experimentsUse for direct Gemini model access, not Antigravity CLI auth.
Cherry StudioGUI model comparison and knowledge basesBest first interface for many researchers.
OpenCodeCLI-side provider comparisonUseful when model choice is part of the method.
Vox MCPMulti-model access inside an MCP clientPure passthrough; good for challenge and triangulation.
Claude Code / Antigravity CLIAgentic project workProduct sign-in flows, not generic provider-key clients.

Configuration: Defaults First

Sampling is no longer the default lever

The old advice was to tune temperature for every model. That is now misleading. Many current reasoning models either reject sampling changes, ignore them, or explicitly recommend defaults.

Default rule: omit temperature, top_p, and top_k unless the provider currently documents that the model supports and benefits from changing them.

Provider-specific notes:

Provider/modelRecommendation
Claude Opus 4.8Omit non-default temperature, top_p, top_k; non-default values can error.
Gemini 3.xRemove temperature, top_p, top_k; use thinking_level and prompt constraints.
Kimi K2.7 / K2.6Do not set temperature; it is not modifiable on these models.
GLM-5.2Default temperature is 1.0; prefer reasoning_effort for thinking control.
DeepSeekTemperature remains documented by use case; default is 1.0.
xAI Grok 4.3Prefer reasoning_effort; watch incompatible parameters on reasoning models.

Tip

If a compatible model forces you to choose a temperature and the provider does not give task-specific guidance, start at 1.0. Drop lower only for mechanical extraction or classification where repeatability matters.


Strategic Model Usage

Use cheap or free access paths to learn model behavior against your own corpus.

  • Start with Google AI Studio, Cherry Studio, OpenCode, or OpenRouter.
  • Run the same prompt across at least three model families.
  • Record failure modes, not just best answers.
  • Promote a model only after it handles your real material.

Free and Low-Cost Options

Google AI Studio remains the easiest free/low-cost backup for Gemini API work, but limits change often. Check Gemini rate limits before relying on a quota for teaching or batch work.

Other cost-management paths:

  • Use OpenCode or Cherry Studio to compare providers before committing to a long run.
  • Use DeepSeek or Qwen for high-volume screening when quality is sufficient.
  • Use local models when privacy matters more than frontier quality.
  • Reserve expensive models for the final synthesis, argument, or validation pass.

Next Step

The reference above tells you what is available. The companion page tells you how to test those choices against your own materials: AI Model Discovery Protocol.