EP08 intermediate

Running Claude Code with Alternative LLM Backends

Swap the model behind Claude Code using three environment variables. Compatible providers, cost math, and when cheap is actually expensive.

Claude Code ships wired to Anthropic’s API. Opus, Sonnet, Haiku — that’s your menu. But the wiring isn’t soldered. It’s pluggable. Three environment variables let you point Claude Code at any API that speaks the Anthropic protocol.

I’ve been running alternative backends for prototyping and bulk tasks since March. Here’s what works, what breaks, and when to bother.

The Three Environment Variables

export ANTHROPIC_BASE_URL="https://your-provider.com/v1"
export ANTHROPIC_API_KEY="your-api-key-here"
export ANTHROPIC_MODEL="provider-model-name"

That’s it. Claude Code reads these at startup and routes all requests to whatever endpoint you specify. No config files to edit, no plugins to install. Set them in your shell and launch.

ANTHROPIC_BASE_URL — The API endpoint. Must implement the Anthropic Messages API format (/v1/messages). Replace the default https://api.anthropic.com with your provider’s URL.

ANTHROPIC_API_KEY — The authentication key for your alternative provider. This replaces your Anthropic API key for the session.

ANTHROPIC_MODEL — The model identifier your provider expects. Each provider has its own naming scheme.

You can also set these permanently in ~/.claude/settings.json:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://your-provider.com/v1",
    "ANTHROPIC_API_KEY": "your-api-key-here",
    "ANTHROPIC_MODEL": "provider-model-name"
  }
}

I don’t recommend the permanent approach for most people. You’ll want to switch between Anthropic and alternatives depending on the task. Shell exports give you that flexibility. I keep a few aliases in my .zshrc:

alias claude-cheap='ANTHROPIC_BASE_URL="https://openrouter.ai/api/v1" \
  ANTHROPIC_API_KEY="sk-or-xxx" \
  ANTHROPIC_MODEL="anthropic/claude-sonnet-4" \
  claude'

alias claude-default='claude'  # Uses native Anthropic API

Compatible Providers

Any provider that implements the Anthropic Messages API format works. In practice, I’ve tested three:

OpenRouter

The most straightforward option. OpenRouter aggregates 200+ models behind a unified API, and they support the Anthropic message format.

export ANTHROPIC_BASE_URL="https://openrouter.ai/api/v1"
export ANTHROPIC_API_KEY="sk-or-v1-your-key"
export ANTHROPIC_MODEL="anthropic/claude-sonnet-4"

OpenRouter passes your request to Anthropic’s API but applies their own pricing. Sometimes cheaper (they negotiate volume discounts), sometimes not. The real value: access to non-Anthropic models through the same interface. You could run Claude Code’s orchestration layer on top of GPT-4o or Gemini, though compatibility gets shaky with non-Anthropic models.

Amazon Bedrock

If your company already uses AWS, Bedrock lets you run Claude models without a separate Anthropic contract. The pricing is slightly different (per-token billing through AWS), and you get AWS’s compliance and data residency guarantees.

export ANTHROPIC_BASE_URL="https://bedrock-runtime.us-east-1.amazonaws.com"
export ANTHROPIC_API_KEY="your-aws-credentials"
export ANTHROPIC_MODEL="anthropic.claude-sonnet-4-20250514-v1:0"

Bedrock has one limitation I’ve hit: extended thinking (the internal reasoning Claude does before responding) sometimes behaves differently. Most tasks work fine. Complex multi-step tool use occasionally stutters.

Google Vertex AI

Similar to Bedrock but on GCP. Same Claude models, different billing wrapper.

export ANTHROPIC_BASE_URL="https://us-east5-aiplatform.googleapis.com/v1/projects/YOUR_PROJECT/locations/us-east5/publishers/anthropic/models"
export ANTHROPIC_API_KEY="your-gcp-token"
export ANTHROPIC_MODEL="claude-sonnet-4@20250514"

The Cost Math

Anthropic’s direct pricing (as of early 2026):

ModelInput (per MTok)Output (per MTok)
Opus 4$15$75
Sonnet 4$3$15
Haiku 4$0.80$4

A typical coding session runs 50K-200K input tokens and 10K-50K output tokens. With Sonnet, that’s roughly $0.15 to $0.90 per session. Opus pushes that to $0.75 to $3.75.

OpenRouter sometimes marks up 5-10% on top of Anthropic’s base prices. Sometimes they match. Check their pricing page — it fluctuates.

The real savings come from model substitution. If you can tolerate Haiku-level quality for a task, you save 80% versus Sonnet. The question is whether the quality drop matters for what you’re doing.

What Breaks

I want to be honest about the trade-offs. Switching backends isn’t free.

Extended thinking. Anthropic’s extended thinking feature (where the model reasons internally before responding) is tightly coupled to their API. Some providers pass it through correctly. Others silently drop it. When extended thinking fails, you’ll notice: complex multi-step tasks that normally work start producing shallow or confused output.

Tool use edge cases. Claude Code’s tool system (file editing, bash execution, search) works through structured tool calls. Most providers handle simple tool calls fine. Parallel tool calls and deeply nested tool chains sometimes break. I’ve seen OpenRouter occasionally mangle the JSON schema for complex Edit operations.

Streaming behavior. Claude Code streams responses token-by-token. Alternative providers sometimes buffer chunks differently. The user experience might feel choppier — longer pauses between updates, then a burst of text. Functionally identical, but noticeably different to use.

Rate limits. Every provider has different rate limits. Anthropic’s limits are calibrated for Claude Code’s usage patterns. Third-party providers might throttle you in the middle of a complex task. Nothing more frustrating than a 30-minute refactoring session interrupted by a 429 error.

When to Switch

Good reasons to use an alternative backend:

  • Prototyping and exploration. You’re trying ideas, not shipping code. Quality dips don’t matter because you’re iterating fast and throwing most output away.
  • Bulk processing. Renaming 200 files, adding license headers, formatting markdown across a repo. Mechanical tasks where Haiku performs as well as Opus.
  • Cost-sensitive projects. Personal projects, side projects, learning exercises. A 50% cost reduction matters when it’s your credit card.
  • Corporate compliance. Your company mandates that all API calls go through AWS or GCP. Bedrock and Vertex solve this without giving up Claude Code.

Bad reasons to switch:

  • Production code. The 5% quality drop on an alternative backend compounds across a large codebase. A subtle bug that Opus would have caught but Haiku missed costs more to debug than you saved on tokens.
  • Multi-agent orchestration. When you’re running subagents (EP05 covers this), each agent’s tool calls must work flawlessly. One broken JSON schema in a 4-agent pipeline wastes the entire run.
  • Anything involving extended thinking. If a task benefits from deep reasoning (architectural decisions, complex debugging, security analysis), stick with Anthropic’s native API. The thinking quality on pass-through providers is unpredictable.

My Setup

I keep Anthropic as my default for real work. For bulk tasks and experimentation, I switch to OpenRouter with a shell alias. It takes two seconds and saves 20-40% on those sessions.

The switching cost is zero. The mental model is simple: serious work gets the real thing, casual work gets the budget option. I haven’t found a scenario where the alternative backend was better. But I’ve found plenty where it was good enough.

# Quick switch in any terminal
claude-cheap   # Launches with OpenRouter backend
claude         # Launches with Anthropic (default)

If you’re spending more than $50/month on Claude Code, setting up an alternative backend for your low-stakes work pays for itself in the first week. If you’re under $50/month, the savings probably aren’t worth the mental overhead of remembering which backend you’re on.

Start with OpenRouter. It’s the easiest to set up and the most broadly compatible. Graduate to Bedrock or Vertex if your infrastructure demands it.