When does deepseek-chat get deprecated?

deepseek-chat and deepseek-reasoner are deprecated on 2026-07-24 15:59 UTC (about 19 days from today, 2026-07-05). After this date, any code using these names will fail. Switch to deepseek-v4-flash (non-thinking mode replacement) or deepseek-v4-pro (for higher capability) before the deadline.

Does the DeepSeek V4 base URL change?

No. The base URL stays the same. OpenAI format: https://api.deepseek.com. Anthropic format: https://api.deepseek.com/anthropic. You only need to change the model parameter from deepseek-chat or deepseek-reasoner to deepseek-v4-flash or deepseek-v4-pro.

How do I enable thinking mode in V4?

OpenAI SDK: pass extra_body={"thinking": {"type": "enabled"}} alongside reasoning_effort="high" or "max". Anthropic SDK: pass thinking={"type": "enabled"} alongside output_config={"effort": "high" or "max"}. Thinking mode is enabled by default; reasoning_effort controls intensity.

DeepSeek V3 to V4 Migration Guide: deepseek-chat Deprecated July 24, 2026

DeepSeek V3 → V4 Migration Guide

deepseek-chat and deepseek-reasoner will be deprecated on 2026-07-24 15:59 UTC. This guide walks through migrating to deepseek-v4-flash and deepseek-v4-pro.

19 days until July 24, 2026

Old name (deprecated)	New name	Capability mapping
`deepseek-chat`	`deepseek-v4-flash`	Non-thinking mode (default)
`deepseek-reasoner`	`deepseek-v4-flash` (with thinking enabled)	Thinking mode (add extra_body)
`deepseek-chat` (high-load)	`deepseek-v4-pro`	Upgrade to Pro for flagship performance

Old name (deprecated)

New name

Capability mapping

deepseek-chat

deepseek-v4-flash

Non-thinking mode (default)

deepseek-reasoner

deepseek-v4-flash (with thinking enabled)

Thinking mode (add extra_body)

deepseek-chat (high-load)

deepseek-v4-pro

Upgrade to Pro for flagship performance

Interface format	Base URL	Notes
OpenAI 兼容	`https://api.deepseek.com`	SDK auto-appends /v1/chat/completions
Anthropic 兼容	`https://api.deepseek.com/anthropic`	SDK auto-appends /v1/messages

Interface format

Base URL

Notes

OpenAI 兼容

https://api.deepseek.com

SDK auto-appends /v1/chat/completions

Anthropic 兼容

https://api.deepseek.com/anthropic

SDK auto-appends /v1/messages

# 迁移前 from openai import OpenAI client = OpenAI( api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com" ) response = client.chat.completions.create( model="deepseek-chat", # ❌ 旧名 messages=[{"role": "user", "content": "Hello"}], )

# 迁移后 from openai import OpenAI client = OpenAI( api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com" # 不变 ) response = client.chat.completions.create( model="deepseek-v4-flash", # ✅ 新名 messages=[{"role": "user", "content": "Hello"}], )

# 迁移前（如果你之前用第三方中转接 DeepSeek） # Claude Code settings.json { "env": { "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic", "ANTHROPIC_API_KEY": "<你的 Key>" } }

# 迁移后：Claude Code 直接用 V4 { "env": { "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic", "ANTHROPIC_API_KEY": "<你的 Key>", "ANTHROPIC_MODEL": "deepseek-v4-flash" // 或 deepseek-v4-pro } }

from openai import OpenAI client = OpenAI( api_key="<DeepSeek API Key>", base_url="https://api.deepseek.com" ) response = client.chat.completions.create( model="deepseek-v4-flash", messages=[{"role": "user", "content": "9.11 and 9.8, which is greater?"}], reasoning_effort="high", # high / max extra_body={"thinking": {"type": "enabled"}}, ) # 思考过程在 reasoning_content，最终答案在 content reasoning = response.choices[0].message.reasoning_content answer = response.choices[0].message.content

# 推荐写法：直接 append 整个 message 对象，reasoning_content 自动包含 messages.append(response.choices[0].message) # 而不是手动复制 content / reasoning_content / tool_calls 字段

Metric	DeepSeek V3	DeepSeek V4-Flash	DeepSeek V4-Pro
Context window	128K	1M	1M
Max output	8K	384K	384K
MRCR retrieval accuracy	~50%	83.5%	83.5%

Metric

DeepSeek V3

DeepSeek V4-Flash

DeepSeek V4-Pro

Context window

128K

Max output

384K

MRCR retrieval accuracy

~50%

83.5%

Model	Input (cache hit)	Input (cache miss)	Output
deepseek-v4-flash	$0.0028 / MTok	$0.14 / MTok	$0.28 / MTok
deepseek-v4-pro	$0.003625 / MTok	$0.435 / MTok	$0.87 / MTok

Model

Input (cache hit)

Input (cache miss)

Output

deepseek-v4-flash

$0.0028 / MTok

$0.14 / MTok

$0.28 / MTok

deepseek-v4-pro

$0.003625 / MTok

$0.435 / MTok

$0.87 / MTok

Do I need a new API key?

No. V4 uses your existing DeepSeek API key. If you already use deepseek-chat, your balance, key, and access carry over directly.

Will migration cause service downtime?

Just change the model parameter, no base_url, SDK, or network changes needed. Switch takes effect instantly, no deployment window. If issues arise, temporarily rolling back to deepseek-chat still works until 7-24.

I use vLLM / Ollama for local deployment. Will I be affected?

Local deployment doesn't relate to DeepSeek's official API model name deprecation — what model name you use depends entirely on your inference server. This guide targets the DeepSeek official API.

What should V3.2 users watch out for?

V3.2 (Speciale) is a 2025 mid-version, not on this deprecation list — but DeepSeek stopped V3.2 new-user registration earlier. Recommend migrating to V4-Flash or V4-Pro soon to enjoy 1M context and thinking mode.

Why split Flash and Pro? Which should I use?

Flash (284B / 13B) is lightweight, concurrency 2500, low price, suited for daily Q&A and batch tasks. Pro (1.6T / 49B) is flagship, concurrency 500, stronger coding and reasoning, suited for agent coding and long-chain reasoning. When in doubt, start with Flash — it's enough for most scenarios.

DeepSeek V3 → V4 Migration Guide

⏰ Key Dates

🔴 2026-07-24 15:59 UTC (legacy API deprecated)

📋 Model Name Mapping

🔗 Base URL Unchanged

💻 Code Migration Examples

Python + OpenAI SDK (most common)

Anthropic SDK (Claude Code / Cursor / etc.)

🧠 Thinking Mode Parameters

⚠️ Temperature params ignored in thinking mode

OpenAI SDK: Enable thinking mode

Anthropic SDK: Thinking intensity

Multi-turn: reasoning_content handling

📏 Context Window: 128K → 1M

💰 New Pricing (July 2026)

✅ Pre-7-24 Migration Checklist

❓ Migration FAQ

Do I need a new API key?

Will migration cause service downtime?

I use vLLM / Ollama for local deployment. Will I be affected?

What should V3.2 users watch out for?

Why split Flash and Pro? Which should I use?