Prompt Caching with Claude

What is Prompt Caching?

Enables developers to cache frequently used context between API calls, providing Claude with more background knowledge and example outputs.

💾

Now available on the Anthropic API

Key Benefits

  • 💰 Reduces costs by up to 90%
  • Reduces latency by up to 85% for long prompts
  • 🧠 Provides more context and examples to Claude

When to Use Prompt Caching

🗨️

Conversational agents: Reduce cost and latency for extended conversations

💻

Coding assistants: Improve autocomplete and codebase Q&A

📄

Large document processing: Incorporate complete long-form material including images

📚

Talk to books and papers: Bring any knowledge base alive for Q&A

Pricing for Cached Prompts

Action Cost
Writing to cache 25% more than base input token price
Using cached content Only 10% of base input token price

Customer Spotlight: Notion

Notion Logo

Notion is adding prompt caching to Claude-powered features for Notion AI

"We're excited to use prompt caching to make Notion AI faster and cheaper, all while maintaining state-of-the-art quality."

— Simon Last, Co-founder at Notion

AIbase Logo

@Anthropic

Tags:Prompt CachingClaude APIreduce costsreduce latencycontext for API callscoding assistantslarge document processingNotion AI