Enables developers to cache frequently used context between API calls, providing Claude with more background knowledge while reducing costs and latency.
Up to 90% cost reduction
Up to 85% latency reduction
Use Case | Latency w/o Caching | Latency w/ Caching | Cost Reduction |
---|---|---|---|
Chat with a book | 11.5s | 2.4s (-79%) | -90% |
Many-shot prompting | 1.6s | 1.1s (-31%) | -86% |
Multi-turn conversation | ~10s | ~2.5s (-75%) | -53% |
Our most intelligent model
Powerful model for complex tasks
Fastest, most cost-effective model
Notion is adding prompt caching to Claude-powered features for its AI assistant, Notion AI. This optimization allows for faster and cheaper operations while maintaining high-quality results.
"We're excited to use prompt caching to make Notion AI faster and cheaper, all while maintaining state-of-the-art quality."
— Simon Last, Co-founder at Notion
Explore our documentation and pricing page to start using the prompt caching public beta on the Anthropic API.
@Anthropic