
Cloud Sonnet 4 is upgraded, and it can now recall the reference to 1 million token references, but only when it is used through API. This may change in the future.
This is 5x more than the previous limit. This also means that Cloud now supports remembering more than 75,000 lines in a session, or even hundreds of documents.
Previously, you needed to present details to the cloud in Chhoti Vandu, but it also meant that Cloud would forget this reference as it hits the range. To the 1 million reference limit, you can create a better app, and the cloud can miss your code more than before.
It is worth noting that 1 million reference limit is limited to Sonnet 4. OPUS 4.1 still has old boundaries because it is an expensive model.
Only APIs get 1 million token reference limits
The new reference border is rolling out through the anthropic API for customer 4 and custom rates, with comprehensive availability rolling in the coming weeks.
“Long references are also available in Amazon Bedrock and soon coming to the vertex AI of Google Cloud,” Anthropic noted,
“You can with 1M tokens: Load the entire codebase with all dependence, analyze hundreds of documents at once, and create agents that maintain references in hundreds of tools. The pricing adjusts for more signs than 200K tokens, but can reduce quick cashing costs and delay.”
Cloud’s mobile and web apps will get 1 million token reference limits at some point in future.


