Anthropic Quietly Gutted Caching. OpenAI Pulled Study Mode. Nobody Told You.

BluntAI Hot Take: silent nerf season at Anthropic and OpenAI — Silent nerf season: Anthropic cut cache TTL 12×, OpenAI yanked Study Mode, zero public changelog entries.

Pop quiz: when did the AI company you pay $200 a month last update you about a product change that directly cost you money?

UPDATE (2026-04-13): OpenAI’s help center now lists Study Mode as available on all ChatGPT plans globally. Looks like the pull was a staged rollout hiccup, not a permanent cut. Anthropic’s 5-minute cache TTL, meanwhile, is still in effect.

If you can’t answer, you’re not alone. You’re also not paying attention. In the last six weeks, two of the biggest AI companies on Earth quietly gutted features their customers rely on — one by flipping a default that ballooned bills by up to 12× per cache miss, the other by yanking a mode that students and autodidacts genuinely loved — and neither one bothered to write a changelog entry, a blog post, a tweet, or an in-product notification.

They just shipped the degradation and waited to see who’d notice.

Spoiler: people noticed. They’re angry. And the way Anthropic and OpenAI responded tells you everything about where AI pricing is heading in 2026.

Exhibit A: Anthropic cut prompt cache TTL from 60 minutes to 5 in the dark

On March 6, 2026, Anthropic changed the default time-to-live for Claude’s prompt cache from 1 hour to 5 minutes. They did this without a changelog entry. Without a status update. Without a developer forum post. The change landed, the metrics shifted, and developers started getting bills that looked like a math error.

GitHub issue #46829 showing Anthropic cache TTL regression closed as not planned — GitHub issue #46829 — filed April 12, 2026, closed as **NOT_PLANNED**.

A developer named seanGSISG filed GitHub issue #46829 on April 12 after analyzing 119,866 API calls across two machines from January 11 to April 11. The data is surgical:

February 1 – March 5: 1h TTL is consistently used. Zero tokens written to the 5-minute tier.
March 8 – April 11: 5m TTL becomes dominant. By late March, 93% of tokens are on the 5-minute tier.

The regression, he concluded, happened on or around March 6.

Here’s why that matters in dollars: a 5-minute cache write costs 12.5× more per token than a cache read — $3.75–$6.25 per million tokens versus $0.30–$0.50. If your coding session pauses for more than 5 minutes — to go make coffee, take a phone call, or, you know, think — the cache expires. On the next turn Claude has to re-upload the context as a fresh cache_creation at the expensive write rate instead of pulling from cheap reads.

seanGSISG’s own receipts, for just his two machines:

Claude Sonnet 4.6: $949.08 in overpayments, 17.1% waste
Claude Opus 4.6: $1,581.80 in overpayments, 17.1% waste

That is one guy. Extrapolate across every company running Claude Code at scale and you’re looking at a silent eight-figure transfer from customers to Anthropic, executed in the month after “ongoing cache optimization” became too boring to mention on the status page.

“Intentional” is not “communicated”

Here’s the part that really gets me.

Jarred Sumner from Anthropic responding that the cache change was intentional — Anthropic’s response, courtesy of Jarred Sumner: intentional, not a bug, closing.

When Jarred Sumner from Anthropic finally responded in that GitHub issue, his answer was essentially: the change was intentional, not a bug. It was part of ongoing cache optimization work. He then closed the issue as NOT_PLANNED.

Sumner’s technical defense — that 1-hour cache writes cost roughly 2× more than 5-minute writes, so forcing 1h everywhere isn’t universally cheaper — is actually correct for certain workloads. One-shot queries with no cache reuse would be more expensive under a 1h default. Fine. Fair point.

But that is not the story. The story is:

Anthropic changed a production default in a way that measurably moved customer bills.
They did it with zero customer-facing communication.
When a customer showed up with surgically detailed evidence, the reply was “intentional, closing this.”

You can make an argument that the new default is better for the average workload. I’m not even contesting that. I’m contesting that a company charging enterprise prices for production AI infrastructure does not get to A/B test your wallet in the dark.

Exhibit B: OpenAI yanked Study Mode from ChatGPT Plus

Hacker News thread: OpenAI silently removed Study Mode from ChatGPT — Hacker News: 170 points, 72 comments in 21 hours. Users noticed.

Cue the other shoe.

On April 12 a user named smokel posted on Hacker News: “Tell HN: OpenAI silently removed Study Mode from ChatGPT.” Short post. Bigger signal. In the thread:

Plus and Pro subscribers report the Study Mode toggle is gone from their UI.
The feature still works for Edu-plan users.
OpenAI’s own release notes page says nothing.
The Study Mode FAQ still claims the feature is available globally on all plans.

Users liked Study Mode for a specific reason. As one commenter put it:

Teaching the user how to solve problems instead of solving them outright.

Which is the opposite of how ChatGPT is usually optimized. Study Mode was a genuinely different product, aimed at the subset of users who wanted to learn rather than be handed an answer. And, per multiple people in the thread, the whole thing was “just a system prompt” under the hood. Which means its removal wasn’t a technical decision. It was a strategic one.

The most likely reason? Retention. Study Mode generates longer, slower sessions where the model explicitly refuses to just spit out the answer. Great for learning, terrible for the kind of engagement metrics that make boards happy. So OpenAI quietly clipped it off the consumer tiers and left it on the one plan where they can bill institutions regardless of session quality.

Fine. It’s their product. You know what’s not fine?

Lying about it in their own documentation. As of the time I’m writing this, the OpenAI Help Center still tells Plus users they have access to Study Mode. They don’t. If they click the toggle, the toggle isn’t there.

The new silent-nerf playbook

Stack these two stories next to each other and the playbook is obvious:

Ship the change.
Don’t tell anyone.
Wait to see who notices.
If a small community complains, shrug.
If it blows up on Hacker News, reluctantly engage.
If it really blows up, refund the loudest complainers and call it handled.

This worked in the Web 2.0 era because nobody was paying meaningful money for anything. You were the product. If Twitter removed a feature, fine — it was free, you took it.

The AI era is different. Developers are paying $20, $200, $2,000 a month for subscriptions they use for production work. When Anthropic silently 12×’s the cost of the same workload, that hits a P&L. When OpenAI quietly removes a mode a teacher has been using with her students, that hits a syllabus.

Pretending the old playbook still applies is either laziness or contempt. Pick one.

What “transparent” actually looks like

I’ll show you who does it right. Last quarter, AWS shipped a 1-hour caching option for Bedrock and announced it with a date, pricing, and a changelog entry. Google Cloud did the same for Vertex AI’s prompt caching. Stripe has an entire API changelog you can subscribe to. GitHub publishes every breaking change with weeks of advance notice.

Anthropic has a changelog page. Go look at it. The March 6 cache TTL change is not there.

OpenAI has a release notes page too. The Study Mode removal is not there either.

This is not a technical capability gap. These companies know how to publish changelogs. They chose not to publish these changes, because publishing them would have triggered the complaints they hoped to avoid.

What to do about it

If you’re running production workloads on Claude or ChatGPT, do three things this week:

1. Monitor your own bills. Not Anthropic’s dashboard — it’s reporting the thing that just got quietly changed. Pull the raw API invoices into a spreadsheet and watch the cache_write versus cache_read ratio yourself.

2. Pin your cache TTL explicitly. Claude’s API supports specifying TTL per request. Stop relying on the default. If you want 1h, ask for 1h. Anthropic documents this, it just chose to swap the default under you.

3. Assume the default is worse next week. That’s the lesson.

And to Anthropic and OpenAI specifically: if you want developers to treat your product as production infrastructure, you have to treat it as production infrastructure yourself. That means a changelog. That means advance notice. It means the word “intentional” does not double as the word “communicated.”

Until that happens, we’re all pricing in a “silent nerf tax” every time we forecast a quarter. Hope that’s worth it to you.