Anthropic Announces Claude Subscribers Must Now Pay Extra to Use OpenClaw (venturebeat.com) 46
Anthropic's making a big and sudden change — and connecting its Claude AI to third-party agentic tools "is about to get a lot more expensive," writes the Verge:
Beginning April 4th at 3PM ET, users will "no longer be able to use your Claude subscription limits for third-party harnesses including OpenClaw," according to an email sent to users on Friday evening. Instead, if users want to use OpenClaw with Claude, they'll have to use a "pay-as-you-go option" that will be billed separate from their Claude subscription.
Anthropic's announcement added these extra usage bundles are "now available at a discount." Users can also try Anthropic's API, notes VentureBeat, "which charges for every token of usage rather than allowing for open-ended usage up to certain limits, as the Pro and Max plans have allowed so far. " The technical reality, according to Anthropic, is that its first-party tools like Claude Code, its AI vibe coding harness, and Claude Cowork, its business app interfacing and control tool, are built to maximize "prompt cache hit rates" — reusing previously processed text to save on compute. Third-party harnesses like OpenClaw often bypass these efficiencies... [Claude Code creator Boris Cherny explained on X that "I did put up a few PRs to improve prompt cache hit rate for OpenClaw in particular, which should help for folks using it with Claude via API/overages."] Growth marketer Aakash Gupta observed on X that the "all-you-can-eat buffet just closed," noting that a single OpenClaw agent running for one day could burn $1,000 to $5,000 in API costs. "Anthropic was eating that difference on every user who routed through a third-party harness," Gupta wrote. "That's the pace of a company watching its margin evaporate in real time."
However, Peter Steinberger, the creator of OpenClaw who was recently hired by OpenAI, took a more skeptical view of the "capacity" argument."Funny how timings match up," Steinberger posted on X. "First they copy some popular features into their closed harness, then they lock out open source." Indeed, Anthropic recently added some of the same capabilities that helped OpenClaw catch-on — such as the ability to message agents through external services like Discord and Telegram — to Claude Code...
User @ashen_one, founder of Telaga Charity, voiced a concern likely shared by other small-scale builders: "If I switch both [OpenClaw instances] to an API key or the extra usage you're recommending here, it's going to be far too expensive to make it worth using. I'll probably have to switch over to a different model at this point."
"I know it sucks," Cherny replied. "Fundamentally engineering is about tradeoffs, and one of the things we do to serve a lot of customers is optimize the way subscriptions work to serve as many people as possible with the best mode..." OpenAI appears to be positioning itself as a more "harness-friendly" alternative, potentially using this moment as a customer acquisition channel for disgruntled Claude power users.
By restricting subscription limits to their own "closed harness," Anthropic is asserting control over the UI/UX layer. This allows them to collect telemetry and manage rate limits more granularly, but it risks alienating the power-user community that built the "agentic" ecosystem in the first place. Anthropic's decision is a cold calculation of margins versus growth. As Cherny noted, "Capacity is a resource we manage thoughtfully." In the 2026 AI landscape, the era of subsidized, unlimited compute for third-party automation is over. For the average user on Claude.ai, the experience remains unchanged; for the power users running autonomous offices, the bell has tolled.
Anthropic's announcement added these extra usage bundles are "now available at a discount." Users can also try Anthropic's API, notes VentureBeat, "which charges for every token of usage rather than allowing for open-ended usage up to certain limits, as the Pro and Max plans have allowed so far. " The technical reality, according to Anthropic, is that its first-party tools like Claude Code, its AI vibe coding harness, and Claude Cowork, its business app interfacing and control tool, are built to maximize "prompt cache hit rates" — reusing previously processed text to save on compute. Third-party harnesses like OpenClaw often bypass these efficiencies... [Claude Code creator Boris Cherny explained on X that "I did put up a few PRs to improve prompt cache hit rate for OpenClaw in particular, which should help for folks using it with Claude via API/overages."] Growth marketer Aakash Gupta observed on X that the "all-you-can-eat buffet just closed," noting that a single OpenClaw agent running for one day could burn $1,000 to $5,000 in API costs. "Anthropic was eating that difference on every user who routed through a third-party harness," Gupta wrote. "That's the pace of a company watching its margin evaporate in real time."
However, Peter Steinberger, the creator of OpenClaw who was recently hired by OpenAI, took a more skeptical view of the "capacity" argument."Funny how timings match up," Steinberger posted on X. "First they copy some popular features into their closed harness, then they lock out open source." Indeed, Anthropic recently added some of the same capabilities that helped OpenClaw catch-on — such as the ability to message agents through external services like Discord and Telegram — to Claude Code...
User @ashen_one, founder of Telaga Charity, voiced a concern likely shared by other small-scale builders: "If I switch both [OpenClaw instances] to an API key or the extra usage you're recommending here, it's going to be far too expensive to make it worth using. I'll probably have to switch over to a different model at this point."
"I know it sucks," Cherny replied. "Fundamentally engineering is about tradeoffs, and one of the things we do to serve a lot of customers is optimize the way subscriptions work to serve as many people as possible with the best mode..." OpenAI appears to be positioning itself as a more "harness-friendly" alternative, potentially using this moment as a customer acquisition channel for disgruntled Claude power users.
By restricting subscription limits to their own "closed harness," Anthropic is asserting control over the UI/UX layer. This allows them to collect telemetry and manage rate limits more granularly, but it risks alienating the power-user community that built the "agentic" ecosystem in the first place. Anthropic's decision is a cold calculation of margins versus growth. As Cherny noted, "Capacity is a resource we manage thoughtfully." In the 2026 AI landscape, the era of subsidized, unlimited compute for third-party automation is over. For the average user on Claude.ai, the experience remains unchanged; for the power users running autonomous offices, the bell has tolled.
I already cancelled my subscription (Score:5, Insightful)
I cancelled my subscription overnight, and I'm using the free credits they gave me to wrap up some things and transition away. I am not going to be locked into someone's walled garden again.
Re:I already cancelled my subscription (Score:5, Insightful)
Right now AI is in the equivalent of a drug dealer's 'hook em for cheap, then run up the cost once they are addicted.' Or, the bubble bursts and any remaining survivors will have survived by billing at a rate to sustain the business, instead of debt on debt on debt while claiming a 'free tier.'
Re: (Score:2)
I ordered 64gb of ram about an hour ago and i'm planning on running either qwen 35B-A3B 8 bit or 122B-A10B 3 bit in fully offline mode.
>the actual cost of 'running the AI.'
is a fixed $200 cost (ram upgrade) + electricity
Re: (Score:2)
Re: I already cancelled my subscription (Score:3)
It's about 5 tokens/second which is totally fine for an async assistant. 20 tokens/second is about the lower limit for usable in realtime. You can also set it up to use a smaller model for quick questions (what are the next 6 items on my calendar/to-do list?) and drop through to the bigger slower model for harder questions (can you add this feature to my internal ticketing system and redeploy?)
Re: (Score:2)
Re: (Score:2)
output is about 6 tokens/s with 16k context window i'm not having any issues since it went live this afternoon. it's not sparkling like opus 4.5/6 but gets the job done
Re: (Score:2)
Re: (Score:2)
I'm using a $200 used ~5 year old (from the ebay listing) HP EliteDesk 805 G6 DM Desktop Ryzen 5 PRO 4650GE 3.3GHz 32GB RAM 512GB SSD WiFi in cpu mode... you don't need a gpu to run single user local LLM... just a bunch of ram. This isn't 2022 anymore
Re: (Score:2)
- AI is thinking, please wait...
Re:I already cancelled my subscription (Score:4, Interesting)
i generally send it a voice note via telegram while driving and then check back in like 1-2 min, or it is sending me a reminder about something on our shared calendar. it's still faster than texting my buddy about making plans for this weekend or whatever.
Re: (Score:2)
Google Assistant has supported this for like a decade via "hey Google, note to self"
Re: I already cancelled my subscription (Score:2)
Yeah I'm kind of done feeding Google every scrap of information about me for advertising purposes. I switched to private search and email last year for about $100/year and don't miss them at all.
Re: (Score:2)
Re: I already cancelled my subscription (Score:2)
$200 for the memory but how much for the whole system? I have two development systems, both cost me less than 150 Canadian.
Re: (Score:3)
Sorry to rain on your parade, but qwen is no match for Anthropic's premium models. I've used both for coding relatively easy stuff, and qwen 3.5 puts lots of bugs even into three page shell scripts, while Claude's code can often be taken as is.
Why does this matter a lot? The biggest threat against lobsters is "prompt injection", and only top of the line LLMs are moderately resistant to it. Running an OpenClaw install based on an entry level LLM can be very risky once you give it access to passwords or perso
Re: (Score:2)
Qwen 3.5 is light years ahead of llama 3 and deepseek, but no comparison to Claude Opus 4.6. Sorry. Plus: the full 35B model requires either a massive GPU (in the multi thousand $ range), or at least a lot of RAM (which is currently a bit pricey). Either way: I have the strong impression, that OpenClaw will lose quite a few users over this.
Re: I already cancelled my subscription (Score:2)
You are talking about these local models like someone in the 90s saying that you always need a mainframe because workstation clients will never be fast enough. In reality the inference you use is readily quanitifyable and less than you think. In ten years the premium model is gonna be doing 500 prompts in the background for you for every one prompt you type in but the local one will be able to easily do what the premium ones do now. The way I think of a 486 these days....quiant. cute even. Is how you'll rem
Re: (Score:2)
I am fully aware, that very few years from now we'll be laughing at the models we use today, just as we laugh at the hallucinating mess we admired so much two years ago. GPUs will improve, CPU memory bandwidth will go way up, we'll have Raspberry Pi like systems which can do quality inference. I look forward to using each and every one of them.
However: some people want to run lobsters today, and they are mostly left out to dry for now. These folks paid a few dozen dollars per month to perform mundane tasks
Re: (Score:2)
The point goes both ways. They now give you (in their products, not in the API) subsided prices to build a user base. But the point is not to make you pay ten times later, because you then just switch to a competitor. The point they (and investors) are speculating for is that newer models can be smaller with same (or better) results and newer hardware is more efficient. If they now subside their product for let's say three years, they can later make profit at the same price for you, just by not giving the s
Re: I already cancelled my subscription (Score:2)
Well I'm happy paying a lot more than I do now. My productivity is rocketing, so as a freelancer it pays for itself ten times over.
Who did not see this coming (Score:4, Insightful)
Re:Who did not see this coming (Score:4, Interesting)
All the AI co's need to monetize.
This, or it's actually, as the party winds down and you see what your apartment looks like, the AI market seeing just how many *real* customers there might be. Re-think build-out, modify budgets, lock down you IP as if had real value. Potempkin market. It's just a herd of cattle raging along and at some point a single crow will dip in and whisper 'You know this trail goes to the packing plant, don't you?"
Let the enshitification begin (Score:5, Insightful)
Re: (Score:2)
So, how much for heated seats [slashdot.org]?
Re: (Score:3)
While I generally share the cynicism and doubt about AI as seen here in many Slashdot comments, your comment made me wonder if this is enshitification versus shitting their pants.
When you talk about milking their customers for what they can get, to me that usually connotes that the company is doing well and has little competition, so they can get greedy and get away with it.
However, as I read the article, this sounds more like reality is setting in. After burning money and trying to establish some buzz and
Makes sense (Score:2)
I'm using Claude (and other tools) to help train my specialised local agents...so by the time Claude Code becomes a lot more expensive I'll only need it a fraction of the amount (or transition to another service).
Re: (Score:1)
If by "punish", you mean "stop subsidizing" then yes; I suppose so.
Re: (Score:2)
Re: Makes sense (Score:2)
Literally could not care less.
Why is this surprising? (Score:3)
It should have been clear for a while (it was to me) that "AI" as SaaS at current level and in its current form is drastically far away from sustainable levels. When allegedly $1 spent on Claude costs Anthropic $13 ... no degree in rocket science is required to imagine what will happen. I have told people for some time "enjoy it while it lasts, since it *will* *not* *last*!"
What can you do?
First: Learn how to use tokens efficiently. That is an absolute must. Spend the money on commercial "AI" when you must.
Seconds: Run your own AI. Sure, you will never even come close to commercial models ... but do you need to? Plus, as things like TurboQuant or better are being developed, running locally becomes better. Your time learning how to do this locally will not be wasted.
In the end run hybrid. Get the best rig you can afford locally to accommodate the models you need most of the time, and pay the big boss (only) when you have to.
The more you can isolate yourself from the serfdom the large players like to impose on us, the better.
And yes, I put my money where my mouth is and bought a dedicated LLM machine for my personal setup. You should too, if you have the resources, time, funds, and knowledge ... And if not, consider it a challenge to overcome. IMHO it is well worth it.
Re: (Score:3)
It is devious and evil, but shows a lot of forward thinking.
Anyone that comes out with a user expandable unified memory architecture using mostly common components could be a
Cloud computing anybody? (Score:2)
Just like cloud computing.
Hey it's cheaper than hosting it yourself!
???????
Profit!
Inevitable (Score:2)
AI has been running at a big loss to get the users hooked. It was inevitable that prices would start climbing. That process is nowhere near done, running AI is expensive as hell.
Once the market starts reflecting the actual costs, you can bet the cost/benefit will not be nearly as rosy as it looks now. But some customers will already have gotten themselves between a rock and a hard place and will be sucked dry, then discarded. Those "expensive" people that are getting dumped will start looking like a bargain
Another one that cannot get profitable (Score:2)
I guess we are seeing the start of the end of the hype. "Investors", dumb and clueless as they may be, are not freely pouring money into the bottomless pit that LLMs are anymore.
Re: (Score:2)
So you predict an even more immediate and catastrophic crash? Well, maybe.
Re: (Score:2)
Do you think investors are going to just sink money into AI forever without a clear path to profitability? None of the big AI players are profitable - not a single one of them. Your free ChatGPT queries are costing OpenAI money. Even your ChatGPT Plus subscription still does not cover their costs.
Tools like Claude Code and Codex are also priced below what it costs to run them.
These investments are mostly based on the promises of AGI - the path to which remains elusive. How long will investors keep throwing
Re: Another one that cannot get profitable (Score:2)
Local LMs worth it? (Score:2)
For about $3000 USD you can buy an AMD Ryzen AI 395 with 128 GB of integrated RAM, which I'm tempted to do to run coding models. Although it seems to me that 256 GB is more of the sweet spot for local LLMs that can do things at a decent speed. For that size of RAM, the only real game in town is the Mac Studio, will cost about $10k (and rising). Of course even $10k is cheaper than a personal assistant. Now with the true cost of agentic AI starting to fall on the customer, $10k doesn't seem so ridiculous
Re: (Score:2)
The smallest (and only) open weight model that gets Opus or Sonnet level coding performance is MiniMax M2.5, and you need about 512GB of VRAM for that model (with enough room for input tokens). At 128GB you are looking at Opus 4.0 / Haiku 4.5 level models like Qwen 3.5 122B-A1 at Q4 or Qwen3-Coder-Next 80B-A3B at Q8.
I think it's likely we will have small language models that specifically target coding that are at Opus 4.6 quality on 128-256GB of VRAM in the next couple years, I don't think we are there yet.
Re: (Score:2)
But yea if you want to run large models for a reasonable price it's the only game in town right now.
Nobody knows what they pay for anyways (Score:2)
Anthropic (and other LLM providers) have an utterly appalling subscription model where they basically let you pay without knowing at all what you get for the payment. There is no information whatsoever anywhare on their web pages, in the pages where they describe plans, nor will they provide it when asked on the support channel, on the number of requests, number of input/output tokens or anything measurable per unit of time they would promise to give you. All they tell you is that you get k times more for