Sure it is. They're well aware their product is a money furnace and they'd have to charge users a few orders of magnitude more just to break even, which is obviously not an option. So all that's left is.. convince users to burn tokens harder, so graphs go up, so they can bamboozle more investors into keeping the ship afloat for a bit longer.
If this claim is true (inference is priced below cost), it makes little sense that there are tens of small inference providers on OpenRouter. Where are they getting their investor money? Is the bubble that big?
Incidentally, the hardware they run is known as well. The claim should be easy to check.
It's an option and they are going to do it. Chinese models will be banned and the labs will happily go dollar for dollar in plan price increases. $20 plans won't go away, but usage limits and model access will drive people to $40-$60-$80 plans.
At cell phone plan adoption levels, and cell phone plan costs, the labs are looking at 5-10yr ROI.
That doesn’t mean they also can’t be wasteful. Fact is, Claude and gpt have way too much internal thinking about their system prompts than is needed. Every step they mention something around making sure they do xyz and not doing whatever. Why does it need to say things to itself like “great I have a plan now!” - that’s pure waste.
No, the argument is they want to sell more product to more people, not just more product (to the same people.) Given that a lot of their income is from flat-rate subscriptions, they make money with more people burning tokens rather than just burning more tokens.
After all, "the first hit's free" model doesn't apply to repeat customers ;-)