I am confident that Anthropic make revenue from that $20 than the electricity and server costs needed to serve that customer.
Claude Code has rate limits for a reason: I expect they are carefully designed to ensure that the average user doesn't end up losing Anthropic money, and that even extreme heavy users don't cause big enough losses for it to be a problem.
Everything I've heard makes me believe the margins on inference are quite high. The AI labs lose money because of the R&D and training costs, not because they're giving electricity and server operational costs away for free.
How is caching implemented in this scenario? I find it unlikely that two developers are going to ask the same exact question, so at a minimum some work has to be done to figure out “someone’s asked this before, fetch the response out of the cache.” But then the problem is that most questions are peppered with specific context that has to be represented in the response, so there’s really no way to cache that.
From my understanding (which is poor at best), the cache is about the separate parts of the input context. Once the LLM read a file the content of that file is cached (i.e. some representation that the LLM creates for that specific file, but I really have no idea how that works). So the next time you bring either directly or indirectly that file into the context the LLM doesn't have to do a full pass, but pull its understanding/representation from the cache and uses that to answer your question/perform the task.
I wrote awhile ago on here that he should stick to his domain.
I was downvoted big time. Ah, I love it when people provide an example so it can finally be exposed without me having to say anything.
Unfortunately this is a huge problem on here - many people step outside of their domains, even if on the surface it seems simple, but post gibberish and completely mangled stuff. How does this benefit people who get exposed to crap?
If you don't know you are wrong but have an itch to polish your ego a bit then what's stopping you (them), right.
People form very strong opinions on topic they barely understand. I'd say since they know little the opinions come mostly from emotions, which is hardly a good path for objective and deeper knowledge.
Anthropic and OpenAI are both well documented as losing billions of dollars a year because their revenue doesn't cover their R&D and training costs, but that doesn't mean their revenue doesn't cover their inference costs.
Does it matter if they can't ever stop training though? Like, this argument usually seems to imply that training is a one-off, not an ongoing process. I could save a lot of money if I stopped eating, but it'd be a short lived experiment.
I'll be convinced they're actually making money when they stop asking for $30 billion funding rounds. None of that money is free! Whoever is giving them that money wants a return on their investment, somehow.
At some point the players will need to reach profitability. Even if they're subsidising it with other revenue - they'll only be willing to do that as long as it drives rising inference revenue.
Once that happens, whomever is left standing can dial back the training investment to whatever their share of inference can bear.
> Once that happens, whomever is left standing can dial back the training investment to whatever their share of inference can bear.
Or, if there's two people left standing, they may compete with each other on price rather than performance and each end up with cloud compute's margins.
Sure, but they will still need to dial it back to a point where they can fund it out of inference at some point. The point is that the fact they can't do that now is irrelevant - it's a game of chicken at the moment, and that might kill some of them, but the game won't last forever.
It matters because as long as they are selling inference for less than it costs to serve they have a potential path to profitability.
Training costs are fixed at whatever billions of dollars per year.
If inference is profitable they might conceivably make a profit if they can build a model that's good enough to sign up vast numbers of paying customers.
If they lose even more money on each new customer they don't have any path to profitability at all.
But only if you ignore all the other market participants, right? How can we ever reach a point where all the i.e. smaller Chinese competitors perpetually trailing behind SOTA with a ~9 month lag but at a tiny fraction of the cost stop existing?
I mean we just have to look at old discussions about Uber for the exact same arguments. Uber, after all these years, still is at a negative 10 % lifetime ROI , and that company doesn't even have to meaningfully invest in hardware.
IMO this will probably develop like the railroad boom in the first half of the 19th century: All the AI-only first movers like OpenAI and Anthropic will go bust, just like most railroad companies who laid the tracks, because they can't escape the training treadmill. But the tech itself will stay, and even become a meaningful productivity booster over the next decades.
I am also thinking long term where is the moat if it will inevitably lead to price competition? Like it's not a Microsoft product suite that your whole company is tied in multiple ways. LLMs can be quite easily swapped to another.
I'm curious just because you're well known in this space -- have you read Ed Zitron's work on the bubble, and if so what did you think of it? I'm somewhat in agreement with him that the financials of this just can't be reconciled, at least for OpenAI and Anthropic. But I also know that's not my field. I find his arguments a lot more convincing than the people just saying "ahh it'll work itself out" though.
My problem with Ed is that he's established a very firm position that LLMs are mostly useless and the business is a big scam, which makes it difficult to evaluate his reporting.
He often gathers good information but his analysis of that information appears to be heavily influenced by the conclusions he's already trying to reach.
I do pay attention to him but I'd like to see similar conclusions from other analysts against the same data before I treat them as robust.
I don't personally have the knowledge or experience of company finance to be able to confidently evaluate his findings myself!
At least until they are running out of customers. And/or societies with mass-unemployment destabilize to a degree that is not conducive for capitalists' operations.
Models are fixed. They do not learn post training.
Which means that training needs to be ongoing. So the revenue covers the inference? So what? All that means is that it doesn't cover your costs and you're operating at a loss. Because it doesn't cover the training that you can't stop doing either.
No they are not. They are exponentially increasing. Due to the exponential scaling needed for linear gain. Otherwise they'd fall behind their competition.
Fixed cost here means that the training costs stay the same no matter how many customers you have - unlike serving costs which have to increase to serve more people.
I used the word "fixed" there to indicate that the cost of training is unaffected by how many users you have, unlike the cost of serving the model which increases as your usage increases.
>make revenue from that $20 than the electricity and server costs needed to serve that customer
Seems like a pretty dumb take. It’s like saying it only takes $X in electricity and raw materials to produce a widget that I sell for $Y. Since $Y is bigger than $X, I’m making money! Just ignore that I have to pay people to work the lines. Ignore that I had to pay huge amounts to build the factory. Ignore every other cost.
They can’t just fire everyone and stop training new models.
Claude Code has rate limits for a reason: I expect they are carefully designed to ensure that the average user doesn't end up losing Anthropic money, and that even extreme heavy users don't cause big enough losses for it to be a problem.
Everything I've heard makes me believe the margins on inference are quite high. The AI labs lose money because of the R&D and training costs, not because they're giving electricity and server operational costs away for free.