At $3/$15, Sonnet is more than an order of magnitude more expensive than DeepSeek at $0.435/$0.87 (with cached input pricing of $0.003625, DeepSeek is very good at caching, so it's very cheap to use). So, if they're equal in performance, DeepSeek is ten times better value.
But, from what I can tell DeepSeek is better than Sonnet, though I agree it is not at the level of current Opus or GPT 5.5 (but I think it probably beats Gemini Pro 3.1). I use the best model I can for code, because the cost of weaker performance is more than the $100/month I pay for Claude Opus, but it's worth knowing there are very cheap, very good, models for stuff I want to do that isn't Claude Code.
I think there are so many variables from harnesses to tasks, making it very hard to put the models to a pecking order unless one beats another in virtually every task (like in Opus vs DeepSeek).
I always thought that stuffing too much into an LLM context window was a lot like overloading a burrito.Keep cramming stuff in and eventually the tortilla gives out, and everything you added since quietly spills out the bottom.
Anyway, this agent probably has the structural integrity of a fat burito held from one corner :)
Comments in Github were usually horrible, but the AI stuff brought extra divisiveness. yt-dlp stops supporting bun because they call the rust rewrite a risk -> hate comments. rsync fixes security issues and gets some help from AI -> someone finds a bug and... hate comments. Poor maintainers.
I've been tasking LLMs to write a traditional AI for a full vibe-coded RTS. I remove the human players and let them battle. I don't know why but I enjoy watching AI players battle so much :)
In the repo, I even have a tournament script that calculates ELOs. So far, codex was unmatched. I'll try with Opus 4.8 too.
I think writing a static site generator was the first moment I felt like I may be serious about this programming thing.
Those losers who still need Perl on their servers better be ready for a mind explosion
...thought, me back in (too lazy to look up which year it was). I probably published like two things with it, spent (what felt like) a million person hours on it, just to abandon it and use Textpattern.
reply