When I go on vacation on cruise ships I never pay for internet and my phone is only used for tracking time and photos. Why be on vacation just to doom scroll?
They're not the only ones in the AI sphere to wind back, but they're the weirdest case in my eyes. Microsoft invests in having engineers building open models and they don't use a single one. I really don't get it.
But what really surprised me most about Copilot is that it would bill you per question, nothing about tokens. So if I managed to produce a prompt that gave me back an insane amount of tokens for something, which using any Claude model would easily accomplish, you were giving me my money's worth, at your own expense. The math is not gonna math out forever.
Funny when you consider the world owes a lot of AI advancements to both Meta and Google, their open releases really did shift things, feel free to correct me if I'm wrong, especially for China, which as far as I know were not releasing as much in AI as they have been beforehand. I remember when Meta released Llama originally people were speculating about it, but it wound up producing a lot of projects that used it, I'm sure some in China. I know that Perplexity has its own custom model on top of Llama that they use for their default model, and its pretty darn good.
If I'm remembering right, it was weirder than that, as Llama's originally release strategy was sort of bizarre.
You did have to apply for access, but if you met their criteria (basically if you were the right profile of researcher or in government), you got direct access to the model weights, not just an API for a hosted model. So access was restricted, but the full weights were shared.
I believe that the model was leaked by multiple people, some of which didn't work at Meta but had been granted access to the weights.
Not sure, but open weights have had their effects. For example, look at Wan 2.2 the last open weights Wan release, still the most powerful Video inference out there, to the level of quality it provides, unfortunately, it went closed source, but before they did, the community had built all sorts of tooling and LoRas on top of it. Nothing comes close for video a year later. Back to llama though, look at all the open models people run offline through their Macs. It definitely had a net positive.
> Extrapolating that trend, we would be at about 87 GB worth of data today.
Throw in YouTube Shorts / TikTok etc and it makes me wonder if that estimate is drastically too low. We went from the information age, to the brainrot overload age, to let's both have brainrot and let computers think for us.
If that trend really wants to measure the quality of video etc. as well, it would definitely be way more. But that assumption seems very flawed to me, e.g. watching a full 4K movie would amount to way more data than scrolling through memes, even though the latter is way more of an attention-stealing activity.
I'm not a subject matter expert, but I'm wondering if all the context switching of short form video counts as way more drain on the brain, would be an interesting study. I have to think the brain eventually gets tired of all the short dopamine hits.
Why an AI agent has the keys to the kingdom is beyond me. Loads of companies don't even give developers this level of access to key infrastructure for a reason.
GoDaddy always struck me as a company ran by a "jock" (think Revenge of the Nerds) and all the technical people there are just there to collect a paycheck and don't care about the customers or going above and beyond, and it shows.
I prefer ticketing systems for AI. I dont care that it forgets what I did last week, I just need it to be able to compact its own memory and grab the next task once done.
I'm ambivalent about that. I've seen people use beads, and they're just making busy work for the agents, splitting stuff up into tiny tasks that could have been one-shotted as part of the larger plan. They seem to just enjoy making thinky machine go brrr, even when it makes the work take longer and burn a lot more tokens.
I tend to think developing with agents should look at lot like managing a human (like, I use feature-branch development with PRs and review them, even on my own projects that have no other devs and don't need a paper trail for security audit purposes), so I theoretically can get down with an issue based process, but thus far I haven't seen it done in a way that isn't just making busy work for agents.
Key things: I added a concept called "gates" which are tied to all tasks, it forces the agent to do arbitrary requirements such as: ensure it still runs / compiles, run all tests, ensure they pass, review existing tests critically and point out if they're not comprehensive enough, and finally, get human confirmation on the task. Until the human confirms, just work on another task and so on.
I didn't like that Beads was built on top of Git, I don't always work on git friendly projects, and beads kept getting messed up if I switched branches. So I made mine SQLite based. I also made it so you can sync to github issues, and sync pre-existing (and new) github issues as guardrails tasks to be worked on, the agent will even leave a comment for you on github when it grabs an issue in order to let others know the work will be done potentially.
It works hand in hand to be honest, because Claude will read tickets that match criteria of what I'm looking to work on, and tack them on to its todo list, it just becomes and overview of my tasks.
reply