Hacker Newsnew | past | comments | ask | show | jobs | submit | blurbleblurble's commentslogin

Maybe the algorithm has some kind of "momentum" to it, taking into consideration the velocity of upvotes.

Maybe he got notified from the mythos team of a bunch of vulnerabilities and then followed up using claude. Doesn't seem that unlikely.

What would you do if suddenly there were a dozen exploitable CVEs in your highly used open source project staring you down? Maybe you'd use the tool that found them to patch them as quickly as possible.


I am absolutely willing to give tridge the benefit of the doubt here, but a note on what you said: I don't think you should ever patch a CVE "as quickly as possible". You should do it slowly, be very sure of the change, and test the hell out of it. You can easily introduce a new security vulnerability by rushing something like that.

Good point. I just can't imagine the urgency and pressure I'd feel.

Looks like at least one of these issues was from a CVE [0], they don’t call out Mythos specifically though (“security researchers”). Many teams are sprinting on security issues atm (including mine, who put all product priorities aside two sprints ago), it must suck to be responsible for high-visibility/high-risk projects like rsync right now.

0: https://github.com/advisories/GHSA-pfv9-gp3h-73xv


Go look for yourself, quite a few mention CVEs.

Their loss

They have not

Reckless Ben has an amazing scientology series, worth watching

Try qwen 3.6 models with hermes and see for yourself. 27b is excellent and 35b is very good for basic agentic tasks.

4.7 broke my trust


I think I'm aligned with the idea that some parts of some workflows are mandatory - auth, read before edit, etc.

But otherwise, forge really doesn't own or opine much of the workflow. Step enforcement exists if you want it, so do prerequisites, but the idea is that those could be conditional or optional (you may never need to edit a file).

The guardrails are designed to work for non deterministic flows or deterministic ones. In the latter, you just might not have one of the guardrails active. It's much more about nudging the model back on track than laying more obvious tracks, in a sense.

Overall, agentic reliability is definitely an active field.


In this blog post I'm reading their call for "control flow" as a generalization of exactly what your work illustrates so nicely.

The blog post doesn't say to me "we need to start encoding specifically opinionated conditional branching statements that guide the model" rather I'm hearing a call to realize the broader principles of control flow itself relevant for composing programs with LLMs.

I think your work "nudges" us in that direction.


Nice ;). I'll take a closer read of it, that's on me - I am definitely seeing more people looking in this direction as agents start to ramp in production at the enterprise level, which I suspect is highlighting some of these failure modes at higher stakes. And also the cloud frontier API bills.


Nice explanation, thank you.

So basically the kind of thing I'd usually be doing manually with small models, over and over again, you just automate that nudging and off they go.

Sometimes LLMs have seemed to me like "computer programs with inertia" and in that frame what your tool does is identify and reduce friction at key points so the wheels can keep spinning.


Yep! The big frontier models are already quite good at doing that, and they have decent harnesses. That's why Opus on Claude Code does what it does.

Small models aren't there yet and they would veer off course, this just nudges them back onto the road. Whether or not they have a good sense of direction is a different question.


Really nice intuition, thank you.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: