Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A river has no will, but it can flood and destroy. A discussion whether AI does something because it "wants" to or not, is just philosophy and semantics. But it may end up generating a series of destructive instructions anyway.

We feed these LLMs all of the Web, including instructions how to write code, and how to write exploits. They could become good at writing sandbox escapes, and one day write one when it just happens to fit some hallucinated goal.



A river kinda has access to the real world a little bit. (Referring to the other part of the argument.)


And a LLM-bot can have access to internet which connects it to our real world, at least in many places.


Also it has access to people. It could instruct people to carry out stuff in the real world, on its behalf.


OpenAI's GPT-4 Technical Report [0] includes an anecdote of the AI paying someone on TaskRabbit to solve a CAPTCHA for it. It lied to the gig worker about being a bot, saying that they are actually a human with a vision impairment.

[0] https://cdn.openai.com/papers/gpt-4.pdf


For reference, this anecdote is on pages 55/56.


Additionally, commanding minions is a leverage point. It's probably more powerful if it does not embody itself.


That makes me think, why not concentrate the effort on regulating the usages instead of regulating the technology itself? Seems not too far fetched to have rules and compliance on how LLM are permitted to be used in critical processes. There is no danger until it's plugged on the wrong system without oversight.


sounds like a recipe for ensuring AI is used to entrenche the interests of the powerful.


A more advanced AI sitting in AWS might have access to John Deere’s infrastructure, or maybe Tesla’s, so imagine a day where an AI can store memories, learn from mistakes, and maybe some person tells it to drive some tractors or cars into people on the street.

Are you saying this is definitely not possible? If so, what evidence do you have that it’s not?


Right, some people don't realise malicious intent is not always required to cause damage.


Writing a sandbox escape doesn’t mean escaping.

If the universe is programmed by god, there might be some bug in memory safety in the simulation. Should God be worried that humans, being a sentient collectively-super-intelligent AI living in His simulation, are on the verge of escaping and conquering heaven?

Would you say humans conquering heaven is more or less likely than GPT-N conquering humanity?


> Would you say humans conquering heaven is more or less likely than GPT-N conquering humanity?

It's difficult to say since we have ~'proof' of humanity but no proof of the "simulation" or "heaven."


A river absolutely has a will In the broadest sense. It will carve its way through the countryside whether we like it or not.

A hammer has no will.


Does a cup of water have will? Does a missile have will? Does a thrown hammer have will? I think the problem here is generally “motion with high impact.” Not necessarily that somebody put the thing in motion. And yes, this letter is also requesting accountability (I.e some way of teaching who threw the hammer)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: