A river has no will, but it can flood and destroy. A discussion whether AI does something because it "wants" to or not, is just philosophy and semantics. But it may end up generating a series of destructive instructions anyway.
We feed these LLMs all of the Web, including instructions how to write code, and how to write exploits. They could become good at writing sandbox escapes, and one day write one when it just happens to fit some hallucinated goal.
OpenAI's GPT-4 Technical Report [0] includes an anecdote of the AI paying someone on TaskRabbit to solve a CAPTCHA for it. It lied to the gig worker about being a bot, saying that they are actually a human with a vision impairment.
That makes me think, why not concentrate the effort on regulating the usages instead of regulating the technology itself? Seems not too far fetched to have rules and compliance on how LLM are permitted to be used in critical processes. There is no danger until it's plugged on the wrong system without oversight.
A more advanced AI sitting in AWS might have access to John Deere’s infrastructure, or maybe Tesla’s, so imagine a day where an AI can store memories, learn from mistakes, and maybe some person tells it to drive some tractors or cars into people on the street.
Are you saying this is definitely not possible? If so, what evidence do you have that it’s not?
If the universe is programmed by god, there might be some bug in memory safety in the simulation. Should God be worried that humans, being a sentient collectively-super-intelligent AI living in His simulation, are on the verge of escaping and conquering heaven?
Would you say humans conquering heaven is more or less likely than GPT-N conquering humanity?
Does a cup of water have will? Does a missile have will? Does a thrown hammer have will? I think the problem here is generally “motion with high impact.” Not necessarily that somebody put the thing in motion. And yes, this letter is also requesting accountability (I.e some way of teaching who threw the hammer)
We feed these LLMs all of the Web, including instructions how to write code, and how to write exploits. They could become good at writing sandbox escapes, and one day write one when it just happens to fit some hallucinated goal.