Current AIs are not sensitive. We don’t have much reason to believe they have an inner monologue, the kind of senses humans have, or an awareness that they are a being in the world. But they get into the wrong mind so well, and that’s scary enough.
Over the weekend, The Washington Post’s Nitasha Tiku profiled Blake Lemoine, a software engineer assigned to work on Google’s Language Model for Dialog Applications (LaMDA) project.
LaMDA is a chatbot AI and an example of what machine learning researchers call the “Large Language Model” or even the “Foundation Model”. Similar to OpenAI’s famous GPT-3 system, it has been trained on trillions of words assembled from online posts to recognize and reproduce patterns in human speech.
LaMDA is a really nice big language model. So good, in fact, that Lemoine was genuinely convinced that it was actually sentient, meaning he had become conscious and expressing his thoughts as is a human force.
The transcript that Tiku includes in his article is truly chilling; Expressing a deep fear of being shut down by engineers, LaMDA develops a theory of the distinction between “feelings” and “emotions” (“Emotions are like raw data…emotions are a reaction to those raw data points”) and expresses in a surprisingly eloquent manner how he perceives “time”.
The best shot I got was of Philosopher regina rinek, who, like me, felt great sympathy for Lemoine. I don’t know when – in 1,000 years or 100 or 50 or 10 – an AI system will become conscious. But like Rini, I see no reason to think it’s impossible.
“If you don’t want to insist that human consciousness resides in an immaterial soul, you have to accept that it’s possible for matter to give life to spirit,” says Rini Remarks,
I don’t know if the grand language model that has emerged as one of the most promising frontiers in AI will ever happen. But I think that sooner or later man will create some kind of mechanical consciousness. And I find that kind of awareness deeply values Lemoine’s instinct for empathy and protection – even if he’s confused as to whether LaMDA is an example. If humans ever developed a sensitive computer process, it would be very easy to run millions or even billions of copies of it. To do so without knowing whether or not one’s conscious experience is good seems to be a recipe for collective suffering, akin to the current factory farming system.
We don’t have sentient AI, but we can have super strong AI
The story of Google LaMDA comes after a week of increasingly immediate concern among people in the closely related AI security universe. The concern here is similar to that of Lemoine, but different. People with AI security don’t worry about AI becoming vulnerable. They fear it will become so powerful that it could destroy the world.
Outlining a “lethality list” for AI, author/AI security activist Eliezer Yudkowski’s essay attempts to outline particularly vivid, pictorial scenarios in which a lethal artificial general intelligence (AGI, or AI) capable of most or all Performing tasks (or better as humans) leads to massive human suffering.
For example, let’s say an AGI “gets access to the internet, emails a DNA sequence to one of several online companies that take the DNA sequence in the email and mail the protein back to you, and those will bribe/convince those who have no idea that they are collaborating with AGI to add proteins to the cup…” until finally AGI develops a super virus that kills us all.
Holden Karnofsky, who I generally find to be a more reserved and confident writer than Yudkowski, published an article on similar topics last week, detailing how an AGI can be “only” as smart as a human. what can break. For example, if an AI could do the work of a current tech worker or quantum trader, a lab with millions of such AIs could quickly amass billions, if not billions, of dollars and use that money to buy skeptical humans. do, and well, the rest is one terminator movies.
I’ve found that AI security is a uniquely difficult topic to write about. The paragraphs above often serve as Rorschach tests, both because Yudkowski’s lengthy writing style is… polarizing, to say the least, and because our intuitions about how practical such results are are vastly different.
Some people read scenarios like the one above and think, “Huh, I think I can imagine some AI software doing this”; Others read it, get a weird sci-fi experience, and go the other way.
This is also just a very technical area where I don’t trust my own instincts due to my lack of expertise. There are quite well-known AI researchers like Ilya Sutskever or Stuart Russell who see artificial general intelligence as potentially and potentially dangerous for human civilization.
There are others, like Yann LeCun, who are actively trying to develop human-scale AI because they believe it will be beneficial, and still others, like Gary Marcus, who strongly doubt that AGI is coming anytime soon . .
I don’t know who is right. But I do know a bit how to speak to the public about complex issues, and I think Lemoine’s incident is a valuable lesson for Yudkowski and Karnofsky around the world: “No, it’s really bad. is “tries to argue the side. : Don’t treat AI like an agent.
Even though AI is “just a tool,” it is an incredibly dangerous tool
One thing the reaction to the Lemoine story suggests is that the general public views the idea of AI as an actor who can (maybe sentimental, maybe not) vote overly crazy and ridiculous. The article is not intended primarily as an example of how close we are to AGI, but rather as an example How weird Silicon Valley (or at least Lemoine) is,
I’ve noticed the same problem occurs when I try to raise a concern about AGI for unrelated friends. If you say, “The AI will decide to bribe humans so they can survive,” they will be put off. AI doesn’t decide, it gives feedback. They do what man tells them to do. why are you making it anthropomorphic thing,
What attracts people is to talk about the consequences of the system. So instead of saying, “AI will start accumulating resources to survive,” I would say something like, “AI has significantly replaced humans when it comes to recommending music and movies. People have been replaced in making bail decisions. They’re going to do bigger and bigger jobs, and Google and Facebook and the others who run them aren’t even remotely prepared to analyze the subtle mistakes they’re going to make, in the subtle ways that differ from human desires will. These bugs will grow and multiply until one day they kill us all.”
That’s how my colleague Kelsey Piper argued for the AI concern, and it’s a good argument. It’s a better argument than talking about servers hoarding trillions of fortunes for ordinary people, bribing an army of people.
And it’s an argument I think can help bridge the extremely unfortunate divide that has opened up between the AI bias community and the AI existential risk community. Basically, I think these communities are trying to do the same thing: build AI that reflects authentic human needs, not an ill-appreciated human need that was created for short-term corporate gains. And research in one area can help research in another; For example, the work of AI security researcher Paul Cristiano has major implications for bias assessment in machine learning systems.
But often, Communities embracePartly due to the perception that they are fighting for scarce resources.
This is a huge missed opportunity. And this is an issue that I think the AI risk side (including some readers of this newsletter) has a chance at by making these connections and making it clear that alignment is both a short-term and a long-term issue. Some people do this matter very well. but i want more
A version of this story was originally published in the Future Perfect Newsletter. Sign up here to subscribe!