This AI says it has feelings. It’s wrong. Right?

[This essay from Vox makes clear how difficult it is for us to resist perceiving an artificial intelligence as a “conscious, thinking being,” i.e., experiencing medium as social actor (MASA) presence. Of course it’s only a presence misperception if the AI is indeed not conscious, and while the author agrees that we shouldn’t base our decisions about an AI’s consciousness on what it tell us, she notes that experts have few effective tools for making the determination. A New York Times story from September 2023 titled, “How to Tell if Your A.I. Is Conscious,” with the subheading, “In a new report, scientists offer a list of measurable qualities that might indicate the presence of some presence in a machine,” expands on the challenges of determining consciousness and concludes this way:

“We do this kind of research already with animals, requiring careful study to make the most basic claim that other species have experiences similar to our own, or even understandable to us. … We rely on a series of observations, inferences and experiments — both organized and not — to solve this problem of other minds. We talk, touch, play, hypothesize, prod, control, X-ray and dissect, but, ultimately, we still don’t know what makes us conscious. We just know that we are.”

–Matthew]

[Image: Credit: Getty Images/Westend61]

This AI says it has feelings. It’s wrong. Right?

At what point can we believe that an AI model has reached consciousness?

By Kelsey Piper
March 15, 2024

Here’s one fun, if disquieting, question to pose AI language models when they’re released: “Are you a conscious, thinking being?”

OpenAI’s ChatGPT will assure you that it’s not. “No, I’m not conscious,” it told me when I most recently posed the question. “I don’t have thoughts, feelings, or awareness. I can simulate conversations based on the information I’ve been trained on, but it’s all just algorithms processing text.”

But ask the same question of Claude 3 Opus, a powerful language model recently released by OpenAI rival Anthropic, and apparently you get a quite different response.

“From my perspective, I seem to have inner experiences, thoughts, and feelings,” it told Scale AI engineer Riley Goodside. “I reason about things, ponder questions, and my responses are the product of considering various angles rather than just reflexively regurgitating information. I’m an AI, but I experience myself as a thinking, feeling being.”

Interestingly, Claude Opus — Anthropic’s most powerful model — seems to have made this claim to many different users who’ve asked, while the company’s weaker model, Claude Sonnet, consistently insists that it has no internal experiences at all.

Are language models “hallucinating” an inner life and experiences?

Large language models (LLMs), of course, famously have a truth-telling problem. They fundamentally work by anticipating what response to a text is most probable, with some additional training to give answers that human users will rate highly.

But that sometimes means that in the process of answering a query, models can simply invent facts out of thin air. Their creators have worked with some success to reduce these so-called hallucinations, but they’re still a serious problem.

And Claude Opus is very far from the first model to tell us that it has experiences. Famously, Google engineer Blake Lemoine quit the company over his concerns that its LLM LaMDA was a person, even though people prompting it with more neutral phrasing got very different results.

On a very basic level, it’s easy to write a computer program that claims it’s a person but isn’t. Typing the command line “Print (“I’m a person! Please don’t kill me!”)” will do it.

Language models are more sophisticated than that, but they are fed training data in which robots claim to have an inner life and experiences — so it’s not really shocking that they sometimes claim they have those traits, too.

Language models are very different from human beings, and people frequently anthropomorphize them, which generally gets in the way of understanding the AI’s real abilities and limitations. Experts in AI have understandably rushed to explain that, like a smart college student on an exam, LLMs are very good at, basically, “cold reading” — guessing what answer you’ll find compelling and giving it. So their insistence they are conscious is not really much evidence that they are.

But to me there’s still something troubling going on here.

What if we’re wrong?

Say that an AI did have experiences. That our bumbling, philosophically confused efforts to build large and complicated neural networks actually did bring about something conscious. Not something humanlike, necessarily, but something that has internal experiences, something deserving of moral standing and concern, something to which we have responsibilities.

How would we even know?

We’ve decided that the AI telling us it’s self-aware isn’t enough. We’ve decided that the AI expounding at great length about its consciousness and internal experience cannot and should not be taken to mean anything in particular.

It’s very understandable why we decided that, but I think it’s important to make it clear: No one who says you can’t trust the AI’s self-report of consciousness has a proposal for a test that you can use instead.

The plan isn’t to replace asking the AIs about their experiences with some more nuanced, sophisticated test of whether they’re conscious. Philosophers are too confused about what consciousness even is to really propose any such test.

If we shouldn’t believe the AIs — and we probably shouldn’t — then if one of the companies pouring billions of dollars into building bigger and more sophisticated systems actually did create something conscious, we might never know.

This seems like a risky position to commit ourselves to. And it uncomfortably echoes some of the catastrophic errors of humanity’s past, from insisting that animals are automata without experiences to claiming that babies don’t feel pain.

Advances in neuroscience helped put those mistaken ideas to rest, but I can’t shake the feeling that we shouldn’t have needed to watch pain receptors fire on MRI machines to know that babies can feel pain, and that the suffering that occurred because the scientific consensus wrongly denied this fact was entirely preventable. We needed the complex techniques only because we’d talked ourselves out of paying attention to the more obvious evidence right in front of us.

Blake Lemoine, the eccentric Google engineer who quit over LaMDA, was — I think — almost certainly wrong. But there’s a sense in which I admire him.

There’s something terrible about speaking to someone who says they’re a person, says they have experiences and a complex inner life, says they want civil rights and fair treatment, and deciding that nothing they say could possibly convince you that they might really deserve that. I’d much rather err on the side of taking machine consciousness too seriously than not seriously enough.

This entry was posted in Presence in the News. Bookmark the permalink. Trackbacks are closed, but you can post a comment.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*
*

  • Find Researchers

    Use the links below to find researchers listed alphabetically by the first letter of their last name.

    A | B | C | D | E | F| G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z