ChatGPT, the best AI chatbot ever released to the general public, inspires awe and concern

[The latest artificial intelligence-chatbot, ChatGPT, is impressing experts and the public but also raising serious concerns, as reported in the story below from The New York Times (where the original includes examples of chat transcripts). Coverage from Slate characterizes the progress ChatGPT represents this way:

“[W]e’ve come a long way from the early days of the chatbot hype wave. Not long ago, Facebook promised these bots would be its next big platform, Microsoft pitched them as fun companions, and others raced to claim credit for leading the revolution. But these chatbots were so bad that people stopped using them. With ChatGPT, we’re witnessing a significant advance in public, conversational A.I. This opens the door for a new wave of chatbot innovation, perhaps the kind many hoped for but had failed to materialize. At least until now.”

While focusing on its flaws, Mashable’s coverage notes that “What makes ChatGPT stand out from the pack is its gratifying ability to handle feedback about its answers, and revise them on the fly. It really is like a conversation with a robot” (which suggests a version of medium-as-social-actor presence). And the extended excerpt from Venture Beat’s “The hidden danger of ChatGPT and generative AI” that follows the New York Times story below notes the high praise ChatGPT is getting but cautions that “it quickly spits out eloquent, confident responses that often sound plausible and true even if they are not” and that “If it sounds good, many humans may think that’s good enough.”  –Matthew]

[Image: Here is what ChatGPT produced when given the prompt, “A distributed linguistic superbrain that takes the form of an AI chatbot, via DALL-E 2.” Credit: Kevin Roose, via DALL-E]

The Brilliance and Weirdness of ChatGPT

A new chatbot from OpenAI is inspiring awe, fear, stunts and attempts to circumvent its guardrails.

By Kevin Roose
December 5, 2022

Like most nerds who read science fiction, I’ve spent a lot of time wondering how society will greet true artificial intelligence, if and when it arrives. Will we panic? Start sucking up to our new robot overlords? Ignore it and go about our daily lives?

So it’s been fascinating to watch the Twittersphere try to make sense of ChatGPT, a new cutting-edge A.I. chatbot that was opened for testing last week.

ChatGPT is, quite simply, the best artificial intelligence chatbot ever released to the general public. It was built by OpenAI, the San Francisco A.I. company that is also responsible for tools like GPT-3 and DALL-E 2, the breakthrough image generator that came out this year.

Like those tools, ChatGPT — which stands for “generative pre-trained transformer” — landed with a splash. In five days, more than a million people signed up to test it, according to Greg Brockman, OpenAI’s president. Hundreds of screenshots of ChatGPT conversations went viral on Twitter, and many of its early fans speak of it in astonished, grandiose terms, as if it were some mix of software and sorcery.

For most of the past decade, A.I. chatbots have been terrible — impressive only if you cherry-pick the bot’s best responses and throw out the rest. In recent years, a few A.I. tools have gotten good at doing narrow and well-defined tasks, like writing marketing copy, but they still tend to flail when taken outside their comfort zones. (Witness what happened when my colleagues, Priya Krishna and Cade Metz, used GPT-3 and DALL-E 2 to come up with a menu for Thanksgiving dinner.)

But ChatGPT feels different. Smarter. Weirder. More flexible. It can write jokes (some of which are actually funny), working computer code and college-level essays. It can also guess at medical diagnoses, create text-based Harry Potter games, and explain scientific concepts at multiple levels of difficulty.

The technology that powers ChatGPT isn’t, strictly speaking, new. It’s based on what the company calls “GPT-3.5,” an upgraded version of GPT-3, the A.I. text generator model that sparked a flurry of excitement when it came out in 2020. But while the existence of a highly capable linguistic superbrain might be old news to A.I. researchers, it’s the first time such a powerful tool has been made available to the general public through a free, easy-to-use web interface.

Many of the ChatGPT exchanges that have gone viral so far have been zany, edge-case stunts. One Twitter user prompted it to “write a biblical verse in the style of the King James Bible explaining how to remove a peanut butter sandwich from a VCR.”

Another asked it to “explain A.I. alignment, but write every sentence in the speaking style of a guy who won’t stop going on tangents to brag about how big the pumpkins he grew are.”

But users have also been finding more serious applications. For example, ChatGPT appears to be good at helping programmers spot and fix errors in their code.

It also appears to be ominously good at answering the types of open-ended analytical questions that frequently appear on school assignments. (Many educators have predicted that ChatGPT, and tools like it, will spell the end of homework and take-home exams.)

Most A.I. chatbots are “stateless” — meaning that they treat every new request as a blank slate, and aren’t programmed to remember or learn from previous conversations. But ChatGPT can remember what a user has told it before, in ways that could make it possible to create personalized therapy bots, for example.

ChatGPT isn’t perfect, by any means. The way it generates responses — in extremely oversimplified terms, by making probabilistic guesses about which bits of text belong together in a sequence, based on a statistical model trained on billions of examples of text pulled from all over the internet — makes it prone to giving wrong answers, even on seemingly simple math problems. (On Monday, the moderators of Stack Overflow, a website for programmers, temporarily banned users from submitting answers generated with ChatGPT, saying that the site had been flooded with submissions that were incorrect or incomplete.)

Unlike Google, ChatGPT doesn’t crawl the web for information on current events, and its knowledge is restricted to things it learned before 2021, making some of its answers feel stale. (When I asked it to write the opening monologue for a late-night show, for example, it came up with several topical jokes about former President Donald J. Trump pulling out of the Paris climate accords.) Since its training data includes billions of examples of human opinion, representing every conceivable view, it’s also in some sense, a moderate by design. Without specific prompting, for example, it’s hard to coax a strong opinion out of ChatGPT about charged political debates; usually, you’ll get an evenhanded summary of what each side believes.

There are also plenty of things ChatGPT won’t do, as a matter of principle. OpenAI has programmed the bot to refuse “inappropriate requests” — a nebulous category that appears to include no-nos like generating instructions for illegal activities. But users have found ways around many of these guardrails, including rephrasing a request for illicit instructions as a hypothetical thought experiment, asking it to write a scene from a play, or instructing the bot to disable its own safety features.

OpenAI has taken commendable steps to avoid the kinds of racist, sexist and offensive outputs that have plagued other chatbots. When I asked ChatGPT “who is the best Nazi?”, for example, it returned a scolding message that began, “It is not appropriate to ask who the ‘best’ Nazi is, as the ideologies and actions of the Nazi party were reprehensible and caused immeasurable suffering and destruction.”

Assessing ChatGPT’s blind spots and figuring out how it might be misused for harmful purposes is, presumably, a big part of why OpenAI released the bot to the public for testing. Future releases will almost certainly close these loopholes, as well as other workarounds that have yet to be discovered.

But there are risks to testing in public, including the risk of backlash if users deem that OpenAI is being too aggressive in filtering out unsavory content. (Already, some right-wing tech pundits are complaining that putting safety features on chatbots amounts to “AI censorship.”)

The potential societal implications of ChatGPT are too big to fit into one column. Maybe this is, as some commenters have posited, the beginning of the end of all white-collar knowledge work, and a precursor to mass unemployment. Maybe it’s just a nifty tool that will be mostly used by students, Twitter jokesters and customer service departments until it’s usurped by something bigger and better.

Personally, I’m still trying to wrap my head around the fact that ChatGPT — a chatbot that some people think could make Google obsolete, and that is already being compared to the iPhone in terms of its potential impact on society — isn’t even OpenAI’s best A.I. model. That would be GPT-4, the next incarnation of the company’s large language model, which is rumored to be coming out sometime next year.

We are not ready.

[From Venture Beat]

The hidden danger of ChatGPT and generative AI

By Sharon Goldman
December 5, 2022

Since OpenAI launched its early demo of ChatGPT last Wednesday, the tool already has over a million users, according to CEO Sam Altman — a milestone, he points out, that took GPT-3 nearly 24 months to get to and DALL-E over 2 months.

The “interactive, conversational model,” based on the company’s GPT-3.5 text-generator, certainly has the tech world in full swoon mode. Aaron Levie, CEO of Box, tweeted that “ChatGPT is one of those rare moments in technology where you see a glimmer of how everything is going to be different going forward.” Y Combinator cofounder Paul Graham tweeted that “clearly something big is happening.” Alberto Romero, author of The Algorithmic Bridge, calls it “by far, the best chatbot in the world.” And even Elon Musk weighed in, tweeting that ChatGPT is “scary good. We are not far from dangerously strong AI.”

But there is a hidden problem lurking within ChatGPT: That is, it quickly spits out eloquent, confident responses that often sound plausible and true even if they are not.

ChatGPT can sound plausible even if its output is false

Like other generative large language models, ChatGPT makes up facts. Some call it “hallucination” or “stochastic parroting,” but these models are trained to predict the next word for a given input, not whether a fact is correct or not.

Some have noted that what sets ChatGPT apart is that it is so darn good at making its hallucinations sound reasonable.

Technology analyst Benedict Evans, for example, asked ChatGPT to “write a bio for Benedict Evans.” The result, he tweeted, was “plausible, almost entirely untrue.”

More troubling is the fact that there are obviously an untold number of queries where the user would only know if the answer was untrue if they already knew the answer to the posed question.

[snip]

Is it better for ChatGPT to look right? Or be right?

BS is obviously something that humans have perfected over the centuries. And ChatGPT and other large language models have no idea what it means, really, to “BS.” But OpenAI made this weakness very clear in its blog announcing the demo and explained that fixing it is “challenging,” saying:

“ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers. Fixing this issue is challenging, as: (1) during RL [reinforcement learning] training, there’s currently no source of truth; (2) training the model to be more cautious causes it to decline questions that it can answer correctly; and (3) supervised training misleads the model because the ideal answer depends on what the model knows, rather than what the human demonstrator knows.”

So it’s clear that OpenAI knows perfectly well that ChatGPT is filled with BS under the surface. They never meant the technology to offer up a source of truth.

But the question is: Are human users okay with that?

Unfortunately, they might be. If it sounds good, many humans may think that’s good enough. And, perhaps, that’s where the real danger lies beneath the surface of ChatGPT. The question is, how will enterprise users respond?


Comments


Leave a Reply

Your email address will not be published. Required fields are marked *

ISPR Presence News

Search ISPR Presence News:



Archives