Nvidia ACE brings AI to game characters, allows lifelike conversations

[Non-player characters in games provide important information that brings the human player into the game setting and story. As Business Insider notes,

“One video game alone can have dozens — if not hundreds — of non-player characters, or NPCs. Controlled by a computer, these characters range from Tom Nook, who runs the village store in ‘Animal Crossing,’ to Zelda, the princess of Hyrule and namesake of the Nintendo series. Over the past few decades, the graphics and basic dialogue of these characters may have advanced, but the interactions can still feel static and scripted.”

To make interactions with NPCs more immersive (i.e., to evoke richer, more natural social presence experiences for players), Nvidia has incorporated the use of real-time, voice-based AI, as described in the story below from Tom’s Hardware. See the original version for the mentioned videos (available on YouTube here and here), and for more information read the Nvidia press release, project webpage and developer blog entry. As nearly all of the press coverage notes, these are early days but the potential of applying AI and other fast-evolving technologies in games has great potential to create engaging (presence-based) experiences. –Matthew]

Nvidia ACE Brings AI to Game Characters, Allows Lifelike Conversations

The company’s ACE for Games for service lets developers bring NPCs to life.

By Avram Piltch
May 29, 2023

There are so many ways you can have a text chat with a large language model, from ChatGPT to Google Bard or MLC LLM, a local chatbot that can run on your phone. The next frontier for AI is bringing the power of LLMs to NPCs (non-player characters) in games where, instead of having a canned set of interactions, you can have a wide-open conversation.

During its Computex 2023 keynote, Nvidia CEO Jensen Huang unveiled ACE for Games, an AI model foundry service, designed to bring game characters to life using natural language conversation, audio-to-facial-expression and text-to-speech / speech-to-text capabilities. Huang showed a game demo where an NPC named Jin, who runs a ramen noodle shop, interacted with a human player who was asking questions with voice and getting back real-sounding answers that match the NPC’s backstory.

In the demo, the gamer (named Kai), walks into Jin’s Ramen shop, asks him how he’s doing (in voice) and has a conversation about the fact that the area has a high crime rate. Kai asks if he can help and Jin responds saying that “if you want to do something about this, I have heard rumors that the powerful crime lord Kumon Aoki is causing all sorts of chaos in the city. He may be the root of this violence.” Kai asks where to find Aoki and Jin tells him, setting the user off on his quest.

“Not only will AI contribute to the rendering and the synthesis of the environment, AI will also animate the characters,” Huang said. “AI will be a very big part of the future of video games.”

Nvidia ACE for Games will offer high-speed access to three different components that already exist. The first, Nvidia NeMo, is an AI framework for training and deploying LLMs and it includes NeMo Guardrails, which is designed to prevent inappropriate / “unsafe” AI conversations. Presumably, this would stop NPCs from answering inappropriate, off-topic prompts from users. Guardrails also has security which should prevent users or would-be prompt injectors from “jailbreaking” the bots and getting them to do bad things.

Nvidia Riva is the company’s speech-to-text / text-to-speech solution. In the ACE for games workflow, a gamer will ask a question via their microphone and Riva will convert it to text which is fed to the LLM. The LLM will then generate a text response which Riva turns back into speech that the user will hear. Of course, we’d expect games to also show the responses in text.  You can try Nvidia Riva’s speech-to-text and text-to-speech capabilities yourself on the company’s site.

Nvidia Omniverse Audio2Face provides the last step in the ACE for games workflow as it allows the characters to have facial expressions that match what they’re saying. The company currently offers this product in beta and you can try it here.

The demo, which is called Kairos, was designed by Convai, an AI-in-gaming startup that’s part of Nvidia’s Inception program that connects up-and-coming companies with venture capital. On the company’s site, it offers a toolset that allows game developers to build lifelike NPCs with complex backstories.

The company has a great explainer video about how its tools work and what they can do. In the video, you can see players talking to NPCs and asking them to do things that involve actual objects and other characters in the game.

For example, in the video, a player asks an NPC to hand him a gun that’s sitting on a table and the NPC complies. In another part of the video, the player asks a soldier NPC to shoot at a target that’s located in a particular place. We also see how Convai’s tools make this all possible.

Having that added context so that the NPC is aware of what’s going on in-game is so important. Recently, we tested a Minecraft AI plugin that allows you to talk to NPCs in that game, but the NPCs have no situational awareness at all. We were able to continue a conversation with a sheep after we had killed it (and it didn’t know it was dead), for example.

This entry was posted in Presence in the News. Bookmark the permalink. Trackbacks are closed, but you can post a comment.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*
*

  • Find Researchers

    Use the links below to find researchers listed alphabetically by the first letter of their last name.

    A | B | C | D | E | F| G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z