Oculus wants to help VR avatars look normal when they talk

[This hasn’t gotten a lot of press coverage, but could be an important step toward more effective presence illusions. The story is from Engadget, which features a 0:13 minute demo video using the avatar pictured below; a second video using an animated robot is available on YouTube. More details from Oculus via Geeky Gadgets are included below. –Matthew]

OVRLipSync Oculus Unity plugin (avatar)

[Image: From Geeky Gadgets]

Oculus wants to help VR avatars look normal when they talk

It’s all thanks to a clever Unity plugin

Chris Velazco
02.14.16

Remember all those Hong Kong kung-fu movies with really poor dubbing so the actors’ mouths would keep flapping after the words had stopped? That was charming. What’s less charming is the possibility of stone-faced avatars poorly mouthing dialogue, detracting ever so slightly from the immersive power of virtual reality worlds. That’s why we’re all slightly excited that Oculus released a beta Unity plugin called OVRLipSync.

The plugin lets developers sync an avatar’s mouth movements to either existing audio or input from a microphone without too much hassle. Granted, the results aren’t wholly life-like, but it’s not a bad showing for some brand new software. More importantly, we’re left wondering how many new VR titles will up taking advantage of this thing. Our guess? Lots. Its potential importance stretches beyond just making NPCs look more natural, too. Oculus is working on shared VR experiences with Oculus Social, so maybe we’ll get those ornate virtual chatrooms with fully animated avatars that were promised in cyberpunk novels after all.

[Geeky Gadgets includes more details from the Oculus documentation: ]

OVRLipSync is an add-on plugin and set of scripts used to sync avatar lip movements to speech sounds from a canned source or microphone input. OVRLipSync requires Unity 5.x Professional or Personal or later, targeting Android or Windows platforms, running on Windows 7, 8, or 10 or 8. OS X 10.9 and later are also currently supported.

Our system currently maps to 15 separate viseme targets: sil, PP, FF, TH, DD, kk, CH, SS, nn, RR, aa, E, ih, oh, and ou. These visemes correspond to expressions typically made by people producing the speech sound by which they’re referred, e.g., the viseme sil corresponds to a silent/neutral expression, PP appears to be pronouncing the first syllable in “popcorn,” FF the first syllable of “fish,” and so forth.


Comments


Leave a Reply

Your email address will not be published. Required fields are marked *

ISPR Presence News

Search ISPR Presence News:



Archives