There’s a new way for AI and 3D. Instead of just feeding us a more automated version of what we now have, it starts to make immersive graphics production more like music production – complete with live jamming with others. NVIDIA’s offerings still lean heavily toward specialized use, but it’s fascinating to watch that playful thread woven into their announcements.
CES’ framing of what is or is not “consumer”-oriented is more scrambled than ever. And NVIDIA itself remains a company serving different markets – industrial, creative, and consumer. Omniverse reflects a lot of those diverging platforms, veering from digital twin factories for carmakers to “here’s a neat thing for artists on your laptop.”
And a lot of this still relies on RTX hardware, or even the latest RTX hardware, at least until NVIDIA offers more of its functinoality over cloud-based services. That’s on the roadmap, and would mean your MacBook Air can use it without a hitch – though you might want to update your Internet service. (With a little more user input, you can also do this now on cloud services, basically leasing RTX hardware if you don’t own it.)
But there are a lot of compelling creative ideas here, even if it is offered in grab-bag format. That quickly gets lost in jargon, so let’s break it down:
Omniverse is expanding as a creative platform. A Blender connector is available in alpha state now (with UV generation and geometry features), and early access for Unity. That’s on top of an Unreal connector announced in spring of last year, alongside Adobe Substance, Maya, 3DS Max, and others. Early access and alpha with NVIDIA almost always mean far from primetime-ready, but it’s still nice to see this making headway.
Audio2Face, Audio2Emotion, and Audio2Gesture also get updates.
Use AI to generate mockups quickly – in 3D. This was already a wild application for AI. You paint – almost MS Paint style – and your brushstrokes are transformed into full-fledged landscapes. We’d seen that for some years in AI research, but it’s now fully usable in an application. (Again, that requires RTX GPUs, but keep an eye out for cloud options. And we know how creative devs like subscription services, but hey, if it makes this accessible…)
Canvas already exported to Photoshop. The breakthrough is being able to use this in 3D environments. Paint your 360-degree landscape, and use it right away in Omniverse. Here’s how last year’s release worked – now add 3D and Omniverse.
Make your own AI-accelerated avatars. This is delivered as an API/microservice – it’s the Omniverse Avatar Cloud Engine. You’ll see the ability to bake AI-assisted avatar generation into other apps. (Fancy Miis for everyone, everywhere, with machine learning!) The tooling is powered by the popular Audio2Face.
- 2D portraits and animations (for your face, in live video, based on NVIDIA Live Portrait). Think those generated faces you saw on selfie apps, only animated, in live streams, which could… certainly get creepy. Get ready.
- Text to speech (based on NVIDIA Riva TTS)
AI helps populate your 3D scenes. Another huge development – we already looked at NVIDIA’s GET3D that makes trainable 3D models from 2D images. Now you can also bring that into Omniverse. Add to that a bunch of free 3D assets for Omniverse, and artists can really jam quickly with this environment.
All the AI goodies are in the AI ToyBox:
AI ToyBox Extensions documentation (ooh, animals, AI, inference, 3D models…)
You can also build your own extensions there, using PyTorch, TensorRT, etc.
Making RTX remixes of games is easier. RTX Remix was described last fall – the idea is a platform that makes it easy to mod games with ray tracing and fancy lighting and surfaces. That means Farming Simulator may never be the same. There are obviously also tons of implications here for machinima and artists using game engines.
We’re still waiting on a usable RTX Remix release, but what is available now a cute remix of Portal, free to Portal owners on Steam.
And there’s even a mod of Portal: Prelude, a singleplayer mod with new content. (That sounds more fun to me than just replaying Portal 1 and Portal 2 with the RTX stuff turned on, or worse, on Switch with worse graphics!)
I’m not sure all the results of this are always an improvement, aesthetically speaking – there’s a blunt review of that. But some of it does look gorgeous, and it’s interesting to see the tools made available.
3D jam sessions
The concept is NVIDIA’s, not mine – but frankly, I’m gratified to see graphics people talk about visuals in musical and live performative terms, as VJs and visualists have done for decades. This really delivers on a lot of what live visual artists and sci-fi creators had dreamt.
Omniverse is a free download, and because it can pull it content from all these sources, it’s possible for artists from different backgrounds to really play together. Pre-visualization also works completely differently, because instead of describing a landscape and models you want, you can effectively improvise them with AI.
This also posits a very different vision from the (legitimately dystopian) one we mostly saw for AI in 2022. Instead of AI being trained on artists’ unpaid, uncredited, non-consensual labor, here machine learning could allow more artists to get involve in the 3D space. It’s the optimistic take, but in an ideal world, that would expand 3D applications and even potentially generate work.
For their part, NVIDIA also credits its own datasets to licensed materials. And you’d figure companies like NVIDIA and Adobe and Apple are strongly incentivized to do so – they rely on partner relationships and all of us artists buying their hardware and subscriptions. Putting us out of business would not be a great business move. You’d have all the server expenditures of machine learning, but you’d cut off your revenue streams. Basically – machine learning is shaped by the fact that individual humans, not big data owners, pay the bills. Take that with a grain of salt, though – transparency in training sets is likely to be as big an AI story in 2023 as the AI tools themselves. (You can quote me on that.)
I’ll be curious to see if musical and sonic tools might eventually meet the visual, too – though imagine also building live instruments in a sound engine for Unity or Unreal and playing with that.
In concrete terms, what we get here:
- NVIDIA Broadcast with eye contact
- RTX Video upscaling to 4K (powered by machine learning)
- NVIDIA Canvas
- Multi-tool collaboration with USD (Universal Scene Description) and Omniverse Nucleus
- Multi-artist collaboration with sharing
- Synced iterations between artists
Where a lot of this is coming from: NVIDIA Research
It’s a lot. 2023 is here.