Google Genie 3, for generating explorable worlds

Fubarberry@sopuli.xyz · 10 days ago

Google Genie 3, for generating explorable worlds

Fubarberry@sopuli.xyz · 10 days ago

Here’s some first impressions from someone who’s gotten to try it out.

Special thanks to @GoogleDeepMind for inviting me to try out Genie 3. I’m excited to share my thoughts on this early research prototype and also some of my live recordings below:

I spent the whole day playing with the system and when it works, it is truly mind blowing🤯. It is the first neural game engine / world model I have tried that generalizes so well and has long term world consistency. Here’s a couple of examples from my live recording and some thoughts on what it means for the future of gaming, robotics, digital experiences and ASI.

Where it shines:

Truly general-purpose and quick startup time. Works exceptionally well for gaming environments but also generalizes to other industrial and real-world scenarios.
It learns physics. Although there are systematic failures even for rigid body physics, it was clear to me that it can learn game engine and non-rigid physics without an underlying engine (and in limit learn from game engines via training data).
It works exceptionally well for stylized environments with characters walking around. This will have implications for concept artists, level designers and game devs.
It is way more fun than video models, indicating that there are high retention consumer experiences waiting to be built with this in the future
Photorealistic walk throughs and drone shots work exceptionally well
Global illumination and lighting works surprisingly well
Visual memory is quite powerful and the same objects approximately remain coherent under occlusion and longer time horizons

Open Problems:

Physics is still hard and there are obvious failure cases when I tried the classical intuitive physics experiments from psychology (tower of blocks).
Social and multi-agent interactions are tricky to handle. 1vs1 combat games do not work
Long instruction following and simple combinatorial game logic fails (e.g. collect some points / keys etc, go to the door, unlock and so on)
Action space is limited
It is far from being a real game engines and has a long way to go but this is a clear glimpse into the future.

The Future:

It is impressive enough for me to have strong conviction that this is going to disrupt the gaming industry. It is super early days and there are a lot of failures but the writing is on the wall. Lots of challenging scientific, engineering and scaling problems to be solved but it is going to happen in the next 5 years.
This is the final piece before we get full AGI and now I think we are well on our way to truly solve it once something like this is scaled up. In many ways it is more ASI than AGI but this is a matter of definitions. The fidelity and generalizability will reach human-level and quickly surpass humans
People are going to combine this with 3D AI and LLMs to build AAA games.

Google Genie 3, for generating explorable worlds

Google Genie 3, for generating explorable worlds

Genie 3: A New Frontier for World Models