What is an AI 'world model'? Why are Fei-Fei Li, Google, and OpenAI all investing in it?

AI News1yrs agorelease AI Sharing Circle

50.9K 00

The development of AI models is becoming more and more diversified. In addition to large-scale language models and small-scale language models, "world models", which are called world simulators, are being regarded as one of the next key development directions of AI.

In 2024, AI pioneer and computer scientist Feifei Li's spatial intelligence startup, World Labs, has completed two rounds of funding with the goal of building a 'big world model' and is currently valued at $1 billion, while Google DeepMind has poached OpenAI's image generation models Sora One of the people responsible for building the World Simulator; OpenAI also describes Sora as a model of the world.

什么是 AI「世界模型」？为什么李飞飞、Google、OpenAI 都纷纷投入？

What exactly is a world model? Why is it getting more attention?

Giving AI an understanding of the real world

The AI world model is inspired by the human mind model - the human brain takes in information from the senses to develop a more concrete understanding of the world around it.

In a paper, AI researchers David Ha and Jürgen Schmidhuber cite the example of baseball hitters who can hit a 100 mph fastball because they can 'instinctively' predict the direction of the ball, which is reasoned and happens subconsciously - their muscles instinctively swing the bat at the right time and place based on the predictions of the brain model. It has been argued that mental modeling is a prerequisite for human intelligence.

As an AI system, an AI world model follows the same path. According to AI startup runway, an AI world model can construct internal cues for the external environment and simulate future environmental events based on those cues; the goal of the world model is to simulate a situation exactly like the real world.

Why are world models in the spotlight?

In fact, the concept of world modeling has been around for more than a decade, but the One of the reasons for this growing interest is the rise of AI-generated video The

TechCrunch observes that most AI-generated video content today still suffers from the Valley of Horror phenomenon, such as showing limbs as twisted or fused to each other. In addition, while generative AI models may be able to accurately predict physical phenomena such as the direction of a basketball bounce, despite years of image training, they don't actually know why the basketball is bouncing.

In contrast, a world model with 3D world perception can better show the effects of a basketball bounce. In order for AI to realize this insight, the world model needs to be trained on a range of data, including photos, audio, video, and text.

The potential of the world model is not limited to generating videos. researchers such as Meta lead AI scientist Likun Yang say that the World models can be used in the future for complex forecasting and planning in both digital and physical domains For his part, Justin Johnson, co-founder of World Labs, says that world modeling could in the future Generate virtual 3D worlds for gaming, virtual photography, etc. The

For developers, with a powerful world model, there's no need to define how each object moves one by one - often a tedious, cumbersome, and time-wasting task.Alex Mashrabov, former head of AI at Snap and CEO of Higgsfield, told the press that with an advanced world model, the AI is able to develop a self-understanding of any scenario it finds itself in and start reasoning about possible solutions.

3 Walls to Cross for World Modeling

While the concept of a world model is tantalizing, there are still many technical challenges. In a talk at 2024, Li-Kun Yang admitted that it would take at least 10 years to realize his model of the world.

According to the analysis of the foreign media, the obstacles faced by the world model are also a microcosm of the current development of AI models. First. Training and running world models requires a lot of arithmetic power --Thousands of GPUs are needed just for Sora, which is considered an early model of the world.

In addition. The world model also produces hallucinations , and may internalize the bias into the training data. For example, a visual model trained based on video of a sunny day in a European city may have difficulty understanding or representing a snowy Korean city, or even generate incorrect content outright.

In order to address this issue. The training data for the world model must be broad enough to cover not only a variety of different scenarios, but also very specific in order for the AI to deeply understand the nuances of different scenarios However, AI development is also currently facing a data scarcity crisis. However, AI development is also currently facing a data scarcity crisis, with Epoch AI predicting that developers will run out of data to train generative AI models by 2026 to 2032.

Nonetheless, the world model is still very attractive, and Mashrabov says that if the hurdles are overcome, the world model could be a "much stronger" connection between AI and the real world-a breakthrough not only in generating virtual worlds, but also major advances in the areas of robotics and AI decision-making.

Skybox AI: Generating 360° panoramic images to easily create virtual worlds

AI News

Article copyright AI Sharing Circle All, please do not reproduce without permission.

Microsoft relaunched the autonomous intelligent body function to help enterprises realize intelligent transformation and team ability leap!

AI News

1 year ago

041.7K

LiblibAI-API supports to call ComfyUI workflow now!

AI News

1 year ago

074.2K

Trae Domestic Edition is officially released, what are the differences compared to Trae International Edition?

AI News

10 months ago

0122.3K

Nature 权威点评：科研人员必备的五款 AI 神器 (DeepSeek-R1 实力上榜)

Nature's authoritative review: five must-have AI tools for researchers (DeepSeek-R1 makes the list)

AI News

1 year ago

049.9K

No comments

You must be logged in to leave a comment!

No comments...

What is an AI 'world model'? Why are Fei-Fei Li, Google, and OpenAI all investing in it?

Giving AI an understanding of the real world

Why are world models in the spotlight?

3 Walls to Cross for World Modeling

Taking Stock of the 5 Hottest Agent Projects on GitHub

Enterprise Data and AI Trends 2025: Intelligentsia, Platforms and Future Outlooks

Related articles

Microsoft relaunched the autonomous intelligent body function to help enterprises realize intelligent transformation and team ability leap!

LiblibAI-API supports to call ComfyUI workflow now!

Trae Domestic Edition is officially released, what are the differences compared to Trae International Edition?

Nature's authoritative review: five must-have AI tools for researchers (DeepSeek-R1 makes the list)

No comments

Latest Collections

Latest Articles

What is an AI 'world model'? Why are Fei-Fei Li, Google, and OpenAI all investing in it?

Giving AI an understanding of the real world

Why are world models in the spotlight?

3 Walls to Cross for World Modeling

Taking Stock of the 5 Hottest Agent Projects on GitHub

Enterprise Data and AI Trends 2025: Intelligentsia, Platforms and Future Outlooks

Related articles

Microsoft relaunched the autonomous intelligent body function to help enterprises realize intelligent transformation and team ability leap!

LiblibAI-API supports to call ComfyUI workflow now!

Trae Domestic Edition is officially released, what are the differences compared to Trae International Edition?

Nature's authoritative review: five must-have AI tools for researchers (DeepSeek-R1 makes the list)

No comments

Selected AI Tools

Latest Collections

Latest Articles