The Rise of ‘World Models’: When AI Starts to Understand Reality

14

Artificial intelligence has made rapid strides in recent years, mastering tasks like text generation, image creation, and even software coding. But the next frontier isn’t about describing the world – it’s about machines learning how the world actually works. This push has led to the development of “world models,” AI systems designed to simulate and predict physical reality, a capability poised to transform robotics, autonomous systems, and even medicine.

What Are World Models?

The concept of world models isn’t new, dating back to the 1950s, but it resurfaced in AI research around 2018 and gained momentum in 2024 with tools like OpenAI’s Sora and Google DeepMind’s Genie. In 2025, Nvidia’s Cosmos, crowned “Best AI” at CES, and Meta’s V-JEPA 2, which claims to grasp basic physics like gravity, further cemented the field’s importance.

Essentially, world models bridge the gap between abstract knowledge and embodied understanding. Traditional “foundation models” (like ChatGPT) learn from vast datasets but lack direct experience. They can describe gravity but don’t feel weight. World foundation models, in contrast, simulate physical environments using video and sensory data, allowing AI to predict outcomes based on actions.

From Language to Prediction

Large language models (LLMs) excel at processing text, but they operate on correlation rather than causation. World models shift the focus: instead of predicting the next word, they predict what happens next after an action is taken. This could be as simple as forecasting how an object moves or as complex as a self-driving car anticipating traffic patterns.

As Eric Landau, CEO of AI data company Encord, puts it, world models aren’t necessarily replacing LLMs but running alongside them as a parallel track of development. LLMs contain some implicit world knowledge, but it’s fragmented. World models aim for a cleaner, more direct representation of reality.

How They Work: Two Approaches

World models operate in two primary ways: real-time generation and fixed-environment simulation. The first creates a dynamic world that responds to interactions, much like a video game. The second builds a pre-defined environment with established rules, allowing exploration without destabilizing the simulation.

Both methods aim to give AI a deeper understanding of cause and effect, enabling it to reason before acting rather than reacting step-by-step. This is critical for robots, autonomous vehicles, and other systems that need reliable predictions in physical spaces.

The Future of AI: Robotics, Medicine, and Beyond

The demand for world models is growing as AI moves beyond chatbots toward more independent agents. Real-world training is expensive and risky; simulations offer a safer, more efficient alternative. Robotics and autonomous driving are obvious applications, but the potential extends further.

Researchers predict rapid expansion into medicine, where world models could revolutionize drug discovery and treatment planning by simulating complex biological interactions. They could also transform creative and educational tools, allowing designers to test prototypes in immersive environments and students to interact with simulated systems rather than simply reading about them.

Risks and Challenges

Despite the promise, significant hurdles remain. Simulating reality accurately is incredibly difficult, and even minor errors can compound over time. Compute power is a major constraint, as these models require massive GPU resources. Data acquisition is another bottleneck; high-quality sensor data is far harder to obtain than the text used to train LLMs.

Beyond technical challenges, experts warn of potential misuse, including weaponized autonomous agents and the societal disruption of widespread automation.

As Nvidia CEO Jensen Huang recently stated, AI is “the single most impactful technology of our time.” The development of world models marks a pivotal step towards AI that doesn’t just process information but understands the world around it, raising fundamental questions about the future of intelligence and automation.