Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Europe’s Société Générale Expands Euro Stablecoin to the XRP Ledger

    February 21, 2026

    ‘Several’ Assassin’s Creed Games, Two Far Cry Games in Development

    February 21, 2026

    45 of the Best Twin Cities Shops That Make Gifting Easy | Wit & Delight

    February 21, 2026
    Facebook X (Twitter) Instagram
    Saturday, February 21
    Trending
    • Europe’s Société Générale Expands Euro Stablecoin to the XRP Ledger
    • ‘Several’ Assassin’s Creed Games, Two Far Cry Games in Development
    • 45 of the Best Twin Cities Shops That Make Gifting Easy | Wit & Delight
    • Station Well being Group Quetta Cantt Jobs 2026 2026 Job Commercial Pakistan
    • Free room and board? 60% of Canadian parents to offer it during post-secondary – National
    • Stocks jump amid relief over tariff reversal ruling
    • FIFA, Board of Peace to assist Gaza reconstruction by means of soccer
    • Remember HQ? ‘Quiz Daddy’ Scott Rogowsky is back with TextSavvy, a daily mobile game show
    • Govt begins disbursement of compensation to Imambargah Khadija (SA) attack victims’ families
    • Bitcoin Whales Rebuild Reserves With 236K BTC in 90-days
    Facebook X (Twitter) Instagram Pinterest Vimeo
    The News92The News92
    • Home
    • World
    • National
    • Sports
    • Crypto
    • Travel
    • Lifestyle
    • Jobs
    • Insurance
    • Gaming
    • AI & Tech
    • Health & Fitness
    The News92The News92
    Home - AI & Tech - NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data
    AI & Tech

    NVIDIA Releases DreamDojo: An Open-Source Robot World Model Trained on 44,711 Hours of Real-World Human Video Data

    Naveed AhmadBy Naveed AhmadFebruary 21, 2026No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Building simulators for robots has been a long term challenge. Traditional engines require manual coding of physics and perfect 3D models. NVIDIA is changing this with DreamDojo, a fully open-source, generalizable robot world model. Instead of using a physics engine, DreamDojo ‘dreams’ the results of robot actions directly in pixels.

    https://arxiv.org/pdf/2602.06949

    Scaling Robotics with 44k+ Hours of Human Experience

    The biggest hurdle for AI in robotics is data. Collecting robot-specific data is expensive and slow. DreamDojo solves this by learning from 44k+ hours of egocentric human videos. This dataset, called DreamDojo-HV, is the largest of its kind for world model pretraining.

    • It features 6,015 unique tasks across 1M+ trajectories.
    • The data covers 9,869 unique scenes and 43,237 unique objects.
    • Pretraining used 100,000 NVIDIA H100 GPU hours to build 2B and 14B model variants.

    Humans have already mastered complex physics, such as pouring liquids or folding clothes. DreamDojo uses this human data to give robots a ‘common sense’ understanding of how the world works.

    https://arxiv.org/pdf/2602.06949

    Bridging the Gap with Latent Actions

    Human videos do not have robot motor commands. To make these videos ‘robot-readable,’ NVIDIA’s research team introduced continuous latent actions. This system uses a spatiotemporal Transformer VAE to extract actions directly from pixels.

    • The VAE encoder takes 2 consecutive frames and outputs a 32-dimensional latent vector.
    • This vector represents the most critical motion between frames.
    • The design creates an information bottleneck that disentangles action from visual context.
    • This allows the model to learn physics from humans and apply them to different robot bodies.
    https://arxiv.org/pdf/2602.06949

    Better Physics through Architecture

    DreamDojo is based on the Cosmos-Predict2.5 latent video diffusion model. It uses the WAN2.2 tokenizer, which has a temporal compression ratio of 4. The team improved the architecture with 3 key features:

    1. Relative Actions: The model uses joint deltas instead of absolute poses. This makes it easier for the model to generalize across different trajectories.
    2. Chunked Action Injection: It injects 4 consecutive actions into each latent frame. This aligns the actions with the tokenizer’s compression ratio and fixes causality confusion.
    3. Temporal Consistency Loss: A new loss function matches predicted frame velocities to ground-truth transitions. This reduces visual artifacts and keeps objects physically consistent.

    Distillation for 10.81 FPS Real-Time Interaction

    A simulator is only useful if it is fast. Standard diffusion models require too many denoising steps for real-time use. NVIDIA team used a Self Forcing distillation pipeline to solve this.

    • The distillation training was conducted on 64 NVIDIA H100 GPUs.
    • The ‘student’ model reduces denoising from 35 steps down to 4 steps.
    • The final model achieves a real-time speed of 10.81 FPS.
    • It is stable for continuous rollouts of 60 seconds (600 frames).

    Unlocking Downstream Applications

    DreamDojo’s speed and accuracy enable several advanced applications for AI engineers.

    1. Reliable Policy Evaluation

    Testing robots in the real world is risky. DreamDojo acts as a high-fidelity simulator for benchmarking.

    • Its simulated success rates show a Pearson correlation of (Pearson 𝑟=0.995) with real-world results.
    • The Mean Maximum Rank Violation (MMRV) is only 0.003.

    2. Model-Based Planning

    Robots can use DreamDojo to ‘look ahead.’ A robot can simulate multiple action sequences and pick the best one.

    • In a fruit-packing task, this improved real-world success rates by 17%.
    • Compared to random sampling, it provided a 2x increase in success.

    3. Live Teleoperation

    Developers can teleoperate virtual robots in real time. NVIDIA team demonstrated this using a PICO VR controller and a local desktop with an NVIDIA RTX 5090. This allows for safe and rapid data collection.

    Summary of Model Performance

    MetricDREAMDOJO-2BDREAMDOJO-14B
    Physics Correctness62.50%73.50%
    Action Following63.45%72.55%
    FPS (Distilled)10.81N/A

    NVIDIA has released all weights, training code, and evaluation benchmarks. This open-source release allows you to post-train DreamDojo on your own robot data today.

    Key Takeaways

    • Massive Scale and Diversity: DreamDojo is pretrained on DreamDojo-HV, the largest egocentric human video dataset to date, featuring 44,711 hours of footage across 6,015 unique tasks and 9,869 scenes.
    • Unified Latent Action Proxy: To overcome the lack of action labels in human videos, the model uses continuous latent actions extracted via a spatiotemporal Transformer VAE, which serves as a hardware-agnostic control interface.
    • Optimized Training and Architecture: The model achieves high-fidelity physics and precise controllability by utilizing relative action transformations, chunked action injection, and a specialized temporal consistency loss.
    • Real-Time Performance via Distillation: Through a Self Forcing distillation pipeline, the model is accelerated to 10.81 FPS, enabling interactive applications like live teleoperation and stable, long-horizon simulations for over 1 minute.
    • Reliable for Downstream Tasks: DreamDojo functions as an accurate simulator for policy evaluation, showing a 0.995 Pearson correlation with real-world success rates, and can improve real-world performance by 17% when used for model-based planning.

    Check out the Paper and Codes. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.




    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleDepop bought to eBay at 25% low cost to 2021 valuation
    Next Article Pakistan to tour Bangladesh in March
    Naveed Ahmad
    • Website
    • Tumblr

    Related Posts

    AI & Tech

    Remember HQ? ‘Quiz Daddy’ Scott Rogowsky is back with TextSavvy, a daily mobile game show

    February 21, 2026
    AI & Tech

    Apple’s iOS 26.4 arrives in public beta with AI music playlists, video podcasts, and more

    February 21, 2026
    AI & Tech

    How to Build Transparent AI Agents: Traceable Decision-Making with Audit Trails and Human Gates

    February 20, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Demo
    Top Posts

    Oatly loses ‘milk’ branding battle in UK Supreme Courtroom

    February 12, 20261 Views

    ‘Fly excessive my angel’: 12-year-old lady dies by suicide amid bullying allegations

    February 7, 20261 Views

    Europe’s Société Générale Expands Euro Stablecoin to the XRP Ledger

    February 21, 20260 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Demo
    Most Popular

    Oatly loses ‘milk’ branding battle in UK Supreme Courtroom

    February 12, 20261 Views

    ‘Fly excessive my angel’: 12-year-old lady dies by suicide amid bullying allegations

    February 7, 20261 Views

    Europe’s Société Générale Expands Euro Stablecoin to the XRP Ledger

    February 21, 20260 Views
    Our Picks

    Europe’s Société Générale Expands Euro Stablecoin to the XRP Ledger

    February 21, 2026

    ‘Several’ Assassin’s Creed Games, Two Far Cry Games in Development

    February 21, 2026

    45 of the Best Twin Cities Shops That Make Gifting Easy | Wit & Delight

    February 21, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions
    • Advertise
    • Disclaimer
    © 2026 TheNews92.com. All Rights Reserved. Unauthorized reproduction or redistribution of content is strictly prohibited.

    Type above and press Enter to search. Press Esc to cancel.