logo
MolmoAct 2 logo

MolmoAct 2Think in 3D, act with precision—no training required for new tasks

Open-source robotics model that reasons in 3D before acting. Handles bimanual tasks with zero fine-tuning. 37x faster than previous version.

MolmoAct 2 screenshot

More About MolmoAct 2

MolmoAct 2

MolmoAct 2 is an open-source robotics foundation model that brings capable, reasoning-driven robot control into real-world environments. Built by Ai2, it outperforms proprietary alternatives on industry benchmarks while remaining fully transparent for researchers to study, extend, and deploy.

Product Highlights

  • Adaptive 3D Reasoning: MolmoAct 2-Think uses depth perception tokens with intelligent routing to reason deeply about spatial structure only when needed, improving performance without sacrificing speed.
  • 37x Faster Inference: Reduced action call latency from 6,700ms to just 180ms (base) or 790ms (with adaptive reasoning), enabling near real-time robot responsiveness.
  • Bimanual Manipulation Ready: Unlike its predecessor, MolmoAct 2 includes dual-arm coordination capabilities directly in the base model—no per-task fine-tuning required.
  • Fully Open Ecosystem: Model weights, training datasets (including the 720-hour MolmoAct 2-Bimanual YAM dataset), code, and the open MolmoAct 2-FAST Tokenizer are all publicly available.
  • Embodied Reasoning Backbone: Built on Molmo 2-ER, which achieves 63.8 average score across 13 embodied-reasoning benchmarks—surpassing GPT-5, Gemini 2.5 Pro, and other leading systems.

Use Cases

  • Laboratory Automation: Deploy in wetlab environments for precise, repetitive tasks like CRISPR gene-editing workflows, sample handling, and equipment operation—tested with Stanford School of Medicine researchers.
  • Household & Service Robotics: Handle kitchen organization, table bussing, towel folding, and object manipulation in unstructured home environments without environment-specific training.
  • Research & Development: Study and extend a complete open VLA (Vision-Language-Action) pipeline, including novel adapter architectures and adaptive reasoning mechanisms.
  • Low-Cost Robot Deployment: Leverage compatibility with affordable open-source hardware like SO-100/SO-101 arms to build accessible robotics solutions.

Target Audience

Robotics researchers, AI engineers, and academic institutions seeking a transparent, high-performance foundation model for embodied AI. Also ideal for automation engineers in laboratories and service industries who need reliable manipulation capabilities without proprietary lock-in.