Robotics and physical AI are quickly becoming strategic necessities for manufacturing, reshoring, and even national sovereignty. They can help address labor shortages, reduce workplace accidents, and significantly boost productivity and GDP growth.
But there’s a hard bottleneck holding progress back: data. Robots don’t learn from text like chatbots do. They need real-world signals, like video, actions, sensor streams, edge cases, and failures. Historically, that data stayed locked inside labs.
This is now changing. Platforms like Hugging Face have radically lowered the friction to share robotics data, and there is a revolution hiding in plain sight. Robotics has become the fastest-growing dataset category on the platform, jumping from roughly 1,000 datasets in 2024 to about 27,000 in 2025. For comparison, text generation - the next largest category - sits at around 5,000 for the year 2025. Robots are becoming a reality.
Who provides the data? While NVIDIA is usually perceived through the lens of chips and compute, its role in robotics data is becoming just as decisive. NVIDIA’s open robotics datasets are the most widely adopted globally this year, with over 9 million downloads for the datasets produced in 2025. Datasets for post-training of Isaac GR00T N1, by far the most downloaded robotics datasets on the platform with 835k downloads last month alone (and 7.9 million the past year), support post-training across multiple robot types and tasks. What once lived inside internal research pipelines is now reusable by the entire ecosystem. This is not trivial.
Other players are contributing at scale too. Shanghai AI Lab follows closely with around 7.6 million downloads, while Hugging Face itself reaches about 1.4 million. Stanford Vision and Learning Lab (SVL) reach about 710k downloads, and AgiBot about 450k. Other key entities include Yaak AI, AllenAI (Ai2), Physical Intelligence, or Unitree Robotics contributing with datasets that have been downloaded from 90k-230k million times each (datasets are filtered for having received at least one like).
By opening high-quality, real-world robotics data, NVIDIA, Shanghai AI lab, Hugging Face and others are accelerating learning, benchmarking, and iteration far beyond their own platforms. In robotics, data isn’t a nice-to-have; it’s an essential infrastructure.