Ant Forest LingBot Unveils Open-Source 2.7T Dataset with 2M Real Samples Across 6 Cameras
On March 31st, Ant Lingbo Technology officially open-sourced the large-scale RGB-D dataset, LingBot-Depth-Dataset. This collection contains 3 million high-quality sample pairs, with 2 million captured from real-world environments and 1 million synthetically rendered. Totaling 2.71TB, it encompasses data from six mainstream depth cameras, making it the largest open-source RGB-D dataset primarily based on real scenes. This release provides richer, more realistic data to advance research in embodied intelligence, spatial perception, and 3D vision.

(Image: A sample from the LingBot-Depth-Dataset. From top to bottom: the RGB image, the sensor's raw depth map, and the ground truth depth map. The dataset provides both raw and ground truth depth information, offering robust support for training and evaluating models in real-world conditions.)
Publicly available depth datasets have long faced challenges of limited scale, inadequate real-world scene coverage, and reliance on single hardware types. Many datasets are largely synthetic, exhibiting significant gaps compared to real sensor data in terms of noise patterns, depth holes, and material representation, which hinders the practical deployment of trained models.
The LingBot-Depth-Dataset effectively addresses this data gap in spatial perception by providing large-scale, real-scene captures. Each sample includes an RGB image, a raw sensor depth map, and a corresponding ground truth depth map, making it directly applicable for training and evaluating depth estimation and depth completion models. It supports six popular depth cameras—Orbbec 335, 335L, and Intel RealSense D405, D415, D435, D455—facilitating better model generalization and evaluation across diverse devices and scenarios.
Ant Lingbo's previously open-sourced high-precision spatial perception model, LingBot-Depth, was trained using this dataset as its core data. Compared to mainstream industry methods like PromptDA and PriorDA, LingBot-Depth reduces depth prediction error by over 70% in indoor scenes and by approximately 47% in sparse depth completion tasks. When deployed, this model enables commercial depth cameras to produce more complete, smoother, and sharper depth maps in challenging conditions like transparent glass, reflective surfaces, and backlighting—all without hardware modifications. In certain scenarios, its performance rivals that of premium industrial-grade depth cameras.
For academic and research institutions, this open-source initiative lowers the barriers to data collection and annotation, accelerating the transition of spatial perception technologies from research to real-world application. As robotics and embodied intelligence rapidly integrate into physical environments, large-scale, high-quality datasets grounded in real-world data will become essential infrastructure for driving continued industry progress.
Related article
UK Government Departments Clash Over Energy Needs for AI Data Centers
The UK government is grappling with a major challenge: advancing clean energy while aiming to become a global leader in artificial intelligence. Yet serious inconsistencies appear between the departments responsible for these goals. The Department fo
Cyberspace Administration of China mandates tagging of AI-generated and fictional short videos
The Cyberspace Administration of China has rolled out a comprehensive plan to standardize short video content labeling, mandating that platforms offer six required tags—including "AI-generated content"—ushering in a new era of mandatory transparency
DeepL, renowned for text translation, now targets voice translation
DeepL, a translation company best known for its text-based tools, has launched a voice-to-voice translation suite today that addresses scenarios such as meetings, mobile and web conversations, and group discussions for frontline workers through custo
Related Special Topic Recommendations
Comments (0)
0/500
On March 31st, Ant Lingbo Technology officially open-sourced the large-scale RGB-D dataset, LingBot-Depth-Dataset. This collection contains 3 million high-quality sample pairs, with 2 million captured from real-world environments and 1 million synthetically rendered. Totaling 2.71TB, it encompasses data from six mainstream depth cameras, making it the largest open-source RGB-D dataset primarily based on real scenes. This release provides richer, more realistic data to advance research in embodied intelligence, spatial perception, and 3D vision.

(Image: A sample from the LingBot-Depth-Dataset. From top to bottom: the RGB image, the sensor's raw depth map, and the ground truth depth map. The dataset provides both raw and ground truth depth information, offering robust support for training and evaluating models in real-world conditions.)
Publicly available depth datasets have long faced challenges of limited scale, inadequate real-world scene coverage, and reliance on single hardware types. Many datasets are largely synthetic, exhibiting significant gaps compared to real sensor data in terms of noise patterns, depth holes, and material representation, which hinders the practical deployment of trained models.
The LingBot-Depth-Dataset effectively addresses this data gap in spatial perception by providing large-scale, real-scene captures. Each sample includes an RGB image, a raw sensor depth map, and a corresponding ground truth depth map, making it directly applicable for training and evaluating depth estimation and depth completion models. It supports six popular depth cameras—Orbbec 335, 335L, and Intel RealSense D405, D415, D435, D455—facilitating better model generalization and evaluation across diverse devices and scenarios.
Ant Lingbo's previously open-sourced high-precision spatial perception model, LingBot-Depth, was trained using this dataset as its core data. Compared to mainstream industry methods like PromptDA and PriorDA, LingBot-Depth reduces depth prediction error by over 70% in indoor scenes and by approximately 47% in sparse depth completion tasks. When deployed, this model enables commercial depth cameras to produce more complete, smoother, and sharper depth maps in challenging conditions like transparent glass, reflective surfaces, and backlighting—all without hardware modifications. In certain scenarios, its performance rivals that of premium industrial-grade depth cameras.
For academic and research institutions, this open-source initiative lowers the barriers to data collection and annotation, accelerating the transition of spatial perception technologies from research to real-world application. As robotics and embodied intelligence rapidly integrate into physical environments, large-scale, high-quality datasets grounded in real-world data will become essential infrastructure for driving continued industry progress.
UK Government Departments Clash Over Energy Needs for AI Data Centers
The UK government is grappling with a major challenge: advancing clean energy while aiming to become a global leader in artificial intelligence. Yet serious inconsistencies appear between the departments responsible for these goals. The Department fo
Cyberspace Administration of China mandates tagging of AI-generated and fictional short videos
The Cyberspace Administration of China has rolled out a comprehensive plan to standardize short video content labeling, mandating that platforms offer six required tags—including "AI-generated content"—ushering in a new era of mandatory transparency
DeepL, renowned for text translation, now targets voice translation
DeepL, a translation company best known for its text-based tools, has launched a voice-to-voice translation suite today that addresses scenarios such as meetings, mobile and web conversations, and group discussions for frontline workers through custo





Home






