Xinyue Zhao, Quanzhi Li, Yue Chao, Quanyou Wang, Zaixing He*
- A Multi-Scene Image Dataset for Pose Estimation
- Large public unique dataset: RTL contains 258K real and synthetic images of reflective texture-less metal parts.
- Multiple scenes: The scene setup contains many variables to simulate real scene and provide varying levels.
- Industrial multi-view acquisition: Camera placement uses the eye-in-hand method to simulate a real industrial view.
- Ground truth pose & bounding-boxes: Accurate annotations for each object are provided.
- Various CAD models: Three types of formats of CAD models were provided to assist training.
38 reflective texture-less machined parts with typical industrial features.
32 scenes simulate real scenes in terms of parts placement, parts types and shape, lighting, background, etc.
Training modes
- Use CAD models only without real images for training.
- Use real images for training (CAD models are optional).
Testing modes
- Use object detection simulation module to test the pose estimation method only.
- Use build-in real object detection module to test the object detection and pose estimation methods as a whole.
Scene groups
- Few occlusions and clutters scenes: 1, 3, 7, 11, 13, 15, 16, 20, 24, 28, 29, 31, 32
- Slight occlusion and few clutters scenes: 6, 5, 12, 15, 17, 22, 24, 25, 26, 27
- Severe occlusion and clutter scenes: 2, 4, 8, 9, 10, 14, 18, 19, 21, 23, 27, 30
Toolbox

