This project, jointly developed by Xi'an Jiaotong University and Huawei, provides a complete pipeline for generating editable 3D Gaussian reconstructions of entire manipulation scenes. The reconstructed scenes are rendered from fixed camera perspectives and used to train Vision-Language-Action (VLA) models. By leveraging the editability of Gaussian splats, this pipeline enables augmentation from a single collected demonstration into infinite variations by modifying object pose, lighting, object types, and scene backgrounds.
git clone https://github.com/XJTU-RoboLab/Scene_Reconstruction.git --recursive
cd Scene_Reconstruction
conda env create --file environment.yml
conda activate gaussian_splattingffmpeg -i your_video_dir -vf fps=10 Scene_Reconstruction/data/input/image%d.pngcd Scene_Reconstruction
python convert.py -s Scene_Reconstruction/data --resizeCompress align/ directory to align.zip and upload to the working directory before running the following commands.
pip install git+https://github.com/huggingface/transformers.git
python Scene_Reconstruction/depth_gen.py --input_path /dev/shm/GS_data/data/input
python utils/make_depth_scale.py \
--base_dir data/ \
--depths_dir data/depthpython train.py -s data \
-m output/data \
-d data/depth \
-r 1Replace with your scene folder if needed. The
-roption controls the downsampling factor for training resolution.
cd Scene_Reconstruction/SIBR_viewers/install/bin
./SIBR_gaussianViewer_app -m Scene_Reconstruction/output/sceneUse the online SuperSplat editor to segment the Gaussian point cloud into meaningful components:
- Robot arm
- Table surface
- Manipulated object
Editor URL: https://superspl.at/editor/
Save outputs to:
Scene_Reconstruction/point_cloud/
cd align
python initialize_matrix.py --gs_path Scene_Reconstruction/point_cloud/fr3.plypython icp_alignment.py- Inspect visual output to verify alignment.
- Use the matrix generated by
initialize_matrix.pyas the initial value fortrans_initinicp_alignment.py. - Also update the initial pose
initposeinscene_target.pyaccordingly.
Use a small neural network to refine the coarse alignment.
python align/network_alignment.py --gs_path gaussian-splatting/point_cloud/fr3.plyThe alignment network is defined in network_alignment.py, with the optimized result stored in:
self.transformed_matrix = nn.Parameter([xyz + xyzw])The optimized pose is returned from .pth and assigned to self.transformed_matrix.
Transform the separated Gaussian point clouds (robot, table, object) using the optimized alignment matrix:
python align/utils_2.py
python align/obj_posed.pySplit robot into individual links using URDF information:
python get_link_mesh.pyEach link mesh will be saved with its local origin set to
[0, 0, 0].
Capture one full motion trajectory including:
ee_pose: End-effector poseq_pos: Joint anglesgripper: Gripper open/close status
Trajectory data should be stored in .h5 format. Key frame indices are set in:
python generate_demo.py
Ensure all required calibration inputs are available in the target folder, then run:
git clone https://github.com/OpenRobotLab/RoboSplat.git
conda create -n robosplat python=3.10 -y
conda activate robosplat
cd RoboSplat
pip install -r requirements.txt
python data_aug/generate_demo.py \
--image_size 224 \
--save True \
--save_video True \
--ref_demo_path data/source_demo/real_000000.h5 \
--xy_step_str '[10, 10]' \
--augment_lighting False \
--augment_appearance False \
--augment_camera_pose False \
--output_path data/generated_demo/pick_100- The
alignfolder will be open-sourced after the project’s completion, scheduled for February 22, 2026. - Clean and modularize the RoboSplat codebase.
- Improve rendering quality.
- Support object category replacement and reconstruction.
This work builds upon the open-source contributions of:



