VGGSfM
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
VGGSfM is an advanced structure-from-motion (SfM) framework jointly developed by Meta AI Research (GenAI) and the University of Oxford’s Visual Geometry Group (VGG). It reconstructs 3D geometry, dense depth, and camera poses directly from unordered or sequential images and videos. The system combines learned feature matching and geometric optimization to generate high-quality camera calibrations, sparse/dense point clouds, and depth maps in standard COLMAP format. Version 2.0 adds support for dynamic scene handling, dense point cloud export, video-based reconstruction (1000+ frames), and integration with Gaussian Splatting pipelines. It leverages tools like PyCOLMAP, poselib, LightGlue, and PyTorch3D for feature matching, pose estimation, and visualization. With minimal configuration, users can process single scenes or full video sequences, apply motion masks to exclude moving objects, and train neural radiance or splatting models directly from reconstructed outputs.