MV-SAM3D: Adaptive Multi-View 3D Reconstruction

An enhanced multi-view extension for SAM 3D Objects, featuring adaptive fusion strategies for improved 3D reconstruction quality from multiple viewpoints.

🔗 Basic Version: For a simpler averaging-based approach, check out our basic multi-view fork.

🔥 Highlights

Adaptive Multi-View Fusion: Unlike simple averaging, we employ a confidence-aware fusion mechanism that automatically weighs contributions from different views based on their reliability.
Multiple Weighting Strategies:
- Entropy-based: Uses attention entropy as uncertainty measure
- Visibility-based: Uses self-occlusion detection via DDA ray tracing
- Mixed: Combines both strategies for robust weighting
Per-Latent Weighting: Our method operates at the latent level, enabling fine-grained control over how information from different views is combined for each spatial location.
Improved Reconstruction Quality: Better handling of occluded regions and view-dependent artifacts through intelligent fusion.

📢 Research in Progress

This is an active research project. We are continuously exploring new fusion strategies and will release updates as our research progresses.

Results Comparison

Single-View (View 3)

Single-View (View 6)

Multi-View (Adaptive Fusion)

Input Image

Input Images

↓ 3D Reconstruction ↓

3D Result

Installation

Please follow the installation instructions in the basic multi-view version or in SAM 3D Objects.

Quick Start

Basic Usage (Both Stages Weighted by Default)

python run_inference_weighted.py \
    --input_path ./data/example \
    --mask_prompt stuffed_toy \
    --image_names 0,1,2,3,4,5,6,7

Disable All Weighting (Simple Average)

python run_inference_weighted.py \
    --input_path ./data/example \
    --mask_prompt stuffed_toy \
    --image_names 0,1,2,3,4,5,6,7 \
    --no_stage1_weighting --no_stage2_weighting

Using Visibility Weighting (Requires DA3)

To use visibility-based weighting for Stage 2, you need to first run Depth Anything 3 (DA3) to obtain camera poses:

Step 1: Install Depth Anything 3

Please follow the installation instructions at Depth Anything 3.

Step 2: Run DA3 to get camera poses

python scripts/run_da3.py \
    --image_dir ./data/example/images \
    --output_dir ./da3_outputs/example

Step 3: Run weighted inference with visibility

python run_inference_weighted.py \
    --input_path ./data/example \
    --mask_prompt stuffed_toy \
    --image_names 0,1,2,3,4,5,6,7 \
    --da3_output ./da3_outputs/example/da3_output.npz \
    --stage2_weight_source visibility

Key Parameters

Basic

Parameter	Description	Default
`--input_path`	Path to input directory	Required
`--mask_prompt`	Mask folder name	None
`--image_names`	Image names (comma-separated)	All images
`--da3_output`	Path to DA3 output (for visibility weighting)	None

Stage 1 (Shape) Weighting

Parameter	Description	Default
`--no_stage1_weighting`	Disable Stage 1 weighting	False (enabled)
`--stage1_entropy_layer`	Attention layer for weight computation	9
`--stage1_entropy_alpha`	Entropy weighting sharpness	30.0

Stage 2 (Texture) Weighting

Parameter	Description	Default
`--no_stage2_weighting`	Disable Stage 2 weighting	False (enabled)
`--stage2_weight_source`	`entropy`, `visibility`, or `mixed`	`entropy`
`--stage2_entropy_alpha`	Entropy weighting sharpness	30.0
`--stage2_visibility_alpha`	Visibility weighting sharpness	30.0
`--stage2_attention_layer`	Attention layer for weight computation	6
`--self_occlusion_tolerance`	Tolerance for visibility detection	4.0

Visualization

Parameter	Description	Default
`--visualize_weights`	Visualize fusion weights	False
`--compute_latent_visibility`	Visualize latent visibility per view	False
`--overlay_pointmap`	Overlay result on View 0 pointmap	False
`--merge_da3_glb`	Merge result with DA3 scene	False

📖 Full Parameters: See README_PARAMETERS.md for detailed parameter documentation.

Data Structure

input_path/
├── images/
│   ├── 0.png
│   ├── 1.png
│   └── ...
└── object_name/  # mask folder
    ├── 0.png
    ├── 1.png
    └── ...

Mask Format: RGBA format where alpha channel stores mask (alpha=255 for object, alpha=0 for background).

Acknowledgments

This project builds upon SAM 3D Objects by Meta. We thank the original authors for their excellent work.

License

This project inherits the SAM License from the original SAM 3D Objects project.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
doc		doc
environments		environments
notebook		notebook
patching		patching
sam3d_objects		sam3d_objects
scripts		scripts
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README_PARAMETERS.md		README_PARAMETERS.md
demo.py		demo.py
pyproject.toml		pyproject.toml
requirements.dev.txt		requirements.dev.txt
requirements.inference.txt		requirements.inference.txt
requirements.p3d.txt		requirements.p3d.txt
requirements.txt		requirements.txt
run_inference.py		run_inference.py
run_inference_weighted.py		run_inference_weighted.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MV-SAM3D: Adaptive Multi-View 3D Reconstruction

🔥 Highlights

📢 Research in Progress

Results Comparison

Installation

Quick Start

Basic Usage (Both Stages Weighted by Default)

Disable All Weighting (Simple Average)

Using Visibility Weighting (Requires DA3)

Key Parameters

Basic

Stage 1 (Shape) Weighting

Stage 2 (Texture) Weighting

Visualization

Data Structure

Acknowledgments

License

About

Uh oh!

Releases

Packages

Languages

License

devinli123/MV-SAM3D

Folders and files

Latest commit

History

Repository files navigation

MV-SAM3D: Adaptive Multi-View 3D Reconstruction

🔥 Highlights

📢 Research in Progress

Results Comparison

Installation

Quick Start

Basic Usage (Both Stages Weighted by Default)

Disable All Weighting (Simple Average)

Using Visibility Weighting (Requires DA3)

Key Parameters

Basic

Stage 1 (Shape) Weighting

Stage 2 (Texture) Weighting

Visualization

Data Structure

Acknowledgments

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages