Skip to content

Commit b297189

Browse files
committed
Bump version, add changelog
Also updated some parts of the README. Other parts still need updating.
1 parent 164152c commit b297189

File tree

5 files changed

+37
-34
lines changed

5 files changed

+37
-34
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
## Changelog
2+
### 0.4.3 video processing tab
3+
* Added an option to process videos directly from a video file. This leads to better results than batch-processing individual frames of a video. Allows generating depthmap videos, that can be used in further generations as custom depthmap videos.
4+
* UI improvements.
5+
* Extra stereoimage generation modes - enable in extension settings if you want to use them.
6+
* New stereoimage generation parameter - offset exponent. Setting it to 1 may produce more realistic outputs.
27
### 0.4.2
38
* Added UI options for 2 additional rembg models.
49
* Heatmap generation UI option is hidden - if you want to use it, please activate it in the extension settings.

README.md

Lines changed: 25 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
# High Resolution Depth Maps for Stable Diffusion WebUI
2-
This script is an addon for [AUTOMATIC1111's Stable Diffusion WebUI](https://github.com/AUTOMATIC1111/stable-diffusion-webui) that creates `depth maps`, and now also `3D stereo image pairs` as side-by-side or anaglyph from a single image. The result can be viewed on 3D or holographic devices like VR headsets or [Looking Glass](https://lookingglassfactory.com/) displays, used in Render- or Game- Engines on a plane with a displacement modifier, and maybe even 3D printed.
2+
This program is an addon for [AUTOMATIC1111's Stable Diffusion WebUI](https://github.com/AUTOMATIC1111/stable-diffusion-webui) that creates depth maps. Using either generated or custom depth maps, it can also create 3D stereo image pairs (as side-by-side or anaglyph), normalmaps and 3D meshes. The outputs of the script can be viewed directly or used as an asset for a 3D engine. Please see [wiki](https://github.com/thygate/stable-diffusion-webui-depthmap-script/wiki/Viewing-Results) to learn more. The program has integration with [Rembg](https://github.com/danielgatis/rembg). It also supports batch processing, processing of videos, and can also be run in standalone mode, without Stable Diffusion WebUI.
33

4-
To generate realistic depth maps `from a single image`, this script uses code and models from the [MiDaS](https://github.com/isl-org/MiDaS) and [ZoeDepth](https://github.com/isl-org/ZoeDepth) repositories by Intel ISL, or LeReS from the [AdelaiDepth](https://github.com/aim-uofa/AdelaiDepth) repository by Advanced Intelligent Machines. Multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) is used to generate high resolution depth maps.
4+
To generate realistic depth maps from individual images, this script uses code and models from the [MiDaS](https://github.com/isl-org/MiDaS) and [ZoeDepth](https://github.com/isl-org/ZoeDepth) repositories by Intel ISL, or LeReS from the [AdelaiDepth](https://github.com/aim-uofa/AdelaiDepth) repository by Advanced Intelligent Machines. Multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) is used to generate high resolution depth maps.
55

6-
3D stereo, and red/cyan anaglyph images are generated using code from the [stereo-image-generation](https://github.com/m5823779/stereo-image-generation) repository. Thanks to [@sina-masoud-ansari](https://github.com/sina-masoud-ansari) for the tip! Discussion [here](https://github.com/thygate/stable-diffusion-webui-depthmap-script/discussions/45). Improved techniques for generating stereo images and balancing distortion between eyes by [@semjon00](https://github.com/semjon00), see [here](https://github.com/thygate/stable-diffusion-webui-depthmap-script/pull/51) and [here](https://github.com/thygate/stable-diffusion-webui-depthmap-script/pull/56).
6+
Stereoscopic images are created using a custom-written algorithm.
77

8-
3D Photography using Context-aware Layered Depth Inpainting by Virginia Tech Vision and Learning Lab , or [3D-Photo-Inpainting](https://github.com/vt-vl-lab/3d-photo-inpainting) is used to generate a `3D inpainted mesh` and render `videos` from said mesh.
8+
3D Photography using Context-aware Layered Depth Inpainting by Virginia Tech Vision and Learning Lab, or [3D-Photo-Inpainting](https://github.com/vt-vl-lab/3d-photo-inpainting) is used to generate a `3D inpainted mesh` and render `videos` from said mesh.
99

10-
[Rembg](https://github.com/danielgatis/rembg) by [@DanielGatis](https://github.com/danielgatis) support added by [@graemeniedermayer](https://github.com/graemeniedermayer), using [U-2-Net](https://github.com/xuebinqin/U-2-Net) by [@xuebinqin](https://github.com/xuebinqin) to remove backgrounds.
10+
Rembg uses [U-2-Net](https://github.com/xuebinqin/U-2-Net) and [IS-Net](https://github.com/xuebinqin/DIS).
1111

1212
## Depthmap Examples
1313
[![screenshot](examples.png)](https://raw.githubusercontent.com/thygate/stable-diffusion-webui-depthmap-script/main/examples.png)
@@ -20,32 +20,30 @@ video by [@graemeniedermayer](https://github.com/graemeniedermayer), more exampl
2020
![](https://user-images.githubusercontent.com/54073010/210012661-ef07986c-2320-4700-bc54-fad3899f0186.png)
2121
images generated by [@semjon00](https://github.com/semjon00) from CC0 photos, more examples [here](https://github.com/thygate/stable-diffusion-webui-depthmap-script/pull/56#issuecomment-1367596463).
2222

23-
2423
## Install instructions
25-
The script is now also available to install from the `Available` subtab under the `Extensions` tab in the WebUI.
24+
### As extension
25+
The script can be installed directly from WebUI. Please navigate to `Extensions` tab, then click `Available`, `Load from` and then install the `Depth Maps` extension. Alternatively, the extension can be installed from URL: `https://github.com/thygate/stable-diffusion-webui-depthmap-script`.
2626

2727
### Updating
2828
In the WebUI, in the `Extensions` tab, in the `Installed` subtab, click `Check for Updates` and then `Apply and restart UI`.
2929

30-
### Automatic installation
31-
In the WebUI, in the `Extensions` tab, in the `Install from URL` subtab, enter this repository
32-
`https://github.com/thygate/stable-diffusion-webui-depthmap-script`
33-
and click install and restart.
30+
### Standalone
31+
Clone the repository, install the requirements from `requirements.txt`, launch using `main.py`.
3432

35-
>Model `weights` will be downloaded automatically on first use and saved to /models/midas, /models/leres and /models/pix2pix
33+
>Model weights will be downloaded automatically on their first use and saved to /models/midas, /models/leres and /models/pix2pix. Zoedepth models are stored in torch cache folder.
3634
3735

3836
## Usage
39-
Select the "DepthMap vX.X.X" script from the script selection box in either txt2img or img2img, or go to the Depth tab when using existing images.
37+
Select the "DepthMap" script from the script selection box in either txt2img or img2img, or go to the Depth tab when using existing images.
4038
![screenshot](options.png)
4139

42-
The models can `Compute on` GPU and CPU, use CPU if low on VRAM.
40+
The models can `Compute on` GPU and CPU, use CPU if low on VRAM.
4341

44-
There are seven models available from the `Model` dropdown. For the first model, res101, see [AdelaiDepth/LeReS](https://github.com/aim-uofa/AdelaiDepth/tree/main/LeReS) for more info. The others are the midas models: dpt_beit_large_512, dpt_beit_large_384, dpt_large_384, dpt_hybrid_384, midas_v21, and midas_v21_small. See the [MiDaS](https://github.com/isl-org/MiDaS) repository for more info. The newest dpt_beit_large_512 model was trained on a 512x512 dataset but is VERY VRAM hungry.
42+
There are ten models available from the `Model` dropdown. For the first model, res101, see [AdelaiDepth/LeReS](https://github.com/aim-uofa/AdelaiDepth/tree/main/LeReS) for more info. The others are the midas models: dpt_beit_large_512, dpt_beit_large_384, dpt_large_384, dpt_hybrid_384, midas_v21, and midas_v21_small. See the [MiDaS](https://github.com/isl-org/MiDaS) repository for more info. The newest dpt_beit_large_512 model was trained on a 512x512 dataset but is VERY VRAM hungry. The last three models are [ZoeDepth](https://github.com/isl-org/ZoeDepth) models.
4543

4644
Net size can be set with `net width` and `net height`, or will be the same as the input image when `Match input size` is enabled. There is a trade-off between structural consistency and high-frequency details with respect to net size (see [observations](https://github.com/compphoto/BoostingMonocularDepth#observations)).
4745

48-
`Boost` will enable multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) and will significantly improve the results. Mitigating the observations mentioned above. Net size is ignored when enabled. Best results with res101.
46+
`Boost` will enable multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) and will significantly improve the results, mitigating the observations mentioned above, and the cost of much larger compute time. Best results with res101.
4947

5048
`Clip and renormalize` allows for clipping the depthmap on the `near` and `far` side, the values in between will be renormalized to fit the available range. Set both values equal to get a b&w mask of a single depth plane at that value. This option works on the 16-bit depthmap and allows for 1000 steps to select the clip values.
5149

@@ -55,8 +53,6 @@ Regardless of global settings, `Save DepthMap` will always save the depthmap in
5553

5654
To see the generated output in the webui `Show DepthMap` should be enabled. When using Batch img2img this option should also be enabled.
5755

58-
To make the depthmap easier to analyze for human eyes, `Show HeatMap` shows an extra image in the WebUI that has a color gradient applied. It is not saved.
59-
6056
When `Combine into one image` is enabled, the depthmap will be combined with the original image, the orientation can be selected with `Combine axis`. When disabled, the depthmap will be saved as a 16 bit single channel PNG as opposed to a three channel (RGB), 8 bit per channel image when the option is enabled.
6157

6258
When either `Generate Stereo` or `Generate anaglyph` is enabled, a stereo image pair will be generated. `Divergence` sets the amount of 3D effect that is desired. `Balance between eyes` determines where the (inevitable) distortion from filling up gaps will end up, -1 Left, +1 Right, and 0 balanced.
@@ -78,17 +74,19 @@ If you often get out of memory errors when computing a depthmap on GPU while usi
7874
## FAQ
7975

8076
* `Can I use this on existing images ?`
81-
- Yes, you can now use the Depth tab to easily process existing images.
82-
- Yes, in img2img, set denoising strength to 0. This will effectively skip stable diffusion and use the input image. You will still have to set the correct size, and need to select `Crop and resize` instead of `Just resize` when the input image resolution does not match the set size perfectly.
83-
* `Can I run this on google colab ?`
77+
- Yes, you can use the Depth tab to easily process existing images.
78+
- Another way of doing this would be to use img2img with denoising strength to 0. This will effectively skip stable diffusion and use the input image. You will still have to set the correct size, and need to select `Crop and resize` instead of `Just resize` when the input image resolution does not match the set size perfectly.
79+
* `Can I run this on Google Colab ?`
8480
- You can run the MiDaS network on their colab linked here https://pytorch.org/hub/intelisl_midas_v2/
8581
- You can run BoostingMonocularDepth on their colab linked here : https://colab.research.google.com/github/compphoto/BoostingMonocularDepth/blob/main/Boostmonoculardepth.ipynb
86-
87-
## Forks and Related
88-
89-
* Several scripts by [@Extraltodeus](https://github.com/Extraltodeus) using depth maps : https://github.com/Extraltodeus?tab=repositories
90-
91-
### More updates soon .. Feel free to comment and share in the discussions.
82+
- Running this program on Colab is not officially supported, but it may work. Please look for more suitable ways of running this. If you still decide to try, standalone installation may be easier to manage.
83+
* `What other depth-related projects could I check out?`
84+
- Several [scripts](https://github.com/Extraltodeus?tab=repositories) by [@Extraltodeus](https://github.com/Extraltodeus) using depth maps.
85+
- Geo11 and [Depth3D](https://github.com/BlueSkyDefender/Depth3D) for playing existing games in 3D.
86+
* `How can I know what changed in the new version of the script?`
87+
- You can see the git history log or refer to the `CHANGELOG.md` file.
88+
89+
### Feel free to comment and share in the discussions!
9290

9391
## Acknowledgements
9492

main.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,4 @@
11
# This launches DepthMap without the AUTOMATIC1111/stable-diffusion-webui
2-
# If DepthMap is installed as an extension,
3-
# you may want to change the working directory to the stable-diffusion-webui root.
42

53
import argparse
64
import os
@@ -11,7 +9,8 @@
119

1210
def maybe_chdir():
1311
"""Detects if DepthMap was installed as a stable-diffusion-webui script, but run without current directory set to
14-
the stable-diffusion-webui root. Changes current directory if needed, to aviod clutter."""
12+
the stable-diffusion-webui root. Changes current directory if needed.
13+
This is to avoid re-downloading models and putting results into a wrong folder."""
1514
try:
1615
file_path = pathlib.Path(__file__)
1716
path = file_path.parts

src/common_ui.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -222,9 +222,10 @@ def open_folder_action():
222222

223223

224224
def depthmap_mode_video(inp):
225-
gr.HTML(value="Single video mode allows generating videos from videos. Every frame of the video is processed, "
226-
"please adjust generation settings, so that generation is not too slow. For the best results, "
227-
"Use a zoedepth model, since they provide the highest level of temporal coherency.")
225+
gr.HTML(value="Single video mode allows generating videos from videos. Please "
226+
"keep in mind that all the frames of the video need to be processed - therefore it is important to "
227+
"pick settings so that the generation is not too slow. For the best results, "
228+
"use a zoedepth model, since they provide the highest level of coherency between frames.")
228229
inp += gr.File(elem_id='depthmap_vm_input', label="Video or animated file",
229230
file_count="single", interactive=True, type="file")
230231
inp += gr.Dropdown(elem_id="depthmap_vm_smoothening_mode", label="Smoothening", type="value", choices=['none'])

src/misc.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ def call_git(dir):
2424

2525
REPOSITORY_NAME = "stable-diffusion-webui-depthmap-script"
2626
SCRIPT_NAME = "DepthMap"
27-
SCRIPT_VERSION = "v0.4.2"
27+
SCRIPT_VERSION = "v0.4.3"
2828
SCRIPT_FULL_NAME = f"{SCRIPT_NAME} {SCRIPT_VERSION} ({get_commit_hash()})"
2929

3030

0 commit comments

Comments
 (0)