Minor cleanup

semjon00 · semjon00 · commit 92cf04a9c97a · 2023-07-31T15:36:44.000+03:00
diff --git a/README.md b/README.md
@@ -1,5 +1,5 @@
 ﻿# High Resolution Depth Maps for Stable Diffusion WebUI
-This program is an addon for [AUTOMATIC1111's Stable Diffusion WebUI](https://github.com/AUTOMATIC1111/stable-diffusion-webui) that creates depth maps. Using either generated or custom depth maps, it can also create 3D stereo image pairs (as side-by-side or anaglyph), normalmaps and 3D meshes. The outputs of the script can be viewed directly or used as an asset for a 3D engine. Please see [wiki](https://github.com/thygate/stable-diffusion-webui-depthmap-script/wiki/Viewing-Results) to learn more. The program has integration with [Rembg](https://github.com/danielgatis/rembg). It also supports batch processing, processing of videos, and can also be run in standalone mode, without Stable Diffusion WebUI.
+This program is an addon for [AUTOMATIC1111's Stable Diffusion WebUI](https://github.com/AUTOMATIC1111/stable-diffusion-webui) that creates depth maps. Using either generated or custom depth maps, it can also create 3D stereo image pairs (side-by-side or anaglyph), normalmaps and 3D meshes. The outputs of the script can be viewed directly or used as an asset for a 3D engine. Please see [wiki](https://github.com/thygate/stable-diffusion-webui-depthmap-script/wiki/Viewing-Results) to learn more. The program has integration with [Rembg](https://github.com/danielgatis/rembg). It also supports batch processing, processing of videos, and can also be run in standalone mode, without Stable Diffusion WebUI.
 
 To generate realistic depth maps from individual images, this script uses code and models from the [MiDaS](https://github.com/isl-org/MiDaS) and [ZoeDepth](https://github.com/isl-org/ZoeDepth) repositories by Intel ISL, or LeReS from the [AdelaiDepth](https://github.com/aim-uofa/AdelaiDepth) repository by Advanced Intelligent Machines. Multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) is used to generate high resolution depth maps.
 
@@ -22,15 +22,15 @@ images generated by [@semjon00](https://github.com/semjon00) from CC0 photos, mo
 
 ## Install instructions
 ### As extension
-The script can be installed directly from WebUI. Please navigate to `Extensions` tab, then click `Available`, `Load from` and then install the `Depth Maps` extension. Alternatively, the extension can be installed from URL: `https://github.com/thygate/stable-diffusion-webui-depthmap-script`.
+The script can be installed directly from WebUI. Please navigate to `Extensions` tab, then click `Available`, `Load from` and then install the `Depth Maps` extension. Alternatively, the extension can be installed from the URL: `https://github.com/thygate/stable-diffusion-webui-depthmap-script`.
 
 ### Updating
 In the WebUI, in the `Extensions` tab, in the `Installed` subtab, click `Check for Updates` and then `Apply and restart UI`.
 
 ### Standalone
 Clone the repository, install the requirements from `requirements.txt`, launch using `main.py`.
 
->Model weights will be downloaded automatically on their first use and saved to /models/midas, /models/leres and /models/pix2pix. Zoedepth models are stored in torch cache folder.
+>Model weights will be downloaded automatically on their first use and saved to /models/midas, /models/leres and /models/pix2pix. Zoedepth models are stored in the torch cache folder.
 
 
 ## Usage
@@ -43,7 +43,7 @@ There are ten models available from the `Model` dropdown. For the first model, r
 
 Net size can be set with `net width` and `net height`, or will be the same as the input image when `Match input size` is enabled. There is a trade-off between structural consistency and high-frequency details with respect to net size (see [observations](https://github.com/compphoto/BoostingMonocularDepth#observations)).
 
-`Boost` will enable multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) and will significantly improve the results, mitigating the observations mentioned above, and the cost of much larger compute time. Best results with res101.
+`Boost` will enable multi-resolution merging as implemented by [BoostingMonocularDepth](https://github.com/compphoto/BoostingMonocularDepth) and will significantly improve the results, mitigating the observations mentioned above, at the cost of much larger compute time. Best results with res101.
 
 `Clip and renormalize` allows for clipping the depthmap on the `near` and `far` side, the values in between will be renormalized to fit the available range. Set both values equal to get a b&w mask of a single depth plane at that value. This option works on the 16-bit depthmap and allows for 1000 steps to select the clip values.
 
@@ -76,17 +76,17 @@ If you often get out of memory errors when computing a depthmap on GPU while usi
  * `Can I use this on existing images ?`
     - Yes, you can use the Depth tab to easily process existing images.
     - Another way of doing this would be to use img2img with denoising strength to 0. This will effectively skip stable diffusion and use the input image. You will still have to set the correct size, and need to select `Crop and resize` instead of `Just resize` when the input image resolution does not match the set size perfectly.
- * `Can I run this on Google Colab ?`
+ * `Can I run this on Google Colab?`
     - You can run the MiDaS network on their colab linked here https://pytorch.org/hub/intelisl_midas_v2/
     - You can run BoostingMonocularDepth on their colab linked here : https://colab.research.google.com/github/compphoto/BoostingMonocularDepth/blob/main/Boostmonoculardepth.ipynb
     - Running this program on Colab is not officially supported, but it may work. Please look for more suitable ways of running this. If you still decide to try, standalone installation may be easier to manage.
  * `What other depth-related projects could I check out?`
     - Several [scripts](https://github.com/Extraltodeus?tab=repositories) by [@Extraltodeus](https://github.com/Extraltodeus) using depth maps.
-    - Geo11 and [Depth3D](https://github.com/BlueSkyDefender/Depth3D) for playing existing games in 3D.
+    - geo-11 and [Depth3D](https://github.com/BlueSkyDefender/Depth3D) for playing existing games in 3D.
  * `How can I know what changed in the new version of the script?`
     - You can see the git history log or refer to the `CHANGELOG.md` file.
 
-### Feel free to comment and share in the discussions! 
+### Feel free to comment and share in the discussions! Submitting issues and merge requests is heavilly appreciated!
 
 ## Acknowledgements
 
diff --git a/src/common_ui.py b/src/common_ui.py
@@ -122,7 +122,7 @@ def main_ui_panel(is_depth_tab):
                 inp += go.GEN_SIMPLE_MESH, gr.Checkbox(label="Generate simple 3D mesh")
             with gr.Column(visible=False) as mesh_options:
                 with gr.Row():
-                    gr.HTML(value="Generates fast, accurate only with ZoeDepth models and no boost, no custom maps")
+                    gr.HTML(value="Generates fast, accurate only with ZoeDepth models and no boost, no custom maps.")
                 with gr.Row():
                     inp += go.SIMPLE_MESH_OCCLUDE, gr.Checkbox(label="Remove occluded edges")
                     inp += go.SIMPLE_MESH_SPHERICAL, gr.Checkbox(label="Equirectangular projection")
@@ -133,10 +133,10 @@ def main_ui_panel(is_depth_tab):
                     inp += go.GEN_INPAINTED_MESH, gr.Checkbox(
                         label="Generate 3D inpainted mesh")
                 with gr.Column(visible=False) as inpaint_options_row_0:
-                    gr.HTML("Generation is sloooow, required for generating videos")
+                    gr.HTML("Generation is sloooow. Required for generating videos from mesh.")
                     inp += go.GEN_INPAINTED_MESH_DEMOS, gr.Checkbox(
                         label="Generate 4 demo videos with 3D inpainted mesh.")
-                    gr.HTML("More options for generating video can be found in the Generate video tab")
+                    gr.HTML("More options for generating video can be found in the Generate video tab.")
 
         with gr.Box():
             # TODO: it should be clear from the UI that there is an option of the background removal
@@ -184,12 +184,14 @@ def update_default_net_size(model_type):
         inp[go.CLIPDEPTH_FAR].change(
             fn=lambda a, b: a if b < a else b,
             inputs=[inp[go.CLIPDEPTH_FAR], inp[go.CLIPDEPTH_NEAR]],
-            outputs=[inp[go.CLIPDEPTH_NEAR]]
+            outputs=[inp[go.CLIPDEPTH_NEAR]],
+            show_progress=False
         )
         inp[go.CLIPDEPTH_NEAR].change(
             fn=lambda a, b: a if b > a else b,
             inputs=[inp[go.CLIPDEPTH_NEAR], inp[go.CLIPDEPTH_FAR]],
-            outputs=[inp[go.CLIPDEPTH_FAR]]
+            outputs=[inp[go.CLIPDEPTH_FAR]],
+            show_progress=False
         )
 
         inp.add_rule(stereo_options, 'visible-if', go.GEN_STEREO)
@@ -558,7 +560,7 @@ def run_generate(*inputs):
 
     # Deciding what mesh to display (and if)
     display_mesh_fi = None
-    if not backbone.get_opt('depthmap_script_show_3d', True):
+    if backbone.get_opt('depthmap_script_show_3d', True):
         display_mesh_fi = mesh_simple_fi
         if backbone.get_opt('depthmap_script_show_3d_inpaint', True):
             if inpainted_mesh_fi is not None and len(inpainted_mesh_fi) > 0:
diff --git a/src/core.py b/src/core.py
@@ -125,9 +125,6 @@ def core_generation_funnel(outpath, inputimages, inputdepthmaps, inputnames, inp
         if not inputdepthmaps_complete:
             print("Loading model(s) ..")
             model_holder.ensure_models(inp[go.MODEL_TYPE], device, inp[go.BOOST])
-        model = model_holder.depth_model
-        pix2pix_model = model_holder.pix2pix_model
-
         print("Computing output(s) ..")
         # iterate over input images
         for count in trange(0, len(inputimages)):
@@ -231,25 +228,17 @@ def core_generation_funnel(outpath, inputimages, inputdepthmaps, inputnames, inp
                     yield count, 'foreground_mask', mask_image
 
             # A weird quirk: if user tries to save depthmap, whereas custom depthmap is used,
-            # depthmap will not be outputed, even if output_depth_combine is used.
+            # custom depthmap will be outputed
             if inp[go.DO_OUTPUT_DEPTH]:
-                if inputdepthmaps[count] is None:
-                    img_depth = cv2.bitwise_not(img_output) if inp[go.OUTPUT_DEPTH_INVERT] else img_output
-                    if inp[go.OUTPUT_DEPTH_COMBINE]:
-                        axis = 1 if inp[go.OUTPUT_DEPTH_COMBINE_AXIS] == 'Horizontal' else 0
-                        img_concat = Image.fromarray(np.concatenate(
-                            (inputimages[count], convert_i16_to_rgb(img_depth, inputimages[count])),
-                            axis=axis))
-                        yield count, 'concat_depth', img_concat
-                    else:
-                        yield count, 'depth', Image.fromarray(img_depth)
+                img_depth = cv2.bitwise_not(img_output) if inp[go.OUTPUT_DEPTH_INVERT] else img_output
+                if inp[go.OUTPUT_DEPTH_COMBINE]:
+                    axis = 1 if inp[go.OUTPUT_DEPTH_COMBINE_AXIS] == 'Horizontal' else 0
+                    img_concat = Image.fromarray(np.concatenate(
+                        (inputimages[count], convert_i16_to_rgb(img_depth, inputimages[count])),
+                        axis=axis))
+                    yield count, 'concat_depth', img_concat
                 else:
-                    # TODO: make it better
-                    # Yes, this seems stupid, but this is, logically, what should happen -
-                    # and this improves clarity of some other code.
-                    # But we won't return it if there is only one image.
-                    if len(inputimages) > 1:
-                        yield count, 'depth', Image.fromarray(img_output)
+                    yield count, 'depth', Image.fromarray(img_depth)
 
             if inp[go.GEN_STEREO]:
                 # print("Generating stereoscopic image(s)..")
@@ -335,17 +324,11 @@ def core_generation_funnel(outpath, inputimages, inputdepthmaps, inputnames, inp
         if backbone.get_opt('depthmap_script_keepmodels', True):
             model_holder.offload()  # Swap to CPU memory
         else:
-            if 'model' in locals():
-                del model
-            if 'pix2pixmodel' in locals():
-                del pix2pix_model
             model_holder.unload_models()
-
         gc.collect()
         backbone.torch_gc()
 
     # TODO: This should not be here
-    mesh_fi = None
     if inp[go.GEN_INPAINTED_MESH]:
         try:
             mesh_fi = run_3dphoto(device, inpaint_imgs, inpaint_depths, inputnames, outpath,