You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+58Lines changed: 58 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -249,6 +249,64 @@ ControlNet-LLLite, a novel method for ControlNet with SDXL, is added. See [docum
249
249
250
250
## Change History
251
251
252
+
### Jan 23, 2024 / 2024/1/23: v0.8.2
253
+
254
+
-[Experimental] The `--fp8_base` option is added to the training scripts for LoRA etc. The base model (U-Net, and Text Encoder when training modules for Text Encoder) can be trained with fp8. PR [#1057](https://github.com/kohya-ss/sd-scripts/pull/1057) Thanks to KohakuBlueleaf!
255
+
- Please specify `--fp8_base` in `train_network.py` or `sdxl_train_network.py`.
256
+
- PyTorch 2.1 or later is required.
257
+
- If you use xformers with PyTorch 2.1, please see [xformers repository](https://github.com/facebookresearch/xformers) and install the appropriate version according to your CUDA version.
258
+
- The sample image generation during training consumes a lot of memory. It is recommended to turn it off.
259
+
260
+
-[Experimental] The network multiplier can be specified for each dataset in the training scripts for LoRA etc.
261
+
- This is an experimental option and may be removed or changed in the future.
262
+
- For example, if you train with state A as `1.0` and state B as `-1.0`, you may be able to generate by switching between state A and B depending on the LoRA application rate.
263
+
- Also, if you prepare five states and train them as `0.2`, `0.4`, `0.6`, `0.8`, and `1.0`, you may be able to generate by switching the states smoothly depending on the application rate.
264
+
- Please specify `network_multiplier` in `[[datasets]]` in `.toml` file.
265
+
- Some options are added to `networks/extract_lora_from_models.py` to reduce the memory usage.
266
+
-`--load_precision` option can be used to specify the precision when loading the model. If the model is saved in fp16, you can reduce the memory usage by specifying `--load_precision fp16` without losing precision.
267
+
-`--load_original_model_to` option can be used to specify the device to load the original model. `--load_tuned_model_to` option can be used to specify the device to load the derived model. The default is `cpu` for both options, but you can specify `cuda` etc. You can reduce the memory usage by loading one of them to GPU. This option is available only for SDXL.
268
+
269
+
- The gradient synchronization in LoRA training with multi-GPU is improved. PR [#1064](https://github.com/kohya-ss/sd-scripts/pull/1064) Thanks to KohakuBlueleaf!
270
+
- The code for Intel IPEX support is improved. PR [#1060](https://github.com/kohya-ss/sd-scripts/pull/1060) Thanks to akx!
271
+
- Fixed a bug in multi-GPU Textual Inversion training.
272
+
273
+
- (実験的) LoRA等の学習スクリプトで、ベースモデル(U-Net、および Text Encoder のモジュール学習時は Text Encoder も)の重みを fp8 にして学習するオプションが追加されました。 PR [#1057](https://github.com/kohya-ss/sd-scripts/pull/1057) KohakuBlueleaf 氏に感謝します。
-`.toml` example for network multiplier / ネットワーク適用率の `.toml` の記述例
291
+
292
+
```toml
293
+
[general]
294
+
[[datasets]]
295
+
resolution = 512
296
+
batch_size = 8
297
+
network_multiplier = 1.0
298
+
299
+
... subset settings ...
300
+
301
+
[[datasets]]
302
+
resolution = 512
303
+
batch_size = 8
304
+
network_multiplier = -1.0
305
+
306
+
... subset settings ...
307
+
```
308
+
309
+
252
310
### Jan 17, 2024 / 2024/1/17: v0.8.1
253
311
254
312
- Fixed a bug that the VRAM usage without Text Encoder training is larger than before in training scripts for LoRA etc (`train_network.py`, `sdxl_train_network.py`).
0 commit comments