Skip to content

Commit af92870

Browse files
committed
Update README.md
1 parent 2baed6c commit af92870

File tree

5 files changed

+67
-10
lines changed

5 files changed

+67
-10
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ __pycache__/
33
*.py[cod]
44
*$py.class
55

6-
./output/*
6+
output/*
77
./datasets/raw/*
88
./datasets/processed/*
99
./trained_models/*

README.md

Lines changed: 62 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# SkeletonDiffusion - Nonisotropic Gaussian Diffusion for Reaslitic 3D Human Motion Prediction (CVPR 2025)
1+
# SkeletonDiffusion - Nonisotropic Gaussian Diffusion for Realistic 3D Human Motion Prediction (CVPR 2025)
22
**[Website](https://ceveloper.github.io/publications/skeletondiffusion/)** |
33
**[Paper](https://arxiv.org/abs/2501.06035)** |
44
**[Video](https://www.youtube.com/watch?v=W9GzdDXN41M)** |
@@ -52,6 +52,9 @@ Together with the latent diffusion model SkeletonDiffusion, we introduce a train
5252
- [Training](#training)
5353
- [Autoencoder](#1-autoencoder)
5454
- [Diffusion](#2-diffusion)
55+
- [About Training Time](#about-training-time)
56+
- [How to Resume Training](#how-to-resume-training)
57+
- [Running our Implementation as Isotropic](#running-our-implementation-as-isotropic)
5558

5659
## Nonisotropic Gaussian Diffusion - Plug-and-play
5760
In our paper SkeletonDiffusion, nonisotropic diffusion is performed extracting correlations from the adjacency matrix of the human skeleton. If you are working on a problem described by an adjacency matrix or the correlations between components of your problem (for nus human body joints) are available, you can try training your diffusion model with our nonisotropic Gaussian diffusion implementation.
@@ -132,7 +135,9 @@ conda create --name skeldiff python=3.10 -y
132135
conda activate skeldiff
133136
conda config --env --add channels pytorch
134137
conda config --env --add channels conda-forge
135-
conda install pytorch==2.0.1 torchvision==0.15.2 cudatoolkit=11.8 -c pytorch -y
138+
conda install pytorch==2.0.1 torchvision==0.15.2 cudatoolkit==11.8 -c pytorch -y
139+
# If the previous line installs pytorch cpu only version, substitute with the following:
140+
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
136141
conda install numpy=1.25 scikit-image=0.24.0 scipy=1.11.3 pillow=9.4.0 pip ignite=0.4.13 pyyaml=6.0.1 einops=0.7.0 hydra-core=1.3.2 zarr=2.18.0 tensorboard=2.15.0 tabulate=0.9.0 cdflib=1.2.3 ipython=8.16.1 jupyter==1.0.0 tqdm
137142
matplotlib=3.8.0 -c conda-forge -y
138143
pip install denoising-diffusion-pytorch==1.9.4 git+https://github.com/nghorbani/human_body_prior@4c246d8
@@ -159,7 +164,7 @@ We follow the same dataset creation pipeline as https://github.com/BarqueroGerma
159164

160165
Download the *SMPL+H G* files for **22 datasets**: ACCAD, BMLhandball, BMLmovi, BMLrub, CMU, DanceDB, DFaust, EKUT, EyesJapanDataset, GRAB, HDM05, HUMAN4D, HumanEva, KIT, MoSh, PosePrior (MPI_Limits), SFU, SOMA, SSM, TCDHands, TotalCapture, and Transitions. Then, move the **tar.bz2** files to `./datasets/raw/AMASS` (DO NOT extract them).
161166

162-
Now, download the 'DMPLs for AMASS' from [here](https://smpl.is.tue.mpg.de), and the 'Extended SMPL+H model' from [here](https://mano.is.tue.mpg.de/). Move both extracted folders (dmpls, smplh) to `./datasets/annotations/AMASS/body_models`. Then, run:
167+
Now, download the 'DMPLs for AMASS' from [here](https://smpl.is.tue.mpg.de), and the 'Extended SMPL+H model' from [here](https://mano.is.tue.mpg.de/). Move both extracted folders (dmpls, smplh) to `./datasets/annotations/AMASS/bodymodels`. Then, run:
163168
```bash
164169
cd src
165170
python -m data.create_amass_dataset --gpu --if_extract_zip
@@ -254,9 +259,32 @@ python train_autoencoder.py dataset=h36m model.num_epochs=200
254259

255260
### 2. Diffusion
256261

257-
Train our Nonisotropic diffusion on top of the previously trained latent space and autoencoder, you will need a 48GB GPU (A40). This part of the training is quite slow, due to necessaity of encoding and decoding latent embeddings via the recurrent autoencoder.
262+
Train our Nonisotropic diffusion on top of the previously trained latent space and autoencoder, you will need a 48GB GPU (A40).
263+
258264
![alt text](./figures/arch_diffusion.jpg)
259265

266+
267+
### About Training Time
268+
The diffusion part of the training is quite slow, due to necessity of encoding and decoding latent embeddings via the recurrent autoencoder. If you want to reduce the training time, you can train less performant models by relaxing the diffusion training objective (See Appendix E.4 and results for AMASS).
269+
270+
271+
| Model | Training Time (AMASS) | APD $\uparrow$ | | CMD $\downarrow$| | str mean $\downarrow$ | str RMSE $\downarrow$ |
272+
|----------------------------------------------|-----------------------|-------------------|------------|----------------|------------|---------------------|----------|
273+
| k=1 | ~1d 7h (<48GB GPU) | 4.987 | | 16.574 | | 3.50 | 4.56 |
274+
| k=50 latent argmin | ~1d 14h (<48GB GPU) | 8.497 | | 12.885 | | 3.17 | 4.35 |
275+
| k=50 motion space argmin (SkeletonDiffusion) | ~6d (48GB GPU) | 9.456 | | 11.418 | | 3.15 | 4.45 |
276+
277+
278+
To train the model with _k=1_ (without loss relaxation) append ```model.train_pick_best_sample_among_k=1``` to your training arguments:
279+
```bash
280+
python train_diffusion.py model.train_pick_best_sample_among_k=1 <your training arguments>
281+
```
282+
283+
To train the model by choosing the sample to backpropagate the loss in laten space with k=50 (_k=50 latent argmin_) append ```model.similarity_space=latent_space``` to your training arguments:
284+
```bash
285+
python train_diffusion.py model.similarity_space=latent_space <your training arguments>
286+
```
287+
260288
#### AMASS
261289
```bash
262290
python train_diffusion.py model=skeleton_diffusion model.pretrained_autoencoder_path=./output/hmp/amass/autoencoder/<your folder>/checkpoints/<checkpoint name>.pt dataset_folder_log_path=amass model.num_epochs=150 model.lr_scheduler_kwargs.warmup_duration=75
@@ -272,3 +300,33 @@ python train_diffusion.py model=skeleton_diffusion model.pretrained_autoencoder_
272300
```bash
273301
python train_diffusion.py model=skeleton_diffusion model.pretrained_autoencoder_path=./output/hmp/h36m/autoencoder/<your folder>/checkpoints/<checkpoint name>.pt dataset_folder_log_path=h36m model.num_epochs=100 model.diffusion_arch.attn_heads=8 model.lr_scheduler_kwargs.gamma_decay=0.85 model.lr_scheduler_kwargs.warmup_duration=25
274302
```
303+
304+
### How to Resume Training
305+
306+
To resume training from an experiment repository and a saved checkpoint, you can run the corresponding train script and append a few arguments:
307+
308+
```bash
309+
python train_<model>.py if_resume_training=True load=True output_log_path=<path to experiemnt repository> load_path=<path to .pt checkpoint> <your other arguments>
310+
```
311+
312+
For an example checkpoint _./output/hmp/amass/diffusion/June30_11-35-08/checkpoints/checkpoint_144.pt_, you would run:
313+
```bash
314+
python train_diffusion.py if_resume_training=True load=True output_log_path=<./output/hmp/amass/diffusion/June30_11-35-08 load_path=./output/hmp/amass/diffusion/June30_11-35-08/checkpoints/checkpoint_144.pt <your other training arguments of the previous call>
315+
```
316+
317+
### Running our Implementation as Isotropic
318+
319+
Our Nonisotropic implemetation supports also isotropic diffusion (our _isotropic_ ablations of Table 7.). This may be useful to you if you want to use our codebase for other projects and want to reduce classes/complexity.
320+
321+
To run our nonisotropic diffusion as isotropic with a suitable choice of covariance matrix:
322+
323+
```bash
324+
python train_diffusion.py model=skeleton_diffusion_run_code_as_isotropic <your training arguments>
325+
```
326+
327+
To run the isotropic diffusion codebase as in BeLFusion or lucidrain:
328+
```bash
329+
python train_diffusion.py model=isotropic_diffusion <your training arguments>
330+
```
331+
332+
For the same random initialization and environment, both trainings return exactly the same weights.

configs/config_train_diffusion/model/skeleton_diffusion_run_code_as_isotropic.yaml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,4 @@
1-
_pretrained_GM_checkpoint: checkpoint_final
2-
pretrained_autoencoder_path: '${model.pretrained_GM_folder}/checkpoints/${model._pretrained_GM_checkpoint}.pt'
3-
pretrained_GM_folder: ${eval:"'./models/final_checkpoints/H36M/hmp/autoencoder/January19_19-24-04_ID1137310' if ${eval:"'${_load_saved_aoutoenc}'.split('-')[1] == 'h36m'"} else './models/final_checkpoints/AMASS/hmp/autoencoder/May11_10-35-09_ID1185354'"}
1+
pretrained_autoencoder_path: './output/models/AMASS/hmp/autoencoder/checkpoints/checkpoint_final.pt'
42

53

64
lr: 1.e-3

src/core/diffusion/nonisotropic.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ def verify_noise_scale(diffusion):
3333
print("current: ", (zeta_noise**2).sum(-1).mean(0))
3434
print("original standard gaussian diffusion: ",(1-alphas) * zeta_noise.shape[-1])
3535

36-
def compute_covariance_matrices(diffusion: torch.nn.Module, Lambda_N: torch.Tensor, diffusion_covariance_type='ani-isotropic', gamma_scheduler = 'cosine'):
36+
def compute_covariance_matrices(diffusion: torch.nn.Module, Lambda_N: torch.Tensor, diffusion_covariance_type='skeleton-diffusion', gamma_scheduler = 'cosine'):
3737
N, *_ = Lambda_N.shape
3838
alphas = 1. - diffusion.betas
3939
def _alpha_sumprod(alphas, t):

src/data/create_amass_dataset.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,8 @@ def extract_dataset(original_folder, precomputed_folder, models_dir, datasets=No
106106
bm_fname = os.path.join(models_dir, 'smplh/{}/model.npz'.format(gender))
107107
dmpl_fname = os.path.join(models_dir, 'dmpls/{}/model.npz'.format(gender))
108108

109-
bm[gender] = BodyModel(bm_path=bm_fname, num_betas=num_betas, model_type='smplh').to(comp_device)
109+
bm[gender] = BodyModel(bm_fname=bm_fname, num_betas=num_betas, model_type='smplh').to(comp_device)
110+
110111
faces[gender] = c2c(bm[gender].f)
111112

112113
print("All resources initialized")

0 commit comments

Comments
 (0)