NanoHTNet: Nano Human Topology Network for Efficient 3D Human Pose Estimation,
Jialun Cai, Mengyuan Liu, Hong Liu, Shuheng Zhou, Wenhao Li
In IEEE Transactions on Image Processing (TIP), 2025
Here, we are primarily open-sourcing the model framework code for NanoHTNet. The code for the POSECLR contrastive learning paradigm will be introduced and open-sourced in our future work.
To get started as quickly as possible, follow the instructions in this section. This should allow you train a model from scratch, test our pretrained models.
Make sure you have the following dependencies installed before proceeding:
- Python 3.7+
- PyTorch >= 1.10.0
Please download the dataset here and refer to VideoPose3D to set up the Human3.6M dataset ('./dataset' directory).
${POSE_ROOT}/
|-- dataset
| |-- data_3d_h36m.npz
| |-- data_2d_h36m_gt.npz
| |-- data_2d_h36m_cpn_ft_h36m_dbb.npzLet's take a receptive field of 243 frames and an actual input of 9 frames as an example, the pretrained model is here, please download it and put it in the './ckpt' directory. To achieve the performance in the paper, run:
python main.py \
--reload \
--previous_dir ./ckpt/demo \
--frames 243 \
--gpu 0 \
--keep_frames 9
If you want to train your own model, run:
python main.py --train --frames 243 -n "your_model_name" --keep_frames 9 --gpu 0
Our code is extended from the following repositories. We thank the authors for releasing the codes.
This project is licensed under the terms of the MIT license.

