Skip to content

Commit 6f65d7e

Browse files
committed
docs: improve README with model demo videos and publication table - Add model demos, track MOV files with Git LFS, enhance publication section with detailed table and summaries
1 parent 71f19b0 commit 6f65d7e

File tree

7 files changed

+108
-16
lines changed

7 files changed

+108
-16
lines changed

.gitattributes

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,45 @@
1+
# Model files
2+
*.pt
3+
*.pth
4+
*.onnx
5+
*.h5
6+
*.hdf5
7+
*.pkl
8+
*.bin
9+
*.weights
10+
*.model
11+
# Dataset files
12+
*.npy
13+
*.npz
14+
*.csv
15+
*.json
16+
*.jsonl
17+
*.json.gz
18+
*.tar
19+
*.tar.gz
20+
*.zip
21+
# Media files
22+
*.mp4
23+
*.avi
24+
*.mov filter=lfs diff=lfs merge=lfs -text
25+
*.wmv
26+
*.flv
27+
*.mkv
28+
*.mp3
29+
*.wav
30+
*.png
31+
*.jpg
32+
*.jpeg
33+
*.gif
34+
*.bmp
35+
*.tiff
36+
# Other large files
37+
*.iso
38+
*.dmg
39+
*.exe
40+
*.dll
41+
*.so
42+
*.dylib
143
*.tar filter=lfs diff=lfs merge=lfs -text
244
*.pt filter=lfs diff=lfs merge=lfs -text
345
cosmos1/models/diffusion/nemo/post_training/multicamera/multi_camera_video_batch_holding_cup.pt filter=lfs diff=lfs merge=lfs -text

README.md

Lines changed: 51 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2,28 +2,63 @@
22
<img src="assets/nvidia-cosmos-header.png" alt="NVIDIA Cosmos Header">
33
</p>
44

5-
## GitHub project for NVIDIA Cosmos: https://github.com/nvidia-cosmos
5+
[NVIDIA Cosmos](https://www.nvidia.com/cosmos/) is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster. Cosmos contains
66

7-
NVIDIA Cosmos now includes three subprojects:
7+
1. Pre-trained models (available via Hugging Face) under the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/) that allows free commercial use.
8+
2. Pre-training, post-training, and inference code (available in native PyTorch) under the [Apache 2 License](https://www.apache.org/licenses/LICENSE-2.0).
89

9-
### Cosmos-Predict1: https://github.com/nvidia-cosmos/cosmos-predict1
10-
- Cosmos-Predict1 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.
10+
There are three main model families in Cosmos World Foundation Model Platform.
1111

12-
<video src="https://github.com/user-attachments/assets/2ee7386b-8808-4db2-b38a-87ab679339f9">
13-
Your browser does not support the video tag.
14-
</video>
12+
1. [Cosmos Predict](https://github.com/nvidia-cosmos/cosmos-predict1): a collection of general-purpose world models for future state prediction.
1513

16-
### Cosmos-Transfer1: https://github.com/nvidia-cosmos/cosmos-transfer1
17-
- Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.
14+
2. [Cosmos Transfer](https://github.com/nvidia-cosmos/cosmos-transfer1): a collection of multimodal conditional world generation model for various domain transfer applications such as Sim2Real.
1815

19-
<video src="https://github.com/user-attachments/assets/cf10262d-e8db-4996-813d-914332f3e00e">
20-
Your browser does not support the video tag.
21-
</video>
16+
3. [Cosmos Reason](https://github.com/nvidia-cosmos/cosmos-reason1): a collection of Physical AI reasoning models for planning and critics.
2217

18+
Being a minimalist, we have these individual models in individual repositories under [nvidia-github](https://github.com/nvidia-cosmos).
2319

24-
### Cosmos-Reason1: https://github.com/nvidia-cosmos/cosmos-reason1
25-
- Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.
2620

27-
-----------------------------------------------------------
21+
| Example Model Behavior |
22+
|--------|
23+
| [Cosmos-Predict Text2World](https://github.com/nvidia-cosmos/cosmos-predict1) |
24+
| <video width="1080" controls> <source src="assets/cosmos-predict1/predict1_text2world.mov" type="video/quicktime"> Your browser does not support the video tag.</video>|
25+
| [Cosmos-Predict Video2World](https://github.com/nvidia-cosmos/cosmos-predict1) |
26+
| <video width="1080" controls> <source src="assets/cosmos-predict1/predict1_video2world.mov" type="video/quicktime"> Your browser does not support the video tag. </video> |
27+
| [Cosmos-Transfer LiDAR + HDMap Conditional Inputs -> World](https://github.com/nvidia-cosmos/cosmos-transfer1) |
28+
| <video width="1080" controls> <source src="assets/cosmos-transfer1/transfer1_lidarhdmap.mov" type="video/quicktime"> Your browser does not support the video tag. </video> |
29+
| [Cosmos-Transfer Multimodal Conditional Inputs -> World](https://github.com/nvidia-cosmos/cosmos-transfer1) |
30+
| <video width="1080" controls> <source src="assets/cosmos-transfer1/transfer1_lidarhdmap.mov" type="video/quicktime"> Your browser does not support the video tag. </video> |
31+
| [Cosmos-Reason Physical AI Planning](https://github.com/nvidia-cosmos/cosmos-transfer1) |
32+
| <video width="1080" controls> <source src="assets/cosmos-transfer1/transfer1_multimodal.mov" type="video/quicktime"> Your browser does not support the video tag. </video> |
2833

29-
This repository will be archived soon. To check out the initial release of NVIDIA Cosmos, please follow [README_CES2025.md](README_CES2025.md).
34+
### Cosmos Publication
35+
36+
<table>
37+
<tr>
38+
<th width="40%">Paper Title</th>
39+
<th width="30%">Summary</th>
40+
<th width="15%">Authors</th>
41+
<th width="15%">Date</th>
42+
</tr>
43+
<tr>
44+
<td><a href="https://arxiv.org/abs/2503.15558">Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning</a></td>
45+
<td>Introduces a reasoning model for physical AI that combines common sense knowledge with embodied reasoning capabilities.</td>
46+
<td>NVIDIA</td>
47+
<td>2025-03-19</td>
48+
</tr>
49+
<tr>
50+
<td><a href="https://arxiv.org/abs/2503.14492">Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control</a></td>
51+
<td>Presents a multimodal model for conditional world generation with adaptive control mechanisms.</td>
52+
<td>NVIDIA</td>
53+
<td>2025-03-18</td>
54+
</tr>
55+
<tr>
56+
<td><a href="https://arxiv.org/abs/2501.03575">Cosmos World Foundation Model Platform for Physical AI</a></td>
57+
<td>Overview of the Cosmos platform, its architecture, and applications in physical AI systems. Introduction of Cosmos-Predict1 world models.</td>
58+
<td>NVIDIA</td>
59+
<td>2025-01-06</td>
60+
</tr>
61+
</table>
62+
63+
### Developer
64+
For native PyTorch developers, we provide native PyTorch training and inference scripts in [nvidia-github](https://github.com/nvidia-cosmos). For Nemo developers, please refer to [README_CES2025.md](README_CES2025.md).
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:89e1f788da739432c6c67fb7cf7e781526d35cfb7170f033a738e8e57a3b2515
3+
size 2409615
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:60db56ab0b69d75a735d75e1bb2284052288d4fe55ea603294119cf44c19f06f
3+
size 1652618
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:c0f84db068e723cd7949b680028419a16ca24417fc48e0789f2f483c3d52ac5b
3+
size 1100892
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:2cc5f47becc8d6c8776846b915a5617b502dc078009910f18d4e22d3bd9bab17
3+
size 3321988
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:8772f3b08f4a37c62b0e7fa46faa252c2ad9cb8ec2dc9264102c32ef14bd010f
3+
size 1381224

0 commit comments

Comments
 (0)