|
2 | 2 | <img src="assets/nvidia-cosmos-header.png" alt="NVIDIA Cosmos Header">
|
3 | 3 | </p>
|
4 | 4 |
|
5 |
| -## GitHub project for NVIDIA Cosmos: https://github.com/nvidia-cosmos |
| 5 | +[NVIDIA Cosmos](https://www.nvidia.com/cosmos/) is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster. Cosmos contains |
6 | 6 |
|
7 |
| -NVIDIA Cosmos now includes three subprojects: |
| 7 | +1. Pre-trained models (available via Hugging Face) under the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/) that allows free commercial use. |
| 8 | +2. Pre-training, post-training, and inference code (available in native PyTorch) under the [Apache 2 License](https://www.apache.org/licenses/LICENSE-2.0). |
8 | 9 |
|
9 |
| -### Cosmos-Predict1: https://github.com/nvidia-cosmos/cosmos-predict1 |
10 |
| -- Cosmos-Predict1 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications. |
| 10 | +There are three main model families in Cosmos World Foundation Model Platform. |
11 | 11 |
|
12 |
| -<video src="https://github.com/user-attachments/assets/2ee7386b-8808-4db2-b38a-87ab679339f9"> |
13 |
| - Your browser does not support the video tag. |
14 |
| -</video> |
| 12 | +1. [Cosmos Predict](https://github.com/nvidia-cosmos/cosmos-predict1): a collection of general-purpose world models for future state prediction. |
15 | 13 |
|
16 |
| -### Cosmos-Transfer1: https://github.com/nvidia-cosmos/cosmos-transfer1 |
17 |
| -- Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments. |
| 14 | +2. [Cosmos Transfer](https://github.com/nvidia-cosmos/cosmos-transfer1): a collection of multimodal conditional world generation model for various domain transfer applications such as Sim2Real. |
18 | 15 |
|
19 |
| -<video src="https://github.com/user-attachments/assets/cf10262d-e8db-4996-813d-914332f3e00e"> |
20 |
| - Your browser does not support the video tag. |
21 |
| -</video> |
| 16 | +3. [Cosmos Reason](https://github.com/nvidia-cosmos/cosmos-reason1): a collection of Physical AI reasoning models for planning and critics. |
22 | 17 |
|
| 18 | +Being a minimalist, we have these individual models in individual repositories under [nvidia-github](https://github.com/nvidia-cosmos). |
23 | 19 |
|
24 |
| -### Cosmos-Reason1: https://github.com/nvidia-cosmos/cosmos-reason1 |
25 |
| -- Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes. |
26 | 20 |
|
27 |
| ------------------------------------------------------------ |
| 21 | +| Example Model Behavior | |
| 22 | +|--------| |
| 23 | +| [Cosmos-Predict Text2World](https://github.com/nvidia-cosmos/cosmos-predict1) | |
| 24 | +| <video width="1080" controls> <source src="assets/cosmos-predict1/predict1_text2world.mov" type="video/quicktime"> Your browser does not support the video tag.</video>| |
| 25 | +| [Cosmos-Predict Video2World](https://github.com/nvidia-cosmos/cosmos-predict1) | |
| 26 | +| <video width="1080" controls> <source src="assets/cosmos-predict1/predict1_video2world.mov" type="video/quicktime"> Your browser does not support the video tag. </video> | |
| 27 | +| [Cosmos-Transfer LiDAR + HDMap Conditional Inputs -> World](https://github.com/nvidia-cosmos/cosmos-transfer1) | |
| 28 | +| <video width="1080" controls> <source src="assets/cosmos-transfer1/transfer1_lidarhdmap.mov" type="video/quicktime"> Your browser does not support the video tag. </video> | |
| 29 | +| [Cosmos-Transfer Multimodal Conditional Inputs -> World](https://github.com/nvidia-cosmos/cosmos-transfer1) | |
| 30 | +| <video width="1080" controls> <source src="assets/cosmos-transfer1/transfer1_lidarhdmap.mov" type="video/quicktime"> Your browser does not support the video tag. </video> | |
| 31 | +| [Cosmos-Reason Physical AI Planning](https://github.com/nvidia-cosmos/cosmos-transfer1) | |
| 32 | +| <video width="1080" controls> <source src="assets/cosmos-transfer1/transfer1_multimodal.mov" type="video/quicktime"> Your browser does not support the video tag. </video> | |
28 | 33 |
|
29 |
| -This repository will be archived soon. To check out the initial release of NVIDIA Cosmos, please follow [README_CES2025.md](README_CES2025.md). |
| 34 | +### Cosmos Publication |
| 35 | + |
| 36 | +<table> |
| 37 | + <tr> |
| 38 | + <th width="40%">Paper Title</th> |
| 39 | + <th width="30%">Summary</th> |
| 40 | + <th width="15%">Authors</th> |
| 41 | + <th width="15%">Date</th> |
| 42 | + </tr> |
| 43 | + <tr> |
| 44 | + <td><a href="https://arxiv.org/abs/2503.15558">Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning</a></td> |
| 45 | + <td>Introduces a reasoning model for physical AI that combines common sense knowledge with embodied reasoning capabilities.</td> |
| 46 | + <td>NVIDIA</td> |
| 47 | + <td>2025-03-19</td> |
| 48 | + </tr> |
| 49 | + <tr> |
| 50 | + <td><a href="https://arxiv.org/abs/2503.14492">Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control</a></td> |
| 51 | + <td>Presents a multimodal model for conditional world generation with adaptive control mechanisms.</td> |
| 52 | + <td>NVIDIA</td> |
| 53 | + <td>2025-03-18</td> |
| 54 | + </tr> |
| 55 | + <tr> |
| 56 | + <td><a href="https://arxiv.org/abs/2501.03575">Cosmos World Foundation Model Platform for Physical AI</a></td> |
| 57 | + <td>Overview of the Cosmos platform, its architecture, and applications in physical AI systems. Introduction of Cosmos-Predict1 world models.</td> |
| 58 | + <td>NVIDIA</td> |
| 59 | + <td>2025-01-06</td> |
| 60 | + </tr> |
| 61 | +</table> |
| 62 | + |
| 63 | +### Developer |
| 64 | +For native PyTorch developers, we provide native PyTorch training and inference scripts in [nvidia-github](https://github.com/nvidia-cosmos). For Nemo developers, please refer to [README_CES2025.md](README_CES2025.md). |
0 commit comments