OSGym: Super-Scalable Distributed Data Engine for Generalizable Agents

Setup

Setup environment:

conda create -n osgym python=3.10

Instal libGL:

sudo apt-get update
sudo apt-get install libgl1 libglx-mesa0

Install required Linux headers:

sudo apt-get install linux-headers-$(uname -r)

Install essential building tools:

sudo apt-get install python3-dev build-essential

Then install the dependencies:

pip install -r requirements.txt

Install Docker

Setup Docker apt repository:

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

Install Docker:

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Verify Installation:

sudo docker run hello-world

Deployment

Launch server:

./start_workers.sh

Clean up server:

./clean.sh

Benchmarking

Launch server locally:

./start_workers.sh --local

Benchmark speed:

cd examples
python test_osgym.py

OS Replica State Manager

Main Endpoints

`POST /reset`

Description: Initializes a new environment with a given task configuration.

Request Body:

{
  "task_config": { ... },  // Task configuration JSON
  "timeout": 1000          // Timeout in seconds
}

Response:

{
  "screenshot": "<base64-encoded image>",
  "problem": "<task instruction>",
  "vm_id": <int>
}

`POST /step`

Description: Executes an action in the environment.

Request Body:

{
  "action": "<action string>",
  "vm_id": <int>
}

Response:

{
  "screenshot": "<base64-encoded image>",
  "is_finish": <bool>,
  "reward": <float>
}

`POST /shutdown`

Description: Shuts down and releases a VM.
Request Body:
```
{
  "vm_id": <int or 'all'>
}
```
Response:
```
{
  "vm_id": <int or 'all'>
}
```

`GET /screenshot`

Description: Gets a screenshot of the current VM state.
Query Parameters:
- vmId: VM ID (integer)

Response:

{
  "screenshot": "<base64-encoded image>",
  "vm_id": <int>
}

Example Usage

Reset an environment:

curl -X POST http://localhost:20000/reset \
  -H "Content-Type: application/json" \
  -d '{"task_config": {...}, "timeout": 1000}'

Step in the environment:

curl -X POST http://localhost:20000/step \
  -H "Content-Type: application/json" \
  -d '{"action": "<|think_start|><|think_end|><|action_start|>click(100,100)<|action_end|>", "vm_id": 0}'

Shutdown a VM:

curl -X POST http://localhost:20000/shutdown \
  -H "Content-Type: application/json" \
  -d '{"vm_id": 0}'

Notes

Authentication: The API restricts access by IP. Set the OSGYM_ALLOWED_IPS environment variable to allow your client IPs.
VM IDs: VM IDs are managed by the server. Use the vm_id returned by /reset for subsequent calls.
Screenshots: All screenshots are returned as base64-encoded PNG images.

OSGym Workers

OSGym is designed for distributed, scalable RL/data collection. Each worker runs an environment server (FastAPI) on a different port, as specified in config.yaml.

Launching Workers

To start all workers (one per port in config.yaml):

./start_workers.sh

By default, this launches on all interfaces (0.0.0.0).
For local-only testing, use:
```
./start_workers.sh --local
```

Each worker runs a FastAPI server (see API Reference above) and manages its own set of VMs.

Cleaning Up

To stop all workers and clean up Docker containers and processes:

./clean.sh

Data Server and MultiTurnDataloader

OSGym provides a data server and dataloader for efficient RL training and experience replay.

Data Server

The data server (see examples/data_server.py) manages:

ReplayBuffer: Stores and samples experience tuples for RL.
MultiTurnDataloader: Handles parallel environment rollout, batching, and preprocessing for RL agents.

Key Classes

ReplayBuffer
- Stores (state, action, reward, next_state, done) tuples.
- Supports discounted reward calculation and outcome-only storage.
- Sampling: ReplayBuffer.sample(batch_size)

MultiTurnDataloader

Launches multiple environment workers in parallel (using multiprocessing).
Collects rollouts, preprocesses data (tokenization, multimodal support), and batches for training.

Example usage:

from examples.data_server import MultiTurnDataloader
dataloader = MultiTurnDataloader(
    env_class=YourEnvClass,
    env_configs=[...],  # List of env configs (one per worker)
    tokenizer=your_tokenizer,
    processor=your_processor,  # Optional, for multimodal
    batch_size=8,
    ...
)
for batch in dataloader:
    # batch is a dict of tensors ready for model input
    ...

Example: Sampling from the Replay Buffer

batch = dataloader.sample_from_buffer(batch_size=32)
# batch is a dict with keys: input_ids, attention_mask, position_ids, responses, reward, etc.

Example: Printing Replay Buffer Stats

dataloader.print_stats_in_replay_buffer()

Desktop Environment Server Endpoints (Advanced)

Each worker uses a desktop environment server (Flask, see desktop_env/server/main.py) to interact with the OS.
Some useful endpoints (for advanced users):

Endpoint	Method	Description
`/screenshot`	GET	Capture current screen (with cursor)
`/terminal`	GET	Get terminal output (Linux only)
`/accessibility`	GET	Get accessibility tree (A11y)
`/execute`	POST	Execute a shell command
`/list_directory`	POST	List directory contents
`/file`	POST	Download a file

These are used internally by OSGym, but can be accessed directly for debugging or custom automation.

Putting It All Together

Start workers: ./start_workers.sh
Interact via API: Use /reset, /step, /shutdown as described above.
Collect data: Use the MultiTurnDataloader and ReplayBuffer for RL training.
Benchmark: Run python examples/test_osgym.py to test and benchmark the system.

License

MIT License. Free for research and commercial use.

Acknowledgement

Some part of the code were borrowed from OSWorld. Thanks for their open-source contribution!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OSGym: Super-Scalable Distributed Data Engine for Generalizable Agents

Setup

Deployment

Benchmarking

OS Replica State Manager

Main Endpoints

`POST /reset`

`POST /step`

`POST /shutdown`

`GET /screenshot`

Example Usage

Notes

OSGym Workers

Launching Workers

Cleaning Up

Data Server and MultiTurnDataloader

Data Server

Key Classes

Example: Sampling from the Replay Buffer

Example: Printing Replay Buffer Stats

Desktop Environment Server Endpoints (Advanced)

Putting It All Together

License

Acknowledgement

About

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
desktop_env		desktop_env
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
clean.sh		clean.sh
config.yaml		config.yaml
main.py		main.py
requirements.txt		requirements.txt
start_workers.sh		start_workers.sh

License

agiopen-org/osgym

Folders and files

Latest commit

History

Repository files navigation

OSGym: Super-Scalable Distributed Data Engine for Generalizable Agents

Setup

Deployment

Benchmarking

OS Replica State Manager

Main Endpoints

POST /reset

POST /step

POST /shutdown

GET /screenshot

Example Usage

Notes

OSGym Workers

Launching Workers

Cleaning Up

Data Server and MultiTurnDataloader

Data Server

Key Classes

Example: Sampling from the Replay Buffer

Example: Printing Replay Buffer Stats

Desktop Environment Server Endpoints (Advanced)

Putting It All Together

License

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages

`POST /reset`

`POST /step`

`POST /shutdown`

`GET /screenshot`