wge

Authors: Evan Zheran Liu*, Kelvin Guu*, Panupong (Ice) Pasupat*, Tianlin Shi, Percy Liang (* equal contribution)

Source code accompanying our ICLR 2018 paper:
Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration

Purpose

The goal of this project is to train machine learning models (agents) to do things in a browser that can be specified in natural language, e.g. "Book a flight from San Francisco to New York for Dec 23rd."

Setup

We tested the system with Python 3.9 (e.g., 3.9.3 on Mac and 3.9.13 on Windows)

General setup

Python
- Install virtualenv
```
pip3 install virtualenv
```
- Create a virtual environment named p5
```
cd path/to/wge
python3 -m venv p5
```
- Activate the virtualenv for MAC
```
source p5/bin/activate
```
- Activate the virtualenv for Windows using command line. If you are using Windows Power Shell (PS), see the next instruction.
```
.\p5\Scripts\activate.bat
```
- Activate the virtualenv for Windows using Powershell (PS)
```
.\p5\Scripts\activate.ps1
```
Python dependencies
```
pip install -r requirements.txt
```
- If this gives you problems, try again and add pip's --ignore-installed flag.

Setup for recording

(MAC) Go to miniwob-sandbox folder and run the recording script from the terminal or command line .
```
cd miniwob-sandbox
python record.py
```
(Windows) The miniwob-sandbox folder is a link, which may not work in Window. If that is the case, use the full path third-party\miniwob-sandbox, and run the recording script from the command line.
```
cd third-party\miniwob-sandbox
python record.py
```
If everything works out, you should see the message: Listening on http://localhost:8032/.

Record your demonstration

Open your browser and enter http://localhost:8032/ in the address bar. You'll see an "Error: 404 Not Found" message, indicating the server is working correctly.
Open the task environment:
- Press Cmd+O (Mac) or Ctrl+O (Windows)
- Navigate to miniwob-sandbox/html/miniwob
- Select an environment file (choose from):
  - click-checkboxes-soft.html
  - email-inbox-forward-nl.html
  - social-media.html
To begin recording, append ?record=true to the URL in your address bar. For example: file:///path/to/wge/miniwob-sandbox/html/miniwob/social-media.html?record=true
Record 10 demonstrations for each environment (30 total recordings).

View your Recordings

Ensure the recording server is still running.
Open the viewer:
- Press Cmd+O (Mac) or Ctrl+O (Windows)
- Navigate to miniwob-sandbox/viewer
- Select viewer.html
- The address should look like: file:///path/to/wge/miniwob-sandbox/viewer/viewer.html
- Your recordings will appear in the left panel.

Setup for `data` directory

Download glove from https://nlp.stanford.edu/data/glove.6B.zip and place it in the wge/data directory after extraction

Setup for model training

Run miniwob server: go to path/to/wge/miniwob-sandbox/html/ and run the supplied http-serve.

For Mac, use

  cd path/to/wge/miniwob-sandbox/html/
  ./http-serve

For Windows, use

  cd path\to\wge\miniwob-sandbox\html\
  .\http-serve.bat

The server should be running at http://localhost:8080/

Next, set the environment variables:

For Mac, each time you open a new terminal to run an experiment, set these environment variables:

  export REPO_DIR=/path/to/wge/
  export RL_DATA=/path/to/wge/data/
  export RL_DEMO_DIR=/path/to/wge/path/to/miniwob-sandbox/out/
  export MINIWOB_BASE_URL='http://localhost:8080/'

For Windows, each time you open a new command line to run an experiment, set these environment variables:

  $env:REPO_DIR="\path\to\wge\"
  $env:RL_DATA="\path\to\wge\data\"
  $env:RL_DEMO_DIR="\path\to\wge\path\to\miniwob-sandbox\out\"
  $env:MINIWOB_BASE_URL="http:\\localhost:8080\"

Once you've followed the above steps, test MiniWoBEnvironment by running
```
cd /path/to/wge/
pytest wge/tests/miniwob/test_environment.py -s
```

Train a model

To train a model on a task, say 'email-inbox-forward-nl', run:

python main.py configs/default-base.txt --task email-inbox-forward-nl

Change the task name (the last parameter) to train for other tasks.

Experiment management

All training runs are managed by the MiniWoBTrainingRuns object.
The most important methods on MiniWobTrainingRun are:
- __init__: the policy, the environment, demonstrations, etc, are all loaded here.
- train: actual training of the policy happens here

Model architecture

During training, there are several key systems involved:

the environment
policies
- the model policy
- the exploration policy
episode generators
- basic episode generator
- best first episode generator
the replay buffer

Environment

All environments implement the Environment interface. A policy interacts with the environment by calling the environment's step method and passing in actions.

Note that an environment object is batched. It actually represents a batch of environments, each running in parallel (so that we can train faster).

Policies

See the Policy interface. The most important methods are act, update_from_episodes and update_from_replay_buffer.

Note that all of these methods are also batched (i.e. they operate on multiple episodes in parallel)

The model policy is the main one that we are trying to train. See MiniWoBPolicy as an example.

Episode generators

See the EpisodeGenerator interface. An EpisodeGenerator runs a Policy on an Environment to produce an Episode.

Replay buffer

See the ReplayBuffer interface. A ReplayBuffer stores episodes produced by the exploration policy. The final model policy is trained off episodes sampled from the replay buffer.

Configuration

All configs are in the configs folder. They are specified in HOCON format. The arguments to main.py should be a list of paths to config files. main.py then merges these config files according to the rules explained here.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
configs		configs
docs		docs
gtd		gtd
third-party		third-party
wge		wge
.gitignore		.gitignore
.gitmodules		.gitmodules
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
diff.py		diff.py
launch_jobs.py		launch_jobs.py
main.py		main.py
miniwob-sandbox		miniwob-sandbox
repo_test.py		repo_test.py
requirements.txt		requirements.txt
script_tools.py		script_tools.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

wge

Purpose

Setup

General setup

Setup for recording

Record your demonstration

View your Recordings

Setup for `data` directory

Setup for model training

Train a model

Experiment management

Model architecture

Environment

Policies

Episode generators

Replay buffer

Configuration

About

Uh oh!

Releases

Packages

Languages

License

Imran2205/wge

Folders and files

Latest commit

History

Repository files navigation

wge

Purpose

Setup

General setup

Setup for recording

Record your demonstration

View your Recordings

Setup for data directory

Setup for model training

Train a model

Experiment management

Model architecture

Environment

Policies

Episode generators

Replay buffer

Configuration

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Setup for `data` directory

Packages