sweeps notebook for oreilly class

lavanyashukla · web-flow · commit 4493d1f00ef6 · 2020-05-05T19:33:58.000+05:30
diff --git a/examples/keras-fashion/sweeps.ipynb b/examples/keras-fashion/sweeps.ipynb
@@ -0,0 +1,382 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "pygments_lexer": "ipython3",
+      "nbconvert_exporter": "python",
+      "version": "3.6.4",
+      "file_extension": ".py",
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "name": "python",
+      "mimetype": "text/x-python"
+    },
+    "colab": {
+      "name": "Hyperparameter Sweeps with W&B.ipynb",
+      "provenance": [],
+      "private_outputs": true,
+      "collapsed_sections": []
+    },
+    "accelerator": "GPU"
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "QJc0KMZlzdtO",
+        "colab_type": "text"
+      },
+      "source": [
+        "# Introduction to Hyperparameter Sweeps\n",
+        "\n",
+        "Searching through high dimensional hyperparameter spaces to find the most performant model can get unwieldy very fast. Hyperparameter sweeps provide an organized and efficient way to conduct a battle royale of models and pick the most accurate model. They enable this by automatically searching through combinations of hyperparameter values (e.g. learning rate, batch size, number of hidden layers, optimizer type) to find the most optimal values.\n",
+        "\n",
+        "In this tutorial we'll see how you can run sophisticated hyperparameter sweeps in 3 easy steps using Weights and Biases.\n",
+        "\n",
+        "## Sweeps: An Overview\n",
+        "\n",
+        "Running a hyperparameter sweep with Weights & Biases is very easy. There are just 3 simple steps:\n",
+        "\n",
+        "1. **Define the sweep:** we do this by creating a dictionary or a [YAML file](https://docs.wandb.com/library/sweeps/configuration) that specifies the parameters to search through, the search strategy, the optimization metric et all.\n",
+        "\n",
+        "2. **Initialize the sweep:** with one line of code we initialize the sweep and pass in the dictionary of sweep configurations:\n",
+        "`sweep_id = wandb.sweep(sweep_config)`\n",
+        "\n",
+        "3. **Run the sweep agent:** also accomplished with one line of code, we call wandb.agent() and pass the sweep_id to run, along with a function that defines your model architecture and trains it:\n",
+        "`wandb.agent(sweep_id, function=train)`\n",
+        "\n",
+        "And voila! That's all there is to running a hyperparameter sweep! In the notebook below, we'll walk through these 3 steps in more detail.\n",
+        "\n",
+        "\n",
+        "We highly encourage you to fork this notebook, tweak the parameters, or try the model with your own dataset!\n",
+        "\n",
+        "## Resources\n",
+        "- [Sweeps docs →](https://docs.wandb.com/library/sweeps)\n",
+        "- [Launching from the command line →](https://www.wandb.com/articles/hyperparameter-tuning-as-easy-as-1-2-3)\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "ITmB8ev90U_t",
+        "colab_type": "text"
+      },
+      "source": [
+        "# Setup\n",
+        "Start out by installing the experiment tracking library and setting up your free W&B account:\n",
+        "\n",
+        "\n",
+        "*   **pip install wandb** – Install the W&B library\n",
+        "*   **import wandb** – Import the wandb library\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "trusted": true,
+        "id": "0m3yWmVgaJ5Y",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# WandB – Install the W&B library\n",
+        "%pip install wandb -q\n",
+        "import wandb\n",
+        "from wandb.keras import WandbCallback"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "trusted": true,
+        "colab_type": "code",
+        "id": "3rv3mo4IuJn-",
+        "colab": {}
+      },
+      "source": [
+        "!pip install wandb -qq\n",
+        "from keras.datasets import fashion_mnist\n",
+        "from keras.models import Sequential\n",
+        "from keras.layers import Conv2D, MaxPooling2D, Dropout, Dense, Flatten\n",
+        "from keras.utils import np_utils\n",
+        "from keras.optimizers import SGD\n",
+        "from keras.optimizers import RMSprop, SGD, Adam, Nadam\n",
+        "from keras.callbacks import ReduceLROnPlateau, ModelCheckpoint, Callback, EarlyStopping\n",
+        "\n",
+        "import os\n",
+        "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'\n",
+        "import tensorflow as tf\n",
+        "tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)\n",
+        "\n",
+        "import wandb\n",
+        "from wandb.keras import WandbCallback\n",
+        "\n",
+        "(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()\n",
+        "labels=[\"T-shirt/top\",\"Trouser\",\"Pullover\",\"Dress\",\"Coat\",\n",
+        "        \"Sandal\",\"Shirt\",\"Sneaker\",\"Bag\",\"Ankle boot\"]\n",
+        "\n",
+        "img_width=28\n",
+        "img_height=28\n",
+        "\n",
+        "X_train = X_train.astype('float32') / 255.\n",
+        "X_test = X_test.astype('float32') / 255.\n",
+        "\n",
+        "# reshape input data\n",
+        "X_train = X_train.reshape(X_train.shape[0], img_width, img_height, 1)\n",
+        "X_test = X_test.reshape(X_test.shape[0], img_width, img_height, 1)\n",
+        "\n",
+        "# one hot encode outputs\n",
+        "y_train = np_utils.to_categorical(y_train)\n",
+        "y_test = np_utils.to_categorical(y_test)\n",
+        "num_classes = y_test.shape[1]"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "AXcXCVlDU4ME",
+        "colab_type": "text"
+      },
+      "source": [
+        "## 1. Define the Sweep\n",
+        "\n",
+        "Weights & Biases sweeps give you powerful levers to configure your sweeps exactly how you want them, with just a few lines of code. The sweeps config can be defined as a dictionary or a [YAML file](https://docs.wandb.com/library/sweeps).\n",
+        "\n",
+        "Let's walk through some of them together:\n",
+        "*   **Metric** – This is the metric the sweeps are attempting to optimize. Metrics can take a `name` (this metric should be logged by your training script) and a `goal` (maximize or minimize). \n",
+        "*   **Search Strategy** – Specified using the 'method' variable. We support several different search strategies with sweeps. \n",
+        "  *   **Grid Search** – Iterates over every combination of hyperparameter values.\n",
+        "  *   **Random Search** – Iterates over randomly chosen combinations of hyperparameter values.\n",
+        "  *   **Bayesian Search** – Creates a probabilistic model that maps hyperparameters to probability of a metric score, and chooses parameters with high probability of improving the metric. The objective of Bayesian optimization is to spend more time in picking the hyperparameter values, but in doing so trying out fewer hyperparameter values.\n",
+        "*   **Stopping Criteria** – The strategy for determining when to kill off poorly peforming runs, and try more combinations faster. We offer several custom scheduling algorithms like [HyperBand](https://arxiv.org/pdf/1603.06560.pdf) and Envelope.\n",
+        "*   **Parameters** – A dictionary containing the hyperparameter names, and discreet values, max and min values or distributions from which to pull their values to sweep over.\n",
+        "\n",
+        "You can find a list of all configuration options [here](https://docs.wandb.com/library/sweeps/configuration)."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "EdaHO-3M8ly3",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Configure the sweep – specify the parameters to search through, the search strategy, the optimization metric et all.\n",
+        "sweep_config = {\n",
+        "    'method': 'random', #grid, random\n",
+        "    'metric': {\n",
+        "      'name': 'accuracy',\n",
+        "      'goal': 'maximize'   \n",
+        "    },\n",
+        "    'parameters': {\n",
+        "        'epochs': {\n",
+        "            'values': [2, 5, 10]\n",
+        "        },\n",
+        "        'batch_size': {\n",
+        "            'values': [256, 128, 64, 32]\n",
+        "        },\n",
+        "        'dropout': {\n",
+        "            'values': [0.3, 0.4, 0.5]\n",
+        "        },\n",
+        "        'conv_layer_size': {\n",
+        "            'values': [16, 32, 64]\n",
+        "        },\n",
+        "        'weight_decay': {\n",
+        "            'values': [0.0005, 0.005, 0.05]\n",
+        "        },\n",
+        "        'learning_rate': {\n",
+        "            'values': [1e-2, 1e-3, 1e-4, 3e-4, 3e-5, 1e-5]\n",
+        "        },\n",
+        "        'optimizer': {\n",
+        "            'values': ['adam', 'nadam', 'sgd', 'rmsprop']\n",
+        "        },\n",
+        "        'activation': {\n",
+        "            'values': ['relu', 'elu', 'selu', 'softmax']\n",
+        "        }\n",
+        "    }\n",
+        "}"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Ll_zFhpaU6cu",
+        "colab_type": "text"
+      },
+      "source": [
+        "## 2. Initialize the Sweep"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "kdc7RBBaU0F3",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Initialize a new sweep\n",
+        "# Arguments:\n",
+        "#     – sweep_config: the sweep config dictionary defined above\n",
+        "#     – entity: Set the username for the sweep\n",
+        "#     – project: Set the project name for the sweep\n",
+        "sweep_id = wandb.sweep(sweep_config, entity=\"sweep\", project=\"sweeps-tutorial\")"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "FT4dKp1xVeeT",
+        "colab_type": "text"
+      },
+      "source": [
+        "### Define Your Neural Network\n",
+        "Before we can run the sweep, let's define a function that creates and trains our neural network.\n",
+        "\n",
+        "In the function below, we define a simplified version of a VGG19 model in Keras, and add the following lines of code to log models metrics, visualize performance and output and track our experiments easily:\n",
+        "*   **wandb.init()** – Initialize a new W&B run. Each run is single execution of the training script.\n",
+        "*   **wandb.config** – Save all your hyperparameters in a config object. This lets you use our app to sort and compare your runs by hyperparameter values.\n",
+        "*   **callbacks=[WandbCallback()]** – Fetch all layer dimensions, model parameters and log them automatically to your W&B dashboard.\n",
+        "*   **wandb.log()** – Logs custom objects – these can be images, videos, audio files, HTML, plots, point clouds etc. Here we use wandb.log to log images of Simpson characters overlaid with actual and predicted labels."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "trusted": true,
+        "id": "aIhxl7glaJ5k",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# The sweep calls this function with each set of hyperparameters\n",
+        "def train():\n",
+        "    # Default values for hyper-parameters we're going to sweep over\n",
+        "    config_defaults = {\n",
+        "        'epochs': 5,\n",
+        "        'batch_size': 128,\n",
+        "        'weight_decay': 0.0005,\n",
+        "        'learning_rate': 1e-3,\n",
+        "        'activation': 'relu',\n",
+        "        'optimizer': 'nadam',\n",
+        "        'hidden_layer_size': 128,\n",
+        "        'conv_layer_size': 16,\n",
+        "        'dropout': 0.5,\n",
+        "        'momentum': 0.9,\n",
+        "        'seed': 42\n",
+        "    }\n",
+        "\n",
+        "    # Initialize a new wandb run\n",
+        "    wandb.init(config=config_defaults)\n",
+        "    \n",
+        "    # Config is a variable that holds and saves hyperparameters and inputs\n",
+        "    config = wandb.config\n",
+        "    \n",
+        "    # Define the model architecture - This is a simplified version of the VGG19 architecture\n",
+        "    model = Sequential()\n",
+        "    \n",
+        "    # Set of Conv2D, Conv2D, MaxPooling2D layers with 32 and 64 filters\n",
+        "    model.add(Conv2D(filters = config.conv_layer_size, kernel_size = (3, 3), padding = 'same', \n",
+        "                     activation ='relu', input_shape=(img_width, img_height,1)))\n",
+        "    model.add(Dropout(config.dropout))\n",
+        "\n",
+        "    model.add(Conv2D(filters = config.conv_layer_size, kernel_size = (3, 3),\n",
+        "                     padding = 'same', activation ='relu'))\n",
+        "    model.add(MaxPooling2D(pool_size=(2, 2)))\n",
+        "\n",
+        "    model.add(Flatten())\n",
+        "    model.add(Dense(config.hidden_layer_size, activation ='relu'))\n",
+        "\n",
+        "    model.add(Dense(num_classes, activation = \"softmax\"))\n",
+        "\n",
+        "    # Define the optimizer\n",
+        "    if config.optimizer=='sgd':\n",
+        "      optimizer = SGD(lr=config.learning_rate, decay=1e-5, momentum=config.momentum, nesterov=True)\n",
+        "    elif config.optimizer=='rmsprop':\n",
+        "      optimizer = RMSprop(lr=config.learning_rate, decay=1e-5)\n",
+        "    elif config.optimizer=='adam':\n",
+        "      optimizer = Adam(lr=config.learning_rate, beta_1=0.9, beta_2=0.999, clipnorm=1.0)\n",
+        "    elif config.optimizer=='nadam':\n",
+        "      optimizer = Nadam(lr=config.learning_rate, beta_1=0.9, beta_2=0.999, clipnorm=1.0)\n",
+        "\n",
+        "    model.compile(loss = \"categorical_crossentropy\", optimizer = optimizer, metrics=['accuracy'])\n",
+        "\n",
+        "    model.fit(X_train, y_train, batch_size=config.batch_size,\n",
+        "              epochs=config.epochs,\n",
+        "              validation_data=(X_test, y_test),\n",
+        "              callbacks=[WandbCallback(data_type=\"image\", validation_data=(X_test, y_test), labels=labels),\n",
+        "                          EarlyStopping(patience=10, restore_best_weights=True)])"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "qVxXIXXTyOLC",
+        "colab_type": "text"
+      },
+      "source": [
+        "## 3. Run the sweep agent"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "1gD9qhA9yOYs",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Initialize a new sweep\n",
+        "# Arguments:\n",
+        "#     – sweep_id: the sweep_id to run - this was returned above by wandb.sweep()\n",
+        "#     – function: function that defines your model architecture and trains it\n",
+        "wandb.agent(sweep_id, train)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "lWE4BXssxsDJ",
+        "colab_type": "text"
+      },
+      "source": [
+        "# Visualize Sweeps Results\n",
+        "\n",
+        "## Parallel coordinates plot\n",
+        "This plot maps hyperparameter values to model metrics. It’s useful for honing in on combinations of hyperparameters that led to the best model performance.\n",
+        "\n",
+        "![](https://assets.website-files.com/5ac6b7f2924c652fd013a891/5e190366778ad831455f9af2_s_194708415DEC35F74A7691FF6810D3B14703D1EFE1672ED29000BA98171242A5_1578695138341_image.png)\n",
+        "\n",
+        "## Hyperparameter Importance Plot\n",
+        "The hyperparameter importance plot surfaces which hyperparameters were the best predictors of, and highly correlated to desirable values for your metrics.\n",
+        "\n",
+        "![](https://assets.website-files.com/5ac6b7f2924c652fd013a891/5e190367778ad820b35f9af5_s_194708415DEC35F74A7691FF6810D3B14703D1EFE1672ED29000BA98171242A5_1578695757573_image.png)\n",
+        "\n",
+        "These visualizations can help you save both time and resources running expensive hyperparameter optimizations by honing in on the parameters (and value ranges) that are the most important, and thereby worthy of further exploration.\n",
+        "\n",
+        "# Next step - Get your hands dirty with sweeps\n",
+        "We created a simple training script and [a few flavors of sweep configs](https://github.com/wandb/examples/tree/master/keras-cnn-fashion) for you to play with. We highly encourage you to give these a try. This repo also has examples to help you try more advanced sweep features like [Bayesian Hyperband](https://app.wandb.ai/wandb/examples-keras-cnn-fashion/sweeps/us0ifmrf?workspace=user-lavanyashukla), and [Hyperopt](https://app.wandb.ai/wandb/examples-keras-cnn-fashion/sweeps/xbs2wm5e?workspace=user-lavanyashukla)."
+      ]
+    }
+  ]
+}