Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions ml/census_train_and_eval/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ You can read more about this model and its use [here](https://research.googleblo

We're using Estimators because they give us built-in support for distributed training and evaluation (along with other nice features). You should nearly always use Estimators to create your TensorFlow models. You can build a Custom Estimator if none of the pre-made Estimators suit your purpose.

See the accompanying [notebook](using_tf.estimator.train_and_evaluate.ipynb) for the details of defining our Estimator, including specifying the expected format of the input data.
See the accompanying [notebook](https://nbviewer.jupyter.org/github/amygdala/code-snippets/blob/master/ml/census_train_and_eval/using_tf.estimator.train_and_evaluate.ipynb#First-step:-create-an-Estimator) for the details of defining our Estimator, including specifying the expected format of the input data.
The data is in csv format, and looks like this:

```
Expand Down Expand Up @@ -83,7 +83,7 @@ The `Dataset` API is much more performant than using `feed_dict` or the queue-ba

In this simple example, our datasets are too small for the use of the Datasets API to make a large difference, but with larger datasets it becomes much more important.

The `input_fn` definition is the following. It uses a couple of helper functions that are defined in the [notebook](using_tf.estimator.train_and_evaluate.ipynb).
The `input_fn` definition is the following. It uses a couple of helper functions that are defined in the [notebook](https://nbviewer.jupyter.org/github/amygdala/code-snippets/blob/master/ml/census_train_and_eval/using_tf.estimator.train_and_evaluate.ipynb#Define-input-functions-(using-Datasets)).
`parse_label_column` is used to convert the label strings (in our case, ' <=50K' and ' >50K') into [one-hot](https://en.wikipedia.org/wiki/One-hot) encodings.


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from __future__ import division\n",
Expand Down Expand Up @@ -675,7 +677,9 @@
"If you take a look in the `trainer` subdirectory of this directory, you'll see that it contains essentially the same code that's in this notebook, just packaged for deployment. `trainer.task` is the entry point, and when that file is run, it calls `tf.estimator.train_and_evaluate`. \n",
"(You can read more about how to package your code [here](https://cloud.google.com/ml-engine/docs/packaging-trainer)). \n",
"\n",
"We'll test training via `gcloud` locally first, to make sure that we have everything packaged up correctly."
"We'll test training via `gcloud` locally first, to make sure that we have everything packaged up correctly.\n",
"\n",
"### Test training locally via `gcloud`"
]
},
{
Expand Down Expand Up @@ -878,7 +882,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"!cat config_custom_gpus.yaml"
Expand All @@ -898,7 +904,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"job_name = \"census_job_%s\" % (int(time.time()))\n",
Expand All @@ -914,7 +922,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"!gcloud ml-engine jobs submit training $JOB_NAME --scale-tier $SCALE_TIER \\\n",
Expand Down Expand Up @@ -958,7 +968,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# We'll use the `hptuning_config.yaml` file for this run.\n",
Expand All @@ -968,7 +980,9 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"!gcloud ml-engine jobs submit training $JOB_NAME --scale-tier $SCALE_TIER \\\n",
Expand Down