Skip to content

Commit 0d179aa

Browse files
skrish13soumith
authored andcommitted
Updated datasets.rst, combined all commits (pytorch#931)
Added MNIST in the docs Updated incomplete cifar doc Updated the datasets.rst to include all datasets
1 parent 5b171ad commit 0d179aa

File tree

1 file changed

+60
-7
lines changed

1 file changed

+60
-7
lines changed

docs/source/torchvision/datasets.rst

Lines changed: 60 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,13 @@ torchvision.datasets
33

44
The following dataset loaders are available:
55

6+
- `MNIST`_
67
- `COCO (Captioning and Detection)`_
78
- `LSUN Classification`_
89
- `ImageFolder`_
910
- `Imagenet-12`_
1011
- `CIFAR10 and CIFAR100`_
12+
- `STL10`_
1113

1214
Datasets have the API:
1315

@@ -33,6 +35,15 @@ but they all take the keyword args:
3335
transforms it. For example, take in the caption string and return a
3436
tensor of word indices.
3537

38+
MNIST
39+
~~~~~
40+
41+
``dset.MNIST(root, train=True, transform=None, target_transform=None, download=False)``
42+
43+
- ``root`` : root directory of dataset where ``processed/training.pt`` and ``processed/test.pt`` exist.
44+
- ``train`` : ``True`` = Training set, ``False`` = Test set
45+
- ``download`` : ``True`` = downloads the dataset from the internet and puts it in root directory. If dataset already downloaded, place the processed dataset (function available in mnist.py) in the ``processed`` folder.
46+
3647
COCO
3748
~~~~
3849

@@ -82,11 +93,42 @@ LSUN
8293
``dset.LSUN(db_path, classes='train', [transform, target_transform])``
8394

8495
- db\_path = root directory for the database files
85-
- classes =
86-
- ‘train’ - all categories, training set
87-
- ‘val’ - all categories, validation set
88-
- ‘test’ - all categories, test set
89-
- [‘bedroom\_train’, ‘church\_train’, …] : a list of categories to load
96+
- ``classes`` = ``‘train’`` (all categories, training set), ``‘val’`` (all categories, validation set), ``‘test’`` (all categories, test set)
97+
- [``‘bedroom\_train’``, ``‘church\_train’``, …] : a list of categories to load
98+
99+
ImageFolder
100+
~~~~~~~~~~~
101+
102+
A generic data loader where the images are arranged in this way:
103+
104+
::
105+
106+
root/dog/xxx.png
107+
root/dog/xxy.png
108+
root/dog/xxz.png
109+
110+
root/cat/123.png
111+
root/cat/nsdf3.png
112+
root/cat/asd932_.png
113+
114+
``dset.ImageFolder(root="root folder path", [transform, target_transform])``
115+
116+
It has the members:
117+
118+
- ``self.classes`` - The class names as a list
119+
- ``self.class_to_idx`` - Corresponding class indices
120+
- ``self.imgs`` - The list of (image path, class-index) tuples
121+
122+
Imagenet-12
123+
~~~~~~~~~~~
124+
125+
This is simply implemented with an ImageFolder dataset.
126+
127+
The data is preprocessed `as described
128+
here <https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md#download-the-imagenet-dataset>`__
129+
130+
`Here is an
131+
example <https://github.com/pytorch/examples/blob/27e2a46c1d1505324032b1d94fc6ce24d5b67e97/imagenet/main.py#L48-L62>`__.
90132

91133
CIFAR
92134
~~~~~
@@ -99,11 +141,22 @@ CIFAR
99141
``cifar-10-batches-py``
100142
- ``train`` : ``True`` = Training set, ``False`` = Test set
101143
- ``download`` : ``True`` = downloads the dataset from the internet and
102-
puts it in root directory. If dataset already downloaded, do
144+
puts it in root directory. If dataset already downloaded, doesn't do anything.
145+
146+
STL10
147+
~~~~~
148+
149+
``dset.STL10(root, split='train', transform=None, target_transform=None, download=False)``
150+
151+
- ``root`` : root directory of dataset where there is folder ``stl10_binary``
152+
- ``split`` : ``'train'`` = Training set, ``'test'`` = Test set, ``'unlabeled'`` = Unlabeled set, ``'train+unlabeled'`` = Training + Unlabeled set (missing label marked as ``-1``)
153+
- ``download`` : ``True`` = downloads the dataset from the internet and puts it in root directory. If dataset already downloaded, doesn't do anything.
103154

155+
.. _MNIST: #mnist
104156
.. _COCO (Captioning and Detection): #coco
105157
.. _LSUN Classification: #lsun
106158
.. _ImageFolder: #imagefolder
107159
.. _Imagenet-12: #imagenet-12
108160
.. _CIFAR10 and CIFAR100: #cifar
109-
.. _COCO API to be installed: https://github.com/pdollar/coco/tree/master/PythonAPI
161+
.. _STL10: #stl10
162+
.. _COCO API to be installed: https://github.com/pdollar/coco/tree/master/PythonAPI

0 commit comments

Comments
 (0)