Skip to content

Commit d5c72b3

Browse files
committed
more work in progress
1 parent 39673b6 commit d5c72b3

File tree

5 files changed

+29
-66
lines changed

5 files changed

+29
-66
lines changed

tutorial/bigger_toc_css.rst

Lines changed: 0 additions & 60 deletions
This file was deleted.
File renamed without changes.

tutorial/general_concepts.rst

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,31 @@ The names of the classes is stored in the last attribute, namely
121121
Handling categorical features
122122
-----------------------------
123123

124-
TODO
124+
Sometimes people describe samples with categorical descriptors that
125+
have no obvious numerical representation. For instance assume that
126+
each flower is further described by a color name among a fixed list
127+
of color names::
128+
129+
color in ['purple', 'blue', 'red']
130+
131+
The simple way to turn this categorical feature into numerical
132+
features suitable for machine learning is to create new features
133+
for each distinct color name that can be valued to ``1.0`` if the
134+
category is matching or ``0.0`` if not.
135+
136+
The enriched iris feature set would hence be in this case:
137+
138+
139+
:Features in the extended Iris dataset:
140+
141+
0. sepal length in cm
142+
1. sepal width in cm
143+
2. petal length in cm
144+
3. petal width in cm
145+
4. color#purple (1.0 or 0.0)
146+
5. color#blue (1.0 or 0.0)
147+
6. color#red (1.0 or 0.0)
148+
125149

126150
Extracting features from unstructured data
127151
------------------------------------------
@@ -184,8 +208,7 @@ How to evaluate the quality of feature extraction strategy
184208
----------------------------------------------------------
185209

186210
The rule of thumb is two samples that seem close or related to
187-
188-
And conversely, samples that seem close in
211+
TODO
189212

190213

191214
Supervised Learning: ``model.fit(X, y)``

tutorial/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
setup
2727
general_concepts
2828
working_with_text_data
29-
exercices
29+
exercises
3030
which_algorithm
3131
finding_help
3232

tutorial/which_algorithm.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
How to select the right algoritm for the task
2-
=============================================
1+
How to select the right algorithm for the task
2+
==============================================
33

44

55
Some practical hints for selecting the right algorithm when facing

0 commit comments

Comments
 (0)