We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
2 parents b7998c2 + aec8dd0 commit a8e7e48Copy full SHA for a8e7e48
sklearn/feature_extraction/dict_vectorizer.py
@@ -37,6 +37,11 @@ class DictVectorizer(BaseEstimator, TransformerMixin):
37
a feature "f" that can take on the values "ham" and "spam" will become two
38
features in the output, one signifying "f=ham", the other "f=spam".
39
40
+ However, note that this transformer will only do a binary one-hot encoding
41
+ when feature values are of type string. If categorical features are
42
+ represented as numeric values such as int, the DictVectorizer can be
43
+ followed by OneHotEncoder to complete binary one-hot encoding.
44
+
45
Features that do not occur in a sample (mapping) will have a zero value
46
in the resulting array/matrix.
47
0 commit comments