Skip to content

Commit c779569

Browse files
committed
Merge branch 'master' into pr_3651
2 parents bcb6d3b + a8585a4 commit c779569

File tree

520 files changed

+124224
-93336
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

520 files changed

+124224
-93336
lines changed

.gitattributes

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
11
/sklearn/__check_build/_check_build.c -diff
2-
/sklearn/_hmmc.c -diff
32
/sklearn/_isotonic.c -diff
4-
/sklearn/cluster/_hierarchical.c -diff
3+
/sklearn/cluster/_dbscan_inner.cpp -diff
4+
/sklearn/cluster/_hierarchical.cpp -diff
55
/sklearn/cluster/_k_means.c -diff
66
/sklearn/datasets/_svmlight_format.c -diff
7+
/sklearn/decomposition/_online_lda.c -diff
78
/sklearn/ensemble/_gradient_boosting.c -diff
89
/sklearn/feature_extraction/_hashing.c -diff
910
/sklearn/linear_model/cd_fast.c -diff

.landscape.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pylint:
2+
disable:
3+
- unpacking-non-sequence
4+
ignore-paths:
5+
- sklearn/externals

.mailmap

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ Andreas Mueller <[email protected]> <Andreas [email protected].
66
77
88
Andreas Mueller <[email protected]> <andy@marvin>
9+
910
Arnaud Joly <[email protected]>
1011
1112
@@ -20,6 +21,7 @@ Brian Cheung <[email protected]> <cow@rusty.(none)>
2021
2122
Christian Osendorfer <[email protected]>
2223
Clay Woolam <[email protected]>
24+
2325
Denis Engemann <[email protected]>
2426
2527
@@ -60,14 +62,20 @@ Kyle Kastner <[email protected]>
6062
Lars Buitinck <[email protected]> <Lars@.(none)>
6163
6264
65+
66+
67+
Loic Esteve <[email protected]>
6368
Manoj Kumar <[email protected]>
6469
Matthieu Perrot <[email protected]> <revilyo@earth.(none)>
6570
Maheshakya Wijewardena <[email protected]>
6671
Michael Bommarito <[email protected]>
6772
Michael Eickenberg <[email protected]>
6873
Michael Eickenberg <[email protected]> <[email protected]>
74+
75+
6976
Nelle Varoquaux <[email protected]>
70-
77+
78+
Nelle Varoquaux <[email protected]> <nelle@[email protected]>
7179
7280
7381
@@ -102,4 +110,4 @@ X006 <x006@x006-icsl.(none)> <x006@x006laptop.(none)>
102110
103111
104112
Yannick Schwartz <[email protected]> <ys218403@is220245.(none)>
105-
113+

.travis.yml

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,18 @@
11
language: python
2-
virtualenv:
3-
system_site_packages: true
2+
# make it explicit that we favor the new container-based travis workers
3+
sudo: false
4+
addons:
5+
apt:
6+
packages:
7+
# Only used by the DISTRIB="ubuntu" setting
8+
- libatlas3gf-base
9+
- libatlas-dev
10+
- python-scipy
411
env:
512
matrix:
6-
- DISTRIB="ubuntu" PYTHON_VERSION="2.7" INSTALL_ATLAS="true"
7-
COVERAGE="true"
13+
# This environment tests that scikit-learn can be built against
14+
# versions of numpy, scipy with ATLAS that comes with Ubuntu Precise 12.04
15+
- DISTRIB="ubuntu" PYTHON_VERSION="2.7" COVERAGE="true"
816
# This environment tests the oldest supported anaconda env
917
- DISTRIB="conda" PYTHON_VERSION="2.6" INSTALL_MKL="false"
1018
NUMPY_VERSION="1.6.2" SCIPY_VERSION="0.11.0"
@@ -19,3 +27,10 @@ after_success:
1927
# because the coverage report failed to be published.
2028
- if [[ "$COVERAGE" == "true" ]]; then coveralls || echo "failed"; fi
2129
cache: apt
30+
notifications:
31+
webhooks:
32+
urls:
33+
- https://webhooks.gitter.im/e/4ffabb4df010b70cd624
34+
on_success: change # options: [always|never|change] default: always
35+
on_failure: always # options: [always|never|change] default: always
36+
on_start: false # default: false

CONTRIBUTING.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ GitHub:
2222
2. Clone this copy to your local disk:
2323

2424
$ git clone [email protected]:YourLogin/scikit-learn.git
25+
$ cd scikit-learn
2526

2627
3. Create a branch to hold your changes:
2728

@@ -39,9 +40,9 @@ GitHub:
3940

4041
$ git push -u origin my-feature
4142

42-
Finally, go to the web page of the your fork of the scikit-learn repo,
43+
Finally, go to the web page of your fork of the scikit-learn repo,
4344
and click 'Pull request' to send your changes to the maintainers for
44-
review. request. This will send an email to the committers.
45+
review. This will send an email to the committers.
4546

4647
(If any of the above seems like magic to you, then look up the
4748
[Git documentation](http://git-scm.com/documentation) on the web.)
@@ -64,7 +65,7 @@ following rules before submitting a pull request:
6465
to other methods available in scikit-learn.
6566

6667
- At least one paragraph of narrative documentation with links to
67-
```` references in the literature (with PDF links when possible) and
68+
references in the literature (with PDF links when possible) and
6869
the example.
6970

7071
The documentation should also include expected time and space

COPYING

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
New BSD License
22

3-
Copyright (c) 2007–2014 The scikit-learn developers.
3+
Copyright (c) 2007–2015 The scikit-learn developers.
44
All rights reserved.
55

66

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ cython:
4747
ctags:
4848
# make tags for symbol based navigation in emacs and vim
4949
# Install with: sudo apt-get install exuberant-ctags
50-
$(CTAGS) -R *
50+
$(CTAGS) --python-kinds=-i -R sklearn
5151

5252
doc: inplace
5353
$(MAKE) -C doc html

README.rst

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,16 @@
11
.. -*- mode: rst -*-
22
3-
|Travis|_
3+
|Travis|_ |AppVeyor|_ |Coveralls|_
44

55
.. |Travis| image:: https://api.travis-ci.org/scikit-learn/scikit-learn.png?branch=master
66
.. _Travis: https://travis-ci.org/scikit-learn/scikit-learn
77

8+
.. |AppVeyor| image:: https://ci.appveyor.com/api/projects/status/github/scikit-learn/scikit-learn?branch=master&svg=true
9+
.. _AppVeyor: https://ci.appveyor.com/project/sklearn-ci/scikit-learn/history
10+
11+
.. |Coveralls| image:: https://coveralls.io/repos/scikit-learn/scikit-learn/badge.svg?branch=master
12+
.. _Coveralls: https://coveralls.io/r/scikit-learn/scikit-learn
13+
814
scikit-learn
915
============
1016

@@ -38,7 +44,7 @@ scikit-learn is tested to work under Python 2.6, Python 2.7, and Python 3.4.
3844
(using the same codebase thanks to an embedded copy of
3945
`six <http://pythonhosted.org/six/>`_). It should also work with Python 3.3.
4046

41-
The required dependencies to build the software are NumPy >= 1.6.2,
47+
The required dependencies to build the software are NumPy >= 1.6.1,
4248
SciPy >= 0.9 and a working C/C++ compiler.
4349

4450
For running the examples Matplotlib >= 1.1.1 is required and for running the

appveyor.yml

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# AppVeyor.com is a Continuous Integration service to build and run tests under
22
# Windows
3+
# https://ci.appveyor.com/project/sklearn-ci/scikit-learn
34

45
environment:
56
global:
@@ -11,23 +12,28 @@ environment:
1112
WHEELHOUSE_UPLOADER_SECRET:
1213
secure: BQm8KfEj6v2Y+dQxb2syQvTFxDnHXvaNktkLcYSq7jfbTOO6eH9n09tfQzFUVcWZ
1314

15+
# Make sure we don't download large datasets when running the test on
16+
# continuous integration platform
17+
SKLEARN_SKIP_NETWORK_TESTS: 1
18+
1419
matrix:
15-
- PYTHON: "C:\\Python27_32"
20+
- PYTHON: "C:\\Python27"
1621
PYTHON_VERSION: "2.7.8"
1722
PYTHON_ARCH: "32"
1823

19-
- PYTHON: "C:\\Python27_64"
24+
- PYTHON: "C:\\Python27-x64"
2025
PYTHON_VERSION: "2.7.8"
2126
PYTHON_ARCH: "64"
2227

23-
- PYTHON: "C:\\Python34_32"
28+
- PYTHON: "C:\\Python34"
2429
PYTHON_VERSION: "3.4.1"
2530
PYTHON_ARCH: "32"
2631

27-
- PYTHON: "C:\\Python34_64"
32+
- PYTHON: "C:\\Python34-x64"
2833
PYTHON_VERSION: "3.4.1"
2934
PYTHON_ARCH: "64"
3035

36+
3137
install:
3238
# Install Python (from the official .msi of http://python.org) and pip when
3339
# not already installed.
@@ -39,7 +45,7 @@ install:
3945
- "python -c \"import struct; print(struct.calcsize('P') * 8)\""
4046

4147
# Install the build and runtime dependencies of the project.
42-
- "%CMD_IN_ENV% pip install -r continuous_integration/appveyor/requirements.txt"
48+
- "%CMD_IN_ENV% pip install --timeout=60 --trusted-host 28daf2247a33ed269873-7b1aad3fab3cc330e1fd9d109892382a.r6.cf2.rackcdn.com -r continuous_integration/appveyor/requirements.txt"
4349
- "%CMD_IN_ENV% python setup.py bdist_wheel bdist_wininst -b doc/logos/scikit-learn-logo.bmp"
4450
- ps: "ls dist"
4551

@@ -55,9 +61,7 @@ test_script:
5561
- "mkdir empty_folder"
5662
- "cd empty_folder"
5763

58-
# Skip joblib tests that require multiprocessing as they are prone to random
59-
# slow down
60-
- "python -c \"import nose; nose.main()\" -s sklearn"
64+
- "python -c \"import nose; nose.main()\" -s -v sklearn"
6165

6266
# Move back to the project folder
6367
- "cd .."
@@ -71,3 +75,9 @@ on_success:
7175
# On Windows, Apache Libcloud cannot find a standard CA cert bundle so we
7276
# disable the ssl checks.
7377
- "python -m wheelhouse_uploader upload --no-ssl-check --local-folder=dist sklearn-windows-wheels"
78+
79+
notifications:
80+
- provider: Webhook
81+
url: https://webhooks.gitter.im/e/0dc8e57cd38105aeb1b4
82+
on_build_success: false
83+
on_build_failure: True

benchmarks/bench_20newsgroups.py

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
from __future__ import print_function, division
2+
from time import time
3+
import argparse
4+
import numpy as np
5+
6+
from sklearn.dummy import DummyClassifier
7+
8+
from sklearn.datasets import fetch_20newsgroups_vectorized
9+
from sklearn.metrics import accuracy_score
10+
from sklearn.utils.validation import check_array
11+
12+
from sklearn.ensemble import RandomForestClassifier
13+
from sklearn.ensemble import ExtraTreesClassifier
14+
from sklearn.ensemble import AdaBoostClassifier
15+
from sklearn.linear_model import LogisticRegression
16+
from sklearn.naive_bayes import MultinomialNB
17+
18+
ESTIMATORS = {
19+
"dummy": DummyClassifier(),
20+
"random_forest": RandomForestClassifier(n_estimators=100,
21+
max_features="sqrt",
22+
min_samples_split=10),
23+
"extra_trees": ExtraTreesClassifier(n_estimators=100,
24+
max_features="sqrt",
25+
min_samples_split=10),
26+
"logistic_regression": LogisticRegression(),
27+
"naive_bayes": MultinomialNB(),
28+
"adaboost": AdaBoostClassifier(n_estimators=10),
29+
}
30+
31+
32+
###############################################################################
33+
# Data
34+
35+
if __name__ == "__main__":
36+
37+
parser = argparse.ArgumentParser()
38+
parser.add_argument('-e', '--estimators', nargs="+", required=True,
39+
choices=ESTIMATORS)
40+
args = vars(parser.parse_args())
41+
42+
data_train = fetch_20newsgroups_vectorized(subset="train")
43+
data_test = fetch_20newsgroups_vectorized(subset="test")
44+
X_train = check_array(data_train.data, dtype=np.float32,
45+
accept_sparse="csc")
46+
X_test = check_array(data_test.data, dtype=np.float32, accept_sparse="csr")
47+
y_train = data_train.target
48+
y_test = data_test.target
49+
50+
print("20 newsgroups")
51+
print("=============")
52+
print("X_train.shape = {0}".format(X_train.shape))
53+
print("X_train.format = {0}".format(X_train.format))
54+
print("X_train.dtype = {0}".format(X_train.dtype))
55+
print("X_train density = {0}"
56+
"".format(X_train.nnz / np.product(X_train.shape)))
57+
print("y_train {0}".format(y_train.shape))
58+
print("X_test {0}".format(X_test.shape))
59+
print("X_test.format = {0}".format(X_test.format))
60+
print("X_test.dtype = {0}".format(X_test.dtype))
61+
print("y_test {0}".format(y_test.shape))
62+
print()
63+
64+
print("Classifier Training")
65+
print("===================")
66+
accuracy, train_time, test_time = {}, {}, {}
67+
for name in sorted(args["estimators"]):
68+
clf = ESTIMATORS[name]
69+
try:
70+
clf.set_params(random_state=0)
71+
except (TypeError, ValueError):
72+
pass
73+
74+
print("Training %s ... " % name, end="")
75+
t0 = time()
76+
clf.fit(X_train, y_train)
77+
train_time[name] = time() - t0
78+
t0 = time()
79+
y_pred = clf.predict(X_test)
80+
test_time[name] = time() - t0
81+
accuracy[name] = accuracy_score(y_test, y_pred)
82+
print("done")
83+
84+
print()
85+
print("Classification performance:")
86+
print("===========================")
87+
print()
88+
print("%s %s %s %s" % ("Classifier ", "train-time", "test-time",
89+
"Accuracy"))
90+
print("-" * 44)
91+
for name in sorted(accuracy, key=accuracy.get):
92+
print("%s %s %s %s" % (name.ljust(16),
93+
("%.4fs" % train_time[name]).center(10),
94+
("%.4fs" % test_time[name]).center(10),
95+
("%.4f" % accuracy[name]).center(10)))
96+
97+
print()

0 commit comments

Comments
 (0)