Skip to content

Commit 254b815

Browse files
committed
DOC explain fork related issues in FAQ + stopgap for Python 3.4
1 parent de48574 commit 254b815

File tree

1 file changed

+59
-9
lines changed

1 file changed

+59
-9
lines changed

doc/faq.rst

Lines changed: 59 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,8 @@ See :ref:`adding_graphical_models`.
106106
.. _adding_graphical_models:
107107

108108
Will you add graphical models or sequence prediction to scikit-learn?
109-
------------------------------------------------------------------------
109+
---------------------------------------------------------------------
110+
110111
Not in the foreseeable future.
111112
scikit-learn tries to provide a unified API for the basic tasks in machine
112113
learning, with pipelines and meta-algorithms like grid search to tie
@@ -124,16 +125,20 @@ do structured prediction:
124125
approximate inference; defines the notion of sample as an instance of
125126
the graph structure)
126127

127-
* `seqlearn <http://larsmans.github.io/seqlearn/>`_ handles sequences only (focuses on
128-
exact inference; has HMMs, but mostly for the sake of completeness;
129-
treats a feature vector as a sample and uses an offset encoding for
130-
the dependencies between feature vectors)
128+
* `seqlearn <http://larsmans.github.io/seqlearn/>`_ handles sequences only
129+
(focuses on exact inference; has HMMs, but mostly for the sake of
130+
completeness; treats a feature vector as a sample and uses an offset encoding
131+
for the dependencies between feature vectors)
131132

132133
Will you add GPU support?
133-
--------------------------
134-
No, or at least not in the near future. The main reason is that GPU support will introduce many software dependencies and introduce platform specific issues.
135-
scikit-learn is designed to be easy to install on a wide variety of platforms.
136-
Outside of neural networks, GPUs don't play a large role in machine learning today, and much larger gains in speed can often be achieved by a careful choice of algorithms.
134+
-------------------------
135+
136+
No, or at least not in the near future. The main reason is that GPU support
137+
will introduce many software dependencies and introduce platform specific
138+
issues. scikit-learn is designed to be easy to install on a wide variety of
139+
platforms. Outside of neural networks, GPUs don't play a large role in machine
140+
learning today, and much larger gains in speed can often be achieved by a
141+
careful choice of algorithms.
137142

138143
Do you support PyPy?
139144
--------------------
@@ -190,3 +195,48 @@ DBSCAN with Levenshtein distances::
190195

191196
Similar tricks can be used, with some care, for tree kernels, graph kernels,
192197
etc.
198+
199+
200+
Why do I sometime get a crash/freeze with n_jobs > 1 under OSX or Linux?
201+
------------------------------------------------------------------------
202+
203+
Several scikit-learn tools such as ``GridSearchCV`` and ``cross_val_score``
204+
rely internally on Python's `multiprocessing` module to parallelize execution
205+
onto several Python processes by passing ``n_jobs > 1`` as argument.
206+
207+
The problem is that Python ``multiprocessing`` does a ``fork`` system call
208+
without following it with an ``exec`` system call for performance reasons. Many
209+
libraries like (some versions of) Accelerate / vecLib under OSX, (some versions
210+
of) MKL, the OpenMP runtime of GCC, nvidia's Cuda (and probably many others),
211+
manage their own internal thread pool. Upon a call to `fork`, the thread pool
212+
state in the child process is corrupted: the thread pool believes it has many
213+
threads while only the main thread state has been forked. It is possible to
214+
change the libraries to make them detect when a fork happens and reinitialize
215+
the thread pool in that case: we did that for OpenBLAS (merged upstream in
216+
master since 0.2.10) and we contributed a `patch
217+
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035>`_ to GCC's OpenMP runtime
218+
(not yet reviewed).
219+
220+
But in the end the real culprit is Python's ``multiprocessing`` that does
221+
``fork`` without ``exec`` to reduce the overhead of starting and using new
222+
Python processes for parallel computing. Unfortunately this is a violation of
223+
the POSIX standard and therefore some software editors like Apple refuse to
224+
consider the lack of fork-safety in Accelerate / vecLib as a bug.
225+
226+
In Python 3.4+ it is now possible to configure ``multiprocessing`` to use the
227+
'forkserver' or 'spawn' start methods (instead of the default 'fork') to manage
228+
the process pools. This should make it possible to not be subject to this issue
229+
anymore. To set the 'forkserver' mode globally for your program, insert the
230+
following instructions in your main script::
231+
232+
import multiprocessing
233+
234+
# other imports, custom code, load data, define model...
235+
236+
if __name__ == '__main__':
237+
multiprocessing.set_start_method('forkserver')
238+
239+
# call scikit-learn utils with n_jobs > 1 here
240+
241+
You can find more default on the new start methods in the `multiprocessing
242+
documentation <https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods>`_.

0 commit comments

Comments
 (0)