Skip to content

Commit b919b73

Browse files
committed
First draft of the profiling sections
1 parent 1148ab6 commit b919b73

10 files changed

+533
-2
lines changed

09-profiling-introduction.html

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
<!DOCTYPE html>
2+
<html>
3+
<head>
4+
<meta charset="utf-8">
5+
<meta name="generator" content="pandoc">
6+
<title>Software Carpentry: Profiling</title>
7+
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
8+
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
9+
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap.css" />
10+
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap-theme.css" />
11+
<link rel="stylesheet" type="text/css" href="css/swc.css" />
12+
<link rel="alternate" type="application/rss+xml" title="Software Carpentry Blog" href="http://software-carpentry.org/feed.xml"/>
13+
<meta charset="UTF-8" />
14+
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
15+
<!--[if lt IE 9]>
16+
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
17+
<![endif]-->
18+
</head>
19+
<body class="lesson">
20+
<div class="container card">
21+
<div class="banner">
22+
<a href="http://software-carpentry.org" title="Software Carpentry">
23+
<img alt="Software Carpentry banner" src="img/software-carpentry-banner.png" />
24+
</a>
25+
</div>
26+
<article>
27+
<div class="row">
28+
<div class="col-md-10 col-md-offset-1">
29+
<a href="index.html"><h1 class="title">Profiling</h1></a>
30+
<h2 class="subtitle">Introduction</h2>
31+
<aside class="callout panel panel-info">
32+
<div class="panel-heading">
33+
<h2 id="quote-by-donald-knuth"><span class="glyphicon glyphicon-pushpin"></span>Quote by Donald Knuth</h2>
34+
</div>
35+
<div class="panel-body">
36+
<p>“We should forget about small efficiencies, say about 97% of the time: <strong>premature optimization is the root of all evil</strong>. Yet we should not pass up our opportunities in that critical 3%”</p>
37+
</div>
38+
</aside>
39+
<p>Know what to optimize, spend the time where it is worth it:</p>
40+
<div class="figure">
41+
<img src="img/Optimizing-different-parts.svg" alt="Optimizing two tasks (By Gorivero, Wikimedia, Public Domain)" />
42+
<p class="caption">Optimizing two tasks (<a href="https://commons.wikimedia.org/w/index.php?curid=3366573">By Gorivero, Wikimedia, Public Domain</a>)</p>
43+
</div>
44+
<p>General approach:</p>
45+
<ol style="list-style-type: decimal">
46+
<li>Make sure that things are <em>correct</em> (fast but wrong does not help you)!</li>
47+
<li>Write tests so that you can be confident that your code is still correct after optimzing it.</li>
48+
<li>Measure the total run time, decide whether you need to optimize the code in the first place (see graphic below).</li>
49+
<li>Profile the code to decide where an optimization could be the most useful.</li>
50+
<li>Optimize it and go back</li>
51+
</ol>
52+
<div class="figure">
53+
<img src="img/is_it_worth_the_time.png" alt="Is it worth the time? XKCD comic, licensed CC BY-NC 2.5" />
54+
<p class="caption">Is it worth the time? <a href="http://xkcd.com/1205/">XKCD comic</a>, licensed <a href="http://creativecommons.org/licenses/by-nc/2.5/">CC BY-NC 2.5</a></p>
55+
</div>
56+
</div>
57+
</div>
58+
</article>
59+
<div class="footer">
60+
<a class="label swc-blue-bg" href="http://software-carpentry.org">Software Carpentry</a>
61+
<a class="label swc-blue-bg" href="https://github.com/paris-swc/python-testing-debugging-profiling">Source</a>
62+
<a class="label swc-blue-bg" href="mailto:[email protected]">Contact</a>
63+
<a class="label swc-blue-bg" href="LICENSE.html">License</a>
64+
</div>
65+
</div>
66+
<!-- Javascript placed at the end of the document so the pages load faster -->
67+
<script src="http://software-carpentry.org/v5/js/jquery-1.9.1.min.js"></script>
68+
<script src="css/bootstrap/bootstrap-js/bootstrap.js"></script>
69+
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
70+
</body>
71+
</html>

09-profiling-introduction.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
---
2+
layout: page
3+
title: Profiling
4+
subtitle: Introduction
5+
minutes: 5
6+
---
7+
8+
> ## Quote by Donald Knuth {.callout}
9+
> "We should forget about small efficiencies, say about 97% of the time:
10+
> **premature optimization is the root of all evil**. Yet we should not pass
11+
> up our opportunities in that critical 3%"
12+
13+
Know what to optimize, spend the time where it is worth it:
14+
15+
![Optimizing two tasks ([By Gorivero, Wikimedia, Public Domain](https://commons.wikimedia.org/w/index.php?curid=3366573))](img/Optimizing-different-parts.svg)
16+
17+
General approach:
18+
19+
1. Make sure that things are *correct* (fast but wrong does not help you)!
20+
2. Write tests so that you can be confident that your code is still correct
21+
after optimzing it.
22+
3. Measure the total run time, decide whether you need to optimize the code in
23+
the first place (see graphic below).
24+
4. Profile the code to decide where an optimization could be the most useful.
25+
5. Optimize it and go back
26+
27+
![Is it worth the time? [XKCD comic](http://xkcd.com/1205/), licensed [CC BY-NC 2.5](http://creativecommons.org/licenses/by-nc/2.5/)](img/is_it_worth_the_time.png)

10-profiling-basic.html

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
<!DOCTYPE html>
2+
<html>
3+
<head>
4+
<meta charset="utf-8">
5+
<meta name="generator" content="pandoc">
6+
<title>Software Carpentry: Profiling</title>
7+
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
8+
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
9+
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap.css" />
10+
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap-theme.css" />
11+
<link rel="stylesheet" type="text/css" href="css/swc.css" />
12+
<link rel="alternate" type="application/rss+xml" title="Software Carpentry Blog" href="http://software-carpentry.org/feed.xml"/>
13+
<meta charset="UTF-8" />
14+
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
15+
<!--[if lt IE 9]>
16+
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
17+
<![endif]-->
18+
</head>
19+
<body class="lesson">
20+
<div class="container card">
21+
<div class="banner">
22+
<a href="http://software-carpentry.org" title="Software Carpentry">
23+
<img alt="Software Carpentry banner" src="img/software-carpentry-banner.png" />
24+
</a>
25+
</div>
26+
<article>
27+
<div class="row">
28+
<div class="col-md-10 col-md-offset-1">
29+
<a href="index.html"><h1 class="title">Profiling</h1></a>
30+
<h2 class="subtitle">Total runtime measurements</h2>
31+
<p>Ipython offers two useful commands to measure the time a single line or cell of code takes to execute:</p>
32+
<ul>
33+
<li><code>%time</code> will time the total runtime in a simple way (much like the command <code>time</code> in a UNIX shell) – if your command/script takes a very long time to run, this is what you want to use.</li>
34+
<li><code>%timeit</code> will repeat the time measurements many times: by default it will do 3 trials, where each trial will execute the command N times. The number N is chosen so that the total test run takes a couple of seconds and the reported time will be the one of the best trial. This gives much more precise measurements for short-running commands.</li>
35+
</ul>
36+
<pre class="sourceCode python"><code class="sourceCode python">In [<span class="dv">2</span>]: square_ar = numpy.random.rand(<span class="dv">1000</span>, <span class="dv">1000</span>)
37+
In [<span class="dv">3</span>]: %time w, v = numpy.linalg.eig(square_ar)</code></pre>
38+
<pre class="output"><code>CPU times: user 4.54 s, sys: 240 ms, total: 4.78 s
39+
Wall time: 2.44 s</code></pre>
40+
<p>For small computations that are repeated many times, <code>timeit</code> is the better tool:</p>
41+
<pre class="sourceCode python"><code class="sourceCode python">In [<span class="dv">4</span>]: %timeit square_ar.var()</code></pre>
42+
<pre class="output"><code>The slowest run took 5.01 times longer than the fastest. This could mean that an intermediate result is being cached.
43+
100 loops, best of 3: 6.05 ms per loop</code></pre>
44+
<p>We get a warning message, most likely because the very first run was much slower than the other runs due to cache effects (data that was previously used is in a fast memory and can be reused very efficiently). Nowadays a lot of performance optimization revolves around the efficient use of memory in general and caches in particular. Whether we are interested in the results including these effects or not depends on our question, but if we are only interested in the “pure computation” time then one strategy is to scale up the problem size:</p>
45+
<pre class="sourceCode python"><code class="sourceCode python">In [<span class="dv">5</span>]: square_ar = numpy.random.rand(<span class="dv">3000</span>, <span class="dv">3000</span>)
46+
In [<span class="dv">6</span>]: %timeit square_ar.var()</code></pre>
47+
<pre class="output"><code>10 loops, best of 3: 101 ms per loop</code></pre>
48+
<p>For the timing of a series of statements, <code>%%timeit</code> can be used in the first line of a jupyter notebook cell to time the full cell.</p>
49+
<section class="challenge panel panel-success">
50+
<div class="panel-heading">
51+
<h2 id="sum-vs.sum"><span class="glyphicon glyphicon-pencil"></span>sum vs. sum</h2>
52+
</div>
53+
<div class="panel-body">
54+
<p>numpy has a <code>sum</code> function, but <code>sum</code> is also a standard built-in function in Python. Both can be used with all kind of Python sequences, e.g. with Python lists or numpy arrays. Use <code>a = numpy.arange(1000000)</code> and <code>l = list(range(1000000))</code> as example data and compare the runtime of <code>sum</code> vs. <code>numpy.sum</code> for the two variables. Which function is faster. Can you guess why?</p>
55+
</div>
56+
</section>
57+
</div>
58+
</div>
59+
</article>
60+
<div class="footer">
61+
<a class="label swc-blue-bg" href="http://software-carpentry.org">Software Carpentry</a>
62+
<a class="label swc-blue-bg" href="https://github.com/paris-swc/python-testing-debugging-profiling">Source</a>
63+
<a class="label swc-blue-bg" href="mailto:[email protected]">Contact</a>
64+
<a class="label swc-blue-bg" href="LICENSE.html">License</a>
65+
</div>
66+
</div>
67+
<!-- Javascript placed at the end of the document so the pages load faster -->
68+
<script src="http://software-carpentry.org/v5/js/jquery-1.9.1.min.js"></script>
69+
<script src="css/bootstrap/bootstrap-js/bootstrap.js"></script>
70+
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
71+
</body>
72+
</html>

10-profiling-basic.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
---
2+
layout: page
3+
title: Profiling
4+
subtitle: Total runtime measurements
5+
minutes: 10
6+
---
7+
8+
Ipython offers two useful commands to measure the time a single line or cell of
9+
code takes to execute:
10+
11+
* `%time` will time the total runtime in a simple way
12+
(much like the command `time` in a UNIX shell) -- if your command/script takes
13+
a very long time to run, this is what you want to use.
14+
* `%timeit` will repeat the time measurements many times: by default it will do
15+
3 trials, where each trial will execute the command N times. The number N is
16+
chosen so that the total test run takes a couple of seconds and the reported
17+
time will be the one of the best trial. This gives much more precise
18+
measurements for short-running commands.
19+
20+
~~~ {.python}
21+
In [2]: square_ar = numpy.random.rand(1000, 1000)
22+
In [3]: %time w, v = numpy.linalg.eig(square_ar)
23+
~~~
24+
~~~ {.output}
25+
CPU times: user 4.54 s, sys: 240 ms, total: 4.78 s
26+
Wall time: 2.44 s
27+
~~~
28+
29+
For small computations that are repeated many times, `timeit` is the better
30+
tool:
31+
32+
~~~ {.python}
33+
In [4]: %timeit square_ar.var()
34+
~~~
35+
~~~ {.output}
36+
The slowest run took 5.01 times longer than the fastest. This could mean that an intermediate result is being cached.
37+
100 loops, best of 3: 6.05 ms per loop
38+
~~~
39+
40+
We get a warning message, most likely because the very first run was much slower
41+
than the other runs due to cache effects (data that was previously used is in
42+
a fast memory and can be reused very efficiently). Nowadays a lot of performance
43+
optimization revolves around the efficient use of memory in general and caches
44+
in particular. Whether we are interested in the results including these effects
45+
or not depends on our question, but if we are only interested in the "pure
46+
computation" time then one strategy is to scale up the problem size:
47+
48+
~~~ {.python}
49+
In [5]: square_ar = numpy.random.rand(3000, 3000)
50+
In [6]: %timeit square_ar.var()
51+
~~~
52+
~~~ {.output}
53+
10 loops, best of 3: 101 ms per loop
54+
~~~
55+
56+
For the timing of a series of statements, `%%timeit` can be used in the first
57+
line of a jupyter notebook cell to time the full cell.
58+
59+
> ## sum vs. sum {.challenge}
60+
> numpy has a `sum` function, but `sum` is also a standard built-in function
61+
> in Python. Both can be used with all kind of Python sequences, e.g. with
62+
> Python lists or numpy arrays. Use `a = numpy.arange(1000000)` and
63+
> `l = list(range(1000000))` as example data and compare the runtime of `sum`
64+
> vs. `numpy.sum` for the two variables. Which function is faster. Can you guess
65+
> why?

11-profiling-detailed.html

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
<!DOCTYPE html>
2+
<html>
3+
<head>
4+
<meta charset="utf-8">
5+
<meta name="generator" content="pandoc">
6+
<title>Software Carpentry: Profiling</title>
7+
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
8+
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
9+
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap.css" />
10+
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap-theme.css" />
11+
<link rel="stylesheet" type="text/css" href="css/swc.css" />
12+
<link rel="alternate" type="application/rss+xml" title="Software Carpentry Blog" href="http://software-carpentry.org/feed.xml"/>
13+
<meta charset="UTF-8" />
14+
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
15+
<!--[if lt IE 9]>
16+
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
17+
<![endif]-->
18+
</head>
19+
<body class="lesson">
20+
<div class="container card">
21+
<div class="banner">
22+
<a href="http://software-carpentry.org" title="Software Carpentry">
23+
<img alt="Software Carpentry banner" src="img/software-carpentry-banner.png" />
24+
</a>
25+
</div>
26+
<article>
27+
<div class="row">
28+
<div class="col-md-10 col-md-offset-1">
29+
<a href="index.html"><h1 class="title">Profiling</h1></a>
30+
<h2 class="subtitle">Detailed runtime measurements</h2>
31+
<p>The tools from the previous section can help us decide the question whether we need to optimize in the first place (is the total run time fast enough?) and it can guide when we want to replace code by a better-performing alternative (such as a more specialized numpy function instead of a general built-in function). They do not tell us <em>what</em> to optimize, though.</p>
32+
<p>As a first example, let’s us have a look at the classical <a href="https://en.wikipedia.org/wiki/Fibonacci_number" title="Fibonacci number (wikipedia)">Fibonaci sequence</a> , where each number is the sum of the two preceding numbers. This can be directly written down in a recursive function (note that for simplicity we leave away all error checking, e.g. for negative numbers):</p>
33+
<pre class="sourceCode python"><code class="sourceCode python"><span class="kw">def</span> fibonacci(n):
34+
<span class="kw">if</span> n &lt; <span class="dv">2</span>:
35+
<span class="kw">return</span> n <span class="co"># fibonacci(0) == 0, fibonacci(1) == 1</span>
36+
<span class="kw">else</span>:
37+
<span class="kw">return</span> fibonacci(n - <span class="dv">2</span>) + fibonacci(n - <span class="dv">1</span>)</code></pre>
38+
<p>This seems to work fine, but the runtime increases with <code>n</code> in a dramatic fashion:</p>
39+
<pre class="sourceCode python"><code class="sourceCode python">%time factorial(<span class="dv">10</span>)</code></pre>
40+
<pre><code>CPU times: user 0 ns, sys: 0 ns, total: 0 ns
41+
Wall time: 52.7 µs
42+
43+
89</code></pre>
44+
<pre class="sourceCode python"><code class="sourceCode python">%time factorial(<span class="dv">20</span>)</code></pre>
45+
<pre><code>CPU times: user 12 ms, sys: 0 ns, total: 12 ms
46+
Wall time: 9.63 ms
47+
48+
10946</code></pre>
49+
<pre class="sourceCode python"><code class="sourceCode python">%time factorial(<span class="dv">30</span>)</code></pre>
50+
<pre><code>CPU times: user 696 ms, sys: 0 ns, total: 696 ms
51+
Wall time: 695 ms
52+
53+
1346269</code></pre>
54+
<pre class="sourceCode python"><code class="sourceCode python">%time factorial(<span class="dv">35</span>)</code></pre>
55+
<pre><code>CPU times: user 7.49 s, sys: 24 ms, total: 7.52 s
56+
Wall time: 7.52 s
57+
58+
14930352</code></pre>
59+
<p>To get an idea what is going on, we can use <code>%prun</code>, which runs a command with Python’s built-in profiler:</p>
60+
<pre class="sourceCode python"><code class="sourceCode python">% prun factorial(<span class="dv">35</span>)</code></pre>
61+
<pre class="output"><code>29860706 function calls (4 primitive calls) in 10.621 seconds
62+
63+
Ordered by: internal time
64+
65+
ncalls tottime percall cumtime percall filename:lineno(function)
66+
29860703/1 10.621 0.000 10.621 10.621 &lt;ipython-input-6-b471b8bf6ddb&gt;:1(factorial)
67+
1 0.000 0.000 10.621 10.621 {built-in method builtins.exec}
68+
1 0.000 0.000 10.621 10.621 &lt;string&gt;:1(&lt;module&gt;)
69+
1 0.000 0.000 0.000 0.000 {method &#39;disable&#39; of &#39;_lsprof.Profiler&#39; objects}</code></pre>
70+
<p>The output gives us three pieces of information for every function called during the execution of the command: the number of times the function was called (<code>ncalls</code>), the total time spend in that function itself (<code>tottime</code>) and the time spend in that function, including all time spend in functions called by that function (<code>cumtime</code>). It is not surprising that all of the time is spend in the <code>factorial</code> function (after all, that’s the only function we have) but the function got called 29860703 times! There is no way we are going to get a decent performance from this function without changing the approach fundamentally.</p>
71+
<section class="challenge panel panel-success">
72+
<div class="panel-heading">
73+
<h2 id="a-better-fibonacci-sequence"><span class="glyphicon glyphicon-pencil"></span>A better Fibonacci sequence</h2>
74+
</div>
75+
<div class="panel-body">
76+
<p>Do you know of a better way to write the Fibonacci function? Can you imagine specific ways of using that function where you would prefer yet another approach?</p>
77+
</div>
78+
</section>
79+
<p>Optimizing a calculation by using a fundamentally different approach is called “algorithmic optimization” and it is the potentially most powerful way to increase the performance of a program. Whenever the runtime of a program appears to be slow, the first check should be whether there is a function that takes a lot of time and is called more often than expected (e.g. we calculate a measure on 1000 values and expect a function to be called about a 1000 times as well but it is called 1000*1000 times instead).</p>
80+
<p>TODO: Show an example with a nested loop</p>
81+
<p>TODO: Demonstrate snakeviz</p>
82+
<p>TODO: Demonstrate line profiler</p>
83+
</div>
84+
</div>
85+
</article>
86+
<div class="footer">
87+
<a class="label swc-blue-bg" href="http://software-carpentry.org">Software Carpentry</a>
88+
<a class="label swc-blue-bg" href="https://github.com/paris-swc/python-testing-debugging-profiling">Source</a>
89+
<a class="label swc-blue-bg" href="mailto:[email protected]">Contact</a>
90+
<a class="label swc-blue-bg" href="LICENSE.html">License</a>
91+
</div>
92+
</div>
93+
<!-- Javascript placed at the end of the document so the pages load faster -->
94+
<script src="http://software-carpentry.org/v5/js/jquery-1.9.1.min.js"></script>
95+
<script src="css/bootstrap/bootstrap-js/bootstrap.js"></script>
96+
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
97+
</body>
98+
</html>

0 commit comments

Comments
 (0)