Skip to content

Commit 6e7becb

Browse files
committed
Merge remote-tracking branch 'jakevdp/master'
2 parents 4a5371f + 599aa0f commit 6e7becb

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

54 files changed

+4446
-846
lines changed

.gitmodules

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
[submodule "website/plugins/ipynb"]
2+
path = website/plugins/ipynb
3+
url = git://github.com/danielfrg/pelican-ipynb.git
4+
[submodule "website/plugins/pelican-plugins"]
5+
path = website/plugins/pelican-plugins
6+
url = git://github.com/getpelican/pelican-plugins.git

README.md

Lines changed: 18 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -1,97 +1,32 @@
11
# Python Data Science Handbook
22

3+
[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jakevdp/PythonDataScienceHandbook/master?filepath=notebooks%2FIndex.ipynb)
4+
35
This repository contains the entire [Python Data Science Handbook](http://shop.oreilly.com/product/0636920034919.do), in the form of (free!) Jupyter notebooks.
46

57
![cover image](notebooks/figures/PDSH-cover.png)
68

7-
The book was written and tested with Python 3.5, though older Python versions (including Python 2.7) should work in nearly all cases.
9+
## How to Use this Book
10+
11+
- Read the book in its entirety online at https://jakevdp.github.io/PythonDataScienceHandbook/
12+
13+
- Run the code using the Jupyter notebooks available in this repository's [notebooks](notebooks) directory.
14+
15+
- Launch a live notebook server with these notebooks using [binder](https://beta.mybinder.org/): [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jakevdp/PythonDataScienceHandbook/master?filepath=notebooks%2FIndex.ipynb)
16+
17+
- Buy the printed book through [O'Reilly Media](http://shop.oreilly.com/product/0636920034919.do)
18+
19+
## About
20+
21+
The book was written and tested with Python 3.5, though other Python versions (including Python 2.7) should work in nearly all cases.
822

923
The book introduces the core libraries essential for working with data in Python: particularly [IPython](http://ipython.org), [NumPy](http://numpy.org), [Pandas](http://pandas.pydata.org), [Matplotlib](http://matplotlib.org), [Scikit-Learn](http://scikit-learn.org), and related packages.
1024
Familiarity with Python as a language is assumed; if you need a quick introduction to the language itself, see the free companion project,
1125
[A Whirlwind Tour of Python](https://github.com/jakevdp/WhirlwindTourOfPython): it's a fast-paced introduction to the Python language aimed at researchers and scientists.
1226

13-
The following listing links to the notebooks in this repository, rendered through the [nbviewer](http://nbviewer.jupyter.org) service:
14-
15-
---
16-
## [Table of Contents](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/Index.ipynb)
17-
18-
### [Preface](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/00.00-Preface.ipynb)
19-
20-
### [1. IPython: Beyond Normal Python](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.00-IPython-Beyond-Normal-Python.ipynb)
21-
- [Help and Documentation in IPython](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.01-Help-And-Documentation.ipynb)
22-
- [Keyboard Shortcuts in the IPython Shell](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.02-Shell-Keyboard-Shortcuts.ipynb)
23-
- [IPython Magic Commands](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.03-Magic-Commands.ipynb)
24-
- [Input and Output History](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.04-Input-Output-History.ipynb)
25-
- [IPython and Shell Commands](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.05-IPython-And-Shell-Commands.ipynb)
26-
- [Errors and Debugging](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.06-Errors-and-Debugging.ipynb)
27-
- [Profiling and Timing Code](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.07-Timing-and-Profiling.ipynb)
28-
- [More IPython Resources](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/01.08-More-IPython-Resources.ipynb)
29-
30-
### [2. Introduction to NumPy](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.00-Introduction-to-NumPy.ipynb)
31-
- [Understanding Data Types in Python](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.01-Understanding-Data-Types.ipynb)
32-
- [The Basics of NumPy Arrays](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.02-The-Basics-Of-NumPy-Arrays.ipynb)
33-
- [Computation on NumPy Arrays: Universal Functions](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.03-Computation-on-arrays-ufuncs.ipynb)
34-
- [Aggregations: Min, Max, and Everything In Between](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.04-Computation-on-arrays-aggregates.ipynb)
35-
- [Computation on Arrays: Broadcasting](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.05-Computation-on-arrays-broadcasting.ipynb)
36-
- [Comparisons, Masks, and Boolean Logic](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.06-Boolean-Arrays-and-Masks.ipynb)
37-
- [Fancy Indexing](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.07-Fancy-Indexing.ipynb)
38-
- [Sorting Arrays](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.08-Sorting.ipynb)
39-
- [Structured Data: NumPy's Structured Arrays](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.09-Structured-Data-NumPy.ipynb)
40-
41-
### [3. Data Manipulation with Pandas](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.00-Introduction-to-Pandas.ipynb)
42-
- [Introducing Pandas Objects](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.01-Introducing-Pandas-Objects.ipynb)
43-
- [Data Indexing and Selection](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.02-Data-Indexing-and-Selection.ipynb)
44-
- [Operating on Data in Pandas](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.03-Operations-in-Pandas.ipynb)
45-
- [Handling Missing Data](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.04-Missing-Values.ipynb)
46-
- [Hierarchical Indexing](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.05-Hierarchical-Indexing.ipynb)
47-
- [Combining Datasets: Concat and Append](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.06-Concat-And-Append.ipynb)
48-
- [Combining Datasets: Merge and Join](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.07-Merge-and-Join.ipynb)
49-
- [Aggregation and Grouping](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.08-Aggregation-and-Grouping.ipynb)
50-
- [Pivot Tables](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.09-Pivot-Tables.ipynb)
51-
- [Vectorized String Operations](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.10-Working-With-Strings.ipynb)
52-
- [Working with Time Series](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.11-Working-with-Time-Series.ipynb)
53-
- [High-Performance Pandas: eval() and query()](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.12-Performance-Eval-and-Query.ipynb)
54-
- [Further Resources](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.13-Further-Resources.ipynb)
55-
56-
### [4. Visualization with Matplotlib](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.00-Introduction-To-Matplotlib.ipynb)
57-
- [Simple Line Plots](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.01-Simple-Line-Plots.ipynb)
58-
- [Simple Scatter Plots](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.02-Simple-Scatter-Plots.ipynb)
59-
- [Visualizing Errors](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.03-Errorbars.ipynb)
60-
- [Density and Contour Plots](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.04-Density-and-Contour-Plots.ipynb)
61-
- [Histograms, Binnings, and Density](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.05-Histograms-and-Binnings.ipynb)
62-
- [Customizing Plot Legends](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.06-Customizing-Legends.ipynb)
63-
- [Customizing Colorbars](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.07-Customizing-Colorbars.ipynb)
64-
- [Multiple Subplots](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.08-Multiple-Subplots.ipynb)
65-
- [Text and Annotation](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.09-Text-and-Annotation.ipynb)
66-
- [Customizing Ticks](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.10-Customizing-Ticks.ipynb)
67-
- [Customizing Matplotlib: Configurations and Stylesheets](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.11-Settings-and-Stylesheets.ipynb)
68-
- [Three-Dimensional Plotting in Matplotlib](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.12-Three-Dimensional-Plotting.ipynb)
69-
- [Geographic Data with Basemap](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.13-Geographic-Data-With-Basemap.ipynb)
70-
- [Visualization with Seaborn](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.14-Visualization-With-Seaborn.ipynb)
71-
- [Further Resources](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/04.15-Further-Resources.ipynb)
72-
73-
### [5. Machine Learning](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.00-Machine-Learning.ipynb)
74-
- [What Is Machine Learning?](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.01-What-Is-Machine-Learning.ipynb)
75-
- [Introducing Scikit-Learn](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.02-Introducing-Scikit-Learn.ipynb)
76-
- [Hyperparameters and Model Validation](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.03-Hyperparameters-and-Model-Validation.ipynb)
77-
- [Feature Engineering](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.04-Feature-Engineering.ipynb)
78-
- [In-Depth: Naive Bayes Classification](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.05-Naive-Bayes.ipynb)
79-
- [In-Depth: Linear Regression](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.06-Linear-Regression.ipynb)
80-
- [In-Depth: Support Vector Machines](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.07-Support-Vector-Machines.ipynb)
81-
- [In-Depth: Decision Trees and Random Forests](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.08-Random-Forests.ipynb)
82-
- [In-Depth: Principal Component Analysis](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.09-Principal-Component-Analysis.ipynb)
83-
- [In-Depth: Manifold Learning](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.10-Manifold-Learning.ipynb)
84-
- [In-Depth: k-Means Clustering](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.11-K-Means.ipynb)
85-
- [In-Depth: Gaussian Mixture Models](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.12-Gaussian-Mixtures.ipynb)
86-
- [In-Depth: Kernel Density Estimation](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.13-Kernel-Density-Estimation.ipynb)
87-
- [Application: A Face Detection Pipeline](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.14-Image-Features.ipynb)
88-
- [Further Machine Learning Resources](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.15-Learning-More.ipynb)
89-
90-
### [Appendix: Figure Code](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/06.00-Figure-Code.ipynb)
91-
92-
---
93-
94-
## Required Packages
27+
See [Index.ipynb](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/Index.ipynb) for an index of the notebooks available to accompany the text.
28+
29+
## Software
9530

9631
The code in the book was tested with Python 3.5, though most (but not all) will also work correctly with Python 2.7 and other older Python versions.
9732

notebooks/01.02-Shell-Keyboard-Shortcuts.ipynb

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,13 @@
2323
"cell_type": "markdown",
2424
"metadata": {},
2525
"source": [
26-
"# Keyboard Shortcuts in the IPython Shell\n",
27-
"\n",
26+
"# Keyboard Shortcuts in the IPython Shell"
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"metadata": {},
32+
"source": [
2833
"If you spend any amount of time on the computer, you've probably found a use for keyboard shortcuts in your workflow.\n",
2934
"Most familiar perhaps are the Cmd-C and Cmd-V (or Ctrl-C and Ctrl-V) for copying and pasting in a wide variety of programs and systems.\n",
3035
"Power-users tend to go even further: popular text editors like Emacs, Vim, and others provide users an incredible range of operations through intricate combinations of keystrokes.\n",

notebooks/01.03-Magic-Commands.ipynb

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,13 @@
2323
"cell_type": "markdown",
2424
"metadata": {},
2525
"source": [
26-
"# IPython Magic Commands\n",
27-
"\n",
26+
"# IPython Magic Commands"
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"metadata": {},
32+
"source": [
2833
"The previous two sections showed how IPython lets you use and explore Python efficiently and interactively.\n",
2934
"Here we'll begin discussing some of the enhancements that IPython adds on top of the normal Python syntax.\n",
3035
"These are known in IPython as *magic commands*, and are prefixed by the ``%`` character.\n",

notebooks/01.04-Input-Output-History.ipynb

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,13 @@
2323
"cell_type": "markdown",
2424
"metadata": {},
2525
"source": [
26-
"# Input and Output History\n",
27-
"\n",
26+
"# Input and Output History"
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"metadata": {},
32+
"source": [
2833
"Previously we saw that the IPython shell allows you to access previous commands with the up and down arrow keys, or equivalently the Ctrl-p/Ctrl-n shortcuts.\n",
2934
"Additionally, in both the shell and the notebook, IPython exposes several ways to obtain the output of previous commands, as well as string versions of the commands themselves.\n",
3035
"We'll explore those here."

notebooks/01.05-IPython-And-Shell-Commands.ipynb

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,13 @@
2323
"cell_type": "markdown",
2424
"metadata": {},
2525
"source": [
26-
"# IPython and Shell Commands\n",
27-
"\n",
26+
"# IPython and Shell Commands"
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"metadata": {},
32+
"source": [
2833
"When working interactively with the standard Python interpreter, one of the frustrations is the need to switch between multiple windows to access Python tools and system command-line tools.\n",
2934
"IPython bridges this gap, and gives you a syntax for executing shell commands directly from within the IPython terminal.\n",
3035
"The magic happens with the exclamation point: anything appearing after ``!`` on a line will be executed not by the Python kernel, but by the system command-line.\n",

notebooks/01.06-Errors-and-Debugging.ipynb

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,13 @@
2323
"cell_type": "markdown",
2424
"metadata": {},
2525
"source": [
26-
"# Errors and Debugging\n",
27-
"\n",
26+
"# Errors and Debugging"
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"metadata": {},
32+
"source": [
2833
"Code development and data analysis always require a bit of trial and error, and IPython contains tools to streamline this process.\n",
2934
"This section will briefly cover some options for controlling Python's exception reporting, followed by exploring tools for debugging errors in code."
3035
]

notebooks/01.07-Timing-and-Profiling.ipynb

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,13 @@
2323
"cell_type": "markdown",
2424
"metadata": {},
2525
"source": [
26-
"# Profiling and Timing Code\n",
27-
"\n",
26+
"# Profiling and Timing Code"
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"metadata": {},
32+
"source": [
2833
"In the process of developing code and creating data processing pipelines, there are often trade-offs you can make between various implementations.\n",
2934
"Early in developing your algorithm, it can be counterproductive to worry about such things. As Donald Knuth famously quipped, \"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.\"\n",
3035
"\n",

notebooks/01.08-More-IPython-Resources.ipynb

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,13 @@
2323
"cell_type": "markdown",
2424
"metadata": {},
2525
"source": [
26-
"# More IPython Resources\n",
27-
"\n",
26+
"# More IPython Resources"
27+
]
28+
},
29+
{
30+
"cell_type": "markdown",
31+
"metadata": {},
32+
"source": [
2833
"In this chapter, we've just scratched the surface of using IPython to enable data science tasks.\n",
2934
"Much more information is available both in print and on the Web, and here we'll list some other resources that you may find helpful."
3035
]

0 commit comments

Comments
 (0)