Skip to content

Commit efe946e

Browse files
committed
Update README.md
1 parent 177212e commit efe946e

File tree

1 file changed

+92
-11
lines changed

1 file changed

+92
-11
lines changed

README.md

Lines changed: 92 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,28 @@
1-
###### NB: Homepage [here](http://datasciencemasters.org), Community version [HERE](https://github.com/datasciencemasters/go/wiki)
1+
###### NB: Homepage [here](http://datasciencemasters.org), Community version [HERE](https://github.com/datasciencemasters/go/)
22

3-
***
4-
5-
The Curriculum for learning Data Science, Open Source and at your fingertips.
3+
## The Open-Source Curriculum for learning Data Science
64

7-
## The Internet is Your Oyster
5+
### The Internet is Your Oyster
86

97
I didn't want to wait. I wanted to work on things I care about **now**. Why sleep through grad school lectures tomorrow when you can hack on interesting questions today?
108

9+
*See [My Curriculum](http://bit.ly/corthelldata)*
10+
1111
With Coursera, ebooks, stackoverflow, and github -- all free and open -- how can you afford not to take advantage of an open source education?
1212

13-
## The Motivation
13+
### The Motivation
1414

1515
We need more Data Scientists.
1616

1717
> ...by 2018 the United States will experience a shortage of 190,000 skilled data scientists, and 1.5 million managers and analysts capable of reaping actionable insights from the big data deluge.
1818
1919
-- [McKinsey Report Highlights the Impending Data Scientist Shortage](http://blog.gopivotal.com/news-2/mckinsey-report-highlights-the-impending-data-scientist-shortage) 23 July 2013
2020

21-
## An Academic Shortfall
21+
> There are little to no Data Scientists with 5 years experience, because the job simply did not exist.
22+
23+
-- David Hardtke [How To Hire A Data Scientist](http://blog.bright.com/2012/11/13/how-to-hire-a-data-scientist/) 13 Nov 2012
24+
25+
### An Academic Shortfall
2226

2327
Classic academic conduits aren't providing Data Scientists -- this talent gap will be closed differently.
2428

@@ -30,12 +34,89 @@ Classic academic conduits aren't providing Data Scientists -- this talent gap wi
3034
3135
-- James Kobielus, [Closing the Talent Gap](http://www.ibmbigdatahub.com/blog/data-scientist-closing-talent-gap) 17 Jan 2013
3236

33-
## The Open Source Curriculum
37+
### Ready?
3438

35-
**[Start Here](http://datasciencemasters.org)**.
39+
***
40+
41+
## The Open Source Data Science Curriculum
42+
43+
Start here.
44+
* **Intro to Data Science** [UW / Coursera](https://www.coursera.org/course/datasci)
45+
* *Topics:* Python NLP on Twitter API, Distributed Computing Paradigm, MapReduce/Hadoop & Pig Script, SQL/NoSQL, Relational Algebra, Experiment design, Statistics, Graphs, Amazon EC2, Visualization.
46+
47+
### Math
48+
* Linear Algebra / Levandosky [Stanford / Book](http://www.amazon.com/Linear-Algebra-Steven-Levandosky/dp/0536667470/ref=sr_1_1?ie=UTF8&qid=1376546498&sr=8-1&keywords=linear+algebra+levandosky#)
49+
* Linear Programming (Math 407) [University of Washington / Course](http://www.math.washington.edu/~burke/crs/407/lectures/)
50+
* Statistics [Stats in a Nutshell / Book](http://shop.oreilly.com/product/9780596510497.do)
51+
* Problem-Solving Heuristics "How To Solve It" [Polya / Book](http://en.wikipedia.org/wiki/How_to_Solve_It)
52+
* Coding the Matrix: Linear Algebra through Computer Science Applications [Brown / Coursera](https://www.coursera.org/course/matrix)
53+
* Think Bayes [Allen Downey / Book](http://www.greenteapress.com/thinkbayes/)
54+
55+
### Computing
56+
* **Algorithms**
57+
* Algorithms Design & Analysis I [Stanford / Coursera](https://www.coursera.org/course/algo)
58+
* Algorithm Design [Kleinberg & Tardos / Book](http://www.amazon.com/Algorithm-Design-Jon-Kleinberg/dp/0321295358/ref=sr_1_1?ie=UTF8&qid=1376702127&sr=8-1&keywords=kleinberg+algorithms)
59+
60+
* **Databases**
61+
* Introduction to Databases [Stanford / Coursera](https://www.coursera.org/course/db)
62+
63+
* **Data Mining**
64+
* Mining Massive Data Sets [Stanford / Book](http://i.stanford.edu/~ullman/mmds.html)
65+
* Mining The Social Web [O'Reilly / Book](http://shop.oreilly.com/product/0636920010203.do)
66+
* Introduction to Information Retrieval [Stanford / Book](http://nlp.stanford.edu/IR-book/information-retrieval-book.html)
67+
68+
* **Machine Learning**
69+
* Machine Learning / Ng [Stanford / Coursera](https://www.coursera.org/course/ml)
70+
* Programming Collective Intelligence [O'Reilly / Book](http://shop.oreilly.com/product/9780596529321.do)
71+
* Statistics [The Elements of Statistical Learning](http://www-stat.stanford.edu/~tibs/ElemStatLearn/)
72+
73+
* **Probabilistic Graphical Models**
74+
* Probabilistic Programming and Bayesian Methods for Hackers [Github / Tutorials](https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers)
75+
* PGMs / Koller [Stanford / Coursera](https://www.coursera.org/course/pgm)
76+
77+
* **Natural Language Processing**
78+
* NLP with Python [O'Reilly / Book](http://shop.oreilly.com/product/9780596516499.do)
79+
80+
* **Analysis**
81+
* Python for Data Analysis [O'Reilly / Book](http://www.kqzyfj.com/click-7040302-11260198?url=http%3A%2F%2Fshop.oreilly.com%2Fproduct%2F0636920023784.do&cjsku=0636920023784)
82+
* Big Data Analysis with Twitter [UC Berkeley / Lectures](http://blogs.ischool.berkeley.edu/i290-abdt-s12/)
83+
* Social and Economic Networks: Models and Analysis / [Stanford / Coursera](https://www.coursera.org/course/networksonline)
84+
* Information Visualization ["Envisioning Information" Tufte / Book](http://www.amazon.com/Envisioning-Information-Edward-R-Tufte/dp/0961392118/ref=sr_1_8?ie=UTF8&qid=1376709039&sr=8-8&keywords=information+design)
85+
86+
* **Python** (Learning)
87+
* New To Python: [Learn Python the Hard Way](http://learnpythonthehardway.org/), [Google's Python Class](http://code.google.com/edu/languages/google-python-class/)
88+
89+
* **Python** (Libraries)
90+
* Basic Packages [Python, virtualenv, NumPy, SciPy, matplotlib and IPython ](http://www.lowindata.com/2013/installing-scientific-python-on-mac-os-x/)
91+
* [Data Science in iPython Notebooks](http://nborwankar.github.io/LearnDataScience/) (Linear Regression, Logistic Regression, Random Forests, K-Means Clustering)
92+
* Bayesian Inference | [pymc](https://github.com/pymc-devs/pymc)
93+
* Labeled data structures objects, statistical functions, etc [pandas](https://github.com/pydata/pandas) (See: Python for Data Analysis)
94+
* Python wrapper for the Twitter API [twython](https://github.com/ryanmcgrath/twython)
95+
* Tools for Data Mining & Analysis [scikit-learn](http://scikit-learn.org/stable/)
96+
* Network Modeling & Viz [networkx](http://networkx.github.io/)
97+
* Natural Language Toolkit [NLTK](http://nltk.org/)
98+
99+
### Capstone Project
100+
* [Toy Data Ideas](http://www.quora.com/Programming-Challenges-1/What-are-some-good-toy-problems-in-data-science)
101+
* Capstone Analysis of Your Own Design; [Quora](http://www.quora.com/Programming-Challenges-1/What-are-some-good-toy-problems-in-data-science)'s Idea Compendium
102+
103+
***
104+
### Further Study Resources:
105+
* [Coursera](http://coursera.org)
106+
* [Khan Academy](https://www.khanacademy.org/math/probability/random-variables-topic/random_variables_prob_dist/v/term-life-insurance-and-death-probability)
107+
* [Wolfram Alpha](http://www.wolframalpha.com/input/?i=torus)
108+
* [Wikipedia](http://en.wikipedia.org/wiki/List_of_cognitive_biases)
109+
* [Quora](http://www.quora.com/Programming-Challenges-1/What-are-some-good-toy-problems-in-data-science)
110+
* Kindle .mobis
111+
* Great PopSci Read: [The Signal and The Noise](http://www.amazon.com/Signal-Noise-Predictions-Fail-but-ebook/dp/B007V65R54/ref=tmm_kin_swatch_0?_encoding=UTF8&sr=8-1&qid=1376699450) Nate Silver
112+
* Zipfian Academy's [List of Resources](http://blog.zipfianacademy.com/post/46864003608/a-practical-intro-to-data-science)
113+
* [A Software Engineer's Guide to Getting Started w Data Science](http://www.rcasts.com/2012/12/software-engineers-guide-to-getting.html)
114+
* Data Scientist Interviews [Metamarkets](http://metamarkets.com/category/data-science/)
36115

37116
## Contribute
38117

39-
Please Share and Contribute -- **it's Open Source**!
118+
Please Share and Contribute Your Ideas -- **it's Open Source!**
119+
120+
Here's [my transcript](https://github.com/datasciencemasters/go/wiki/%5BTranscript%5D-Clare-Corthell); Please **showcase your own** on the [wiki](https://github.com/datasciencemasters/go/wiki/_pages)!
40121

41-
Follow me on Twitter [@clarecorthell](http://twitter.com/clarecorthell)
122+
[Follow me on Twitter @clarecorthell](http://twitter.com/clarecorthell)

0 commit comments

Comments
 (0)