Skip to content

Commit 219d0c2

Browse files
committed
updates from jeremy
1 parent b1fc313 commit 219d0c2

9 files changed

+1846
-593
lines changed

1-what-is-nlp.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -478,7 +478,7 @@
478478
"name": "python",
479479
"nbconvert_exporter": "python",
480480
"pygments_lexer": "ipython3",
481-
"version": "3.7.1"
481+
"version": "3.7.3"
482482
}
483483
},
484484
"nbformat": 4,

2-svd-nmf-topic-modeling.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -390,7 +390,7 @@
390390
"\n",
391391
"Lemmatization uses the rules about a language. The resulting tokens are all actual words\n",
392392
"\n",
393-
"\"Stemming is the poor-man\u2019s lemmatization.\" (Noah Smith, 2011) Stemming is a crude heuristic that chops the ends off of words. The resulting tokens may not be actual words. Stemming is faster."
393+
"\"Stemming is the poor-man’s lemmatization.\" (Noah Smith, 2011) Stemming is a crude heuristic that chops the ends off of words. The resulting tokens may not be actual words. Stemming is faster."
394394
]
395395
},
396396
{
@@ -1756,9 +1756,9 @@
17561756
"name": "python",
17571757
"nbconvert_exporter": "python",
17581758
"pygments_lexer": "ipython3",
1759-
"version": "3.7.1"
1759+
"version": "3.7.3"
17601760
}
17611761
},
17621762
"nbformat": 4,
17631763
"nbformat_minor": 2
1764-
}
1764+
}

2b-odds-and-ends.ipynb

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,6 @@
122122
]
123123
},
124124
{
125-
"attachments": {},
126125
"cell_type": "markdown",
127126
"metadata": {
128127
"hidden": true
@@ -1093,7 +1092,7 @@
10931092
"name": "python",
10941093
"nbconvert_exporter": "python",
10951094
"pygments_lexer": "ipython3",
1096-
"version": "3.7.1"
1095+
"version": "3.7.3"
10971096
}
10981097
},
10991098
"nbformat": 4,

3-logreg-nb-imdb.ipynb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@
7676
"source": [
7777
"The [large movie review dataset](http://ai.stanford.edu/~amaas/data/sentiment/) contains a collection of 50,000 reviews from IMDB, We will use the version hosted as part [fast.ai datasets](https://course.fast.ai/datasets.html) on AWS Open Datasets. \n",
7878
"\n",
79-
"The dataset contains an even number of positive and negative reviews. The authors considered only highly polarized reviews. A negative review has a score \u2264 4 out of 10, and a positive review has a score \u2265 7 out of 10. Neutral reviews are not included in the dataset. The dataset is divided into training and test sets. The training set is the same 25,000 labeled reviews.\n",
79+
"The dataset contains an even number of positive and negative reviews. The authors considered only highly polarized reviews. A negative review has a score 4 out of 10, and a positive review has a score 7 out of 10. Neutral reviews are not included in the dataset. The dataset is divided into training and test sets. The training set is the same 25,000 labeled reviews.\n",
8080
"\n",
8181
"The **sentiment classification task** consists of predicting the polarity (positive or negative) of a given text."
8282
]
@@ -3251,7 +3251,7 @@
32513251
{
32523252
"data": {
32533253
"text/plain": [
3254-
"'xxbos xxmaj this is not really a zombie film , if we \\'re xxunk zombies as the dead walking around . xxmaj here the protagonist , xxmaj xxunk xxmaj louque ( played by an xxunk young xxmaj dean xxmaj xxunk ) , xxunk control of a method to create zombies , though in fact , his \\' method \\' is to mentally project his thoughts and control other living people \\'s minds turning them into xxunk slaves . xxmaj this is an interesting concept for a movie , and was done much more effectively by xxmaj xxunk xxmaj lang in his series of \\' xxmaj dr. xxmaj mabuse \\' films , including \\' xxmaj dr. xxmaj mabuse the xxmaj xxunk \\' ( xxunk ) and \\' xxmaj the xxmaj testament of xxmaj dr. xxmaj mabuse \\' ( 1933 ) . xxmaj here it is unfortunately xxunk to his quest to regain the love of his former fianc\u00e9e , xxmaj claire xxmaj duvall ( played by the xxmaj anne xxmaj xxunk look alike with a bad xxunk , xxmaj dorothy xxmaj stone ) which is really the major theme . \\n \\n xxmaj the movie has an intriguing beginning , as xxmaj louque is sent on a military xxunk expedition to xxmaj xxunk to end the cult of zombies that came from there . xxmaj at some type of compound ( where we get great 30s sets and clothes ) he xxunk his xxunk to xxmaj claire , and then barely five minutes later , she gives him back his ring xxunk her love for his pal , xxmaj xxunk xxmaj greyson ( xxmaj robert xxmaj xxunk ) . xxmaj it \\'s unintentionally funny the way they talk to each other without making eye contact . xxmaj this would have been a great movie for \\' xxmaj mystery xxmaj science xxmaj theater xxunk \\' , if they had n\\'t already xxunk it . \\n \\n xxmaj it \\'s never shown how xxmaj louque actually learns the \\' xxunk \\' secret , but he then uses it to kill his enemies , create a giant army of xxunk carrying soldiers and body guards . xxmaj we wo n\\'t see such sheer force of will until xxmaj john xxmaj xxunk in \\' xxmaj the xxmaj brain xxmaj from xxmaj planet xxmaj xxunk \\' ( xxunk ) . \\n \\n xxmaj finally xxmaj claire xxunk to marry him if he will let xxmaj greyson live and return to xxmaj america . xxmaj louque agrees , but actually turns him into one of his xxunk slaves . xxmaj on their wedding night he realizes that xxmaj claire will only begin to love him if he gives up his \\' powers . \\' xxmaj to gain her love , he does so , causing the \\' revolt \\' of the title , in which all his slaves xxunk and attack his compound and kill him . xxmaj greyson xxunk xxmaj claire , and we seem to be at the end of a parable : \" xxmaj whom the xxunk would destroy , they first make mad . \" \\n \\n xxmaj so really then , it \\'s not that bad of a film , despite the low imdb rating it currently has . xxmaj on repeated viewings ( ? ) one can see the xxunk in the well formed script ! xxmaj dean xxmaj xxunk had yet to develop into a good actor , and is almost unrecognizable in his xxunk -- is that really his own hair ? xxmaj we remember him more for his xxunk , old man roles in \\' xxmaj white xxmaj christmas \\' ( xxunk ) , \\' x xxmaj the xxmaj unknown \\' ( 1956 ) and \\' xxmaj king xxmaj xxunk \\' ( 1958 ) . xxmaj the story xxunk a lot of its basic themes from the xxmaj xxunk brothers better , earlier film \\' xxmaj white xxmaj zombie \\' ( xxunk ) in which xxunk xxmaj robert xxmaj xxunk ( as xxmaj charles xxmaj xxunk ) uses \\' xxunk \\' to win the love of xxmaj xxunk xxmaj xxunk ( as xxmaj xxunk xxmaj parker ) . \\n \\n xxmaj if you want real zombie movies ( of which there are hundreds ! ) i \\'d start with \\' xxmaj white xxmaj zombie \\' ( xxunk ) , \\' xxmaj king of the xxmaj zombies \\' ( xxunk ) , \\' i xxmaj walked with a xxmaj zombie \\' ( xxunk ) , \\' xxmaj night of the xxmaj living xxmaj dead \\' ( xxunk ) , \\' xxmaj the xxmaj last xxmaj man on xxmaj earth \\' ( 1964 ) and its two xxunk . xxmaj in the modern era of classy films , there are \\' xxmaj horror xxmaj express \\' ( 1972 ) , \\' xxmaj the xxmaj xxunk and the xxmaj xxunk \\' ( xxunk ) , \\' 28 xxmaj days xxmaj later \\' ( 2002 ) and its sequel , as well as many , many , others too numerous to mention . \\n \\n xxmaj this one is not really a zombie film . xxmaj judging this movie on its own terms , it \\'s more of a semi - xxmaj gothic romance . xxmaj as such it ranks a little below some of xxmaj universal \\'s bottom billed b horror movies of the late 30s and early xxunk . xxmaj so i \\'ll give it a 5 .'"
3254+
"'xxbos xxmaj this is not really a zombie film , if we \\'re xxunk zombies as the dead walking around . xxmaj here the protagonist , xxmaj xxunk xxmaj louque ( played by an xxunk young xxmaj dean xxmaj xxunk ) , xxunk control of a method to create zombies , though in fact , his \\' method \\' is to mentally project his thoughts and control other living people \\'s minds turning them into xxunk slaves . xxmaj this is an interesting concept for a movie , and was done much more effectively by xxmaj xxunk xxmaj lang in his series of \\' xxmaj dr. xxmaj mabuse \\' films , including \\' xxmaj dr. xxmaj mabuse the xxmaj xxunk \\' ( xxunk ) and \\' xxmaj the xxmaj testament of xxmaj dr. xxmaj mabuse \\' ( 1933 ) . xxmaj here it is unfortunately xxunk to his quest to regain the love of his former fiancée , xxmaj claire xxmaj duvall ( played by the xxmaj anne xxmaj xxunk look alike with a bad xxunk , xxmaj dorothy xxmaj stone ) which is really the major theme . \\n \\n xxmaj the movie has an intriguing beginning , as xxmaj louque is sent on a military xxunk expedition to xxmaj xxunk to end the cult of zombies that came from there . xxmaj at some type of compound ( where we get great 30s sets and clothes ) he xxunk his xxunk to xxmaj claire , and then barely five minutes later , she gives him back his ring xxunk her love for his pal , xxmaj xxunk xxmaj greyson ( xxmaj robert xxmaj xxunk ) . xxmaj it \\'s unintentionally funny the way they talk to each other without making eye contact . xxmaj this would have been a great movie for \\' xxmaj mystery xxmaj science xxmaj theater xxunk \\' , if they had n\\'t already xxunk it . \\n \\n xxmaj it \\'s never shown how xxmaj louque actually learns the \\' xxunk \\' secret , but he then uses it to kill his enemies , create a giant army of xxunk carrying soldiers and body guards . xxmaj we wo n\\'t see such sheer force of will until xxmaj john xxmaj xxunk in \\' xxmaj the xxmaj brain xxmaj from xxmaj planet xxmaj xxunk \\' ( xxunk ) . \\n \\n xxmaj finally xxmaj claire xxunk to marry him if he will let xxmaj greyson live and return to xxmaj america . xxmaj louque agrees , but actually turns him into one of his xxunk slaves . xxmaj on their wedding night he realizes that xxmaj claire will only begin to love him if he gives up his \\' powers . \\' xxmaj to gain her love , he does so , causing the \\' revolt \\' of the title , in which all his slaves xxunk and attack his compound and kill him . xxmaj greyson xxunk xxmaj claire , and we seem to be at the end of a parable : \" xxmaj whom the xxunk would destroy , they first make mad . \" \\n \\n xxmaj so really then , it \\'s not that bad of a film , despite the low imdb rating it currently has . xxmaj on repeated viewings ( ? ) one can see the xxunk in the well formed script ! xxmaj dean xxmaj xxunk had yet to develop into a good actor , and is almost unrecognizable in his xxunk -- is that really his own hair ? xxmaj we remember him more for his xxunk , old man roles in \\' xxmaj white xxmaj christmas \\' ( xxunk ) , \\' x xxmaj the xxmaj unknown \\' ( 1956 ) and \\' xxmaj king xxmaj xxunk \\' ( 1958 ) . xxmaj the story xxunk a lot of its basic themes from the xxmaj xxunk brothers better , earlier film \\' xxmaj white xxmaj zombie \\' ( xxunk ) in which xxunk xxmaj robert xxmaj xxunk ( as xxmaj charles xxmaj xxunk ) uses \\' xxunk \\' to win the love of xxmaj xxunk xxmaj xxunk ( as xxmaj xxunk xxmaj parker ) . \\n \\n xxmaj if you want real zombie movies ( of which there are hundreds ! ) i \\'d start with \\' xxmaj white xxmaj zombie \\' ( xxunk ) , \\' xxmaj king of the xxmaj zombies \\' ( xxunk ) , \\' i xxmaj walked with a xxmaj zombie \\' ( xxunk ) , \\' xxmaj night of the xxmaj living xxmaj dead \\' ( xxunk ) , \\' xxmaj the xxmaj last xxmaj man on xxmaj earth \\' ( 1964 ) and its two xxunk . xxmaj in the modern era of classy films , there are \\' xxmaj horror xxmaj express \\' ( 1972 ) , \\' xxmaj the xxmaj xxunk and the xxmaj xxunk \\' ( xxunk ) , \\' 28 xxmaj days xxmaj later \\' ( 2002 ) and its sequel , as well as many , many , others too numerous to mention . \\n \\n xxmaj this one is not really a zombie film . xxmaj judging this movie on its own terms , it \\'s more of a semi - xxmaj gothic romance . xxmaj as such it ranks a little below some of xxmaj universal \\'s bottom billed b horror movies of the late 30s and early xxunk . xxmaj so i \\'ll give it a 5 .'"
32553255
]
32563256
},
32573257
"execution_count": 323,
@@ -7011,9 +7011,9 @@
70117011
"name": "python",
70127012
"nbconvert_exporter": "python",
70137013
"pygments_lexer": "ipython3",
7014-
"version": "3.7.1"
7014+
"version": "3.7.3"
70157015
}
70167016
},
70177017
"nbformat": 4,
70187018
"nbformat_minor": 2
7019-
}
7019+
}

3b-more-details.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
"- **it gives us a truncated SVD** (whereas with traditional SVD, we are usually throwing away small singular values and their corresponding columns)\n",
3939
"\n",
4040
"If you were curious to know more, two keys are:\n",
41-
"- It is often useful to be able to reduce dimensionality of data in a way that preserves distances. The Johnson\u2013Lindenstrauss lemma is a classic result of this type. [Johnson-Lindenstrauss Lemma](https://en.wikipedia.org/wiki/Johnson%E2%80%93Lindenstrauss_lemma): a small set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved (proof uses random projections).\n",
41+
"- It is often useful to be able to reduce dimensionality of data in a way that preserves distances. The Johnson–Lindenstrauss lemma is a classic result of this type. [Johnson-Lindenstrauss Lemma](https://en.wikipedia.org/wiki/Johnson%E2%80%93Lindenstrauss_lemma): a small set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved (proof uses random projections).\n",
4242
"- We haven't found a better general SVD method, we'll just use the method we have on a smaller matrix. \n",
4343
"\n",
4444
"Below is an over-simplified version of `randomized_svd` (you wouldn't want to use this in practice, but it covers the core ideas). The main part to notice is that we multiply our original matrix by a smaller random matrix (`M @ rand_matrix`) to produce `smaller_matrix`, and then use our same `np.linalg.svd` as before:\n",
@@ -971,9 +971,9 @@
971971
"name": "python",
972972
"nbconvert_exporter": "python",
973973
"pygments_lexer": "ipython3",
974-
"version": "3.7.1"
974+
"version": "3.7.3"
975975
}
976976
},
977977
"nbformat": 4,
978978
"nbformat_minor": 2
979-
}
979+
}

4-regex.ipynb

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -507,7 +507,7 @@
507507
"metadata": {},
508508
"outputs": [],
509509
"source": [
510-
"re_punc = re.compile(\"([\\\"\\''().,;:/_?!\u2014\\-])\") # add spaces around punctuation\n",
510+
"re_punc = re.compile(\"([\\\"\\''().,;:/_?!\\-])\") # add spaces around punctuation\n",
511511
"re_apos = re.compile(r\"n ' t \") # n't\n",
512512
"re_bpos = re.compile(r\" ' s \") # 's\n",
513513
"re_mult_space = re.compile(r\" *\") # replace multiple spaces with just one\n",
@@ -1027,10 +1027,10 @@
10271027
"metadata": {},
10281028
"outputs": [],
10291029
"source": [
1030-
"message = \"\ud83d\ude12\ud83c\udfa6 \ud83e\udd22\ud83c\udf55\"\n",
1030+
"message = \"😒🎦 🤢🍕\"\n",
10311031
"\n",
1032-
"re_frown = re.compile(r\"\ud83d\ude12|\ud83e\udd22\")\n",
1033-
"re_frown.sub(r\"\ud83d\ude0a\", message)"
1032+
"re_frown = re.compile(r\"😒|🤢\")\n",
1033+
"re_frown.sub(r\"😊\", message)"
10341034
]
10351035
},
10361036
{
@@ -1114,7 +1114,7 @@
11141114
"1. We use regex as a metalanguage to find string patterns in blocks of text\n",
11151115
"1. `r\"\"` are your IRL friends for Python regex\n",
11161116
"1. We are just doing binary classification so use the same performance metrics\n",
1117-
"1. You'll make a lot of mistakes in regex \ud83d\ude29. \n",
1117+
"1. You'll make a lot of mistakes in regex 😩. \n",
11181118
" - False Positive: Thinking you are right but you are wrong\n",
11191119
" - False Negative: Missing something"
11201120
]
@@ -1272,9 +1272,9 @@
12721272
"name": "python",
12731273
"nbconvert_exporter": "python",
12741274
"pygments_lexer": "ipython3",
1275-
"version": "3.7.1"
1275+
"version": "3.7.3"
12761276
}
12771277
},
12781278
"nbformat": 4,
12791279
"nbformat_minor": 1
1280-
}
1280+
}

0 commit comments

Comments
 (0)