adamduan
diff --git a/‎build/doctrees/environment.pickle
0 Bytes b/‎build/doctrees/environment.pickle
0 Bytes
diff --git a/‎build/doctrees/目录/ch1.doctree
1.2 KB b/‎build/doctrees/目录/ch1.doctree
1.2 KB
diff --git a/‎build/doctrees/目录/ch10.doctree
0 Bytes b/‎build/doctrees/目录/ch10.doctree
0 Bytes
diff --git a/‎build/doctrees/目录/ch4.doctree
0 Bytes b/‎build/doctrees/目录/ch4.doctree
0 Bytes
diff --git a/‎build/doctrees/目录/ch7.doctree
-77 Bytes b/‎build/doctrees/目录/ch7.doctree
-77 Bytes
diff --git a/‎build/doctrees/目录/ch8.doctree
0 Bytes b/‎build/doctrees/目录/ch8.doctree
0 Bytes
diff --git a/‎build/doctrees/目录/ch9.doctree
0 Bytes b/‎build/doctrees/目录/ch9.doctree
0 Bytes
diff --git a/‎build/doctrees/目录/参考答案.doctree
-74 Bytes b/‎build/doctrees/目录/参考答案.doctree
-74 Bytes
diff --git a/‎build/html/_sources/目录/ch1.rst.txt
Lines changed: 23 additions & 11 deletions b/‎build/html/_sources/目录/ch1.rst.txt
Lines changed: 23 additions & 11 deletions
diff --git a/‎build/html/_sources/目录/ch7.rst.txt
Lines changed: 3 additions & 3 deletions b/‎build/html/_sources/目录/ch7.rst.txt
Lines changed: 3 additions & 3 deletions
diff --git a/‎build/html/_sources/目录/参考答案.rst.txt
Lines changed: 4 additions & 4 deletions b/‎build/html/_sources/目录/参考答案.rst.txt
Lines changed: 4 additions & 4 deletions
diff --git a/‎build/html/searchindex.js
Lines changed: 1 addition & 1 deletion b/‎build/html/searchindex.js
Lines changed: 1 addition & 1 deletion
@@ -164,7 +164,7 @@ zip函数能够把多个可迭代对象打包成一个元组构成的可迭代
     np.eye(3) # 3*3的单位矩阵
     np.eye(3, k=1) # 偏移主对角线1个单位的伪单位矩阵
     np.full((2,3), 10) # 元组传入大小，10表示填充数值
-    np.full((2,3), [1,2,3]) # 通过传入列表填充每列的值
+    np.full((2,3), [1,2,3]) # 每行填入相同的列表
 
 【c】随机矩阵： ``np.random``
 
@@ -182,6 +182,12 @@ zip函数能够把多个可迭代对象打包成一个元组构成的可迭代
     a, b = 5, 15
     (b - a) * np.random.rand(3) + a
 
+一般的，可以选择已有的库函数：
+
+.. ipython:: python
+
+    np.random.uniform(5, 15, 3)
+
 ``randn`` 生成了 :math:`N\rm{(\mathbf{0}, \mathbf{I})}` 的标准正态分布：
 
 .. ipython:: python
@@ -196,6 +202,12 @@ zip函数能够把多个可迭代对象打包成一个元组构成的可迭代
     sigma, mu = 2.5, 3
     mu + np.random.randn(3) * sigma
 
+同样的，也可选择从已有函数生成：
+
+.. ipython:: python
+
+    np.random.normal(3, 2.5, 3)
+
 ``randint`` 可以指定生成随机整数的最小值最大值（不包含）和维度大小：
 
 .. ipython:: python
@@ -211,7 +223,7 @@ zip函数能够把多个可迭代对象打包成一个元组构成的可迭代
     np.random.choice(my_list, 2, replace=False, p=[0.1, 0.7, 0.1 ,0.1])
     np.random.choice(my_list, (3,3))
 
-当返回的元素个数与原列表相同时，等价于使用 ``permutation`` 函数，即打散原列表：
+当返回的元素个数与原列表相同时，不放回抽样等价于使用 ``permutation`` 函数，即打散原列表：
 
 .. ipython:: python
 
@@ -417,9 +429,9 @@ zip函数能够把多个可迭代对象打包成一个元组构成的可迭代
 
     res = np.ones((3,2))
     res
-    res * np.array([[2,3]]) # 扩充第一维度为3
-    res * np.array([[2],[3],[4]]) # 扩充第二维度为2
-    res * np.array([[2]]) # 等价于两次扩充
+    res * np.array([[2,3]]) # 第二个数组扩充第一维度为3
+    res * np.array([[2],[3],[4]]) # 第二个数组扩充第二维度为2
+    res * np.array([[2]]) # 等价于两次扩充，第二个数组两个维度分别扩充为3和2
 
 【c】一维数组与二维数组的操作
 
@@ -468,11 +480,11 @@ other  --                            sum(abs(x)**ord)**(1./ord)
 
 .. ipython:: python
 
-    martix_target =  np.arange(4).reshape(-1,2)
-    martix_target 
-    np.linalg.norm(martix_target, 'fro')
-    np.linalg.norm(martix_target, np.inf)
-    np.linalg.norm(martix_target, 2)
+    matrix_target =  np.arange(4).reshape(-1,2)
+    matrix_target 
+    np.linalg.norm(matrix_target, 'fro')
+    np.linalg.norm(matrix_target, np.inf)
+    np.linalg.norm(matrix_target, 2)
 
 .. ipython:: python
 
@@ -574,4 +586,4 @@ Ex4：改进矩阵计算的性能
 Ex5：连续整数的最大长度
 ------------------------------
 
-输入一个整数的 ``Numpy`` 数组，返回其中递增连续整数子数组的最大长度。例如，输入 [1,2,5,6,7]，[5,6,7]为具有最大长度的递增连续整数子数组，因此输出3；输入[3,2,1,2,3,4,6]，[1,2,3,4]为具有最大长度的递增连续整数子数组，因此输出4。请充分利用 ``Numpy`` 的内置函数完成。（提示：考虑使用 ``nonzero, diff`` 函数）
+输入一个整数的 ``Numpy`` 数组，返回其中严格递增连续整数子数组的最大长度。例如，输入 [1,2,5,6,7]，[5,6,7]为具有最大长度的递增连续整数子数组，因此输出3；输入[3,2,1,2,3,4,6]，[1,2,3,4]为具有最大长度的递增连续整数子数组，因此输出4。请充分利用 ``Numpy`` 的内置函数完成。（提示：考虑使用 ``nonzero, diff`` 函数）
@@ -13,15 +13,15 @@
 1. 缺失信息的统计
 --------------------
 
-缺失数据可以使用 ``isna`` 或 ``isnull`` （两个函数没有区别）来查看每个单元格是否缺失，通过和 ``sum`` 的组合可以计算出每列缺失值的比例：
+缺失数据可以使用 ``isna`` 或 ``isnull`` （两个函数没有区别）来查看每个单元格是否缺失，结合 ``mean`` 可以计算出每列缺失值的比例：
 
 .. ipython:: python
     
     df = pd.read_csv('data/learn_pandas.csv',
                      usecols = ['Grade', 'Name', 'Gender', 'Height',
                                 'Weight', 'Transfer'])
     df.isna().head()
-    df.isna().sum()/df.shape[0] # 查看缺失的比例
+    df.isna().mean() # 查看缺失的比例
 
 如果想要查看某一列缺失或者非缺失的行，可以利用 ``Series`` 上的 ``isna`` 或者 ``notna`` 进行布尔索引。例如，查看身高缺失的行：
 
@@ -329,7 +329,7 @@ Ex1：缺失值与类别的相关性检验
 
     df = pd.read_csv('data/missing_chi.csv')
     df.head()
-    df.isna().sum()/df.shape[0]
+    df.isna().mean()
     df.y.value_counts(normalize=True)
 
 事实上，有时缺失值出现或者不出现本身就是一种特征，并且在一些场合下可能与标签的正负是相关的。关于缺失出现与否和标签的正负性，在统计学中可以利用卡方检验来断言它们是否存在相关性。按照特征缺失的正例、特征缺失的负例、特征不缺失的正例、特征不缺失的负例，可以分为四种情况，设它们分别对应的样例数为 :math:`n_{11}, n_{10}, n_{01}, n_{00}` 。假若它们是不相关的，那么特征缺失中正例的理论值，就应该接近于特征缺失总数 :math:`\times` 总体正例的比例，即：
 
@@ -137,10 +137,10 @@ Ex1：口袋妖怪数据集
 
 .. ipython:: python
 
-    L_full = [' '.join([i, j]) if i!=j else i for j in dp_dup['Type 1'
-             ].unique() for i in dp_dup['Type 1'].unique()]
-    L_part = [' '.join([i, j]) if type(j)!=float else i for i, j in zip(
-             attr_dup['Type 1'], attr_dup['Type 2'])]
+    L_full = [i+' '+j for i in df['Type 1'].unique() for j in (
+              df['Type 1'].unique().tolist() + [''])]
+    L_part = [i+' '+j for i, j in zip(df['Type 1'], df['Type 2'
+             ].replace(np.nan, ''))]
     res = set(L_full).difference(set(L_part))
     len(res) # 太多，不打印了