Skip to content

Commit 0e70736

Browse files
committed
Initial HMM commit
1 parent af60d35 commit 0e70736

File tree

2 files changed

+159
-1
lines changed

2 files changed

+159
-1
lines changed

FSA.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
}
99
},
1010
"source": [
11-
"# FSA Assignment\n",
11+
"https://www.linkedin.com/in/oscar-kosar-kosarewicz/# FSA Assignment\n",
1212
" Implement the FSA variable selectionmethod for linear models and binary classification with the logistic loss, as\n",
1313
" described in the slides. Use the parameters s = 0.0001, μ = 30, N iter = 500. Take special care to normalize each column of the X matrix to have zero mean and variance 1 and to use the same mean and standard deviation that you used for normalizing the train set also for normalizing the test set.\n"
1414
]

HMM.ipynb

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"source": [
6+
"# HMM Assignment\n",
7+
"1. Download the dataset hmm_pb1.csv from Canvas. It represents a sequence of\n",
8+
"dice rolls $x$ from the Dishonest casino model discussed in class. The model parameters\n",
9+
"are exactly those presented in class. The states of $Y$ are 1=’Fair’ and 2=’Loaded’.\n"
10+
],
11+
"metadata": {
12+
"collapsed": false
13+
}
14+
},
15+
{
16+
"cell_type": "markdown",
17+
"source": [
18+
"#### Import dependencies"
19+
],
20+
"metadata": {
21+
"collapsed": false
22+
}
23+
},
24+
{
25+
"cell_type": "code",
26+
"execution_count": 407,
27+
"outputs": [],
28+
"source": [
29+
"import numpy as np\n",
30+
"from matplotlib import pyplot as plt\n",
31+
"from sklearn.cluster import KMeans\n",
32+
"from os.path import join\n",
33+
"from scipy.stats import multivariate_normal\n",
34+
"from itertools import repeat\n",
35+
"from random import randint"
36+
],
37+
"metadata": {
38+
"collapsed": false,
39+
"pycharm": {
40+
"name": "#%%\n"
41+
}
42+
}
43+
},
44+
{
45+
"cell_type": "markdown",
46+
"source": [
47+
"#### Data loading functions"
48+
],
49+
"metadata": {
50+
"collapsed": false
51+
}
52+
},
53+
{
54+
"cell_type": "code",
55+
"execution_count": 408,
56+
"outputs": [],
57+
"source": [
58+
"def get_pb1():\n",
59+
" return load_data(\"hmm_pb1.csv\")\n",
60+
"\n",
61+
"def get_pb2():\n",
62+
" return load_data(\"hmm_ph2.csv\")\n",
63+
"\n",
64+
"def load_data(filename):\n",
65+
" path = \"data/HMM/\"\n",
66+
" data = np.loadtxt(join(path,filename), delimiter=',')\n",
67+
" return data\n"
68+
],
69+
"metadata": {
70+
"collapsed": false,
71+
"pycharm": {
72+
"name": "#%%\n"
73+
}
74+
}
75+
},
76+
{
77+
"cell_type": "markdown",
78+
"source": [
79+
"a) Implement the Viterbi algorithm and find the most likely sequence $y$ that generated the observed $x$.\n",
80+
" Use the log probabilities, as shown in the HMM slides from\n",
81+
"Canvas. Report the obtained sequence $y$ of 1’s and 2’s for verification. (2 points)"
82+
],
83+
"metadata": {
84+
"collapsed": false
85+
}
86+
},
87+
{
88+
"cell_type": "markdown",
89+
"source": [
90+
"b) Implement the forward and backward algorithms and run them on the observed\n",
91+
"x. You should memorize a common factor $u_t$ for the $\\alpha_t^k$\n",
92+
"to avoid floating point underflow, since $\\alpha_t^k$ quickly become very small. The same holds for\n",
93+
"$\\beta_t^k$. Report $\\alpha_{125}^1 / \\alpha^2_{125}$ and $\\beta_{125}^1 / \\beta^2_{125}$,\n",
94+
"where the counting starts from $t$ = 1. (3 points)"
95+
],
96+
"metadata": {
97+
"collapsed": false
98+
}
99+
},
100+
{
101+
"cell_type": "markdown",
102+
"source": [
103+
"2. Download the dataset hmm_pb2.csv from Canvas. It represents a sequence of\n",
104+
"10000 dice rolls x from the Dishonest casino model but with other values for the a and\n",
105+
"b parameters than those from class. Having so many observations, you are going to\n",
106+
"learn the model parameters.\n"
107+
],
108+
"metadata": {
109+
"collapsed": false
110+
}
111+
},
112+
{
113+
"cell_type": "markdown",
114+
"source": [
115+
"Implement and run the Baum-Welch algorithm using the forward and backward\n",
116+
"algorithms that you already implemented for Pb 1. You can initialize the $\\pi,a,b$ with\n",
117+
"your guess, or with some random probabilities (make sure that $\\pi$ sums to 1 and that\n",
118+
"$a_{ij}, b^i_k$\n",
119+
"sum to 1 for each $i$). The algorithm converges quite slowly, so you might need\n",
120+
"to run it for up 1000 iterations or more for the parameters to converge.\n",
121+
"Report the values of $\\pi,a,b$ that you have obtained. (4 points)\n",
122+
"\n"
123+
],
124+
"metadata": {
125+
"collapsed": false
126+
}
127+
}
128+
],
129+
"metadata": {
130+
"authors": [
131+
{
132+
"name": "Oscar Kosar-Kosarewicz"
133+
},
134+
{
135+
"name": "Nicholas Phillips"
136+
}
137+
],
138+
"kernelspec": {
139+
"display_name": "Python 3",
140+
"language": "python",
141+
"name": "python3"
142+
},
143+
"language_info": {
144+
"codemirror_mode": {
145+
"name": "ipython",
146+
"version": 3
147+
},
148+
"file_extension": ".py",
149+
"mimetype": "text/x-python",
150+
"name": "python",
151+
"nbconvert_exporter": "python",
152+
"pygments_lexer": "ipython3",
153+
"version": "3.8.3"
154+
}
155+
},
156+
"nbformat": 4,
157+
"nbformat_minor": 4
158+
}

0 commit comments

Comments
 (0)