IE684 Lab08
IE684 Lab08
Try to solve all problems on your own. If you have difficulties, ask the instructor or TAs.
In this session, we will shift the theme of decomposing a problem with optimization procedures to handle large data
to Classification problems.
The implementation of the optimization algorithms in this lab will involve extensive use of the numpy Python package.
It would be useful for you to get to know some of the functionalities of numpy package. For details on numpy Python
package, please consult https://numpy.org/doc/stable/index.html
For plotting purposes, please use matplotlib.pyplot package. You can find examples in the site https://
matplotlib.org/examples/.
Please follow the instructions given below to prepare your solution notebooks:
There are only 3 exercises in this lab. Try to solve all the problems on your own. If you have difficulties, ask the
Instructors or TAs.
You can either print the answers using print command in your code or you can write the text in a separate text
tab. To add text in your notebook, click +Text. Some questions require you to provide proper explanations; for
such questions, write proper explanations in a text tab. Some questions require the answers to be written in LaTeX
notation. (Write the comments and observations with appropriate equations in LaTeX only.) Some
questions require plotting certain graphs. Please make sure that the plots are present in the submitted notebooks.
After completing this lab’s exercises, click File → Download .ipynb and save your files to your local laptop/desktop.
Create a folder with the name YOURROLLNUMBER IE684 Lab08 and copy your .ipynb files to the folder. Then zip
the folder to create YOURROLLNUMBER IE684 Lab08.zip. Then upload only the .zip file to Moodle. There will be
some penalty for students who do not follow the proper naming conventions in their submissions.
1
IE684, IEOR Lab
Lab 08 March 11, 2024
In the last lab, we developed an optimization method to solve the optimization problem associated with binary
classification problems. Recall that for a dataset D = {(xi , y i )}ni=1 where xi ∈ X ∈ Rd , y i ∈ {+1, −1}, we solve:
n
λ 1X
min f (w) = ||w||22 + L(yi , wT xi ) (1)
w∈Rd 2 n i=1
Solving the optimization problem (1) facilitates in learning a classification rule h : X → {+1, −1}. We used the
following prediction rule for a test sample x̂:
In the last lab , we used a decomposition of f (w) and solved an equivalent problem of the following form:
n
X
min f (w) = min fi (w) (3)
w w
i=1
In this lab, we will consider a constrained variant of the optimization problem (1).
For a dataset D = {(xi , y i )}ni=1 where xi ∈ X ∈ Rd , y i ∈ {+1, −1}, we solve:
n
λ 1X
min f (w) = ||w||22 + L(yi , wT xi ), s.t. w ∈ C (4)
w∈Rd 2 n i=1
n
X
min f (w) = min fi (w) (5)
w∈C w∈C
i=1
2
IE684, IEOR Lab
Lab 08 March 11, 2024
Exercise 1 (Data Preparation) Load the wine dataset from the scikit-learn package using the following code.
We will load the features into the matrix A such that the i-th row of A will contain the features of i-th sample. The
label vector will be loaded into y.
1. Check the number of classes C and the class label values in wine data. Check if the class labels are set from the
set {0, 1, . . . , C − 1} or if they are from the set {1, 2, . . . C}.
2. When loading the labels into y do the following:
• If the class labels are from the set {0, 1, . . . , C − 1} convert classes 0, 2, 3, . . . , C − 1 to −1.
• If the class labels are from the set {1, 2, . . . , C} convert classes 2, 3 . . . C to −1.
Thus, you will have class labels eventually belonging to the set {+1, −1}.
3. Normalize the columns of A matrix such that each columns has entries in range [−1, 1].
4. Create an index array of size number of samples. Use this index array to partition the data and labels into
train and test splits. In particular, use the first 80% of the indices to create the training data and labels. Use
the remaining 20% to create the test data and labels. Store them in the variables train data, train label,
test data, test label.
3
IE684, IEOR Lab
Lab 08 March 11, 2024
1. To solve the problem (5), we shall use the following method (denoted by ALG-LAB8). Assume that the training
data contains ntrain samples.
For t = 1, 2, 3, . . . , do:
(a) Sample i uniformly at random from {1, 2, . . . , ntrain }
(b) wt+1 = ProjC (wt − ηt ∇fi (wt )).
The notation ProjC = arg minu∈C ||u − z||2 denotes the orthogonal projection of point z onto set C. In other
words, we wish to find a point u∗ ∈ C which is closest to z in terms of l2 distance. For specific examples of set
C, the orthogonal projection has a nice closed form.
2. When C = {w ∈ Rd : ||w||∞ ≤ 1}, find and expression for ProjC (z). (Recall: For a w = [w1 w2 . . . wd ]T ∈ Rd ,
we have ||w||∞ = max{|w1 |, |w2 |, . . . , |wd |}.)
3. Consider the hinge loss function Lh . Use the python modules developed in the last lab to compute the loss
function Lh , and objective function value. Also, use the modules developed in the last lab to compute the
gradient (or sub-gradient) of fi (w) for the loss function Lh . Denote the (sub-)gradient by gi (w) = ∇w fi (w).
4. Define a module to compute the orthogonal projection onto the set C.
5. Modify the code template given in the last lab to implement ALG-Lab8. Use the following code template.
6. In OPT1, use num epochs = 500, step = 1t . For each λ ∈ {10−3 , 10−2 , 0.1, 1, 10}, perform the following tasks:
• Plot the objective function value in every epoch. Use different colors for different λ values.
• Plot the test set accuracy in every epoch. Use different colors for different λ values.
• Plot the train set accuracy in every epoch. Use different colors for different λ values.
• Tabulate the final test set accuracy and train set accuracy for each λ value.
• Explain your observations.
7. Repeat the experiments (with num epochs=500 and with your modified stopping criterion) for different loss
functions Ll and Lsh . Explain your observations.
4
IE684, IEOR Lab
Lab 08 March 11, 2024
1. When C = {w ∈ Rd : ||w||1 ≤ 1}, find an expression for ProjC (z). (Recall: For a w = [w1 w2 . . . wd ]T ∈ Rd , we
Pd
have ||w||1 = i=1 |wi |.)
2. Consider the hinge loss function Lh . Use the python modules developed in the last lab to compute the loss
function Lh , and objective function value. Also, use the modules developed in the last lab to compute the
gradient (or sub-gradient) of fi (w) for the loss function Lh . Denote the (sub-)gradient by gi (w) = ∇w fi (w).
3. Define a module to compute the orthogonal projection onto set C.
4. In OPT1, use num epochs = 500, step = 1t . For each λ ∈ {10−3 , 10−2 , 0.1, 1, 10}, perform the following tasks:
• Plot the objective function value in every epoch. Use different colors for different λ values.
• Plot the test set accuracy in every epoch. Use different colors for different λ values.
• Plot the train set accuracy in every epoch. Use different colors for different λ values.
• Tabulate the final test set accuracy and train set accuracy for each λ value.
• Explain your observations.
5. Repeat the experiments (with num epochs=500 and with your modified stopping criterion) for different loss
functions Ll and Lsh . Explain your observations