0% found this document useful (0 votes)
22 views

IE684 Lab08

Uploaded by

arunss9890
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

IE684 Lab08

Uploaded by

arunss9890
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

IE684, IEOR Lab

Lab 08 March 11, 2024

Instructions: (Please read carefully and follow them!)

Try to solve all problems on your own. If you have difficulties, ask the instructor or TAs.
In this session, we will shift the theme of decomposing a problem with optimization procedures to handle large data
to Classification problems.
The implementation of the optimization algorithms in this lab will involve extensive use of the numpy Python package.
It would be useful for you to get to know some of the functionalities of numpy package. For details on numpy Python
package, please consult https://numpy.org/doc/stable/index.html
For plotting purposes, please use matplotlib.pyplot package. You can find examples in the site https://
matplotlib.org/examples/.

Please follow the instructions given below to prepare your solution notebooks:

• Please use different notebooks for solving different Exercise problems.


• The notebook name for Exercise 1 should be YOURROLLNUMBER IE684 Lab08 Ex1.ipynb.
• Similarly, the notebook name for Exercise 2 should be YOURROLLNUMBER IE684 Lab08 Ex2.ipynb, etc and so
on.

There are only 3 exercises in this lab. Try to solve all the problems on your own. If you have difficulties, ask the
Instructors or TAs.

You can either print the answers using print command in your code or you can write the text in a separate text
tab. To add text in your notebook, click +Text. Some questions require you to provide proper explanations; for
such questions, write proper explanations in a text tab. Some questions require the answers to be written in LaTeX
notation. (Write the comments and observations with appropriate equations in LaTeX only.) Some
questions require plotting certain graphs. Please make sure that the plots are present in the submitted notebooks.

After completing this lab’s exercises, click File → Download .ipynb and save your files to your local laptop/desktop.
Create a folder with the name YOURROLLNUMBER IE684 Lab08 and copy your .ipynb files to the folder. Then zip
the folder to create YOURROLLNUMBER IE684 Lab08.zip. Then upload only the .zip file to Moodle. There will be
some penalty for students who do not follow the proper naming conventions in their submissions.

Please check the submission deadline announced in moodle.

1
IE684, IEOR Lab
Lab 08 March 11, 2024

In the last lab, we developed an optimization method to solve the optimization problem associated with binary
classification problems. Recall that for a dataset D = {(xi , y i )}ni=1 where xi ∈ X ∈ Rd , y i ∈ {+1, −1}, we solve:

n
λ 1X
min f (w) = ||w||22 + L(yi , wT xi ) (1)
w∈Rd 2 n i=1

where we considered the following loss functions:

• Lh (yi , wT xi ) = max{0, 1 − yi wT xi } (hinge)

• Ll (yi , wT xi ) = log(1 + exp(−yi wT xi )) (logistic)


• Lsh (yi , wT xi ) = (max{0, 1 − yi wT xi })2 (squared hinge)

Solving the optimization problem (1) facilitates in learning a classification rule h : X → {+1, −1}. We used the
following prediction rule for a test sample x̂:

h(x̂) = sign(wT x̂) (2)

In the last lab , we used a decomposition of f (w) and solved an equivalent problem of the following form:

n
X
min f (w) = min fi (w) (3)
w w
i=1

In this lab, we will consider a constrained variant of the optimization problem (1).
For a dataset D = {(xi , y i )}ni=1 where xi ∈ X ∈ Rd , y i ∈ {+1, −1}, we solve:

n
λ 1X
min f (w) = ||w||22 + L(yi , wT xi ), s.t. w ∈ C (4)
w∈Rd 2 n i=1

where C ∈ Rd is a closed convex set.


Hence we would develop an optimization method to solve the following equivalent constrained problem of (4):

n
X
min f (w) = min fi (w) (5)
w∈C w∈C
i=1

Let’s start coding now!

2
IE684, IEOR Lab
Lab 08 March 11, 2024

Exercise 1 (Data Preparation) Load the wine dataset from the scikit-learn package using the following code.
We will load the features into the matrix A such that the i-th row of A will contain the features of i-th sample. The
label vector will be loaded into y.

1. Check the number of classes C and the class label values in wine data. Check if the class labels are set from the
set {0, 1, . . . , C − 1} or if they are from the set {1, 2, . . . C}.
2. When loading the labels into y do the following:

• If the class labels are from the set {0, 1, . . . , C − 1} convert classes 0, 2, 3, . . . , C − 1 to −1.
• If the class labels are from the set {1, 2, . . . , C} convert classes 2, 3 . . . C to −1.
Thus, you will have class labels eventually belonging to the set {+1, −1}.

3. Normalize the columns of A matrix such that each columns has entries in range [−1, 1].
4. Create an index array of size number of samples. Use this index array to partition the data and labels into
train and test splits. In particular, use the first 80% of the indices to create the training data and labels. Use
the remaining 20% to create the test data and labels. Store them in the variables train data, train label,
test data, test label.

5. Write a Python function that implements the prediction rule.


6. Write a Python function that takes as input the model parameter w, data features, and labels and returns the
accuracy of the data. (Use the predict function).

3
IE684, IEOR Lab
Lab 08 March 11, 2024

Exercise 2 An Optimization Algorithm

1. To solve the problem (5), we shall use the following method (denoted by ALG-LAB8). Assume that the training
data contains ntrain samples.
For t = 1, 2, 3, . . . , do:
(a) Sample i uniformly at random from {1, 2, . . . , ntrain }
(b) wt+1 = ProjC (wt − ηt ∇fi (wt )).
The notation ProjC = arg minu∈C ||u − z||2 denotes the orthogonal projection of point z onto set C. In other
words, we wish to find a point u∗ ∈ C which is closest to z in terms of l2 distance. For specific examples of set
C, the orthogonal projection has a nice closed form.
2. When C = {w ∈ Rd : ||w||∞ ≤ 1}, find and expression for ProjC (z). (Recall: For a w = [w1 w2 . . . wd ]T ∈ Rd ,
we have ||w||∞ = max{|w1 |, |w2 |, . . . , |wd |}.)
3. Consider the hinge loss function Lh . Use the python modules developed in the last lab to compute the loss
function Lh , and objective function value. Also, use the modules developed in the last lab to compute the
gradient (or sub-gradient) of fi (w) for the loss function Lh . Denote the (sub-)gradient by gi (w) = ∇w fi (w).
4. Define a module to compute the orthogonal projection onto the set C.
5. Modify the code template given in the last lab to implement ALG-Lab8. Use the following code template.

def OPT1 ( data , label , lambda , num_epochs ) :


t = 1
# initialize w
# w = ???
arr = np . arange ( data . shape [0])
for epoch in range ( num_epochs ) :
np . random . shuffle ( arr ) # shuffle every epoch
for i in np . nditer ( arr ) : # Pass through the data points
# step = ???
# Update w using w <- w - step * g_i ( w )
t = t +1
if t >1 e4 :
t = 1
return w

6. In OPT1, use num epochs = 500, step = 1t . For each λ ∈ {10−3 , 10−2 , 0.1, 1, 10}, perform the following tasks:

• Plot the objective function value in every epoch. Use different colors for different λ values.
• Plot the test set accuracy in every epoch. Use different colors for different λ values.
• Plot the train set accuracy in every epoch. Use different colors for different λ values.
• Tabulate the final test set accuracy and train set accuracy for each λ value.
• Explain your observations.
7. Repeat the experiments (with num epochs=500 and with your modified stopping criterion) for different loss
functions Ll and Lsh . Explain your observations.

4
IE684, IEOR Lab
Lab 08 March 11, 2024

Exercise 3 A different constraint set

1. When C = {w ∈ Rd : ||w||1 ≤ 1}, find an expression for ProjC (z). (Recall: For a w = [w1 w2 . . . wd ]T ∈ Rd , we
Pd
have ||w||1 = i=1 |wi |.)
2. Consider the hinge loss function Lh . Use the python modules developed in the last lab to compute the loss
function Lh , and objective function value. Also, use the modules developed in the last lab to compute the
gradient (or sub-gradient) of fi (w) for the loss function Lh . Denote the (sub-)gradient by gi (w) = ∇w fi (w).
3. Define a module to compute the orthogonal projection onto set C.
4. In OPT1, use num epochs = 500, step = 1t . For each λ ∈ {10−3 , 10−2 , 0.1, 1, 10}, perform the following tasks:

• Plot the objective function value in every epoch. Use different colors for different λ values.
• Plot the test set accuracy in every epoch. Use different colors for different λ values.
• Plot the train set accuracy in every epoch. Use different colors for different λ values.
• Tabulate the final test set accuracy and train set accuracy for each λ value.
• Explain your observations.

5. Repeat the experiments (with num epochs=500 and with your modified stopping criterion) for different loss
functions Ll and Lsh . Explain your observations

You might also like