0% found this document useful (0 votes)
2 views

Week02 Convex Optimization

Uploaded by

haducdeptrai2k6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Week02 Convex Optimization

Uploaded by

haducdeptrai2k6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

FUNDAMENTALS OF

OPTIMIZATION
FUNDAMENTALS OF OPTIMIZATION
Week 2: Convex Optimization

3
Outline

1. Definitions
• Convex set

• Convex combination

• Convex function

• Exercises

2. Unconstrained Optimization

3. Constrained Optimization

4
Convex set

• A set of points 𝐶 ⊆ ℝ𝑛 is convex if for every pair of points 𝑥, 𝑦 ∈ 𝐶, any


point on the line segment between 𝑥 and 𝑦 is also in 𝐶
• That is, if 𝑥, 𝑦 ∈ 𝐶, then 𝑡𝑥 + 1 − 𝑡 𝑦 ∈ 𝐶 for all 𝑡 ∈ [0,1]

Convex Set Not a Convex Set

5
Examples of convex set

• Line Segments: 𝐶 = 𝑥 + 𝑡 𝑦 − 𝑥 𝑡 ∈ [0,1]} for some x, 𝑦 ∈ ℝ𝑛


𝑥

• Lines, planes, hyperplanes, etc. also define convex sets

6
Examples of convex set

• Half spaces: 𝐶 = 𝑥 ∈ ℝ𝑛 𝑤 𝑇 𝑥 + 𝑏 ≤ 0} for some 𝑤 ∈ ℝ𝑛 and 𝑏 ∈ ℝ


• Balls of Radius 𝝐: 𝐶 = {𝑥 ∈ ℝ𝑛 | 𝑥 2 ≤ 𝜖} for some 𝜖 ≥ 0 ∈ ℝ
𝑛

𝑥 2 = 𝑥𝑇𝑥 = ෍ 𝑥𝑖2
𝑖=1

• This is called the Euclidean norm or Euclidean distance because 𝑥 − 𝑦 2 is


equal to the length of the line segment between the points 𝑥 and 𝑦

7
Convex combination

• We say that 𝑦 ∈ ℝ𝑛 is a convex combination of the points 𝑥 (1) , … , 𝑥 𝑘 ∈


ℝ𝑛 if 𝑦 = 𝜆1 𝑥 (1) + ⋯ + 𝜆𝑘 𝑥 𝑘 for some choice of 𝜆1 , … , 𝜆𝑘 ∈ [0,1] such
that 𝜆1 + ⋯ + 𝜆𝑘 = 1
• Let 𝐶 be the set of all points 𝑦 that can be obtained as a convex
combination of the 𝑥 (1) , … , 𝑥 (𝑘)
• In the special case 𝑘 = 2, 𝐶 is just a line segment
• 𝐶 is a convex set called the convex hull of the 𝑥 (1) , … , 𝑥 (𝑘)

8
Convex function

• A function 𝑓: 𝐶 → ℝ is convex if 𝐶 is a convex set and


𝑓 𝑡𝑥 + 1 − 𝑡 𝑦 ≤ 𝑡𝑓 𝑥 + 1 − 𝑡 𝑓(𝑦)

for all 𝑥, 𝑦 ∈ 𝐶 and 𝑡 ∈ 0,1


• 𝑓 is called concave if −𝑓 is convex

9
Exercises

1. 𝑆 = 𝑥, 𝑦 0 ≤ 𝑥 ≤ 𝑎, 0 ≤ 𝑦 ≤ 𝑏} is a convex set ?
2. Union of two convex sets is a convex set ?
3. 𝑆 = 𝑥 ∈ 𝑅𝑛 ||𝑥 − 𝑎|| ≤ 𝑟} is a convex set ?
4. 𝑆 = 𝑥, 𝑦, 𝑧 𝑧 ≥ 𝑥 2 + 𝑦 2 } is a convex set ?

10
Exercises

𝑆= 𝑥, 𝑦 0 ≤ 𝑥 ≤ 𝑎, 0 ≤ 𝑦 ≤ 𝑏} is a convex set ?

11
Exercises

Union of two convex sets is a convex set ?

12
Convex properties

• Nonnegative weighted sums of convex functions are convex, i.e., if


𝑓1 : ℝ𝑛 → ℝ and 𝑓2 : ℝ𝑛 → ℝ are convex functions and 𝑐1 , 𝑐2 ≥ 0, then
𝑔 𝑥 = 𝑐1 𝑓1 𝑥 + 𝑐2 𝑓2 (𝑥)
is a convex function.

• Pointwise maximum of convex functions are convex, i.e., if 𝑓1 : ℝ𝑛 → ℝ


and 𝑓2 : ℝ𝑛 → ℝ are convex functions, then
𝑔 𝑥 = max 𝑓1 𝑥 , 𝑓2 (𝑥)
is a convex function.

16
Convex properties

• A differentiable function 𝑓: ℝ𝑛 → ℝ is convex on a convex set 𝐶 if and


only if
𝑓 𝑥 ≥ 𝑓 𝑦 + ∇𝑓 𝑦 𝑇 𝑥−𝑦
for all 𝑥, 𝑦 ∈ 𝐶

17
Exercises

• Which of the following functions are convex?


• exp 𝑥
• exp(−𝑥)
• log(𝑥)
• sin(𝑥)
• 𝑥2
• 𝑥8
• max(𝑥, 0)
• 𝑥
• |𝑥|

18
Exercises

• Which of the following functions are convex?


• exp 𝑥
• exp(−𝑥)
• log(𝑥)
• sin(𝑥)
• 𝑥2
• 𝑥8
• max(𝑥, 0)
• 𝑥
• |𝑥|

19
Outline

1. Definitions

2. Unconstrained Optimization
• Introduction to unconstrained optimization

• Descent method

• Gradient descent method

• Newton method

3. Constrained Optimization

20
Unconstrained convex optimization

• Unconstrained, smooth convex optimization problem:


min f(x)
• f: Rn → R is convex and twice differentiable
• dom f = R: no constraint
• Assumption: the problem is solvable with f* = minx f(x) and x* = argMinx f(x)
• To find x, solve equation f(x*) = 0: not easy to solve analytically
• Iterative scheme is preferred: compute minimizing sequence x(0), x(1), … s.t. f(x(k)) →
f(x*) as k → 
• The algorithm stops at some point x(k) when the error is within acceptable
tolerance: f(x(k)) – f* ≤ 

21
Local minimizer

• x* is a local minimizer for f: Rn → R if f(x*) ≤ f(x) for ||x*-x|| ≤  ( > 0 is a


constant)
• x* is a global minimizer for f: Rn → R if f(x*) ≤ f(x) for all x Rn

f(x) = ex has no minimizer f(x) = -x + e-x has no minimizer

22
Local minimizer

• x* is a local minimizer for f: Rn → R if f(x*) ≤ f(x) for ||x*-x|| ≤  ( > 0 is a


constant)
• x* is a global minimizer for f: Rn → R if f(x*) ≤ f(x) for all x Rn

f(x) = ex + e-x - 3x2 + x has two local f(x) = ex + e-x - 3x2 has two global minimizers
minimizers and one global minimizer

23
Local minimizer

• x* is a local minimizer for f: Rn → R if f(x*) ≤ f(x) for ||x*-x|| ≤  ( > 0 is a


constant)
• Theorem (Necessary condition for local minimum) If x* is a local minimizer
for f: Rn → R, then f(x*) = 0 (x* is also called stationary point for f)

f(x) = ex + e-x - 3x2 + x has two local f(x) = ex + e-x - 3x2 has two global minimizers
minimizers and one global minimizer

24
Local minimizer

Example
• f(x, y) = x2 + y2 - 2xy + x
• f(x, y) = 2x - 2y + 1 = 0 has no solution
2y - 2x

→ there is no minimizer of f(x, y)

25
Local minimizer

• Theorem (Necessary condition for local minimum) If x* is a local minimizer


for f: Rn → R, then f(x*) = 0 (x* is also called stationary point for f)
• Theorem (Sufficient condition for a local minimum) Assume x* is a stationary
point and that 2f(x*) is positive definite, then x* is a local minimizer

2𝑓(𝑥) 2𝑓(𝑥) 2𝑓(𝑥)


. . .
𝑥1𝑥1 𝑥1𝑥2 𝑥1𝑥𝑛
2𝑓(𝑥) 2𝑓(𝑥) 2𝑓(𝑥)
•  2f(x) = . . .
𝑥2𝑥1 𝑥2𝑥2 𝑥2𝑥𝑛
. . .
2𝑓(𝑥) 2𝑓(𝑥) 2𝑓(𝑥)
. . .
𝑥𝑛𝑥1 𝑥𝑛 𝑥2 𝑥𝑛 𝑥𝑛

26
Local minimizer

• Matrix Anxn is called positive definite if

a1,1 a1,2 . . . a1,i


a2,1 a2,2 . . . a2,i
Ai = ..... , det(A i) > 0, i = 1,…,n

ai,1 . . ai,2 . . . ai,i

27
Examples
2 2
Example f(x,y) = 𝑒 𝑥 +𝑦

𝑥2+𝑦2
f(x) = 2x𝑒 = 0 has unique solution x* = (0,0)
𝑥2+𝑦2
2y𝑒

2 0
 2f(x*) = > 0 → (0,0) is a minimizer of f
0 2

28
Examples

• Example f(x,y) = x2 + y2 - 2xy - x

-2x + 2y + 1
f(x) = -2x -2y =0

has unique solution x* = (-1/4,1/4)

-2 2
 2f(x*) = is not positive definite
-2 -2
→ cannot conclude x*

29
Descent method

Determine starting point x(0)  Rn;


k  0;
While (stop condition not reach){
Determine a search direction pk  Rn;
Determine a step size k > 0 s.t. f(x(k) + kpk) < f(x(k));
x(k+1)  x(k) + kpk;
k  k+1;
}

Stop condition may be


• ||f(xk)|| ≤ 
• ||xk+1 - xk|| ≤ 
• k > K (maximum number of iterations)

30
Gradient descent method

• Gradient descent schema


x(k) = x(k-1) - kf(x(k-1))
init x(0);
k = 1;
while stop condition not reach {
specify constant k;
x(k) = x(k-1) - kf(x(k-1));
k = k + 1;
}

𝑓
• k might be specified in such a way that f(x(k-1) - kf(x(k-1))) is minimized: =0
𝑘

31
Minimize 𝒇 𝒙 = 𝒙𝟒 − 𝟐𝒙𝟑 − 𝟔𝟒𝒙𝟐 + 𝟐𝒙 + 𝟔𝟑

32
Minimize 𝒇 𝒙 = 𝒙𝟒 + 𝟑𝒙𝟐 − 𝟏𝟎 𝒙 + 𝟒

33
Minimize 𝒇 𝒙 = 𝒙𝟐 + 𝟓𝒔𝒊𝒏(𝒙)

34
Minimize 𝒇 𝒙, 𝒚 = 𝒙𝟐 + 𝒚𝟐 + 𝒙𝒚 − 𝒙 − 𝒚

35
Minimize 𝒇 𝒙 = 𝒙𝟐𝟏 + 𝒙𝟐𝟐 + 𝒙𝟐𝟑 − 𝒙𝟏 𝒙𝟐 − 𝒙𝟐 𝒙𝟑 + 𝒙𝟏 + 𝒙𝟑

• k might be specified in such a way that f(x(k-1) - kf(x(k-1))) is


𝑓
minimized: =0
𝑘

36
Newton method

• Second-order Taylor approximation g of f at x is


1 2
f(x+h)  g(x + h) = f(x) + hf(x) + h  2f(x)
2
• Which is a convex quadratic function of h
𝑔
• g(x+h) is minimized when = 0 → h = - 2f(x)-1f(x)
ℎ
Generate x(0); // starting point
k = 0;
while stop condition not reach{
x(k+1)  x(k) -  2f(x(k))-1f(x(k));
k = k + 1;
}

37
Minimize 𝒇 𝒙 = 𝒙𝟐𝟏 + 𝒙𝟐𝟐 + 𝒙𝟐𝟑 − 𝒙𝟏 𝒙𝟐 − 𝒙𝟐 𝒙𝟑 + 𝒙𝟏 + 𝒙𝟑
import numpy as np

def newton(f,df,Hf,x0):
x = x0
for i in range(10):
iH = np.linalg.inv(Hf(x))
D = np.array(df(x)).T #transpose matrix: convert from list to
#column vector
print('df = ',D)
y = iH.dot(D) #multiply two matrices
if np.linalg.norm(y) == 0:
break
x = x - y
print('Step ',i,': ',x,' f = ',f(x))

38
Minimize 𝒇 𝒙 = 𝒙𝟐𝟏 + 𝒙𝟐𝟐 + 𝒙𝟐𝟑 − 𝒙𝟏 𝒙𝟐 − 𝒙𝟐 𝒙𝟑 + 𝒙𝟏 + 𝒙𝟑
def main():
print('main start....')
f = lambda x: x[0] ** 2 + x[1] ** 2 + x[2] ** 2 - x[0] * x[1] - x[1] *
x[2] + x[0] + x[2] # function f to be minimized
df = lambda x: [2 * x[0] + 1 - x[1], -x[0] + 2 * x[1] - x[2], -x[1] + 2
* x[2] + 1] # gradient
Hf = lambda x: [[2,-1,0],[-1,2,-1],[0,-1,2]]# Hessian
x0 = np.array([0,0,0]).T
newton(f,df,Hf,x0)

if __name__ == '__main__':
main()

39
Outline

1. Definitions

2. Unconstrained Optimization

3. Constrained Optimization
• General constrained optimization problem

• Lagrange multiplier method

40
General constrained optimization problem

• Optimization problem in the standard form

(P) 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓(𝑥)


s.t. 𝑔𝑖 𝑥 = 0, ∀𝑖 ∈ {1, 2, … , 𝑚}
𝑥 ∈ 𝑋  𝑅𝑛

with 𝑥 ∈ 𝑅𝑛, and assume 𝐷 = (𝑖=1 𝑚


𝒅𝒐𝒎 𝑔𝑖) is not empty.
• Denote 𝑓 ∗ the optimal value of 𝑓(𝑥)
• If 𝑓, 𝑔𝑖 (𝑖 = 1,2, … , 𝑚) are convex functions.

41
Lagrangian function

• Optimization problem in the standard form

(P) 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓(𝑥)


s.t. 𝑔𝑖 𝑥 = 0, ∀𝑖 ∈ {1, 2, … , 𝑚}
𝑥 ∈ 𝑋  𝑅𝑛

• Langragian function of the above problem is defined as follows, 𝐿: 𝑅𝑛 ×


𝑅𝑚 → 𝑅 𝑚

𝐿 𝑥, 𝜆 = 𝑓 𝑥 + ෍ 𝜆𝑖 × 𝑔𝑖 (𝑥)
𝑖=1

42
Lagrangian function - properties

• Optimization problem in the standard form

(P) 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓(𝑥)


s.t. 𝑔𝑖 𝑥 = 0, ∀𝑖 ∈ {1, 2, … , 𝑚}
𝑥 ∈ 𝑋  𝑅𝑛
• Langragian function of the above problem is defined as follows, 𝐿: 𝑅𝑛 ×
𝑅𝑚 → 𝑅 𝑚

𝐿 𝑥, 𝜆 = 𝑓 𝑥 + ෍ 𝜆𝑖 × 𝑔𝑖 (𝑥)
𝑖=1
• Theorem: The optimal value of the optimization problem is the following
property: ∇𝑥, 𝜆 𝐿 𝑥, 𝜆 = 0

43
Lagrangian multiplier method

• The method of Lagrange multipliers is a technique in mathematics to find the local


maxima or minima of a function 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) subject to constraints
𝑔𝑖 𝑥1 , … , 𝑥𝑛 = 0, ∀𝑖 ∈ {1, … , 𝑚}.
• Method:
• Step 1: Solving the following system 𝑚

∇𝑓 𝑥1 , … , 𝑥𝑛 = ෍ 𝜆𝑖 𝑔𝑖 𝑥1 , … , 𝑥𝑛
𝑖=1
𝑔1 𝑥1 , … , 𝑥𝑛 =0

𝑔𝑚 𝑥1 , … , 𝑥𝑛 = 0

• Step 2: we get particular values of 𝑥1 , 𝑥2 , … , 𝑥𝑛 , which can be plugged in 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) to get


the extremum value if it exists.

44
Example 1:

• Problem: Find the maximum and minimum of 𝑓(𝑥, 𝑦) = 5𝑥 − 3𝑦 subject to the


constraint 𝑥 2 + 𝑦 2 = 136.
• Solution:
• Region of possible solutions lies on a disk of radius 136 which is a closed and bounded
region, − 136 ≤ 𝑥, 𝑦 ≤ 136
• The Lagrangian function 𝐿 𝑥, 𝑦, 𝜆 = 5𝑥 − 3𝑦 − 𝜆(𝑥 2 + 𝑦 2 − 136)
• As the result of the theorem, we have

• Notice that, we can’t have 𝜆 = 0 since that would not satisfy the first two equations. So, since
we know that 𝜆 ≠ 0 we can solve the first two equations for 𝑥 and 𝑦 respectively. This gives,

45
Example 1

• Problem: Find the maximum and minimum of 𝑓(𝑥, 𝑦) = 5𝑥 − 3𝑦 subject to the


constraint 𝑥 2 + 𝑦 2 = 136.
• Solution:
• Plugging these into the constraint gives

• We can solve this for 𝜆

• Now, that we know 𝜆 we can find the points that will be potential maximums and/or minimums.
1
• If 𝜆 = − we get 𝑥 = −10, 𝑦 = 6
4
1
• if 𝜆 = we get 𝑥 = 10, 𝑦 = −6
4
• So,

46
Example 2:

• Problem:
• Objective function: 𝑚𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥, 𝑦 = 𝑥𝑦
• Constraint: 𝑔 𝑥, 𝑦 = 10𝑥 + 20𝑦 − 400 = 0
• Solution
• Form the Lagrange function:
𝐿 𝑥, 𝑦, 𝜇 = 𝑓 𝑥, 𝑦 − 𝜇 𝑔 𝑥, 𝑦
𝐿 𝑥, 𝑦, 𝜇 = 𝑥𝑦 − 𝜇(10𝑥 + 20𝑦 − 400)
• Set each first order partial derivative equal to zero:

47
Example 2:

• Problem:
• Objective function: 𝑚𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑢 𝑥, 𝑦 = 𝑥𝑦
• Constraint: 𝑔 𝑥, 𝑦 = 10𝑥 + 20 − 400 = 0
• Solution:
• Set each first order partial derivative equal to zero:

• So,

48
Example 3:

• Problem:
• Objective function: 𝑚𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥, 𝑦 = 𝑥 + 𝑦
• Constraint: 𝑔 𝑥, 𝑦 = 𝑥 2 + 𝑦 2 − 2 = 0
• Solution:
• Form the Lagrange function:

• Set each first order partial derivative equal to zero:


𝜕𝐿
• = 1 + 2𝜆𝑥 = 0
𝜕𝑥
𝜕𝐿
• = 1 + 2𝜆𝑦 = 0
𝜕𝑦
𝜕𝐿
• = 𝑥2 + 𝑦2 − 2 = 0
𝜕𝜆

49
Example 3

• Problem:
• Objective function: 𝑚𝑎𝑥𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥, 𝑦 = 𝑥 + 𝑦
• Constraint: 𝑔 𝑥, 𝑦 = 𝑥 2 + 𝑦 2 − 2 = 0
• Solution:
• Set each first order partial derivative equal to zero:
𝜕𝐿
• = 1 + 2𝜆𝑥 = 0
𝜕𝑥
𝜕𝐿
• = 1 + 2𝜆𝑦 = 0
𝜕𝑦
𝜕𝐿
• = 𝑥2 + 𝑦2 − 2 = 0
𝜕𝜆
• We have , 𝑥, 𝑦 ∈ 1,1 , −1, −1
• So, (𝑥, 𝑦) = (1,1)

50
THANK YOU !

51

You might also like