1 - Theory of Maxima and Minima
1 - Theory of Maxima and Minima
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
Solution by
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
5
5
7
7
8
8
11
13
15
15
16
17
19
21
27
30
Contents
Chapter
(1.1)
f : Rn R
(1.2)
(1.3)
subject to:
gi (x) = 0, i = 1, m
(1.4)
hj (x) 0, j = 1, p
(1.5)
The goal of the problem is to find the vector of parameters x which minimizes (or maximizes) a given
scalar function, possibly subject to some restrictions on the allowed parameter values. The function f to
be optimized is termed the objective function; the elements of vector x are the control or decision variables; the
restrictions (1.4) and (1.5) are the equality or inequality constraints.
The value x of the variable which solves the problem is a minimizer (or maximizer) of function f subject
to the constraints (1.4) and (1.5), and f (x ) is the minimum (or maximum) value of the function subject to
the same constraints.
If the number of constraints m + p is zero, the problem is called an unconstrained optimization problem.
5
(1.6)
min (1 x1 )2 + (1 x2 )2
(1.7)
x1 + x2 1 0,
(1.8)
subject to
x31
x2 0
(1.9)
(1.10)
A point x0 is a strong local minimum if there exists some > 0, such that:
f (x0 ) < f (x), when |x x0 | <
(1.11)
A point x0 is a weak local minimum if there exists some > 0, such that:
f (x0 ) f (x), when |x x0 | <
(1.12)
80
60
f(x)
40
20
20
40
f(x)
60
80
x
100
4
f(x)
x
x1
x2
x3
Figure 1.3: Minimum points. x1 : weak local minimum, x2 : global minimum, x3 : strong local minimum
f (x0 ) =
f f
f
...
x1 x2
xn
(1.13)
T
(1.14)
A stationary point of a single variable function is a point where the first derivative f (x0 ) equals zero.
Example 1.2 Consider the cases in the Figure 1.4.
7
f(x)
f(x)
f(x)
x
a
x1
x
a x2 x3 xb4
a)
x
a
b)
x5b
c)
1 (n)
f (x0 )(x x0 )k
k!
(1.17)
If k is even, then (xx0 )k is positive. Thus, if f (k) (x0 ) > 0, then f (x0 ) is a minimum; if f (k) (x0 ) <
0 then f (x0 ) is a maximum (Figure 1.5).
If k is odd, then (x x0 )k changes sign. It is positive for x > x0 and negative for x < x0 .
If f (k) (x0 ) > 0, the second term in the equation (1.17) is positive for x > x0 and negative for
x < x0 . If f (k) (x0 ) < 0, the second term in the equation (1.17) is negative for x > x0 and positive
for x < x0 . The stationary point is an inflection point (Figure 1.5).
x5 x3
5
3
(1.18)
(1.19)
The stationary points are obtained by setting the first derivative equal to zero:
x1 = 1; x2 = 1; x3 = x4 = 0;
(1.20)
f 00 (x) = 4x3 2x
(1.21)
(1.22)
Because f 00 (1) > 0 and f 00 (1) < 0, the stationary point x1 = 1 is a local minimum and x2 = 1 is a local
maximum. Since the second derivative is zero at x3,4 = 0, an analysis of higher order derivatives is necessary. The
third derivative of f :
f 000 (x) = 12x2 2
(1.23)
is nonzero for x3,4 = 0. Since the order of the first nonzero derivative is 3, i.e. it is odd, the stationary points
x3 = x4 = 0 are inflection points. A plot of the function showing the local minimum, maximum and inflection points
is shown in Figure 1.6.
x5/5x3/3
0.4
0.3
0.2
f(x)
0.1
0
0.1
0.2
0.3
0.4
1.5
0.5
0
x
0.5
1.5
(1.24)
(1.25)
(1.26)
The stationary points of the function are x10 = 2 and x20 = 2. Since the variable is constrained by (1.25) and x20
is out of the bounds, we shall analyze the other stationary point and the boundaries. The second derivative:
f 00 (x10 ) = 2x10 = 2 (2) = 4 > 4
(1.27)
is positive at x10 , thus 2 is a local minimum, as shown in Figure 1.7. According to the theorem of Weierstrass,
the function must have a maximum value in the interval [3, 1]. On the boundaries, the function takes the values:
f (3) = 3 and f (1) = 11/3. Thus, the point x = 1 is the maximizer in [3, 1].
10
4
3
2
1
f(x)
0
1
2
3
4
5
6
3
2.5
1.5
0.5
0.5
(1.29)
If f (x1 , x2 ) has a local extremum at a point (x10 ,x20 ) and has continuous partial derivatives at this point,
then
fx1 (x10 , x20 ) = 0, fx2 (x10 , x20 ) = 0
(1.30)
The second partial derivatives test classifies the point as a local maximum or local minimum.
Define the second derivative test discriminant as:
D2 = fx1 x1 fx2 x2 fx21 x2
(1.31)
fx1 x1 fx1 x2
H2 =
(1.32)
fx1 x2 fx2 x2
Example 1.5 Locate the stationary points of the function:
f (x1 , x2 ) = x21 x22
(1.33)
= 2x1 = 0
(1.34)
fx2
= 2x2 = 0
(1.35)
The function has only one stationary point: (x10 , x20 ) = (0, 0). Compute the second derivatives:
fx1 x1 = 2, fx1 x2 = 0, fx2 x2 = 2
(1.36)
fx1 x1 fx1 x2
= fx x fx x fx2 x = 2 (2) 02 = 4 < 0
D2 =
2 2
1 1
1 2
fx1 x2 fx2 x2
(1.37)
According to the second derivative test, the point (0, 0) is a saddle point. A mesh and contour plot of the function
is shown in Figure 1.8.
1
0.8
0.8
1
0.5
x2
f(x1, x2)
0.6
0.6
0.5
1
1
0.4
0.4
0.2
0.2
0.2
0.2
0.4
0.4
0.6
0.5
0.6
1
0.5
0
0.5
0.5
1
0.8
0.8
0
1
1
1
x1
0.5
0.5
x1
(1.38)
= 2x1 = 0
(1.39)
fx2
= 2x2 = 0
(1.40)
The function has only one stationary point: (x10 , x20 ) = (0, 0). The second derivatives are:
fx1 x1 = 2 < 0, fx1 x2 = 0, fx2 x2 = 2
(1.41)
fx1 x2
= fx1 x1 fx2 x2 fx21 x2 = (2) (2) 02 = 4 > 0
fx2 x2
(1.42)
f
D2 = x1 x1
fx1 x2
Thus, the function has a maximum at (0, 0), because fx1 x1 < 0 and D2 > 0. The graph of the function is shown
in Figure 1.9.
12
f(x1, x2)
0.5
0.5
1
1
0.5
1
0.5
0.5
x2
0.5
1
x1
(1.43)
= 2x1 + 2x2 = 0
= 2x1 +
4x32
(1.44)
=0
(1.45)
and the stationary points are: (0, 0), (1/ 2, 1/ 2) and (1/ 2, 1/ 2).
The second derivatives are:
fx1 x1 = 2, fx1 x2 = 2, fx2 x2 = 12x22
and the discriminant:
f
D2 = x1 x1
fx1 x2
fx1 x2 2
2
=
fx2 x2 2 12x22
= 24x22 4
(1.46)
(1.47)
(1.48)
1
1.5
0.8
0.6
0.5
0.2
x2
f(x , x )
0.4
0
0.2
0.5
0.4
2
3
1
0.6
1.5
0
x2
0.5
0.8
0.5
1
1
2
0.5
0.5
x1
Figure 1.10: Mesh and contour plot of f (x1 , x2 ) = x21 + 2x1 x2 + x42 2
where H2 is the Hessian matrix defined by:
fx1 x1
fx2 x1
H2 =
...
fxn x1
fx1 x2
fx2 x2
...
fxn x2
. . . fx1 xn
. . . fx2 xn
... ...
. . . fxn x2
(1.49)
If x is sufficiently close to x0 , the terms containing (xi xi0 )k , k > 2 become very small and higher
order terms can be neglected. The first derivatives of f are zero at a stationary point, thus the relation (1.48)
can be written as:
1
(1.50)
f (x) = f (x0 ) + (x x0 )T H2 (x x0 )
2
The sign of the quadratic form which occurs in (1.50) as the second term in the right hand side will
decide the character of the stationary point x0 , in an analog way as for single variable functions.
According to (Hancock, 1960), we can determine weather the quadratic form is positive or negative by
evaluating the signs of the determinants of the upper-left sub-matrices of H2 :
(1.52)
(1.53)
(1.54)
6x1 0 0
H2 = 0 2 0
0 0 2
(1.55)
6 0
D1 = 6 > 0, D2 =
0 2
6 0 0
D3 = 0 2 0 = 24 > 0
0 0 2
(1.56)
= 12 > 0,
(1.57)
6 0
= 12 < 0,
D1 = 6 < 0, D2 =
0 2
6 0 0
D3 = 0 2 0 = 24 < 0
0 0 2
(1.58)
The Hessian matrix is diagonal, the eigenvalues are easily determined as 6, 2, 2. Because they do not have the same
sign and are nonzero (1, 0, 0) is a saddle point.
(1.59)
subject to
gi (x) = 0, i = 1, m
(1.60)
hj (x) 0, j = 1, p
(1.61)
where x is a vector of n independent variables, [x1 x2 . . . xn ]T , and f , gi and hj are scalar multivariate
functions.
15
(1.62)
g(x1 , x2 ) = x1 + 2x2 + 4 = 0
(1.63)
subject to
The plot of the function and the constraint are illustrated in Figure 1.11, as a 3D surface and contour lines.
3
15
14
12
10
10
f(x , x )
16
20
0
8
1
x1+2x2+4=0
0
3
4
2
2
1
0
1
x1
2
3
3
3
x2
x1
Figure 1.11: Mesh and contour plot of f (x1 , x2 ) = x21 + x22 and the constraint x1 + 2x2 + 4 = 0
The purpose now is to minimize the function x21 + x22 subject to the condition that the variables x1 and x2 lie on
the line x1 + 2x2 + 4 = 0. Graphically, the minimizer must be located on the curve obtained as the intersection of the
function surface and the vertical plane that passes through the constraint line. In the x1 x2 plane, the minimizer
can be found as the point where the line x1 + 2x2 + 4 = 0 and a level curve of x21 + x22 are tangent.
(1.64)
Notice that the slack variables sj are squared to be positive. They are additional variables, so the number of
the unknown variables increases to n + p.
An inequality given in the form:
hj (x) 0, j = 1, p
(1.65)
can also be turned into an equality if a positive quantity is subtracted from the left hand side:
hj (x) s2j = 0, j = 1, p
16
(1.66)
(1.67)
subject to
x1 + x2 4
(1.68)
s21 :
x1 + x2 4 + s21 = 0
(1.69)
1.3.3 Analytical methods for optimization problems with equality constraints. Solution by
substitution
This method can be applied when it is possible to solve the constraint equations for m independent variables, where the number of constraints is less than the total number of variables m n. The solution of
the constraint equations is then substituted into the objective function. The new problem will have n m
unknowns, it will be unconstrained and the techniques for unconstrained optimization can be applied.
Example 1.11 Let w, h, d be the width, height and depth of a box (a rectangular parallelepiped). Find the optimal
shape of the box to maximize the volume, when the sum w + h + d is 120.
The problem can be formulated as:
max whd
(1.70)
(w,h,d)
subject to
w + h + d 120 = 0
(1.71)
We shall solve (1.71) for one of the variables, for example d and then substitute the result into (1.70):
d = 120 w h
(1.72)
max wh(120 w h)
(1.73)
(1.74)
Let:
Compute the stationary points from:
f
= 120h 2wh h2 = h(120 2w h) = 0
w
f
= 120w w2 2wh = w(120 w 2h) = 0
h
The solutions are: w = h = 0 (not convenient) and w = h = 40.
Determine whether (40, 40) is a minimum or maximum point. Write the determinant:
2
f
2f
2h
120 2w 2h
w2 wh
D2 = 2 f
2 f = 120 2w 2h
2w
wh
h2
80 40
=
40 80
Since
(1.75)
(1.76)
(1.77)
2f
= 80 < 0 and D2 = 4800 > 0
w2
the point (40, 40) is a maximum. From (1.72) we have: d = 120 40 40 = 40, thus the box should have the sides
equal: w = d = h = 40.
D1 =
17
(1.78)
x1 + x2 = 4
(1.79)
subject to:
5
10
0
3
7.1
4.7
619
2
20
1
9. 1.90
48
523
429 8
2.38
f(x1, x2)
10
x1+x24=0
1
30
0
40
50
2
x1
2
2
x2
x1
Figure 1.12: Mesh and contour plot of f (x1 , x2 ) = x21 x22 and the constraint x1 + x2 4 = 0
As shown in Figure 1.12, the constrained maximum of f (x1 , x2 ) must be located on the curve resulted as the
intersection of the function surface and the vertical plane that passes through the constraint line. In the plot showing
the level curves of f (x1 , x2 ), the point that maximizes the function is located where the line x1 +x2 +4 = 0 is tangent
to one level curve.
Analytically, this point can be determined by the method of substitution, as follows:
Solve (1.79) for x2 :
x2 = 4 x1
(1.80)
and replace it into the objective function. The new unconstrained problem is:
max x21 (4 x1 )2
x1
(1.81)
The stationary point of f (x1 ) = x21 (4 x1 )2 is calculated by letting the first derivative be zero:
2x1 2(4 x1 )(1) = 0
(1.82)
(1.83)
x1 + x2 = 6
(1.84)
subject to:
Substitute x2 = 6 x1 from (1.84) into (1.83) and obtain the unconstrained problem:
max 20 x1 (6 x1 )
x1
(1.85)
(1.86)
Because the second derivative f 00 (x10 ) = 2 is positive, the stationary point is a minimizer of f (x1 ). Because x2 =
6 x1 , the point (x10 = 3, x20 = 3) minimizes the function f (x1 , x2 ) subject to the constraint (1.84). As shown
in Figure 1.13, the minimum obtained is located on the parabola resulted as the intersection of the function surface
and the vertical plane containing the constraint line, or, in the contour plot, it is the point where the constraint line
is tangent to a level curve.
18
f(x , x )
21.6667
24.333
667
21.6
x1
5
0
5.
3
66
8.
66
33
7
33
3
11
2
5
0.3 .3333
3
333
33
3
5.66
667
8.33
333
11
13.6667
666
16.3
3
33
19
16.3333
19
19
21.6667
24.3333
21.6667
5
2
67
66
.6 5 33
33 3
.3 33
2 33
3
0.
x1+x26=0
13.
20
8.
33 2
1 33 1
5.6
66
1 1 7
0.3 3
33
3
x +x 6=0
33
.33
16
3
2
10
30
8
19
667
10
13.6
20
3
7
666
5.6
333
8.33
11
16.3333
19
30
10
1
1
x2
x1
Figure 1.13: Mesh and contour plot of f (x1 , x2 ) = 20 x1 x2 and the constraint x1 + x2 6 = 0
subject to
gi (x) = 0, i = 1, m
(1.88)
As example, consider the problem of finding the minimum of a real-valued function f (x1 , x2 ) subject
to the constraint g(x1 , x2 ) = 0. Let f (x1 , x2 ) = 20 x1 x2 and g(x1 , x2 ) = x1 + x2 6 = 0, as shown in Figure
1.14. The gradient direction of f (x1 , x2 ) is also shown, as arrows, in the same figure.
6
5.5
10
4.5
15
4
3.5
3
2.5
2
10
1.5
1
15
1
g(x1, x2)=0
5
x1
(1.89)
f (x) + g(x) = 0
(1.90)
(1.91)
where g is a column vector function containing the m constraints gi (x), and is a column vector m of
unknown values, called Lagrange multipliers. The function above can be written in an expanded form as:
L(x1 , x2 , . . . , xn , 1 , 2 , . . . , m ) = f (x1 , x2 , . . . , xn ) + 1 g1 (x1 , x2 , . . . , xn )
+ . . . + m gm (x1 , x2 , . . . , xn )
To locate the stationary points, the gradient of the Lagrangian function is set equal to zero:
(1.92)
(1.93)
The necessary conditions for optimum are obtained by setting the first partial derivatives of the Lagrangian
function with respect to xi , i = 1, n and j , j = 1, m equal to zero. There are n + m nonlinear algebraic
equations to be solved for n + m unknowns, as follows:
L(x, )
x1
L(x, )
x2
...
L(x, )
xn
L(x, )
1
...
L(x, )
m
f (x) X gj (x)
+
j
=0
x1
x1
j=1
f (x)
+
x2
m
X
j=1
gj (x)
=0
x2
f (x) X gj (x)
+
j
=0
xn
xn
(1.94)
j=1
= g1 (x) = 0
= gm (x) = 0
Example 1.14 Find the stationary points of the function f (x1 , x2 ) = 20x1 x2 subject to the constraint x1 +x2 = 6
using the method of Lagrange multipliers.
Define the Lagrangian function:
L(x1 , x2 , ) = 20 x1 x2 + (x1 + x2 6)
(1.95)
20
= x2 + = 0
(1.96)
= x1 + = 0
(1.97)
= x1 + x2 6 = 0
(1.98)
(1)m
2 L(x0 ,0 )
x1 x1
...
2 L(x0 ,0 )
xp x1
g1 (x0 )
x1
...
gm (x0 )
x1
...
...
...
...
...
...
2 L(x0 ,0 )
x1 xp
(1.99)
g1 (x0 )
x1
...
...
2 L(x0 ,0 )
xp xp
g1 (x0 )
xp
g1 (x0 )
xp
...
...
...
0
...
0
...
...
...
...
gm (x0 )
xp
gm (x0 )
x1
0
...
0
...
gm (x0 )
xp
>0
(1.100)
(1.101)
The similar result for strict local maxima is obtained by changing (1)m in (1.100) to (1)p , (Avriel,
2003).
For p = n, the matrix from (1.100) is the bordered Hessian matrix of the problem. The elements are in fact
the second derivatives of the Lagrangian with respect to all its n + m variables, xi and j . The columns in
the right and the last rows can be easier recognized as second derivatives of L if we notice that:
gj (x)
L(x, )
L2 (x, )
= gj (x) and
=
j
j xi
xi
(1.102)
Because gj (x) does not depend on , the zeros from lower-right corner or the matrix result from:
L(x, )
L2 (x, )
= gj (x) and
=0
j
j i
(1.103)
When p < n, the matrices can be obtained if the rows and columns between p+1 and n1 are excluded.
Example 1.15 For the problem from example 1.14, we shall prove that the stationary point is a minimum, according
to the sufficient condition defined above. The function to be minimized is f (x1 , x2 ) = 20 x1 x2 and the constraint
g(x1 , x2 ) = x1 + x2 6 = 0
The number of variables in this case is n = 2, the number of constraints, m = 1 and p = m + 1 = 2. The only
matrix we shall analyze is:
H2 =
2 L(x1 ,x2 ,)
2 x1
2 L(x1 ,x2 ,)
x2 x1
2 L(x1 ,x2 ,)
x1
2 L(x1 ,x2 ,)
x1 x2
2 L(x1 ,x2 ,)
2 x2
2 L(x1 ,x2 ,)
x2
2 L(x1 ,x2 ,)
x1
2 L(x1 ,x2 ,)
x2
2 L(x1 ,x2 ,)
2
21
(1.104)
H2 =
2 L(x1 ,x2 ,)
x1 x2
2 L(x1 ,x2 ,)
2 x2
g(x1 ,x2 )
x2
2 L(x1 ,x2 ,)
2 x1
2 L(x1 ,x2 ,)
x2 x1
g(x1 ,x2 )
x1
g(x1 ,x2 )
x1
g(x1 ,x2 )
x2
(1.105)
Using the results we have obtained in the example 1.14, the second derivatives of the Lagrangian function are:
2 L(x1 , x2 , )
2 x1
2
L(x1 , x2 , )
2 x2
2
L(x1 , x2 , )
x1 x2
g(x1 , x2 )
x1
g(x1 , x2 )
x2
and the bordered Hessian:
(x2 + ) = 0
x1
(x1 + ) = 0
x2
(x2 + ) = 1
x2
(x1 + x2 6) = 1
x1
(x1 + x2 6) = 1
x2
=
=
=
=
=
(1.106)
0 1 1
H2 = 1 0 1
1
1 0
(1.107)
0 1 1
(1.108)
thus, the stationary point (3, 3) is a minimizer of the function f subject to the constraint g = 0.
Example 1.16 Find the highest and lowest points on the surface f (x1 , x2 ) = x1 x2 + 25 over the circle x21 + x22 = 18.
Figure 1.15 shows a graphical representation of the problem. Theconstraint on x1 and x2 places the variables
on the circle with the center in the origin and with a radius equal to 18. On this circle, the values of the function
f (x1 , x2 ) are located on the curve shown in Figure 1.15 on the mesh plot. It is clear from the picture that there are
two maxima and two minima that will be determined using the method of Lagrange multipliers.
10
45
25
20
25
20
35
15
40
45
10
5
5
25
30
3
4
30
25
35
15
40
5
4
x1
Figure 1.15: Mesh and contour plot of f (x1 , x2 ) = 25 + x1 x2 and the constraint x21 + x22 = 18
The problem may be reformulated as: optimize f (x1 , x2 ) = x1 x2 + 25 subject to the constraint g(x1 , x2 ) =
x21 + x22 18 = 0.
22
(1.109)
Compute the first partial derivatives of L and set them equal to zero:
Lx1
= x2 + 2x1 = 0
Lx2
= x1 + 2x2 = 0
L =
x21
x22
(1.110)
18 = 0
x10 = 3, x20 = 3, 0 =
x10 = 3, x20
x10 = 3, x20
x10 = 3, x20 = 3, 0 =
(1.111)
1
2
Build the bordered Hessian matrix and check the sufficient conditions for maxima and minima.
(1.112)
Because the number of constraints is m = 1, the number of variables is n = 2 and p = 2, the sufficient condition
for a stationary point to be a minimizer of f subject to g = 0 is:
(1)1 det(H2 ) > 0 or det(H2 ) < 0
(1.113)
(1.114)
1 1 6
1
(3, 3, ) : det(H2 ) = 1 1 6 = 144 < 0
2
6 6 0
1 1
6
1
(3, 3, ) : det(H2 ) = 1 1 6 = 144 < 0
2
6 6 0
1 1 6
1
(3, 3, ) : det(H2 ) = 1 1 6 = 144 > 0
2
6
6 0
1 1 6
1
(3, 3, ) : det(H2 ) = 1 1 6 = 144 > 0
2
6 6 0
(1.115)
(1.116)
(1.117)
(1.118)
Thus, the function f subject to g = 0 has two minima at (3, 3, 12 ) and (3, 3, 21 ), and two maxima at (3, 3, 21 )
and (3, 3, 12 ).
23
2
2
0
2
2
subject to:
g(x, y, z) = x2 + y 2 + z 2 4 = 0
(1.121)
(1.122)
and set the partial derivatives equal to zero to compute the stationary points:
Lx = 2x 6 + 2x = 0
Ly = 2y 8 + 2y = 0
(1.123)
Lz = 2z + 2z = 0
L = x2 + y 2 + z 2 4 = 0
The system (1.123) has two solutions:
6
8
3
(S1 ) : x10 = , y10 = , z10 = 0, 10 =
5
5
2
and
(1.124)
6
8
7
(S2 ) : x10 = , y10 = , z10 = 0, 10 =
(1.125)
5
5
2
It is clear from Figure 1.16 that we must find a minimum and a maximum distance between the point P and the
sphere, thus the sufficient conditions for maximum or minimum have to be checked.
24
Lxx Lxy Lx
Lxx Lxy gx
H22 = Lxy Lyy Ly = Lxy Lyy gy
(1.126)
Lx Ly L
gx
gy
0
For p = 3:
H23
Lxx
Lxy
=
Lxz
Lx
Lxy
Lyy
Lyz
Ly
Lx
Lxx Lxy Lxz gx
Ly
= Lxy Lyy Lyz gy
Lxz
Lyz
Lzz
Lz
(1.127)
The sufficient conditions for minimum in this case are written as:
(1)1 det(H22 ) > 0, and (1)1 det(H23 ) > 0
(1.128)
(1.129)
The second derivatives of the Lagrangian function with respect to all its variables are:
Lxx = 2 + 2, Lyy = 2 + 2, Lzz = 2 + 2
Lxy = 0, Lxz = 0, Lx = gx = 2x
(1.130)
Lyz = 0, Ly = gy = 2y, Lz = gz = 2z
From (1.126) and (1.127) we obtain:
2 + 2
0
2x
0
2 + 2 2y
=
2x
2y
0
(1.131)
2 + 2
0
0
2x
0
2 + 2
0
2y
0
0
2 + 2 2z
2x
2y
2z
0
(1.132)
H22
H23
For the first stationary point, (S1 ), the determinants of H22 and H23 are:
detH22
= 0
12
16
5
detH23
0
=
0
12
5
0
5
0
16
5
= 80
12
5
16
5 = 400
0
0
12
5
16
5
0
5
0
0
5
0
(1.133)
(1.134)
(1.135)
5
0
12
5
5 16
detH22 = 0
5
12 16
0
5
5
5
0
0
0
5
0
detH23 =
0
5
0
12 16 0
5
5
= 80,
12
5
16
5
= 400
0
0
(1.136)
(1.137)
(1.138)
(1.139)
g1 (x, y, z) = x2 + y 2 + z 2 1 = 0
(1.140)
g2 (x, y, z) = x + y + z 1 = 0
(1.141)
subject to:
In this case we have two constraints and 3 variables. Two new unknowns will be introduced and the Lagrange
function is written as:
L(x, y, z, 1 , 2 ) = x + 2y + z + 1 (x2 + y 2 + z 2 1) + 2 (x + y + z 1)
(1.142)
(1.143)
L1
= x +y +z 1=0
L2
= x+y+z1=0
x0 = 0, y0 = 1, z0 = 0, 10 = 12 , 20 = 1
(S2 ) : x0 =
2
3,
y0 =
13 ,
z0 =
2
3,
10 =
1
2,
20 =
53
The number p from (1.100) is 3 in this case, therefore we have to analyze the sign of the determinant:
0 21 2z 1
detH2 = Lxz Lyz Lzz g1z g2z = 0
g1x g1y g1z
0
0 2x 2y 2z 0 0
(1.144)
(1.145)
(1.146)
(1.147)
detH2 =
1 0
0 0 1
0 1 0 2 1
0
0 1 0 1
0
2
0 0 0
1
1
1 0 0
(1.148)
= 8 < 0
4
1
0
0
1
3
2
0
1 0 3 1
0 1 43 1 = 8 > 0
detH2 = 0
40 2 4 0 0
3
3
3
1
1 1 0 0
(1.149)
(1.150)
min f (x)
(1.151)
subject to
gi (x) = 0, i = 1, m
(1.152)
hj (x) 0, j = 1, p
(1.153)
m
X
i gi (x) +
i=1
p
X
j hj (x)
j=1
(1.154)
where:
= [1 2 . . . m ]T and = [1 2 . . . p ]T are vector multipliers,
g = [g1 (x) g2 (x) . . . gm (x)]T and h = [h1 (x) h2 (x) . . . hp (x)]T are vector functions.
The necessary conditions for a point x0 to be a local minimizer of f are:
P
Pp
f (x0 ) + m
i=1 i gi (x0 ) +
j=1 j hj (x0 )
(1.155)
gi (x0 ) = 0,
i = 1, m
(1.156)
hj (x0 ) 0,
j = 1, p
(1.157)
j hj (x0 ) = 0, j = 1, p
(1.158)
j 0, j = 1, p
(1.159)
i , unrestricted in sign, i = 1, m
(1.160)
27
(1.161)
In a few cases, it is possible to solve the KKT conditions (and therefore, the optimization problem)
analytically. but the sufficient conditions are difficult to verify.
Example 1.19 Minimize
f (x1 , x2 ) = e3x1 + e2x2
(1.162)
subject to:
x1 + x2 2
(1.163)
x1 0
(1.164)
x2 0
(1.165)
(1.166)
x1 0
(1.167)
x2 0
(1.168)
(1.169)
(1.170)
(1.171)
(1.172)
(1.173)
3 (x2 ) = 0
(1.174)
1 0
(1.175)
2 0
(1.176)
3 0
(1.177)
First we may notice that x1 0 and x2 0, thus they can be either zero, or strictly positive. Therefore, we have
four cases:
1.) x1 = 0, x2 = 0 The relations (1.170), (1.171), (1.172) become:
3 + 1 2 = 0
(1.178)
2 + 1 3 = 0
(1.179)
1 (2) = 0
(1.180)
2x2
(1.181)
+ 1 = 0
(1.182)
1 (x2 2) = 0
(1.183)
From (1.182) we obtain 1 = 2e2x2 6= 0 so the relation (1.183) is satisfied only for x2 = 2. Then 1 = 2e4 .
From (1.181) we obtain: 2 = 3 + 2e4 < 0 and the constraint (1.176) is not satisfied.
This case will not give a solution of the problem.
3.) x1 > 0, x2 = 0 Because x1 is strictly positive, from (1.173) we obtain 2 = 0, and the relations (1.170), (1.171),
(1.172) are recalculated:
3e3x1 + 1 = 0
(1.184)
2 + 1 3 = 0
(1.185)
1 (x1 2) = 0
(1.186)
From (1.184) we obtain 1 = 3e3x2 6= 0 so the relation (1.186) is satisfied only for x1 = 2. Then 1 = 3e6 .
From (1.185) we obtain: 3 = 2 + 3e6 < 0 and the constraint (1.177) is not satisfied.
This situation is not a solution of the problem either.
4.) x1 > 0, x2 > 0 Since x1 and x2 cannot be zero, from (1.173) and (1.174) we obtain: 2 = 0 and 3 = 0.
The relations (1.170), (1.171), (1.172) become:
3e3x1 + 1 = 0
2e
2x2
(1.187)
+ 1 = 0
(1.188)
1 (x1 + x2 2) = 0
(1.189)
The value of 1 cannot be zero because this would make zero the exponentials from (1.187) and (1.188) which
is not valid. Then x1 + x2 2 = 0, or x2 = 2 x1 .
29
(1.190)
and then:
3e3x1 = 2e2(2x1 ) , or e5x1 +4 =
The solution is:
2
3
(1.191)
1
2
x1 = (4 ln ) = 0.88, x2 = 2 x1 = 1.11
5
3
(1.192)
1.2
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
1.4 Exercises
1. Locate the stationary points of the following functions and determine their character:
a) f (x) = x7
b) f (x) = 8x x4 /4
c) f (x) = 50 6x + x3 /18
2. Consider the functions f : R2 R:
a) f (x1 , x2 ) = x1 x2
b) f (x1 , x2 ) = x1 /2 + x22 3x1 + 2x2 5
c) f (x1 , x2 ) = x21 + x32 + 6x1 12x2 + 5
d) f (x1 , x2 ) = 4x1 x2 x41 x42
e) f (x1 , x2 ) = x1 x2 e
2
x2
1 +x2
2
Compute the stationary points and determine their character using the second derivative test.
30
1.4. Exercises
3. Find the global minimum of the function:
f (x1 , x2 ) = (x1 2)2 + (x2 1)2
in the region
0 x1 1
0 x2 2
4. Find the extrema of the function f : R3 R:
f (x1 , x2 , x3 ) = 2x21 + 3x22 + 4x23 4x1 12x2 16x3
5. Use the method of substitution to solve the constrained-optimization problem:
subject to
x1 + 3x2 10 = 0
6. Use the method of Lagrange multipliers to find the maximum and minimum values of f subject to
the given constraints
a) f (x1 , x2 ) = 3x1 2x2 , x21 + 2x22 = 44
b) f (x1 , x2 , x3 ) = x21 2x2 + 2x33 , x21 + x22 + x23 = 1
c) f (x1 , x2 ) = x21 x22 , x21 + x22 = 1
7. Minimize the surface of a cylinder with a given volume.
31
32
Bibliography
Avriel, M. (2003). Nonlinear Programming: Analysis and Methods. Courier Dover Publications.
Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press.
Hancock, H. (1960). Theory of Maxima and Minima. Ginn and Company.
Hiriart-Urruty, J.-B. (1996). LOptimisation. Que sais-je? Presses Universitaires de France.
Renze, J. and Weisstein, E. (2004). Extreme value theorem. From MathWorldA Wolfram Web Resource.
http://mathworld.wolfram.com/ExtremeValueTheorem.html.
Weisstein, E. (2004).
Second derivative test.
From MathWorldA Wolfram Web Resource.
http://mathworld.wolfram.com/SecondDerivativeTest.html.
33