Finite Element Method For Electromagnetics
Finite Element Method For Electromagnetics
Method for
Electromagnetics
ANTENNAS,
MICROWAVE CIRCUITS,
John L Volakis
Arindam Chattcrjet
Leo \320\241
Kempel
Arindam Chaterjce has developed three-dimensional computer simulation of electromagneticfields for scattering
and microwave circuits, and is currently a member of the finite element development group for the HFSS finite
element commercialpackageat Hewlett-Packard.
Leo C. Kempel developed three-dimensional antenna simulation packages using the finite element-boundary
integral method and has extensive experience with all popular numerical techniques in electromagnetics. He
is currently at Mission Research Corporation, Florida, conducting research and development on all aspects of
electromagnetics.
\320\263
0780334 56
IEEE Press 90000
445 Hoes Lane
P.O.Box1331
Piscataway, NJ 08855-1331
1-800-678-IEEE (Toll-Free U.S.A.and Canada)
or 1-732-981-0060
9||78078\320\262||33425\320\262
Order
\320\250\320\225\320\225 No. PC5698
FINITE ELEMENT METHOD
FOR ELECTROMAGNETICS
lEF.E/Ol'P SERIES ON ELECTROMAGNETIC WAVE THEORY
D. S. Jones
University of Dundee Anwrnm, Propagation* uittl
David R. Jackson
University of Houston
1EEE/OUP Series on
Electromagnetic Wave Theory
John L. Volakis
University of Michigan
Arindam Chatterjee
Hewlett-Packard
LeoC. Kempel
IEEE
PRESS
IEEE PressMarketing
Atln: Special Sales
445 Hoes Laae, P.O. Box 1331
Piscataway, NJ 0\320\2505-1331
Fax: G32) 981-9334
All rights reserved. No part of this bonk may he reproduced in any form,
{\321\216\320\263
may it be stwed in a retrieval system or transmitted in any form,
without written permission front the puhlbber.
\320\256 987654321
ISBN0-7803-3425-6
IEEE Order Number: PC5698
Editorial Board
Roger f\\ Hoyt. Editor In Chief
Antennas
\320\250\320\225\320\225 & Propagation Socicly, Sponsor
AP-S Liaisonto IEEE Press.Robert Maiiloux
Technical Reviewers
Berlin Ibadan
PREFACE xiii
ACKNOWLEDGMENTS xv
vii
viii Contents
1.5
I\320\233 Some Matrix Definitions 31
1,11.6 Comparisonof Solution Methods and Their
Convergence 32
I.I 1.7 Held Formulation Issues 34
3.1 Introduction 65
Program 89
Appendix 2: Useful Integration Formulae for One-Dimensional
FEM Analysis 91
Contents ix
Concepts
are presented
\321\201\320\265\321\200\321\214 including the Poynting, uniqueness, superposition, and duality
1
Contents
PROBLEMS:
CHAPTER 6 THREE-DIMENSIONAL
RADIATION AND SCATTERING 183
FE-BI
CHAPTER 7 THREE-DIMENSIONAL
METHOD 227
INDEX 337
The finite element method (FEM) and its hybrid versions (finite element-boundary
integral, finite clement-absorbing boundary match-
condition, finite element-mode
matching,etc.) is oue successfulfrequency
of the most domain computational methods for
electromagnetic simulations. It combines geometrical adaptability and material gen-
generality for modeling arbitrary geometries and materials of any composition. The
latter is particularly important in electromagnetics since nearly most applications
dealing with antennas, microwave circuits, scatterers, motor and generator model-
modeling,etc. require the simulation of nonmetallie/cumpositematerials. Also, the hybri-
hybridization of the finite element method with integral equation techniques leads \321\202\320\276
fully
rigorous approaches which combine the best aspectsof volume and surface formula-
formulation
techniques.
Because unique features, the finite element
of its method is becoming the work-
workhorse for
electromagnetic modeling and simulations. Many research and develop-
development codes are now available from universities and industry, and these have
demonstrated the utility and capability of the method. Also, a number of commercial
finite element analysis packages are currently available. Typically, these packages do
not yet incorporate the more rigoroushybrid versions of the FEM. However, they
are rapidly evolving to more sophisticated and capablepackages which incorporate
new technologies in geometrical modeling, simulation engines, and solvers.
With the increasing importance of electromagnetics simulation packages using
the FEM, this book should serve as a valuable text for students, practicing engineers,
and researchers in electromagnetics. The original goal of writing the book was to
serve as a text for beginning graduate students Interested in the application of the
finite element method and its hybrid versions to electromagnetics. However, the
authors also recognized a need to report (in a coherent manner) the many recent
xiii
xiv Preface
/, L. I'olukis
A. Chanerjee
L. \320\241
Krmpe!
June \342\204\22
Ann Arbor, Ml
Acknowledgments
Interest in Ihe finite clement method (FEM) at the Radiation Laboratory of the
University of Michigan began in 1987 by the first author and his graduate students.
The motivation was to model large domains without restrictions in geometry and
material composition. At thut time, two graduate students. Timothy J. Peters and
Kasra Barkeshli,had completed successful implementations of boundary integral
solutions using A-space methods. This O(.VlogiV) approach paved the way for a
fully OiN) linile element-boundary integral algorithm which combined the rigor of
the boundary integral for mesh truncation and the generality of the FEM for
volume/domain modeling. The first of these hybrid implementations was developed
by Dr. Jian-Ming Jin. a graduate ussistant of John Volakis, resulting in \321\217 highly
successful finite element-boundary integral computer program. Versionsof this code
are still in use by government, industrial, and academic researchersin the United
Stales. Another graduate student of ProfessorVolakis, Dt. Jeffery D. Collins, furth-
furthered this work to a body-of-revolution with an integral mesh enclosure. His later
students among them the co-authors. Dr. Chattcrjce and Dr. Kcmpei.and Dr,
Daniel C. Ross, Dr. Jian Gong, and Dr. Tayfun 6?demir\342\200\224made significant con-
contributions toward the understanding of 3D problems in antennas, scattering, and
microwave circuits.
The authors are indebted to the entire research group of ProfessorVolakis for
Tayfun Oxdemir for proofreading and in providing data and figures. The authors are
grateful to Dr. Sunfl Bindiganavale for co-authoring the section on fast integral
methods in Chapter 8. He also helped in proofreading various sections of Lhe manu-
manuscript. The acknowledgments would be incomplete without mentioning Mr. Richard
Carnes. whose expertise in LATEX made the typesetting of the book a much easier
task. Ruby Sowards typed some sections of the book, and Patti Wolfe helped in
preparing several figures. The authors are thankful to each of them.
xv
xvi Acknowledgments
A lot of encouragement was received from several people throughout the pro-
project.
The authors would like to particularly acknowledge Professor Donald G.
Dudley, Series Editor of LEEE Press, who was instrumental in publishing this
book with the Press and was supportive throughout the preparation of the manu-
manuscript. The comments and constructive criticisms of the early chapters and the final
manuscript by Professor Andreas Cangellaris. Professor Jin-Fa Lee. Dr. Daniel T.
McGrath, and the anonymous reviewersare very much appreciated. The authors are
also thankful to the entire IEEE Press staff {John E. Griffin, Linda C. Matarazzo,
criticisms. Leo Kempel would like to express his appreciation for the support and
patience provided by his wife, Cathy.
John L. Volakrs
Leu \320\241
Kempi'l
August 1997
Ann Arbor, Ml
Fundamental Concepts \302\246
Chapter I
eo = 2nf rad/sec since the finite element method for electromagnetics utilizes time-
harmonic fields. The interested reader is referred to one of the excellent general
electromagneticstexts cited in this chapter's introduction for a discussionof the
time-dependent form of Maxwell's equations. For our purposes,we begin with the
time-harmonic form of Maxwell'sequations.
The time-harmonic electric field is related to the lime-dependent electric field
that \342\200\224
(assuming j V\342\200\224T)
by
= + cos{<t>t +
xExq cas{cot + \321\203\320\225\321\203\321\212
\321\204\321\205) cos(w? + \321\204\321\203)
+ zE20 \321\204.) A.1)
z)
\320\251\321\205.
\321\203.
=
xExOeJ*' + yEytf* + \320\263\320\225\320\273\320\265^ A.2)
is referred to as the field phasor, and similar representations can be employed for the
other field quantities. Introducing these into the time-dependent Maxwell'sequa-
equations [5], we obtain a simplified set
V x E =-M-\321\203\302\253\320\264\320\235 A.4)
a,, A.5)
p A.6)
where the corresponding vector field and current phasors are
E = electricfield intensity in volts/meter (V/'m)
H = magnetic field intensity in amperes/meter (A/m)
J = electriccurrent density in amperes/meter2 (A/m2)
M= magnetic current density in volts/meter2 (V/m2)
and the two scalar charge phasors are
Both the magnetic current density (M) and the magnetic charge density (pm) are
fictitious quantities introduced for convenience.
Implied in these time-harmonic equations are constitutive relations for an
isotropic medium
D = eE = eoerE A.7)
\320\222
=/uH = \320\264\342\200\236\320\264\320\263\320\235 A.8)
J=crE A.9)
M = crmH A.10)
(Ml)
= 0 A.12)
V.M+./\302\253?>*
1:
f,, = free space permittivity
= #.854 x 10 farads/meter (F/m)
7
=
U\302\273 free space permeability = x
4\321\217 10 henrys/meter (H/m)
(.,
\342\200\224
medium's relative permittivity constant
=
\320\246, medium's relative permeability constant
G = electric current conductivity in mhos/m (jj/m)
am = magnetic current conductivity in ohms'm (fi/m)
The first two of these (e() and /x()) are fundamental constants while the others describe
the specific material. For example, er is a measure of the material's electric storage
capacity while a is a measure of the material's ability to conduct electric currents or
alternatively as an Ohmic lossmechanism. The relative permeability (ir and magnetic
conductivity am are the magnetic field analogues to e, and or. respectively. For the
purposes of the finite element method, all four of these material quantities may vary
spatially (inhomogeneous) and spectrally (dispersive).
Thecurrent densities J and M appearing in A.3) and A.4) do not include the
J = Jf + 4e = J, + aE A.13)
=
\320\234 = \320\234;
\320\234,+\320\234\320\263 + (\321\202\342\200\236\320\263\320\235 A.14)
where the subscript \"(\" denotes impressed currents while the subscript \"V refers to
conduction currents. When these are substituted into A.3) and A.4). the familiar
form of Maxwell's equations arc obtained
A.15)
A.16)
where
= - - =
iT j
\342\200\224
\342\202\254r = \302\253' /\302\253\" f'A -,/tanS) A.17)
and
A.1\302\273)
Any one of the representations given in A.17) and A.18) are likely to be found in the
literature with the quantities
tan<5 = \342\200\224
A.19)
\320\270
V x H = J, +joj\342\202\254E A.21)
where the phasor form of A.11) and A.12)was employed to rewrite A.5) and A.6) as
given above.
H d\\ =
\342\200\242
(J, +>*E) \342\200\242
dS A.25)
1 \321\201 ih
E d\\ =
\342\200\242 - (M,
\302\246
+\320\234*\320\235) dS A.26)
f f
where is the
\320\241 contour bounding the open surface S illustrated in Fig. 1.1 and
rfS \342\200\224
ndS. The circle through the single integral indicates integration over a closed
contour, whereas the same symbol through the surface integral denotes integration
over the closedsurface 5,.which encloses the corresponding volume V, The surface S
associated with the integrals A.25) and A.26) is completely unrelated to Sr which
encloses the volume V.
Expressions A.21) and A.22) imply six scalar equations for the solution of the
six components associatedwith E and H. Thus, for time-harmonicfields. A.21)and
nds
A.22) or A,25) and A.26)are sufficient for a solution of the electric and magnetic
any vector A. Equations A.21\320\2301 -28)can be easily modified for anisotropic material
fiH be replaced by f E and |f H. respectively,
\342\200\242 \302\246
as well. This requires that and
\302\253E
x
where ? and Ji represent 3 3 tensors[7].
In this text, open scattering and radiation problems will be considered.
Consequently, any valid and unique solution of the electric and'or magnetic fields
must also satisfy the Sommerfeld radiation condition, which describes the field
behavior at infinity
corresponding free space wavelength. This simply states that the Held is outgoing and of
the form e~^\"r/f as r \342\200\224*\302\246
oo.
wave equation.
Specifically by takingthe curl of A.2I) or (L22) and making use of the other*
the following vector wave equationsare obtained:
kbn,H =
V x _ + V x HI
-Jkn\320\223\342\200\236\320\234
IYJLHJ
where the upper set of equations are for solution of the electric field while the lower
set is for solution of the magnetic field. In A.30). er denotes the relative permittivity
of the media and indicates
\321\206\320\263
the reUtive permeability of the media. For free space,
these two quantities are both unity. Also. Zu = I / >'u = is the free space
\320\243\320\264\320\276/\320\261\320\236 wave
impedance. In materials other than free space, the wave impedanceand wavenumber
are given by Z = 1/ Y -= yjuje and =
\320\272 respectively.
\321\210^/\320\265\320\264.
=
\320\251\321\205
\320\247\321\205(\321\204\320\220) x A
\320\220+\321\204\320\247 A.31)
Wa %,E + x v x = -A2\302\273J
- V x A.32)
\320\253^-) E] Ij-i|
Fundamental Concepts \302\246
Chapter 1
H- +
*\302\247\320\264,\320\235 x V x Hi = -jka YQM + V x i A.33)
Wij j j
for the magnetic field. The significanceof this form of the wave equation is that for
homogeneous materials, the terms within the bracket are zero. Most implementa-
of the finite element method assume a homogeneous
implementations material within each finite
element and hence this bracketed term can be set to zero for those cases.
Another important version of A.30) for homogeneous media is obtained by
utilizing the vector identity
V x E = - V2E
\342\200\242
V x VV E A.34)
in A.32) to get
-V(V E) + V2E
\342\200\242
+ klfrfirE =jk{]ZonrJ + V x M A.35)
2 2
= 0 A.36)
These equationsrepresent three vector field components each of which satisfies the
Helmholtz or scalar wave equation
vV + /cV = 0 A-37)
Although for the majority of this text we are concerned with dynamic electromag-
some
electromagnetics, examples of static electromagnetics are included to illustrate basic finite
element principles. Therefore, we presentthe basic equations of electrostatics and
1.3.1 Electrostatics
E=-V0(r) A.39)
With this field egression A.39), A.38) reduces to Poisson'i\302\273 equation (here given in
terms of general scalar fields iind sources since Poisson's equation is also used for
magnelostatics)
V \302\246
I\302\253V#r)| =/(r) A.40)
Closed form solutions are available for only a limited number of boundury condi-
conditions [V]. Hence, it is usually more practical to employ a numerical method such as
the finite element or boundary element methods.
The potential attributed lo a volume charge is given by the integral relation
K'(r) = f
f
\320\223
^p-
GiD{t. r') dV' <1.42)
where the three-dimensional static Green's function is given by
and volume charges are denoted by p,,. This representation is the solution of A.41) in
unbounded space and a pictorial relationship of the primed and unprimed par-
parameters is shown in Fig. 1.2.
J-'or two-dimensiona! situations (e.g., the sources of excitation run from
z = -oo to z = oc and are invariant with respect to r), the following potential
integral is appropriate
~ G-^ir.t1)^' A.44)
where /i, denotes surface charges and G2n is the two-dimensional static Green's
function
2n \\\\r-
Theseintegral
relations are used to determine the potential. V. at some point
in space due to an impressed charge distribution.They are used to derive integral
equations for the lotal potential due to an impressed source subjectto boundary
conditions on surfaces within the domain. Such an integral equation (for surface
problems) is given by
Fundamental Concepts \302\246
Chapter 1
(a)
rt ds,
where = n\302\246
VK(r),
\320\255\320\230(\320\263)/\320\255\302\253 r')/8n'
=
\320\255\320\241(\320\263, n'- V'G(r,r') = -\302\253'\302\246
VG(r, and
\320\263') n'
denotes the outward normal to the integration surface S. Note that n = w(r') implies
that the unit normal is a function of the integration variables, where = \320\277{\321\202)
\321\217 is a
functicm of the observation variables.
For perfect electric conductors, A.46) can be rewritten in terms of unknown
surface charges. Specifically,by making use of A.39), and relation pje = \302\253-E=
\342\200\242
\342\200\224n
VK(r), we obtain the usual expression
1.3.2 Magnetostatics
where the static field density is assumed to be related to the magnetic field intensity
by the expression
(L49)
Note that in A.48), we have not assumed a fictitious magnetic charge density.
Rather, the fundamental sources of static magnetic fields are currents. J.
Wc can define a magnetic vector potential in terms of these currents
A(r) = /i f
\320\223
\320\223
J(r')G(r. V-\"
\320\263 U-50)
where the Green's function is the same three-dimensionalfunction used for electro-
electrostaticsA.43) or the two-dimensional function A.45). For the hitler cuse, the integral
must be reduced to a two-dimensional one over the domain of J. With the introduc-
of this vector
introduction potential. soJufiun of A.48) wilh ().49) yields the expression
Surface equivalent currents are very useful in thv formulation and execution of a
numerical solution of Maxwell's equations. Their introduction can be readily justi-
justified in the context equivalenceprinciple,e.g..two sources thut produce
of the surface
the Mime field within a region are .said lo be equivalent within that region.
The surface equivalenceprincipleslatesthat the field exterior (or interior) to a
given (possibly fictitious) surface may be exactly representedby equivalent currents
placed on that surface and allowed lo radiate into the region external {or internal) to
that surface. For the exterior cast;, these equivalent currents are given in terms of the
total exterior (E. H) fields while the interior liekb\302\273are assumed to be ?cro (this is
Love's equivalence principle). The appropriate currents for representing the fields
exterior to the surface are given by
10 Fundamental Concepts \302\246
Chapter 1
n x H = J
A.52)
For the interior fields, the negative of A.52) are used.The radiated fields due to these
equivalent currents are given by the integral expressions
\302\246
ti x H(r')dS' A.53)
'II = A.55)
where / = xx + yy + zz is the unit dyad and the corresponding scalar Green's func-
function is given by
AnR
A.56)
Also,
V x G0(r. r') = -V x [lG(i(r,r')] = -VG0(r.r') x / A.57)
E,H E,H
Js=rixH
Sources
(a) (b)
A.58)
H(r) = \320\231
[-J(r')
x VGn(r. r') -i ^ M(r')&'0(r. r')
I
A.59)
'kitZa
[M(r') x
'
-ji-\342\200\224
lj(r')
\302\253\320\276\320\273
(A-(l\302\253)-J
H -E.
\342\200\224 J -> M. M -> -J. ,u -> e, f -> /./, Zo \342\200\224 Ko -\302\273
K\302\253, Z,().
We cun rewrite A.58) in a. less singular form by noting the identities.
VG = -V'a,
A.61)
to deduce that
J(r') \342\200\242
WC/,,(r. r')dS' = -vli J(r') -
V'<7,,(r. r')(fS'
V .J(r')VGnU.r')clS'
12 Fundamental Concepts \302\246
Chapter I
which is most commonly used for integral equation numerical solutions and is also
valid for open surfaces since the normal components of J to the perimeter edges
of the surface vanish. The correspondingH field expression is again obtained by
duality.
For far zone computations (r -* oo>, the Green's function A.56) can be sim-
simplified as
**'*> A.63)
^
Using this in A.58) and A.59), carrying out the vector derivative operations, and
retaining only the terms that decay1 as C?(I/r), we get
E(r) ft\302\273
>0
e-jL
\320\257
[+
f x M(r') + Zof x(rx J(r'))]e**'* dS' A.64)
H(r) *\320\2240
\320\246
\342\200\224^
[-? x J(r') + Yof x (r x M(r'))] ****'*dS' A.65)
Theseare referred to as the fai zone field expressions and are typically used for the
evaluation of antenna radiated fieldsor for the calculation of the radar cross section
(RCS)ofa target. An acceptable criterion for using A.64) and A.65) is
? \342\200\236.\3
where D is the largest antenna or target dimension. In this case, the phase error in the
intervening approximations is maintained at less than f.
A typical setup for comput-
the
computing radiation from volume sources at points in the near and far zones is depicted
in Fig. 1.2.
Figure 1.4 also shows the spherical angles commonly used in electro-
electromagnetics and to be used throughout this text.
can be imposed for relating the electric or magnetic fieldsin the exterior region to the
magnetic or electriccurrents,respectively. The radiated field expressions, A .Si) and
A.54), now simplify to
implying that only a single current is requiredfor the representation of each field
quantity.
One use of this Green's function is in the calculation of the scattering by a
perfect magnetic conductor (a fictitious material) through the use of equivalent
electric currents.
If the Neumann boundary condition is imposed
the resulting field integral expressions are again given in terms of only one current
This Green'sfunction (G2) is useful for calculating the scattering by a perfect electric
conductor using electric currents. Also, this Green's function will be used to relate
the electric field quantities of an interior finite element formulation to the magnetic
field of the
bounding surface.
_ _
The above dyadic Green'sfunctions, Gi and G2, are commonly termed the first
and the second kind dyadic Green's functions, respectively. A good discussion of
dyadic Green's functions used in electromagnetics is given in [10].
The above expressions are for three-dimensional fields. In the case of two-
dimensional fields (e.g., one dimension, such as r, is invariant), similar expressions
are used. These are scalar and typically written in terms of TM. and \320\242\320\225.
polariz-
14 Fundamental Concepts \302\246
Chapter I
E. = i t')]dl' A. 73)
~'*<>Z\302\260I H,(r')[G2n(r,
?-<r')|\"^7 G2\302\260(t-r<)]dV
H = 2 x V?r, A.76)
-\321\204- \320\225=^!\321\205\320\243\320\257;
the expressions presented in this section are given for currents radiating
All of
in free Fields within
space. a homogeneous media can be determined by replacing Ao
and
with \320\272 Zo with Z in all of these expressions. Throughout this text, will denote
\320\272
(H,
- H,) \302\246
/ = [J,
\342\200\242
(n, x /)] \320\224\320\220 A.77)
which is valid provided eE is finite at the interface. When A.26) is applied to the
- E2) \342\200\242
i = -[M, \342\200\242
x /)] \320\224\320\220
(E, 0?! A.78)
Section 1.5 Natural
\302\246 Boundary Conditions IS
(a)
Infinitesimal volume
enclosed by Sc
Medium B
(E2.H2)
(b)
Figure 1.5 Geometries Torderiving the boundary conditions (a) for tangential
components, and (b) Cor normal components.
A.79)
= \320\234,-\320\224\320\271
\320\234\342\200\236 A.80)
\320\273,
\321\205(\320\235,-H2)
= Jit A.81)
16 Fundamental Concepts \302\246
Chapter I
w, x(E, = -\320\234\320\271
-\320\225\320\273) A.82)
The quantities Jft and M,> are referred to as the impressed electric and magnetic
surface current densities in A/m and V/m, respectively, at the interface. Nole that if
E2 and H; are zero, these conditionsare identical to A.52) except that in this case Ju
and \320\234/,refer to actual impressed currents rather than equivalent currents.
To generate the boundary conditions correspondingto A.27) and A.28), we
select St, to be the surface of a small pill box, shown in Fig. 1.5(b), enclosing the
volume V. The pill box is positioned at the dielectric interface so that half of its
volume is in medium I and the other half in medium 2. It is again assumed that
0 so that only its
\342\200\224\302\273\342\200\242
\320\224\320\220 flat surfaces need be considered in performing the integra-
integrations. Through direct integration of A.27) we obtain the interface conditions
= pm A.83)
\321\217,-(<=|E, -e2E2) = A A.84)
where p, denotesthe unbounded electric surface charge density in C/m\" at the inter-
interface and pnvt is the corresponding fictitious surface magnetic charge density in
Wb/m2.
The boundary/interface conditions A.81HL84). although derived for time-
harmonic fields, are applicable for instantaneous fields as well. In the time-harmonic
case, only A.81) and A.82) are required in conjunction with A.23) and A.24) fora
x (E,
\320\270,
- E2) = 0 A.86)
\321\217,-(/ijH,-\320\2642\320\2352)
= 0 A.87)
=
\320\273, -\342\202\2542\320\2252)
.(\320\265,\320\225, \321\200, A.88)
The first two of these state that the tangential electric fields are continuous acrossthe
interface whereas the tangential discontinuousat the
magnetic fieldsare same loca-
location by an amount equal to the impressed electriccurrent. Unlessa source (i.e., free
charge) is actually
placed at the interface. Jfc is also zero and in that case, the
tangential magnetic fields will be continuous across the media as well.
When medium 2 is a perfect electric conductorthen E2 = H2 = 0. In addition,
=Jft
\320\233|\320\245\320\235, A.89)
\320\273,
=
.(\320\265|\320\225,) \321\200\320\273 A.92)
The first two of these now imply that Ihe tangential electric field vanishes on the
surface of the perfect electric conductor whereas the langential magnetic field is
In the previous section, the boundary conditions which must be imposed at the
interface of different dielectrics were presented.Sometimes,it is difficult to utilize
these conditions sinceexcessive computational cost is required or the resulting for-
formulation is numerically unstable such as the case of a thin dielectric sheet. In many
cases, much simpler approximate boundary conditions that account for the presence
of an inhomogeneous medium, coaled metallic surface, or a thin dielectric layer can
be employed to simulate the actual surface. Below we discuss two types of such
approximate conditions:impedance boundary and sheet transition conditions. The
interested reader is directed to [II] for a general treatment of approximate condi-
conditions.
\321\205\320\235 A.94)
reflected field attributed to A.94) is identical to that due to the natural boundary
conditions. Then,
nrr
A.95)
This is exact for an infinite planar interface while it is approximate for a curved
\\lm{yfcift\\kt,r,-\302\273l A.96)
where Im(-) denotes the imaginary part of the complex argument and the principle
radii of curvature, is associated
/\342\200\242\342\200\236 with the surface at a point.This condition assures
that the material is sufficiently lossy so that the fields which penetrate into the
material does not re-emerge at some other point.
18 Fundamental Concepts \302\246
Chapter 1
Impedance
surface
(a)
Impedance
surface
(\320\253
Impedance
Dielectric surface
(e,M) \320\273
coating
Perfect conductor
(c)
shorted transmission line model with length corresponding to the coating thick-
thickness, t:
A.97)
The SlBCs can be applied for modeling surfaces whose material properties vary
slowly in the transverse plane. For a planar interface, the coating can have a varying
composition in the normal dimension, and Rytov [12] found the following
impedance
I \320\257 1
is where N = y/]Z^~r
useful is the index of refraction and the normal derivative is
applied at the surface.
More accurate approximate conditionscan be developed by incorporating
higher order derivatives in their constructions. These are referred to as Generalized
Impedance Boundary Conditions (GIBCs), and these are discussedin [11].
J = <tE A.99)
J., = rJ A.100)
and from A.99)
n x x + =
[\320\231 (E+ \320\225-)] -22\320\223\342\200\236/\320\263\320\233
x [n
\302\253 x (E+ + E\]")= -2ZaR,.nx (H+
-
H)
, A.104)
n x (E+ - E~) = 0
As long as the loss in the layer is sufficient to assure that no multiple field penetra-
penetrationswill occur, these resistive transition conditions may be used for curved layers.
The dual to the resistive sheet condition is the conductive sheet condition which
The normalized conductivity of this sheet is denoted by Rm with units Mhos per
square. This condition is requiredfor the simulation of materials which have non-
trivial permeability. Also, a specialcombination of coincident resistive and conduc-
sheets
conductive with respective resistivity and conductivity
J=jk0Y0(er-\342\204\226 A.1071
and A.100). It follows that the tangential components of the field are given by
E, = Z0ReJ,, A.108)
with
R,.= . ^ n A.109)
koz(er- I)
A dual conductive sheet is given by
is known as the complex Poynting vector and has units of Watts/m2. It represents the
complex power density of the wave, and it is therefore important to understand the
source and nature of this power. To do so, we refer to A.21) and A.22), where by
H* \342\200\242
V x E = -M/ \342\200\242
H\" -jcoixH
\302\246
H* = -M' \342\200\242
H* ->\320\274|\320\235|2 A.113)
V \342\200\242
(E x H*) =yW|E|2 ->m|H|2 - j; E - M,
\302\246
H* A.115)
which is an identity valid everywhere in space. Integrating both sides of this over a
volume V containing all sources, and invoking the divergence theorem yields
A.116)
(E x \320\251 ds =
\342\200\242
Pei + Pmi -Pd A.117)
s,
X- 1\321\202<\320\246
(E x H*) \302\246
ds
-
lo^W, - Wm\\
-i Im f f [j;
\342\200\242
E + Mr H*]dv (I.I 18)
f
where
1 f f f
Pej
\342\200\224
\342\200\224r\\ Re(J*
\342\200\242
E) dv \342\200\224
averaging outgoing power due to
J J Jv current J
the impressed A.119)
^mi
\342\200\224
Re(M,
\342\200\242
H*) dv \342\200\224
average outgoing power due to the
~~Z)\\\\\\
\342\200\242'\302\246'\342\200\242'v
impressed current M, A.120)
Wc =
- I I I ener|E|2 dv = average electric energy stored in \320\243 A.122)
(ExH')-rfe A.124)
convenient method of analysis will yield the correct solution to the problem.
The most common form of the uniqueness theorem is: In a region \320\243
completely
occupied with dissipative media, a harmonic field (E. H) is uniquely determined hy the
impressed currents in that region plus the tangential components of the electric or
magnetic fields on the closed surface Sc bounding V. This theorem may be proved
by assuming for the moment that two solutions exist, denoted by (E],H|) and
(Ei, Hi). Both fields must, of course,satisfy Maxwell's equations A.21) and A.22)
with the same impressed currents (J,, M,). We have
V x H, = J, +>eE,. V x H2 = J, +jaxE2
V x Ei = -M/ \342\200\224j<afiHlt V x E2 = -M, -jto(iH2
A.126)
A.127)
1.9 SUPERPOSITION
THEOREM
The superposition theorem states that for a linear medium, the total field intensity
A.130)
A.131)
By adding these two sets of equations, it is clear that the total field due to both
sources combinedis given by
A.1 1.131),
\320\227\320\236\320\235 respectively.
The duality theorem relates to the interchangeability of the electric and magnetic
fields, currents, charges, or material properties. We observe from A.3) and A.4) that
the first can be obtained from the second via the interchanges
M->- -J
HE:HE
J-> M
The duality theorem can reduce formulation and computational effort when
1.11 NUMERICALTECHNIQUES
?u-f =0 A.135)
subject to appropriate boundary or transition conditions
B(u) = 0 A.136)
within the domain (fi) and on its boundary (Sr = dQ). In these, the operator ? is
based on oneof the following: an integral representation of the fieldssuch as (I.52)-
A.53), on the vector wave equation A.30), or the Helmholtz equation A.36) for
scalar fields. It is understood that \320\270
must be replaced by a vector field u when dealing
with the vector wave equationsA.30) or A.35). The forcing function/ is a known
Unfortunately, very few analytical solutions for A.135) are available in elec-
electromagnetics. One such solution, the fields due to a magnetic dipole in the presence
of an infinite metallic plane or cylinder, will be used in Chapter 7 to form the
appropriate dyadic Green's function for those geometries.However, most useful
electromagnetic scattering and radiation problemscannotbe solved using analytical
methods. Rather, an approximate numerical solution is sought which in some way
closely resembles the exact solution. Two methods of formulating such an approxi-
approximatesolution are: the Ritz method and the method of weighted residuals.
stationary point of a variational functional. For operators which are self-adjoint and
positive-definite (see later subsection for definitions), the stationary point of the
following functional
<a.b)= f
abdu A.39)
.In
The choiceof this inner product extends the validity of the varialional expressions to
vectorial fields. When the operator Cit and/ in A.137) are chosen as
2The method was originally introduced by Ritylcigh in 1877 and was extended by Ritz in 1909.
Section 1.11 Numerical
\302\246 Techniques 25
i = A.140)
Vxl-l-=-=|-*fcru
Mr J
/\\\321\217\\
A.141)
it can be shown that setting the first variation of F(u) to zero is equivalent to
satisfying the vector wave equation A.30) over the computational domain ?2.
Similarly, when
A.142)
setting the first variation of F(u) to zero is equivalent to satisfying the inhomo-
geneous Helmholtz wave equation
in which Vm \342\200\242
=
\321\217 denotes
\320\241
\320\254\320\270/\320\264\320\277, the contour enclosing the region ?2 (see Fig. 1.8)
and h is the unit normal vector to Note
\320\241 that in deriving A.145) we used the
identity
=
V\302\253) -Vm \342\200\242
V^ + V \342\200\242
(ifVu) A.146)
=| Vu-nds A.147)
a ic
Next we proceed to evaluate the first variation of F(u) given by
SF = -
F(u + \320\220\320\270) F(u) A.148)
where 0 is
-*\302\246
\320\224 a scalar quantity. The evaluation of SF involves the quantities
= \320\2702
+ \320\220\320\270I
(\320\274 + (\320\224\320\270J
+ 2(\320\224\320\270)\320\275 A.149)
[Vh +
\342\200\242
V( \320\224\320\270)]
[Vm + V( Aw)]
= Vm \342\200\242
Vm + 2V( \342\200\242 \342\200\242
Vw + V(\320\224\320\270)
\320\224\320\274) V(\320\233\320\270)A150)
\320\264\320\270
3(\320\224\320\274) \320\264\320\270 \320\255(\320\224\321\213) .
\320\264 \320\220
F(u + % \342\200\242
Vm + klu2] dU - fudU
\\ f
\320\224\320\270) [-V\302\253 f [
2JJq JJq
9m
\342\200\236
\302\246
Vm
+ \320\224 f [-mVm + katr]dQ
I
F(u + \321\212
F(u)
\320\220\320\270) + A
[ [
m[V
\342\200\242
Vk + k\\u -f] </?2
-\320\233 \320\270*\320\250 +
+ A.153)
Jc dn 2 Jcl
\302\2611\\\320\270? \320\270?\\\320\260
\320\255\302\273
BnJ
where we also used the divergencetheorem A.147)and the identity A.146) to obtain
the second and third terms. Clearly, the last two terms in A.153) cancel each other
leading to
SF = F{u + -
\320\224\320\260)F(u)
= \320\224
f f h[V2m + klu -JidQ A.154)
of N basis functions
Section 1.11 Numerical
\302\246 Techniques 27
= <\320\230\302\273\320\223<\320\235'} A.156)
\320\233\">
./=1
[w)fda A.157)
JJ
where we used the innerproduct definition
=
}. (\302\253}) [u)T{v]
for discrete data vectors. This functional is extremized by allowing all partial deri-
derivatives with respect to the coefficients, to
{\320\274}, vanish
A.158)
The elements of the matrix [A] and excitation vector \\b\\ are given by
A.160)
\\\\
b, = w,fdQ
that no physical significance can be attached to the stationary point of the functional
A.137). In mechanical systems, for example, minimizing this functional represents
minimization of the total potential energy of the system. However, since electromag-
involves
electromagnetics complex quantities, such a statement may not be asserted.
U = E
?(u)=(Vx(^ \302\246Vx\302\273)-|1\302\260' A.161)
- x \321\204.~[\342\200\242 u = E
I ->^oJ
V M),
f = A ]62)
+ Vx(?;'.J), u = H
I ->e0M
28 Fundamental Concepts \302\246
Chapter 1
F(H) = lf [VxH-f;'-VxH-^H-^H]f/r- f
H(dV A.163)
where
\320\241\321\203\321\205
\342\202\254\320\243\320\243
are the permittivity and permeability tensors of the media and \320\257
represents a
volume. In general, for arbitrary anisotropy, this functional will lead to an asym-
asymmetric (non-Hermitian) system. One way to obtain a symmetric system is to use the
functional
= <?u, ua)
- (u. fa) -
(u0, f) A.165)
where ua and f(l satisfy the partial differential equation
)= <v,?eii> A.167)
The method of weighted residuals [17], [18] begins with the residual
A.168)
[t,C[w)r{u)
-
tif] d?l=0, i= 1.2,3 N A.169)
f
Jo
I
f
w,C\\Vjdn\\ {u)=\\ w,f d& A.170)
Section 1.11 Numerical
\302\246 Techniques 29
which is identical to the Ritz procedure given above. Thus. Galerkin'smethod leads
to the same linear system A.159)as the Ritz method.
As a generalization, when F(u) is chosen as
= i
F(M) (?\302\273.\302\253*)-(/,\302\253*) A.171)
the matrix [A].They are often used in evaluating the numerical system's condition,
which in turn affects the stability of the solution. That is, of interest is how a small
change in the excitation or right hand side of the matrix system (I.I59) affects the
=
{\320\270} {\302\253|,Mi-U3,...,uN)T is the Euclidean norm. It is defined by3
llulh = ,J = !\"}>
((\302\253}.
= MT[u) A.172)
112
= = }r[u*
{u}r[u*\\ A.173)
for complex vector (\320\270).Here \\ut\\ implies the absolute value of the quantity.
Throughout the book, the notation will
f|\302\253|| imply the Euclidean norm of a vector
or data column unless otherwise noted.
Infinity Norm, The infinity norm of a data vector is defined by
=
||\320\270||,*, max of |m,| \\<i<N A.174)
This norm is also referred to as the uniform vector norm or maximum magnitude norm.
H6lder Norm. The Holder or p-norm is a generalization of the Euclidean
| {>p
(\320\25375)
X>
In
where denotes
|\321\213/[/* the /?th power of the quantity \\u,\\.
\\\\A\\\\F
= (LI76)
= \321\202\320\272
\320\276\320\223 (\320\25377)
\\Y,\\Aii\\\\
This specific norm is also referred to as the row-sum norm. Similarly the column-sum
matrix norm is given by
\\\\A\\\\X
= max of f
53\\\320\220-\320\233
A.178)
The infinity norm is referred to as the natural norm of [A]. It can be shown that
\\\\A\\\\X
= max(||M](u}||) = max of 0-179)
\\Y,\\*A
Cond(^) = \\\\A\\\\\\\\A-11|
>
?dh1 A.180)
\\\\A-AA\\\\ .
= 10\"
1\320\230\320\230
||u \342\200\224
uAll
,0-,
where
That is, s is always /, implying a larger error for the final solution
smaller than
vector. More specifically, in the seventh decimal place for the norm
an error of [A]
(i.e.. / = 7) translates to an error in the third decimal place for (the norm of) the
1.11.5SomeMatrix Definitions
> 0.
|u| 0
|\320\270)\320\263[\320\233]|\321\213|
\321\204 Positive-definite
>
0.
(u| 0
|\320\270}\320\263\320\230]{\320\270|
\320\244 Nonnegaiive
7'rMI\302\253)) <0 Indefinite
32 Fundamental Concepts \302\246
Chapter
Given the natureof the malrix or operator, we can immediately make a state-
statement about the eigenvalues of that operator. Some of the most common relation-
relationshipsbetween operators and eigenvalues are given in Table 1.2.
(/li^'v x u)
\342\200\224
kleru do not guarantee positive-definiteness. That is, if the operator
is not positive-definite, the Rayleigh-Ritz method fails to ensure minimization of the
functional since a global stationary point may not exist. However, the application of
Galerkin's method to yield a discrete system does not require that the operator is
minimize -
Rayleigh Ritz \320\270)(/'.\302\253)
\\ {\320\241\320\270,
Galerkio solve - (/.
(\302\253\320\241\320\270. = 0
ttj) \302\253/>
\342\200\224
Least Squares minimize Q{u) = {\320\241\320\270
\320\241\320\270
\342\200\224f. f)
'Hilbert space refers to a linear space where a given intcrproduei has been defined and which is
complete with respect to this interproduct.
6Because of the complex t, and y.,, in electromagnetics, the operators may be symmetric but not
Hcrmitian (i.e., self-adjoint).From Section 1.11.5, Hermitian operators have positiveand real eigenvalues
and are therefore positive.
Section 1.11 Numerical
\302\246 Techniques 33
v) =
{\320\241\320\270, {u, Cv) A.183)
and is referred to as the adjoint operator. Clearly, A.182)is obtained by multiplying
the right and left hand =f by C. The new
sides of \320\241\320\270 operator V, where
Vu = g A.184)
(g = Cf) is now positive-definite and self-adjoint. The corresponding matrix system
resulting from A.184) is of the form
) A.185)
or
[B]M = {g}
where [B] It is thus seen that the desired property of positive-definiteness
= [A*]T[A].
comes at the
price of squaring the matrix condition number. As is well known, large
matrix conditions lead to less accurate solutions and slowerconvergence when an
iterative solver is used.
It should be remarked that minimization of the functional for the Least
Squares Method is equivalent to solving the differential equation A.182).
Consequently,the Least Squares Method leads to positive-definitesystemsat the
expense of squaring the matrix condition. Also,the Least Squares Method minimizes
the square of the norm (as -*\302\246
\320\277 u), viz.
Fundamental Concepts \302\246
Chapter I
Again, it is important to note that nothing can be said about convergenceunless the
operator is positive-definite.
The finite element method can assumevarious forms depending on the desired
field quantity. Many applications prefer either a total or secondary electric field
formulation. Other applications desire a result in terms of either the total or second-
magnetic
secondary field. Some applications can utilize a potential formulation. Thus, even
though Maxwell's equations relate these various quantities, an accurate field com-
computation often demands a particular formulation. The advantages and disadvantages
of each of these formulations are discussed below.
The total electric field formulation very popular choice. This is because
is a
enforcement of the
boundary conditions associated with perfect electric conductors
(pec) is particularly easy. Since the tangential electric fields on a pec surface must
vanish, the edges of the mesh associatedwith those surfaces are a priori set to zero.
Three methods are commonly used in practice to enforce this condition. The first is
accomplished by forcing a null field condition to zero out all entries of the matrix
associated with that edge (except for the self-term which is set to unity), and by also
setting the excitation entry to zero. Thus, as the unknown fields are solved, the edges
lying on pec surfaces are forced to zero. The second method involvesa preprocessing
step where the edges associated with a pec surface are removed from the list of
unknowns. Thus, the number of edges greateris than the number of unknowns
and matrix entries for these pec edgesare never computed. This approach has the
advantage of reducing the order of the matrix and therefore reducing memory and
compute cycle demands. The third method, useful when an iterative matrix solver is
employed, involves forcing the unknowns associated with the pec edges to zero
computational burden. However, this is not the only valid formulation and in certain
boundary conditions (see Chapters 4 and 6). However, they also have an added
advantage when a boundary integral is used for mesh closure. Experiencehas
shown phase errors in the computed
that interior field tend to increase within the
mesh locations
at distant from boundaries on which boundary conditions arc
imposed. This is due to unavoidable numerical inaccuracies t hat increase as the effect
of the boundary conditions propagate throughout the mesh. That is, previous errors
in the adjoining field are incorporated and magnified as the field is evaluated at a
more distant field point. Since boundary conditionsalways are enforced with total
fields, the total field formulation enforces such conditionsonly on the boundaries of
References 35
the mesh and pec surfaces (for E-field formulations). For very large computational
domains,significant distance can lie between a field point within the interior of the
mesh and the mesh boundary.Hence,the potential for error propagation throughout
the mesh. A scattered field formulation enforces the boundary conditions on the
mesh boundary, pec surfaces, and dissimilar material interfaces. Therefore, the dis-
distance between a boundary condition and any interior field point is reduced and
accordingly the phase error throughout the mesh may be also reduced.The scattered
field formulation has the disadvantage of higher matrix order (i.e., more unknowns
and equations) and explicit enforcement of the boundary conditions associated with
pec surfaces and material discontinuities.
Magnetic field formulations are preferred for applications where the desired
result is the magnetic field within the computational domain. This is due to the fact
that although Maxwell's equations relate the electric and magnetic fields, in practice
one quantity cannot be accurately obtained from the other by numerical differentia-
This
differentiation. is due error occurring when
to the inherent continuous derivatives are
replaced with discrete Rather, a computationally expensive integral
differences.
expression is necessary for accurate field differentiation provided a suitable
Green's function is available. Hence, an accurate solution demandsa formulation
consistent with the desired result.
Finally, some finite element practitioners utilize a potential formulation which
employs the scalar or vector potentials as the unknown quantities (see Chapter 5).
The use of this approach is related to the hybrid finite element-boundary integral
method where the singularity of the integral equation associated with the boundary
can be reduced with a potential formulation. However, if this reduction is present,
we note that a numerical differentiation operation may be requiredto obtain the
desired field quantity and this operation may lead to inaccuracy.
REFERENCES
[2] R. E. Collin. Field Theory of Guided Waves. IEEE Press, New York, 1991.
[3] C. A. Balanis. Advanced Engineering Electromagnetics. McGraw-Hill,New
York, 1989.
York, 1968.
edition, 1993.
The finite element method is used for modeling a wide class of problems by breaking
solely specifying the basis functions. The element choice, however, needs human
intervention and intelligence to ensurea reliable solution of the problem at hand.
As will be shown later in this chapter, the development of a specialclassof elements
which mimic the character of electric/magnetic fields has proved to be the key in
obtaining robust solutions to three-dimensional problems electromagnetics.
in
shape functions have been used extensively in civil and mechanical engineering
applications as well as in scalar electromagnetic field problems. However, a full
three-dimensional vector formulation brings out numerous deficienciesin these tra-
traditional element shape functions [1], [2]. Edge-basedvector basis functions with
unknowns associated with element edges have thus been derived overcome
to the
problems related to nodal basis and theseare now extensively used for solving three-
dimensionalelectromagneticsproblems.We will also describe the hierarchical nature
of the edge-based functions and their possible applicability in \321\203\321\202-based refinement
techniques.
37
38 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2
property
distinguishing of PDE techniques and leads to very sparse matrices in finite elements,
whereas IE techniques give rise to full, dense matrices resulting in poor scalability as
problem size increases.
or when the extra terms are symmetric with respect to one another, as in the follow-
incomplete
following third-order polynomial
=
\302\253(\320\273-,
\321\203) C[ + c2x + W + +
\320\263**2 Cixy + c6y2 -f c7a:2.v -f- t^xy1 B.2)
Such approximation functions have the characteristic that, for fixed x or y, they are
polynomials that will yield the highest order of approximation for a minimum num-
number of unknowns associated with that element shape. The two examples shown above
apply to two dimensions, but their extension to three-dimensional elements is
straightforward. Typically, the higher the order of the approximating polynomial,
the lower the error in the final solution if element size remains constant.As usual,
there is a trade-off here between the desired accuracy and the degrees of freedom
required to solve the problem.
2.2.3 Continuity
nth order are said to be C\" continuous. For elliptic PDEs of order 2k (k = 1,2),
the continuity requirement is C*\021 for Galerkm methods. In most electromagnetic
problems,functions which exhibit C\302\260
continuity (i.e., function continuity) are used
since the discontinuous first derivatives are integrable. However, it is difficult to
Section 2.3 Node-Based
\302\246 Elements 39
impose continuity of order 1 and higher since the determination of suitable shape
functions is very complicated. For example, ninth-order polynomials are required to
obtain C1 continuity for tetrahedral elements. In electromagnetics, Wong and
Cendes [3] used C1 node-based triangles to avoid the problem of spurious modes
in the determination of cavity resonances. AH shape functions derived in the follow-
sections
following impose function continuity or C\302\260
continuity (not derivative continuity)
between elements.
In node-based finite elements, the form of the sought function in the element is
controlled by the function values at its nodes. The approximating function can
p
v,.v) B.3)
Since the expression B.3) must be valid for any nodal variable uj\\ the basis function
N\"(x.y) must be unity at node / and zero for all remaining nodes within the element.
and more systematic to construct higher order bases in the Lagrange family while
structure; for example, the bounding curve of the cross section of an infinite cylinder.
These basis functions can also be used in conjunction with higher-dimensional finite
with endpoints .V| and xi. The basis functions for element e are then defined as
\342\200\224
x
x\\
vf _ ,-e
The basis functions have unit magnitude at one node and vanish at all others with
linear variation between the nodes. Higher order basis functions can be constructed
40 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2
are the nodes of the one-dimensional element, and we are interested in finding the
basis function for the /ah node x%. then the corresponding Lagrange polynomial
describing this basis function is given by [4]
{x-x4)(x-x'2)...(x-4-\\){x-xUi)---{x-x>tt)
2.3.2.1 Rectangular and Quadrilateral Elements. The simple shape of the rec-
rectangular element permits its shapefunctions to be written down merely by inspec-
inspection. On examining the element shape given in Fig. 2.1. the shape functions can be
cast in the form
where and
\320\273? y* denote the coordinates of the midpoints of the element edges, l(x and
hy represent the edge lengths and Ae denotes the area of the element. Each basis
function NJ has unit magnitude at the /th node, vanishes at the remaining three
nodes,and varies linearly. Hence it can be used m B.3) to represent u\". Higher order
rectangular elements include the eight node (three equispacednodes per edge)and
the twelve node (four equispaced nodes per edge)quadrilateral element discussed in
[4]. However, these elementscan model only regular geometries and decline in accu-
accuracy with excessive shape distortion. Thus, often they are not very useful in practice.
y2 = a' + b'-c'-d'
+ b' + c' + d'
B.6)
=
.*\302\246, a + b + eH-d. yy
= a'
xA=a-h + c- d, v4
= -h'
\302\253' + c' - d'
On solving for the unknown coefficients in the above equation, the basis functions
can be cast in the following form
/= 1. B.7)
where ?0
= ff, and = and
\321\211 \321\211,
=
\321\203 B.8)
The variables (&, rj() denote the coordinates of the /th node in the (?, rj) coordinate
system. The linear quadrilateral is also known as an isoparametric element since the
shape functions defining the geometry and the nodal values are the same.
4 3
= -1 5=1
Higher order quadrilateral elements include the eight-node element (four cor-
corner nodes and four midside nodes) and the twelve-node element (four equispaced
nodes per edge).The basis functions for such elements can be found in [5].
Due to the irregular shape of the quadrilateral element, it is not easy to inte-
integrate the basis functions in the xy plane. To facilitate and generalize the integration
process, the conceptof the Jacobian is introduced. The Jacobian matrix, or more
specifically its determinant, transforms the infinitesimal area or volume element from
one coordinate system to another.If we consider the above example, we are essen-
essentially transforming between the global (x, y) coordinates and the local (f, 17) coordi-
coordinates. By the chain-rule of partial differentiation, we can express the f derivative of
Nf as
3N[_-dN[0x \320\250[\320\264\321\203
\321\215\320\273-
\320\267\302\247 a? By a?
8N4/dr) J J
\\ dNf/\320\255\321\203
[\320\264\321\205/\320\264\320\263]
\320\264\321\203/\320\222\321\206] J| BNf/By J
Since (x, y) are known explicitly in terms of the local coordinates (?, rj), the Jacobian
matrix can be found explicitly in terms of local coordinates. Care must be taken in
the choice of the local coordinate system such that the Jacobian matrix is non-
singular. To find the derivatives with respect to x and v, we merely needto invert
the Jacobian matrix to yield
{2\320\251
\\dN!/<>y\\-[S] \\BNf/Bn)
The technique can easily be generalized for n-dimensional transformations if necess-
necessary.The infinitesimal area element <IA can now be written as
ctxdy
= det[J]d^di] B.11)
I \321\205 \321\203
I
=\302\246 l
\321\207 4 /2 B.1-
2J
Rgm\302\273 Triangular element.
Area/\302\27323
41 I '
A2_AreaP31
~
*~
\320\224 Area 123
._ \320\2243 Areafl2
3 ~ ~_
~A Area 123
(=1
> J>
i=l
?>,
/=l
B.14)
+ \320\2243
\320\224\320\263
= \320\224.
Alternatively, \320\246,IX,
and L\\ can be obtained in terms of x, y, and
the vertex coordinatesby solving the system of equations B.14).
The coordinate\320\246is zero on the edge opposite to vertex 1 and unity at vertex 1.
Its variation along the height of the triangle is displayed in Fig. 2.4. The remaining
two area coordinates associated with the other two vertices behave similarly, vanish-
on
vanishing edge opposite to the corresponding vertex
the and having unit magnitude at
the vertex it belongs to. This feature combined with spatial locality and C\302\260
continuity
qualifies the area coordinates as suitable basis functions N]' for a triangle when the
Nf^L'l B.15)
Higher order basis functions for triangles can be derived using the procedure
given in [4] and [6]. In general, the shape function /V/ for node /, labeled as (/, /. K),
is given by
= n B.16)
44 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2
t,=0
=
\320\2460.25
\\/ \\ \\ \\
\\ \\ \\ \\
Figure 2.4 Area coordinates of a triangle.
where i
\320\246,
= I e are area cordinates defined previously and is the
\320\240\"(\320\246) poly-
polynomial
j -
.t=o
B.17)
Using the formulae given above, the basis functions for a quadratic triangle
N* = -
\320\246B\320\246 I), i = I. 2, 3 CORNER NODES
B.18)
M1DS1DE NODES
8
ue(x, y. z) = J^ u4Nf(x, y,z) B.21)
avoided on writing down the required basis functions by mere inspection. Since
the basis function N\302\260
must be unity at node i and zero at the remaining nodes,
the eight interpolation functions can be written down as
where jc'!. y*r, and z\"c denote the coordinates of the center of the element, hex, hey, and he:
represent the edge lengths of the element and V is the element volume.
Brick elements of the Serendipity family are derived in [4]. To obtain higher
order basis functions, progressively larger number of nodesare uniformly placed on
46 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2
the element edges. Bricks with nth-order interpolation functions (with \320\270
+ 1 per
edge) require 8 + 12(n- I) degrees of freedom. Higher order bricks are rarely
used in electromagnetics applications since the regular shape requirement and
decline of accuracy with excessive shape distortion place severe limitations on the
mapping the element in the xyz coordinate system onto a standard cubein a new ^in-
^incoordinate system. We proceed along lines similar to the derivation of bases for the
quadrilateral element in Section 2.3.2. We express the Cartesian coordinates (x,y,z)
m terms of (?, rj, ?) as follows:
x = o,
=
\321\203 a2 + *rf + e2n+ && + e,Jv +\320\250 + &ft + Arf 4f B-22)
2=
= + + WXI + M) B-23)
Nf id \320\2501
with (?,-, ?//, f,-) denoting the coordinates of the ith node in the ?>jf coordinate system.
As before, the relationship between the (\320\273;, z)
\321\203, and (?, ?;. f) coordinates is given by
8 8 8
=
\342\200\242v
?
comes to our aid for calculating the volume integral over the arbitrary hexahedral
f ajvf/at 1
fax/af \321\215\321\203/ag Wax
I dN4 = \320\255\321\205/cit]
\320\264/\320\264
= \320\243]
{ BNf/By
\\
) [ HNf/d:
The volume element transformation from the global xyz to the local ?qf coordinate
system is expressed as
Volume P234
~
1
Volume 1234
Volume P341
2
\"Volume 1234
Volume P412
3~ Volume 1234
_ Volume PI 23
4 ~
Volume 1234 B.26)
and any position within the element is specifiedby
4 4 4
with (xhyhZi) being the coordinates of node /. As for the two-dimensional case
(triangular elements), the basis functions ,Vf are equal to the volume coordinates,
i.e.,
Nf
= Li, / = 1 4 B.27)
Quadratic shape functions for a tetrahedron necessitates the use of ten node
points: the four corner nodes and the remaining six at the midpoints of the edges.
The shape functions for the quadratic tetrahedron are given by
Nf = -
\320\246B\320\246 1), 1=1 4 CORNER NODES
B.28)
N) = 4L}'Z.?, 1 = 5 , 10, MIDS1DE NODES
j and \320\272
are endpoints of each edge
Similarly to triangular elements, volume coordinates greatly simplify integra-
over
integration tetrahedral elements. A useful formula for integrating over the volume of a
tetrahedron is
= 6V B.29)
volume
dxdydz
(a + h ^f
+ C+ d + 3)!
where a, b, and
\321\201 d are integers and V is the volume of the tetrahedron.
48 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2
/=1.4
/ = 3.6
where {;u
\342\200\224\342\200\224
2(: zc)/heighi varies linearly from -1 to +1 over the height of the
prism and is zero at the midpoint ze of the vertical edge (joining nodes 1-4, 2-5.
or 3-6 in Fig. 2.7). Here \320\246
refer to the area coordinates of the triangle that forms the
cross section of the prism (i.e., the triangle formed by nodes 123 or 456).
The quadratic node-based triangular prism has 15 nodes\342\200\224one each at the
corners and at the midpoints of each of the nine edges. Shape functions for the
quadratic and cubic triangular prisms can be found in [4]. For a more efficient
discretization using the fewest unknowns, prisms and bricks can be combined.
This is easily done because prisms and brickscan be readily connected by sharing
the same nodes and edgesat their boundaries.
employed to represent vector electric or magnetic fields. First, spurious modes are
observed when modeling cavity problems using node-based elements [7]. Nodal basis
functions impose continuity in all three spatial components whereas edge bases
Section 2.4 \302\246
Edge-Based Elements 49
standard Sobolev space of functions of order ,y in ft and by || \342\200\242 ||, the norm on this
space, we can then define the space
H(curl;ft) = e (L2(ft)K|V
(\320\270 x \320\270
\320\265
(L2(ft)K)
\320\246\320\225-\320\225'-\321\203,,,
<\320\241\320\233\320\220-||\320\225||\320\264.+1 B.31)
provided is not
\321\201\320\276 an interior eigenvalue and h is sufficiently small. In B.31), A is the
independent of h. Thus higher order bases lead to lower errors in the solution when
the sampling size is sufficiently small. The convergence is also optimal in \320\233.
Edge No. h k
I 1 2
2 4 3
3 1 4
4 2 3
where x, y, and z are the unit vectors in the Cartesian coordinate system.The above
basis functions have unity value along one edge and zero over all others, i.e..
? B.32)
where now ?f denotesthe average tangential field along the /th edge. The has\302\273
they have a tangential component only along the /th edge and none along the other
edges. They are also divergencelesswithin the element and possess a constant non-
nonzero curl. It should be noted that by taking the cross-product of z with W,, we obtain
basis functions which possess normal continuity across element boundaries, have zero
curl and non-zero divergence. The latter are ideal for representing surface current
densitiesand are known as rooftop basis functions in electromagnetics. They have
found extensive use in the solution of integral equations [19] and hybrid finite ele-
element-boundary integral implementations.
Edge-based vector elements can be derived
bases for quadrilateral by carrying
out the transformation detailed in of nodal basis for quadrilaterals
the derivation in
the previous section and then taking the gradient of the resulting expression for each
edge.Thesebaseshave two shortcomings. First, the integrals associated with edge-
based quadrilateral elements do not lend themselves to easy evaluation. Second, they
may not be divergence free. However, their ability to model complicated shapes with
a lesser number of unknowns than tetrahedra and the inherent property of enforcing
tangential continuity across elements makes them attractive for use in two-dimen-
vector
two-dimensional formulations.
W{ = N'u
=
\320\246(\320\246VL/
-
LJjVUj), ij = 1, 2.3 B.33)
where W\\ denotes the basis function for the A'th edge of the eth element and ly
= ?*
is the length of the edge formed by nodes i and j of the triangle. The vector field
B.34)
where ?? denotesthe tangential field along the /cth edge. It can be easily shown that
the edge-based functions defined in B.33) have the following properties within the
element
VxJVj
= x
VLJ
\320\230\321\206\320\247\320\246
If is
\321\221| the unit vector pointing from node 1 to node 2 in Fig. 2.3, then
\342\200\242=
VL|
\321\221|
\342\200\224
\\jt\\ and \302\246
VL2
\321\221|
= \\jl\\. Since L\\ is a linear function that varies
1 I 2
2 2 3
3 3 1
52 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2
from unity at node 1 and zero at node 2 and L$ is unity at node 2 and zero at node 1,
we have
e, Afo
=
\342\200\242
L? + Z.1 = 1 B.35)
along the entire length of edge 1. This implies that \320\233^2has a constant tangential
component along edge 1. Moreover, since U\\ vanishes along edge 2, VL\\ is normal to
edge 2, U, vanishes along edge3, and VL$ is normal to edge 3, N\\2 has no tangential
component along these edges. Similar observations apply to N13 and N31.Thus,
tangential continuity is preserved across inter-element boundariesbut normal con-
continuity is not. Fig. 2.9 shows the actual variation of the basis function for the edgeof
a right triangle that is opposite to the node associated with the right angle. A
different method of constructing edge bases for triangular elements is given in [20],
[21].
Higher order vector basis functions involve a node at the midpoint of
adding
each edge and including the contribution of facet to the approximating
elements
function. Unknowns in the triangular element are assigned as shown in Fig. 2.10 [17].
The tangential projection of the vector field along edge {i,j} is determined by two
unknowns E\\ and E)and two facet unknowns\342\200\224-F\\ and F2\342\200\224areprovided to allow a
quadratic approximation of the normal component along two of the three edges.
Only two facet unknowns are required to make the range space of the curl operator
complete to first order. Therefore, there are eight degrees of freedom for each trian-
triangular element. Since the edge variables provide common unknowns across element
boundaries, tangential continuity of the field over the boundary is assured.However,
an obvious disadvantage of these elements is that the two-facet variables cannot be
symmetrically assigned. This disadvantage can be avoided by employing third-order
edge bases [22]. The higher order approximation to the vector field within the ele-
element is given by
B.36)
where we have arbitrarily chosen the facet variables to lie on edges 1 and 2. These
variables are local unknowns associated with each separate triangular element and
are included to provide a linear approximation for V, x E,, where the subscript t
denotes the tangential components of the operator. This property turns out to be
very important in the selection of the order of the basis function to be used in the
modeling process. The basis described by B.36) can be classified as belongingto the
Hl(curl) space. The Hk(curl) space consistsof those vectors whose inner products
are square integrable and whosecurl consists of complete polynomials of order k.
The basis given in B.33) thus belongs to the H\302\260(curl) space, since its curl is merely
constant within the finite element. The basis generated by excluding the facial con-
contribution would result in six unknowns\342\200\224 two per edge\342\200\224but the order of the approxi-
approximation would still be H\302\260(enr[) and does not add to the accuracy of modeling the H-
field while doubling the unknown count. It should also be noted that the form of the
facet bases in B.36) are different in the original paper [17].This is due to recent
analysis [22] that shows that the Nedelec constraints [10]are met by B.36), resulting
in smaller dispersion error and better conditionedmatrices.
Edge-based elements have facilitated to a great degree the finite element ana-
analysis of three-dimensional structures in electromagnetics. Linear nodal bases with
their problem of spurious modes and difficulty in maintaining only tangential con-
continuity across material interfaces are not as convenient for electromagnetic field
simulations in three dimensions. On the other hand, the introduction of edge-
basedshape functions provides a robust way of treating general three-dimensional
problems having material inhomogeneities and structural irregularities like sharp
edges and corners.
In the following section, we will consider first the simple rectangular bricks and
will proceed to present edge-based shape functions for more complicated finite ele-
elements such as tetrahedrals and curvilinear hexahedrals. The chapter is concluded
with a brief discussion on hierarchical edge elements.
54 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter]
are defined as in Table 2.3, the vector field within the element can be expressed as
B.37)
k=\\
where El represents the value of the electric field along the A*th edge of the <?th
element. The vector bases W? defined for the rectangular brick element have zero
divergence and a nonzero curl. Furthermore, the expansionB.37)guarantees tan-
tangential continuity of the electric field across the surfaces of the elements.
A rectangular brick element has limitations in the sense that it is unable to
model irregular geometries. For this reason, the analog of the two-dimensional
1 1 2
2 4 3
3 5 6
4 8 7
5 1 4
ft 5 8
7 2 3
8 6 7
9 1 5
10 2 6
11 4 8
12 3 7
8^
4
Figure 2.12 Mapping of a hexahedral de-
dement\321\216
unit
\320\260 cube.
56 ShapeFunctions for Scalar and Vector Finite Elements \302\246
Chapter 2
expression for the shape function in a hexahedral element given in B.23). we may
write the corresponding edge bases as
W%
= A + n)(
\321\211 1 + UO V? edges || to $-axis B.38)
^
\\
= -\302\261
A + \320\2501 + M) V^7 edges || to rj-am B.39)
edge elements and generally result in about half the number of unknowns generated
2.4.2.2 Elements.
Tetrahedral Tetrahedra are. by far, the most popular el-
arbitrary three-dimensional geometries and is also well suited for automatic mesh gen-
generation. The derivation of shape functions for these elements follow the same pattern
as that for triangular vector basis functions. If we consider the tetrahedron shown in
Fig. 2.6 and define the edge numbers accordingto Table 2.4, we have
= = - 4
W\\ N4; \320\246\320\247\320\246)./.y=l
\320\225\302\253(\320\246\320\247\320\246 B.4li
where again ty =
tk denotes the length of the edge between nodes / and./, which in
turn define the A-th edge. The vector field within the element can then be expandedas
B.42:
ft=l
1 1 2
2 1 3
3 1 4
4 2 3
5 4 2
(y 3 4
Section 2.4 \302\246
Edge-Based Elements 57
nodes 1 and 2 in Fig. 2.6. Since V/\320\233is orthogonal to facet A34) and VLf is orth-
orthogonal to facet B34}, the field turns around the axis 3-4 and is normal to planes
containing nodes 3 and 4. The field thus has only tangential continuity across el-
element faces. Edge elements can also be describedas Whitney elements of degree one
and can be broadly classified as belonging to the ^(curl) space.
Whitney elements of the second degree are calledfacet elements becausethey
are constant over the face of the tetrahedron. The vector function for the facet
elementcan be written as
=
2(UtVLe] x
\342\204\226\321\202 VL? + L]VLck
x \321\205
\320\247\320\246+\320\246\320\247\320\246
VL'j), ij.k=l 4
B.43)
As explained in [23], we now have a central field (as if emanating from node 4 in Fig.
2.6) on each of the two tetrahedra that share the face A,2,3). The field can be
imagined as coming from the 'source' 4, growing, crossing the facet, and vanishing
into the 'weir 4'. the fourth vertex of the other tetrahedron. Thus, this field has
normal continuity and the flux across the facet forms the degree of freedom for the
element.
Alternative expressions for linear basisinside a tetrahedron have been derived
in [14]. They are given by
f7-/ = B.45)
^r/lxr,1
^ =
'4f B.46)
in which /=1,2 6, Vt. is the volume of the tetrahedral element, e,- = (r,, \342\200\224
r(|)//\302\273,
is the unit vector of the /th edge, and //, = |r,^ \342\200\224
r,, |
is the length of the /th edge with
and
\320\263,, r,, denoting the position vector of the /| and i2 nodes. It can be shown that
where i| and i2 are given in Table 2.4. The basisfunctions given in B.44) have zero
divergenceand constant curl (VxHf = 2g,). The form of the basis functions given
in B.44) is similar to the zeroth-order edge elements postulated by Nedelec [10].
order of the polynomial approximation for the first-order
The edge element
given in B.41) or B.44) can be taken as 0.5. This is because the value of the basis
function is constant, i.e., 0A), along the edge it supports and is linear everywhere
else within the element. Mur and de Hoop [12]presented edge elements which are
consistently linear, yielding a linear approximation of the field both inside each
tetrahedron and along its edges and faces. However, the curl of the basis is still
58 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2
0A). Since this requires two unknowns per edge, there are twelve degrees of freedom
per element. The basis functions in [12] are derived by first defining the outwardly
directed vectorial areas of the faces as
where r/, / = I..... 4 denote the position vectors of the vertices of the tetrahedron
and i,j, k, I are cyclic.Then the edge-based vectorial expansion function is defined \320\253
position given by
1=1 1=1
V 1
~
X-> 1 \\h
\"
x det -y det V 1
+ 2 det x3 1 \320\233
V I *\"
\320\2434 1 \024 1
\342\200\242V4 >'4
normal to its corresponding edge in two dimensions and normal to its corresponding
face in three dimensions. The basisfunctions with consistently linear interpolation in
the tetrahedron can thus be rewritten in a more convenient notation as
= \"(I= /.7=1. ,4. A49)
Still higher order basis functions are sometimes necessary for rapidly varying
fields. The second-order edge basis @{rim5)) for a tetrahedral element was first
presented by Lee, Sun, and Cendes [24].We need 20 degrees of freedom to achieve
a quadratic approximation of the vector field inside a tetrahedron (see Fig. 2.13).
Accordingly, the field within a tetrahedron can be written as
fa! fal
- +
-
'IUKUjVLi UkVL'j) FiLJj(Lek\320\243\320\246 \320\254'\320\243\320\246) B.50)
now has 30 unknowns\342\200\224three along each edge and three on each face.
prisms lies in the fact that they yield fewer unknowns than tetrahedrals while retain-
retainingthe ability to mesh arbitrary geometries unlike hexahedrals. Moreover, it is
sometimes possible to extrude the volume mesh out of an existing surface mesh
using triangular prisms. This feature, however, may not always lead to good quality
elements, especially when the geometry is non-planar with sharp corners. Finally, it
is not easy to construct edgebasis functions for such elements. 6zdemir and Volakis
[25] proposed edge-based shape functions for right-angled and distorted triangular
prisms. The vector basis functions derived in [25] are a combination of edge basis
over the triangular cross section and a linear variation over the height of the prism.
A sketch of the basis function over the triangular and quadrilateral faces is presented
in Fig. 2.14. One of the shortcomings of these bases is the lack of tangential con-
continuity across element faces when the prisms are distorted, i.e.. the vertical arms are
not at right angles to the plane of the triangular faces.
60 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2
(x,y,z)
Figure 2.14 Sketch of edge basis function over triangular and quadrilateral faces of
prism element. [Courtesy o/T. Ozdemir.]
The geometry of an arbitrary triangular prism is shown in Fig. 2.15. The edge
= /,./=1,2.3; =
\320\233 1.2,3 B.5\320\246
\320\251^\320\251 \320\254^\320\246\320\247\320\246-\320\246\320\243\320\246)\320\260,
W%
=
\320\251
= - 1
\320\254\321\206{\320\246\320\247\320\246
\320\246V?D(
-
s), l,j = 4. 5, 6; =
\320\272 4. 5, 6 B.5:,
and the vertical edges are
In the above equations, \320\246are the node-based shape functions (area coordinatesof
the triangle) defined earlier and ,v is a normalized parameter which is zero at the
bottom face and unity at the top face of the prism. It should be noted that the basis
functions for the top and the bottom edges are exactly similar to that of a triangular
basis scaled by a dimensionless parameter. For the vertical edges, the vector v is a
linear weighting of the unit vectors v,. vj, v3 associated with the vertical arms and is
defined as
\320\251
= = 1
\320\272 number of edges B.55)
\320\260\320\272+\320\240\320\272\321\205\320\263,
where a*. fik are constants and r is the position vector insidethe finite element. The
edge bases for the top and bottom faces fall into the Nedelec form but the bases for
the vertical forms do not, hencethe loss of tangential continuity across elementfaces.
= = 1 \320\234
**(*\342\200\242). \320\272
\320\251(\320\263)\321\204*($.ij. \320\236 B.56)
Alexopoulos also proposed a curved brick superparametric element with 8, 27, and
64 nodes in [27] for solving scattering problems. Superparametric elements attach
unknowns to a lesser number of points than required to define the geometry. The
advantage of curvilinear elements lies in the fact that they can model curved surfaces
with more accuracy and lesser number of unknowns than rectilinear elements.
Analytical surfaces and even complicated non-planar surface features can thus be
modeled exactly at low computational cost. However, many mesh generation
packages cannot construct curvilinear elements for arbitrary geometries.
62 ShapeFunctions for Scalar and Vector Finite Elements \302\246
Chapter 2
any element of higher order [4]. Hierarchical elements find use in a class of adapiivc
finite elements (called /^-refinement) where the order of approximation is improved
by refining the order of the polynomial basis functions instead of refining the mesh
density. However, there is usually a trade-off when higher order basis functions
extract a heavy price m terms of computer resources.A major problem with going
to higher order bases is the increased density of the finite element matrix and the
slowly varying and higher order elements in regions where the field varies rapidly.
The implementation of hierarchical vector elementscan be difficult, especially
at the transition boundaries where elementsof one order mergeinto the elements of
higher or lower order. If several vector elements share an edge, the field tangent to
the edge must be made identical in each of the tetrahedra. This is done by carefully
matching the coefficients of the vector basisfunction corresponding to that edge. For
tangential continuity across a face, the same equality must be enforced between the
coefficients of all the edge and facet functions associated with the face. Table 2.5
given in [28] shows the basis functions for hierarchical vector finite elements, li
should be mentioned that for the zeroth-order edgeelement,the described polyno-
polynomialapproximation to be of order 0.5 is somewhat of a misnomer. It should be taken
to mean that the field variation along the edge is constant, i.e., 0(ru), and the
variation normal to the edge is O(r'). Averaging the orders, albeit a mathematically
dubious procedure, yields the described polynomial order.On the plus side, the table
offers a concise view of the hierarchical nature of these edge elements. Higher order
basis functions are constructed by systematically adding the extra terms up to the
desired order. It should be noted that the bases for the tetrahedron with six and 20
Unknowns per
Element Type Polynomial Order Element Basis Function
Edge 0.5 6 -
L,VLj LjVL,
Edge 1 12 V(L,L,)
Face 1.5 20 Ul.,VLk-UVL))
Face -
Lj(LkVL, L,V^)
Edge 2 30 V\\L,L,a,-Lt)]
Face 4L,L,Lk]
References 63
REFERENCES
[2] J. P. Webb. Edge elements and what they can do for you. IEEE Trans.
Magnetics,29:1460-1465. 1993.
[3] S. H. Wong and Z. J. Cendes.Combined finite element-modal of three-
solution
dimensional eddy current problems. IEEE Trans. Magnetics, 24F), November
1988.
[4] Zienkiewicz.
\320\241
\320\236. The Finite Element Method. McGraw-Hill, New York, Third
edition, 1979.
[5] K. Tuncer, D. Norrie, and F. Brezzi. Finite Element Handbook. McGraw-Hill,
New York, 1987.
[6] J. M. Jin. The Finite Element Method in Electromagnetics. John Wiley & Sons,
New York. 1993.
[7] Z. J. Cendes and P. Silvester. Numerical solution ofdielectric loaded wave-
waveguides: 1\342\200\224Finite element analysis. IEEE Tram. Microwave Theory Tech.,
1970.
118:1124-1131,
[8] X. Yuan, D. R. Lynch, and K. Paulsen.Importance of normal field continuity
in inhomogeneous scattering calculations. IEEE Trans. Microwave Theory
Tech., 39:638-642, April 1991.
[9] H. Whitney. Geometric Integration Theory. Princeton Univ. Press, NJ, 1957.
[10]J. Nedelec.
\320\241 Mixed finite elements in r\\ Numer. Math., 35:315-41, 1980.
[11] M. Hano.Finite element of dielectric-loaded
analysis waveguides. IEEE Trans.
1987.
[24] J. F. Lee, D. K. Sun, and Z. J. Cendes. Tangential vector finite elements for
element antenna analysis. IEEE Trans. Antennas Propagat., pp. 788-797, Ma\\
1997.
[26] J. S. Wang and N. Ida. Curvilinear and higher order 'edge' finite elements in
3.1 INTRODUCTION
The finite element method (FEM) belongsto the class of partial differential equation
(PDE) methods.Its origin is frequently traced to Courant [1] who in the 1940s first
discussed piecewise approximations in the appendix of his paper. In the 1950s,
Argyris [2] began putting together the many mathematical ideas(domain partitioning,
assembly, boundary conditions, etc.) that comprise the FEM for aircraft structural
analysis. The introduction of FEM to the engineering community occurred in the
1960s, and some feeJ that the conferences on finite elements held in 1965, 2968, and
1970at the Wright Patterson Air Force Base in Dayton, Ohio, U.S. played an
important role in advancing the method. Finite element activity in electrical engin-
engineering also began in the late 1960s with the papers by Silvester [3] (see also the
reprints volume [4] and Arlett, Bahrani and Zienkiewicz[5])addressingapplications
to waveguide and cavity analysis. Later developments on absorbingboundary con-
65
66 Overview of lhe Finite Element Method: One-Dimensional Examples \302\246
Chapter 3
that the memory needed for a solution of an FEM system is proportional to the
number of unknowns N. For most casesthese memory requirements may range from
ION to 40iV depending on the type of problemconsidered and the employed basis or
expansion functions approximating the field within the computation domain. This is
in contrastto boundary integral solutionswhich lead to fully populated systems
having O(N2) storage and OiN*) CPU requirements. However, it should be pointed
out that the number of unknowns for boundary integral equations are generally
much less than those of FEM for the same problem. Nevertheless, when dealing
with nonmetallic structures, the FEM and its hybrid versions is the most attractive
choice.
The geometrical adaptability and low memory requirements of the FEM have made
it one of the most popular numerical methods in all branches of engineering.Its
application to boundary value problems [6] involves the subdivision of the computa-
computationaldomain (region where the fields are to be determined) into smaller elements [7].
[8]. For two-dimensional these elementsare typically
problems, triangles or quadri-
quadrilaterals as discussed in Chapter 2 and illustrated in Fig. 3.1. Additional example
meshes are given in Figs. 3.2 and 3.3 with the latter referring to a three-dimensional
mesh around a sphere.
The subdivisionof the domain into small elements is referred to as meshing or
discretization of the geometry and is an important part of the FEM solution pro-
procedure. By keeping the elements small enough (typically less than 1/10 of a wave-
wavelength per side), the field interior to the elementcan be safely approximated by some
linear or, if necessary, higher order expansion. The collection of these elements and
arbitrary and rather complex fields in terms of unknown coefficients which may repre-
represent the field values at the nodes (node-based basis)or the average field values over
the edges (edge-based basis).
In the FEM, the equations for the unknown
context of the coefficients of the
expansions are
by enforcing the wave
constructed equation in a weighted (average)
sense over each element.A subsequent step involves the application of the boundary
conditions leading to a matrix system of the form
(*!./!>
Quadrilaterals
(four-sided elements)
V=Q
(8)
(b)
Figure 3.1 Example illustrations of finite clement meshes: (a) shielded strip-
conductor transmission line problem; (b) shielded circular conductor
transmission line problem.
Figure 3.2 Finite element mesh around an airfoil for scattering computations.
[Courtesy of Daniel C. Ross.]
68 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter 3
employing established storage schemes such as the compressed row and ITPACK
formats discussed in Chapter 9. Direct solvers such as LU decomposition are still
better suited for smaller size systemssincethey require storage of the entire matrix
including its nonzero entries.
The steps involved in the generation and solution of an FEM system can be
summarized as follows:
\302\246
Define the problem's computational domain
\302\246
Choose mesh truncation schemes (in the case of open domain problems)
\302\246
Choose discrete elements and shape functions
\302\246
Generate mesh (prepocessing)
\302\246
Enforce the wave equation over each element (or Laplace's/Poisson's equa-
equation for statics) to generate the clement matrices
\302\246
Apply boundary conditions and assemble element matrices to form the over-
overallsparse system C.1)
\302\246
Ensure matrix symmetry (for domains with reciprocal materials)
\302\246
Choose solver and solve matrix system
\302\246
Postprocess field data to extract parameters of interest (suchas eigenvalues,
capacitance, impedance, insertion loss, scattering matrix, radar crosssection
and so on)
In this chapter we will present these steps for one-dimensionaJ problems before
d ( dU\\
-
-r />(.v)-j- + <i(x) U(x) =j\\x) Q<x<xa, C.2)
where p(x), q{x),and f(x) are known functions and U(x) is the unknown field or
voltage quantity. Depending on the interpretation of U(x), this equation can repre-
represent any of the following problems illustrated in Fig. 3.4.
Differential equ.: ~
?,\320\224+ C.4)
(\342\200\224 =/(*)
-^ ^\320\265\320\263?\342\200\236
pe\321\204endicular polarization
or
H.{x):
IE.(x): parallel polarization
Boundary Conditions:
EAx
= 0) = 0, \342\204\226
+ jlcQE:\\ I - 2jk0e*\302\260*\"
Overview of the Finite Element Method: One-Dimensional
Examples \302\246
Chapter 3
v=va
4-
(b) Parallel plate
QQ
*
x-t \321\205~\321\205\320\271
\320\255\321\205
= 0 or = \320\236
Diff.equ.: J*(l*f)+kfcrE: -f f1 \320\250 + /\321\201^\320\263\320\257. C.5)
C.6)
Section 3.4 The
\302\246 Weighted Residual Method 71
where R is the reflection coefficient of the coated ground plane and is not known
until the FEM solution is completed. Thus, the total field
E. = ?t\"c + or
\320\257\320\223, Hz = + \320\257\320\223
\320\257!\320\277\321\201 C.7)
satisfies the stated boundary condition. As indicated in Fig. 3.4, ?lnc, and H\342\204\242
represent the z components of the plane wave incident upon the metal backed
dielectric slab.
We proceed now with the solution of C.2) on the basis of the finite element method.
As a first step we introduce the residual
R{x)= - \342\200\224
(p{x) ^-\\ + a(x) U(x) -f(x) C.8)
which must be zero in accordance
of course with the state problem. However, it is
impractical enforce R(x) = 0 at every
to point in the domain from x = 0 to
I Wm(x)R(x)dx = 0 C.9)
J Domain of Wm
\"' 2= = N\021
Unknown Uz = U% = U,e=2 Unknown UN-,
= U|=N\" \320\246\"
XN-1
(e=2)
\302\246
If Wm(x) = S(x - xm) or WJx)
- (xm+]
= 8[x + xm)/2], the resulting
linear system.
\302\246
The choice of Wm{x) is not completely arbitrary. For the mathematical steps
in the FEM procedure to hold rigorously, Wm{x) and its derivative must be
at least square integrable over the domain. Specifically, for the problem al
< oo
\320\223
duy
\"Jo J*-.v=O
The first right hand side (RHS) term can be evaluated by enforcing the known
boundary conditions at the endpoints. Its effect on the overall system will be con-
considered later.
Step 2 Derive the weak form of the differential equation. The weak form of
the differential equation is most appropriatefor numerical solution and is obtained
by substituting C.10) into C.9). We have
+ \320\257{\320\245) ~ ~
Wm(x) = \302\260
Wm{x) U(X)
Jo\"\\PiX) ^T ? W/'\302\273(V)-/(A')](lx [p{x) Sf]
C.11)
which holds provided C.2) is valid. However, becauseof the integral, the weak form
C.11) enforces the differential equation on an average (and therefore weaker) sense.
Equation C.11) is often referred to as a varialional statement of the problem. What \320\27
remarkable about C.11) is that it incorporates in a single mathematical statement the
requirements imposed by the differential equation and the boundary conditions at the
endpoints. That is, upon substitution of the boundary conditionson U(x) and
Section 3.5 \302\246
Discretization of the \"Weak\" Differential Equation 73
Wm(x). This is an essential step in all numerical solution procedures, and the pre-
previous chapter served to introduce the various classes of basis functions used in a
discrete representation of the unknown function. We choose the linear representa-
representation
C.12)
\320\225
where are
\320\251 the unknown coefficients of the expansion and (seeFig.3.6)
x-x\\
X*\\ < X < \320\233'2
0 otherwise
otherwise 10 otherwise
C.14)
Global node # eto segment Local Global (e- 1)th elb element
n=e node# node# element
Figure 3.6 Illusiralion of lhc clh segmentor element and the linear shape functions:
(a) the nodal expansion functions: (h) overlay of the nodal expansion
functions.
74 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter 3
combination of the field or potential values at each node. Although not necessary for
this one-dimensional example, two types of node-numbering schemesare typically
used to facilitate the programming and implementation of the finite element solution.
Local Node Numbers. These are assigned node numbers unique only within a
single element. For the line segments in Fig. segment is formed by two nodes
3.5 each
having the local node numbers 1 and 2. as illustrated in Fig. 3.6. That is, the notation
x\\ refers to the location of the local node 1 of the <>th element. Similarly, refers
\320\251 to
the field or potential at node I of the eih element. Using this type of notation, we can
develop formulations and equations for a single element which can then be incorpo-
incorporated into the overall solution by attaching the superscripte to all local or elemem
variables. In this manner, the uniquenessof the equations is maintained when com-
combined with those from the other elements.The elements (two-nodesegments for one-
Node
Global Numbers. Each node of the discretized domain is also given a
unique number from 1 to N, as shown for example Fig.in 3.5, The assignment of
these global numbers is necessary since eventually all unknowns from each element
must be collected (a process referred to as element assembly) into a matrix system
where
\321\205\342\200\236, first refers to the local numbering
the notation and hereon the single
subscript be
will reserved for the global numbering notation. As can be realized,
since every node in Fig. 3.5 belongs to two elements, multiple local notations can
refer to the same global node or field value. As an example (see Fig. 3.6),the node
location is identical
\320\273\342\200\236 to the locations implied by the notations x\\ and .v>\"'.
Likewise, for the held values we can state that ?/,,
= V* = ?/-Tl a\"d so on.
When the expansion C.14) is substituted into C.11) we get
f
\320\233', 2 f.vS
- -fix)
\302\273\302\246\342\200\236(*>
/\321\204\320\263> Wln(x)
= 0 C.15)
^ ^ v=0
The latter terms in the brackets are due to contributions from the endpoints of the
domain and their evaluation is subject to the specific boundary conditions. This
equation now explicitly shows how the boundary conditionsenter into the construc-
of
construction linear system. Hereon, we will refer to their contributions
the as [endpoints]
since we have not yet specified the type of boundary condition to be imposed.
We are now ready to make different choices for the weighting function to
generate a system of linear equations for the solution of \\Un). As stated earlier,
this step is also referred to as testing and Galerkin's method is usually employed
Section 3.5 Discretization
\302\246 of the \"Weak\" Differential Equation 75
+ [endpoints] = 0 C.16a)
where
.v=0
*=*.J
~p{x) C.16b)
~a\\
.\321\202=0 x=xd
Since Nj(x) is nonzero only over the eth element, the summation over the elements
can be eliminated at this stage. In other words, integration is carried out only over
the nonzero portion of the integrand. We can rewrite C.16a) in matrix form as
+ fendpoints] = C.17a)
[A'ji\\{Uf)
which can be considered as the weighted discrete form of the differential equation.
This matrix system provides a relationship only between the two nodes forming the
eth element a localizedrelationship among the
and is therefore node fields/potentials.
The endpoint contributions appear only when e = 1 or e = Ne and vanish when the
C.18)
C.19)
=
ri{x)f(x)dx C.20)
The above testing procedure will result in 2Ne equations obtained by letting
e = 1,2 Ne in C.17). Since only Ne + 1 unique unknowns exist, it is necessary \321\216
condense or combine the 2Ne equations down + 1. The additional set of
to Ne
equations is a result of testing at the same from the left using the testing
nth node
function N2~\\x) and from the right using the testing function N'(x). Their reduction
to Ne + 1 equations is referred to as assembly of the element equations and is a
standard step in all finite element solutions.
The essence of the assembly procedure is to take the average of the test equations
from the left and right of the node.
\302\253th That is, we consider the weighted average.
dx + N\\{x) R(x)
\320\251-1(x)R(x) dx\\=0
or
\"'*'
\320\223J Tm(x) R{x) dx = 0, m = 2,3 N - 1 C.24)
= ~x
Tm(x) x <x < C-25)
0 otherwise
From C.24), the test equation at the with node has the explicit form
where
1
/ ,dTmdTn
= \320\223\"+'\320\223 -\342\200\224\302\246
p(x)
\320\233\342\200\236\321\210 +
\342\200\224r- q{x) Tm{x) \320\242\342\200\236(\321\205)
dx
Jxa-X dx dx J
A^\\\" + &\320\221.
^
\320\277
\342\200\242 w
\320\273
12 . n = m \342\200\224
1 l->^',)
/4^\"' . n = m + 1
'
f-v\302\253+i
*m
= = \320\271\320\223\321\210\"'
\320\233
\320\243\320\266\320\270)\320\233\321\205) + *\320\223\" C.28)
I
Jtm-I
C.29)
'22jl^2j l\022j
'*!! (\320\260\320\264
\320\270,
1 4, 2 \\ 3 \\ 4
and
C.31)
b\\
These are six equations for the four unknown coefficients U\\. U2, U$< and C/4. To
reduce C.29)-C.31) down to four add those which
equations, we simply
correspond
to testing from the left and right This addition of equationsamounts
of the node. to
simply performing the first sum in C.15) and is not an arbitrary decision in the
process. For nodes 1 and 4, there is only testing from one side and therefore the
first equation of C.29) and the second equation of C.31) are left unchanged. For
node 2, we add the second equation of C.29) and the first equation of C.30).
Likewise for node 3, we add the second equation of C.30) with the first equation
of C.31). The resulting (i.e., assembled) system of four equations is
\320\220\\\320\263 0 0 t/,
22 + \320\2202\320\270\320\220\\\320\263 0 \320\2702
C.32)
\320\276 Ah + \320\220]
\320\220222 \320\260]2
where we employed the global node notation for the {U} vector. In placing this into
the compact notation
[A][V\\
= C.33)
we note that [A] is a tridiagonal matrix regardlessof the number of elements used for
the tessellation of the line segment 0 < x < xa. That is, a maximum of three nonzero
entries appear in each row of [A] and except for the top and bottom row, the
diagonal entry and its adjacent elements are the only nonzero entries. We can
state that the bandwidth of the matrix is three regardless of the number of elements.
Simply put. as the number of nodes/elementsincreases, matrix takes the generic
t he
sparse form
X X 0 0 . ..
X X .V 0 0 . \342\200\242
.
0 X .V x 0 0 ...
0 0 JC X X 0 0
C.34)
0 0 0 X X X 0 0
0 0 X X X 0
0 0 .v .v
\" \"
where the \"x\" symbols imply a nonzero entry and the . . . denote a continuation
of the zero entries, in applied mechanics [A] is referred to as the stiffness matrix
because of the similarity of C.33) to the equation Kx =f for the deflection* of a
linear spring with stiffness under
\320\232 an applied force/. In electromagnetics. [A] can
be interpreted in several ways depending on the physical quantity represented by
U{x). For example, if U(x) = voltage or electric field, then [A] can represent an
admittance matrix with {/>} being the electric current excitation. Alternatively, if
Becauseof the sparsity of [A], only its nonzero entries are stored when solving
the matrix system C.33). Also, global numbering as used in C.26)-C.28) must be
employed in defining the assembled matrix system. Clearly,the matrix system in
C.32) is identical to that in C.26H3.28) except for the first and last row of C.32)
which correspond to the omitted m = 1 and m = N equations in C.26). In practice,
the assembly of [A] is done by employing C.27) and C.28)directly or by implement-
a double
implementing loop and keeping track of the correspondence between the local and
global numbering. A possible double loop which generates the nonzero entries of
[A] is
Initialize
\320\241 the [A] matrix to zero
DO Wm = l,N
DO 10 \320\270
= I./V
10 A(m,n) = 0.0
\320\241
Loop through all elements and construct [A]
DO20e= l.iV-1
\320\241
Compute element matrix [A*]
DO 30/= 1,2
DO 30./= 1.2
30 Compute AE(iJ) from equations C.21) and C.22)
Assemble
\320\241 [A'] into global matrix
boundary nodes, but the fields/potentials and their derivatives at the boundary nodes
must be independently specified for a unique solution of the differential equation. As
a reference, we remark that if the spatial variable .v in C.2) was replaced by the time
variable i, the boundary conditions would become the initial conditions of the tem-
temporal response U(i).
80 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter 3
at the left or right endpoint of the domain. In two- and three-dimensional problems,
it can be stated as
n Vf/ =
\342\200\242
^oft
=0 on S or \320\241 C.36)
where n denotes the outgoing unit normal vector of the domain boundary, as
illustrated in Fig. 3.8. In acoustics this is referred to as the hard boundary con-
condition, and in electromagnetics the magnetic field obeys this condition on metallic
boundaries.
The Neumann boundary condition is the easiest to be numerically enforced in
FEM solutions. In this case, the [endpoint] contributions C.16b) vanish and the
elemental equations C.17b) lead to the global system C.32) without any special
considerations.
In acoustics this is referred to as the soft boundary condition.For two- and three-
dimensional electromagnetic problems, the Dirichlet boundary condition is satisfied
U2\\_ 1
Thus, even though the [endpoint] contributions are not computable, we can still
solve for the node fields or potentials. We remark that if C.38) was a system of N
equations, the reduced system C.39) will consist of N \342\200\224
2 equations after the en-
enforcement of the Dirichlet boundary conditions.
We may also encounter situations when the field or potential at the end node is
assigned a specifiedvalue. An example of this situation is the parallel plate capacitor
problem where the upper plate has a potential equal to Vu. As a more general
example, let us consider the situation where in reference to the three-element example
in Fig. 3.7, we set
= 2o. ^7 C.40)
which are typically referred to as inhomogeneous Dirichlet and Neumann boundary
conditions, respectively.
Substituting these values into the system C.38) gives
A 12 0
A\\\\ \320\220\\\320\263 Al C.41)
0 \302\246L b]
82 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter]
which can be solved for U2, V-i and U4. We again point out that, when dealing with
N nodes, C.41) would be a system of N - 1 equations and with the exception of the
first and last equations, the test will have three nonzero elements as illustrated by tk
matrix C.34).
The procedure of reducing the system C.38) to C.39) or C.41) is often referred
to as condensation of boundary conditions.This reduction is typically performed
during the assembly process by eliminating for example the rows which test at a
boundary node assigned a specified value. Finally, we remark that the condensation
process modifies the excitation column implying that the specification of a potential
and its normal derivative. Referring to Fig. 3.8, it is typically stated as [9]
+ at/ = 0
^\320\264\320\277 on S or \320\241 {\320\252\320\2
where or is a constant. This boundary condition has been found very useful in
modeling the
presence of thin dielectric coatings without a need to tessellate tk
region interior to the dielectric. In finite element simulations, C.42) also plays the
role of the radiation condition or a first order absorbing condition (to be discussed
later). These boundary conditions will be discussed extensively in later chapters and
are used for truncating the computational domain of open domain problems as in
the case of scattering by an airfoil (see Fig. 3.2). They basically provide a statement
on the field behavior at the boundary nodes.The need to mesh the region beyond the
condition gives the proper field behavior beyond the boundary enclosure.
A generalization of C.42) is
+ aU = f) on 5 or \320\241 C.431
where or and j8 are constants and we can refer to this as the inhomogeneous boundun
condition. The treatment of C.43) is no different than that for C.42). For the one-
\342\200\224
+ aaU = x =
\321\200\342\200\236.
= a
\321\205\342\200\236 C.44|
ox
When these conditions are used in the finite element solution of the three-elemeni
segment example (see Fig. 3.7), the resulting system is again of the same form as
[endpoint], =/?@)(A,-or0f/|)
[endpoint],
= -p(x,,)(fiu - a<tU4)
Section 3.8 \302\246
Examples 83
and when these results are incorporatedinto C.38), after rearranging, we obtain the
system
0 0
0
0 V,
-\320\233/\320\232\320\236)
\320\276
\302\246
+ C.45)
\320\276
b\\ + b\\
bl +A,/K-va)
This systemcan now be solved for [U] and we remark that the middle two equations
are unchanged by the imposition of the impedance boundary conditions.It is clear
that when iV equations are involved, the imposition of the impedance boundary
condition will only alter the first and last equationsof the overall system.
3.8 EXAMPLES
2W + n2 ?,.(*) = sin
2\320\2732 nx, 0 < x C.46)
Also, from Fig. 3.9 and the formulae C.21) and C.22), we find that
Aeu = 10.328987
=^2=-^-
and
. = ,*,=\342\200\224L +\321\217*\320\224*=_9.\320\2305507
\320\224\320\264- 6
84 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapier!
(x=0.
\\ e
Ax=0.1,N=#ofnodes = 11
Figure 3.9 Tessellation of a line segmra;
Ne = # of elements = 10
into ten equal length dements.
10.328987
\320\223 -9.835507 II ?J, 1 _ I b\\ \\
C.49t
[
-9.835507 10.328987 J | \320\251\320\263
\\~ \\ b\\ \\
\302\246>
\320\223*/**-.
lit1 = 2\320\2732 x.
f ~
x\\
20.6580 -9.8355 0 0 0 0 0
Ey2
Eyi h
EyA b,
=
b*
\320\272
with
reciprocal permittivity and permeability tensors are inherently reciprocal, and this properly\302\273
exhibited in the Hermitian form of the matrix.
Section 3.8 \302\246
Examples 85
Solving the above system via matrix inversion, for example, gives the following results:
1 0 0 0
2 0.3103 0.3090 0.0013
3 0.5902 0.5878 0.0024
4 0.8123 0.8090 0.0033
5 0.9550 0.95II 0.0039
6 1.0041 1.0000 0.0041
7 0.9550 0.9511 0.0039
g 0.8123 0.80\320\255\320\224 0.0033
9 0.5902 0.5878 0.0024
10 0.3103 0.3090 0.0013
I! 0 0 0
As seen, the difference between the exact and numerical solution is in the third decimal place,
indicating that the employed number of elements are sufficient for anaccurate representation
of the field distribution. To reducethe solution error, more the elements can be used for
tessellation of the line segment. However, as \320\224\321\217 0, the system condition \\\\.s/\\\\ ||.\302\253/~'||,
\342\200\224>
increasing N. unless the machine precision is also increasedwith the inclusion of additional
decimal places in the calculation of the matrix entries and in carrying out the system solution.
Numerical precision is particularly important for solving problems with many thousands of
unknowns (the case with many problems).
practical
Having the node fields, the field is found from C.12) or C.14). Specifically, since
Ey(x)
= 10 -
\321\201
?[?\342\200\242;,@.1 x) + - O.le
\320\225*.\320\263(\321\205 + 0.1)]P^Jx -2e + 0.05)
9
0.1
where
x < A.v/2
otherwise
is a pulse function.
EXAMPLE 3.2
The field reflected by a metal-backed dielectric slab due to a plane wave excitation E'J\" = e'*\"*
is given by
Er. = Re'\021\"
where
and
are the wave impedances in free space and in the dielectric medium, respectively.
Solution
As discussed al the beginning of Section3.2,the pertinent differential equation is
d2E.,ir. _
We will set xa = t + At, where Al is the chosen element length, as illustrated in Fig. 3.10. The
choice of \320\224\320\263
(the element size) should be somewhere between Ad/10 and Ad/20, where
Ad = 2jr/Re(ftd) is the wavelength in the material. For our case, Ad = and
B\320\267\320\263/\320\2200)\321\20
sampling rate is 20 elements per wavelength in the dielectric since ReFr) = 4. From |3.2ii
and C22) the entries of the elemental equation are given by (with / = 1,qc = -k\\tK)
= Ah = -L - = 40.- 0.32899e
At -
J_ = -40. 0.164493<n
At
-?,= 0
\302\246
where e,c denotes ihe relative of the eih element and we have suppressed
permittivity the
presence of the factor Ao the free-space wavelength
since cancels out in the final result.
After assembly, we will obtain 11 equations since ?-t = E:(x = 0) = 0. Except for the last
and first, all other equations of the assembled system will be of the form
+Jko)E:l2 =
That is, b\\2 = 2jk()e'*|>1'+\320\224|> js the only nonzero entry of the excitation column and is a result of
the assumed plane wave incidence.
Upon solution of the assembled FEM system, the reflection coefficient is extracted from
the node field values as
E2U
\342\200\236hem
- E^(x =t+ At)
?Inc(.v = I + At)
A plot of R as a function of the loss parameter /3
= Im(er) is given in Fig. 3.11 (courtesy of
S. Legaull).
It thai the computed values
is seen of RFEM are in good agreement with the exact result
even when the discrete layers in the dielectric are reduced from ten down to only N,. - 1 = 2
(corresponding lo a sampling rate of eight elements per \\d). In this case the decay of the field as it
propagates in the dielectric is very rapid and the employed discretization of the dielectric slab is
not sufficiently fine to pick up the large changes in the field values from one discreteslab
(element) to the next. This is a good example of the fundamental assumptions made when
diseretizing the computational domain. For large jS, the field decrease is slower and thus less
samples are neededto approximate the field within the region while maintaining the same level
of accuracy.
\342\200\224
Exact
\342\200\224
Elements in Dial. = 10
-
\342\200\224
Elements in Dlei. = 5
Elements in Diel. = 2
DC 0.7
EXAMPLE 3.3
unchanged.
Solution
By selecting \321\206,
= er, the impedance of the wave inside and outside the dielectric is
^ = 120jt
and thus the wave does not exhibit any reflection at x = t (i.e., at the dielectric interfacci
The interface is then referred to in the literature as reflectionless. Also, as it enters ilw
dielectric, the wave is absorbed due to the nonzero imaginary components of the conslituuvi;
parameters. If p is chosen so that the wave has decayed to negligible levels by the time it
exits into the air medium, the dielectric layer can be considered as \"perfectly absorbing'
since no reflected field is returned. Such layers have been proposedrecently [11], and it ha>
been shown that certain anisotropic layers can lead to perfectly matched interfaces for
incidences away from normal. These types of layers are important in finite element simula-
simulationsbecause they can be used for simulating a nonrenecting surface. The latter is essential
in solving open domain problems, as is the case with scattering and antenna radiation
as fi increases, the numerically computed reflection coefficient begins to deviate from ihc
decay of the field within the absorber as fi is increased. Thus, higher sampling is needed \321\213
belter model the field values from one discrete layer to the other.
-10
-20
Elements In Diet. =5
solution |
[\342\200\242\342\200\224Exact 4^ Elements In piel.= 10
-40
\302\246\\Elements in Diel. = 20
\342\200\22450
%This MATLAB code can be used to reproduce the data in Fig. 3-11
\\ CoUTtesy Of LARS ANDERSEN
%t=thickness of slab
%kO=free space propagation constant
%xa=location of the left
computational domain endpoint
%alphaa=alpha coeff. to be used in the boundary condition at xa; see C.43)
%betaa=betaa coeff. to be used in the boundary condition at xa; see C.43)
%p=p coefficient appearing in the differential equation C.2)
%epr=relative permittivity of the slab
%beta=as defined in Fig. 3-11
%q=coefficient appearing in the differential equation C.2)
%N=number of nodes (N-l=number of layers)
%R_FEM=computed reflection coefficient (to be plotted)
% Initialization
clear;
N=7;
t=0.25;
Dx=t/(N-2)(
xa=t+Dx;
P=-l;
kO=2*pi;
f=0;
alphaa=j*KO;
betaa=2*j*kO*exp(j*kO*xa) ;
%values to generate plot in Fig. 3-11
Z=26;
4epsr=4-j*beta;
for 2=1: 7,
beta=(z-l)/(Z-l)*5;
epsr=4-j* beta;
for m=l:N,
b(m,l)=0;
for n=l:N,
A(m,n)=Os
end
end
for n\302\253l:N-l,
xel=(n-l)*xa/(N-l);
xe2=n*xa/(N-l) ;
if n=\302\273N-lf
eps=l;
90 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter]
else
eps=epsr;
end
q=eps*k0'2;
Ael(l,l)=p/abs(xe2-xelLq*abs(xe2-xel)/3;
AelB,2>=Ael(l,l>;
Ael(l,2)=-p/abs(xe2-xel)+q*abs(xe2-xel)/6j
AelB,l)=Ael(l,2);
belB>=bel(l);
A(n,n}=A(n,n)+Ael(l,l)!
A(n,n+l)=A(n,n+l)+Ael(l,2>;
A(n+l,n)=A(n+l,n)+AelB,l)\320\263
A(n+l,n+l)=A(n+l,n+l)+AelB,2)
b(n,l)=b(n,l)+bel(l) ,-
end
A(l,l)=l?
for n=2:N,
A(n,l)=0;
A(l,n)=0;
end
A(N,N)=A(N,N)+alphaa*p;
b(N)=b(N)+betaa*p;
x=inv(A)*b;
R_exact=abs((zetal*tanh(j*kl*t)-zetaO)/(zetal*tanh(j*kl*t)+zetaO));
Rf {z,l)=beta,-
Rf(Z,2)=R_FEM;
Re(z,l)=beta;
Re(z,2)=R_exact
end
elf;
plotfRf(:,1),Ret:,2>'r')(
hold;
plot(Re(:,l),Re(:,2));
References 91
xlabeK 'beta' );
ylabeK ' | ft i \342\200\242)
s
% End of program
- -
xn)q{x)dx * + q(xn+l)]
p'*\342\200\236+,
-v)(.v
^ [q{xn)
/2 = (x - \321\205\342\200\236_,J
<?(\302\246*>rfx
(j-J p
- .vJ
(xn+1 q(x) dx % -^
[iq{xn) + <jf(.vtt+1)]
=\302\273 + /K*,,-i')]
p{x)dx \\[\321\200(\321\205\342\200\236)
= -L \320\223\" -
.v,,_,)/(.v) *
(.v
^- [2/(-vn) +/(*\342\200\236_
=
\320\263
\320\223
.v,,
In all cases, hn
= -
xn\\
\\\321\205\342\200\236+\\and hn_\\
= \342\200\224 are
xn-i
\\\321\205\342\200\236
I assumed to be small. These
are derived by introducing a linear approximation for the functions g(.v), j\\x), and
p(x). For example,in deriving the approximation for /4, p{x) was approximated as
\"n-l \302\253n-1
REFERENCES
[7] S. \320\241
Charpa and R. P. Canale. Numerical Methodsfor Engineers. McGraw-
anisotropic artificial absorber for truncating finite element meshes. IEEE Tram
Applications
41 INTRODUCTION
93
94 Two-Dimensional Applications \302\246
Chapic
In this chapter we cover many aspects of the finite element method (and hybr
versions) at sufficient detail to provide the reader with a comfortable level of uui
standing its implementation. Such an understanding is essentialfor a three-dim,
sional analysis where many of the steps must be discussed symbolically due loi
size of the matrices even for very small problem sizes. We begin by first perform;
reduction of Maxwell'sequations to two-dimensional wave equations and \321\200\320
with the solution of the latter by following the same FEM steps discussed in i
previous chapter. That is, we generate the weak form of the wave equation and cur
out its discretization with the introduction of linear shape functions.Matrix as\302\253
bly and boundary conditions are then discussed for determining the propagaii
constants in waveguides, and we give examples of this type of analysis.In proceed
with the solution of open domain problems, we first discuss absorbing bound;
conditions and material absorbersfor truncating the finite element mesh. The \320\277\32
chapters.
4.2 TWO-DIMENSIONAL
WAVE EQUATIONS
1. Carry out the FEM solution and find V(x, y) in the absence of dielecir.
2. Determine the charge per unit length of one of the conductors by carry
out the integration
eh Well |j
Contour
Section 4.2 Two-Dimensional
\302\246 Wave Equations 95
Finite
element
Integration
contour for
evaluating
charge on
enclosed
conductor
where n denotes the outward directed unit normal and the contour is shown
in Fig. 4.1.
3. Evaluate the capacitance per unit length of the free space filled transmission
line from
D.3)
\320\264\320\272
}
z ... .. D.4)
=
?\342\200\236 D.5)
D.6)
for incidence
\320\242\320\225 (also referred to as H. polarization) or
E' = fg D.7)
96 Two-Dimensional Applications \302\246
Chapter 4
alternatively written as
where k1 = \342\200\224
(j?cos0o +.vsin0o) 'S tne direction of the incident wave and
excitation, the fields scattered by the cylinder will also be z directed and thus la
the \320\242\320\225
case the vector wave equation becomes
V, X \320\223-
V, X - zklflrHz
= -ZJ(D8OMI: D.81
B#s)j
the vector wave equation D.8) is now reduced to the scalar wave equation
D.41
Scattered <ft
field \\
Scattering
cylinder
Contour, \320\241
simply given by D.6). component The Hfal is referred to as the scattered magnetic
field (z component only) and is the unknown quantity of interest. As will be dis-
discussed later, there are advantages in solving for tff01 directly and in this case the
pertinent wave equation is
\302\246 \\ + = -V, \342\200\242
VfH{\\
_ ^fr =f(x,y)
+j(OSoMt:
V,
(~ V,#fat klnrff?al
(y
D.\320\237)
obtained by substituting D.10) into D.9). From duality, the corresponding wave
equation for TM incidenceis
Vr
\302\246 + ftfer?f\302\253
= -V,
\342\200\242 - kierEl+>\320\274\320\276^- D\320\2332)
f-L V,?f\") (~ V,?i)
where denote
??\320\270\" .*ie z component of the scattered electricfield and El. is the \320\263
component of the incident electric field generated by zjt in isolation or is simply
given by D.7).
|U,(a-,v)
\302\246jU/X-v.y)
where /? is the propagation constant along z and V(x,y) is the field value over the
waveguides cross section.Fora waveguide whose nonmetaUic region ?2 is filled with
= \320\236 D.17)
where V; denotes the surface gradient and we set 82H:/dz2 = -P2H:. This is also
called theHelmholtz equation and is basically the scalar equivalent of the curl-curl
vector formulation. We note that once #. is found from D.17), Maxwell's equations
can be used to obtain the other field components using the expressions
Ey
= Ex
= -Utofi/ytX8HI/8y),
\320\270<\320\276\321\206//)(\320\264\320\230:/\320\264\321\205), Hx = (-jfi/fWHJSx). =
\320\251
V, V, = 0 D.19)
x(\342\200\224 xE,)-(^,-^)E,
in which
D.20)
is the total transverse electric field in the guide. Again, D.19) is valid only for cases
where the field variation is independentof the third dimension.
Alternatively, for the TM modes (\320\257.= \320\236, 0), the appropriate
E. \321\204 scalar and
vector wave equations are simply the duals of D.17H4.19).However, the boundary
conditions are of the Dirichlet type when solving for Ez and of the Neumann type
when solving for H-. As was shown in Chapter 3, the relation = 0 serves as
\320\265\320\250_,/\320\255\320\273
V = 1 e = x
V, + ? V, -m
| + 1-jfii
\321\203 D11,
implying that
Finite element
y* mesh
Metallic
<a) (b)
boundary
figure 4.3 Waveguide configuration: (a) cross section of waveguide; (b) three-
dimensional view.
- V, x
\342\200\224 x
V, E,
- z x \342\200\224
[(V,?. \321\205
f]
-\320\254\321\203/\320\227\320\225,)
Me Mr
+ Vf x (V'?:+/)8E') x D.22)
\\j f]
V, x \342\200\224x
V, E,
- ^ (V,?. -
*ge,E,
+\320\243/\320\227\320\225,)
= 0 D.23)
Mr Mr
V, x \\\342\200\224
(V,E. +J0E,) x - Ager?:J= 0 D.24)
fj
for the transverse and z components of the wave equation. Clearly, D.23) and D.24)
a
represent pair coupled of differential equations which either needs to be decoupled
or solved as is. Decoupling them using the divergence property yields a nonsym-
metric generalized eigenvalue problem, the solution of which is numerically ineffi-
inefficient. However,
using simple variable transformations [2],
D.25)
100 Two-Dimensional Applications \302\246
Chapter 4
the coupled pair of differential equations D.23) and D.24) can be expressedas
V, x V, x +
<\320\233 p2
\342\200\224
(V,e, + e,) = klere,
(\342\200\224
r. 1= D.26)
/?4 x (V,e: + e,) x fJk2oere.J
\\\342\200\224 fJ
The coupled pair of differential equations D.26) can now be solved for f}2\342\200\224the
<V* +''>\342\200\242*
=
; D.28)
V, x e, = 0
From the above presentation, a general form of the two-dimensional wave equa-
equation is
\342\200\242
V \320\253-v,y)VU(x, y)] + klq{x, y)U{x, =\320\224\321\205,
\321\203) \321\203) D.29)
U{x,\321\203)
=
\321\203). p(x,
\320\235\320\263{\321\205, >') =
\342\200\224
. <l(x< = Pr
\320\243)
The steps to be followed for the solution of D.29) via the FEM parallel those
given in Sections 3.4 to 3.7 for the solution of the corresponding one-dimensional
(ordinary) differential equation. They involve
\302\246
casting of the original wave equation to its weak form to obtain a single
functional incorporating the conditions imposed by the wave equation and
the boundary conditions.
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation 101
\302\246
tessellation of the computational domain allowing for a discretization of the
weak form to a linear system of equations element by element.
\302\246
assembly of the element equation and imposition of the boundary conditions
to obtain the final linear system of equations.
In this section we carry out the first two steps, and in the subsequent section we
considerthe assembly of elemental equations to solve for the fields and eigenvalues
associated with a metallic waveguide (closed domain problem).
R(x,y) = V \302\246
/>(r)Vt/(r) + k20 q{r)U{r) -/(r) D.30)
mm of
W(r)R(r)dxdy= Q D.31)
Section 3.4, the weighting function must again be compatible with the boundary
conditions and be square integrable over the domain. Its derivative must also be
square integrable.
To reduce the order of the derivatives in the residual and r*tcoducethe bound-
Making use of D.32) and D.33) into the weighted residual equation D.31)
yields
-Mr)V^(r) \342\200\242
V?/(r) + kfab) W(r)U{t) - ds
\320\251\321\202)\320\224\320\263)]
+1 p(r) \302\246
\320\251\321\204V?/(r)] dl = 0 D.34)
ic
This is the weak form of the two-dimensional scalar wave equation and should be
compared to the one-dimensional weak form in C.11). Again, we note the presence
of the integral over \320\241
boundary which boundary allows for the imposition of the
conditions. D.34) provides a single statement
Thus, incorporating the conditions
implied by the wave equation and the pertinent boundary conditions.As noted
for the one-dimensional case, this is at the heart of the finite element method.
where Uf are the unknown coefficients of the expansion and represent the field or
potential values at the nodes of each triangle. This representation is therefore
referred to as a node-based expansion. As usual, Ne denotes the number of elements
used for tessellating the domain. The procedure of tessellation is referred to as
meshing and typically each side of the triangle is chosen to be less than 1/10 of a
wavelength.
From Chapter 2, the explicit form of the shape functions Nf (x, y) is
0 \\!fl.
otherwise D.36)
v
where
with the indices (ij, k) following the cyclical rule. That is, (ij, k) = A,2,3),B,3.1),
orC,1,2) for the first, second, and third local nodes, The shape func-
respectively.
functions Nf(x,y) are pictorially illustrated in Fig. 4.5. They are equal to unity at the Ah
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation 103
Linear field
approximation
over the eth
triangle
2 1
x, y)
Figure 4.5 Node coordinates for the eth triangle and illustration of the node-based
expansion functions Nf{x,y).
E E
w
f f f D.39)
Jc J Jn-
Notice that we have not used the expansion D.35)to approximate VU
\320\273 =
\342\200\242 on the
\320\251
boundary since
\320\241 the behavior of \320\251
on \320\241
must be provided through the boundary
conditions.
A linear set of equations can now be obtained by employing Galerkin's method
where we choose the weighting function lV(x.y) equal to the expansion basis
Nf(x,y). i = 1.2,3- Doing so yields
Y,Vf\\\\ \302\246
VAff(r)
[-\321\200(\320\263)\320\2511\320\263) + *grfr)A(f(ryvftr)] dx dy
+
f p(r)NJ{t)n
\302\246
Vt/(r)rf/ = I
f /=
\320\233\320\223/(\320\263)/(\320\263)dxdy, 1,2,3 D.40)
where we have temporarily dropped the sum over the elements since Nj(x,y) is
nonzero only over the eth element. We will later perform the summation over all
104 Two-Dimensional Applications \302\246
Chapter 4
elements during the assembly process of the matrix system. Also, the presence of Ihe
boundary integral is required only if element has an edge bordering
the eth the
contour The
\320\241 contour segment C, refers to the edge of the eth triangle which is
part of \320\241
choices =
of \320\243 1,2,3. If we assume that on the
\320\241 field satisfies the Neumann bound-
boundarycondition h \302\246
Vt/ = = 0,
\320\251
we have
Ah Ah Ab~ U\\
Ah Ah Ah U{ =. bl D.41)
Aji Ah Ah. v\\.
or
which is the element matrix system. The explicit form of the matrix entries is
4 = -/ \302\246 dx dy
V/Vf (r)
jJ V/Vj(r)
where imply
((\320\270) column vectors)
=
[\320\232'] L\" =
f J NfNfdxdy] qe^JNf\\{Nj)Tdxdy
Evaluating the entries of the submatrices [K?] and [Ke] yields [note that the constants
b* below are those in D.38) and are not related to the excitation column in D.41)]
D.44)
2 1 \320\223
1 2 1 D.451
1 1 2
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation 105
The latter is independent of the triangle coordinates (except for the area multiplier)
and is referred to as a universal element matrix [7]. By making use of D.44) and
D.45) in D.42), the [Ae] matrix entries can be more compactly written as
= :? + D.46)
^ (l s,j)
=
a% /Gt + *\302\247kj
+
\321\204\321\211
^\321\204
+ kfc
where
8=l
j
'' 0 otherwise
\\
As was the case for the one-dimensional solutions, the next step in the finite
element procedure is the assembly of the element equations D.41). This refers to the
procedure of carrying out the sum
? {/,') D.47)
+ Atf'US+l) = D.48)
W2+l }_;H!l+>
1=0 i=0
(e'+2)th
e'thelement
for node I with the matrix entries as given in D.46). The correspondingequations for
the other nodes are very similar, except for differences in the
superscripts/subscripts
and the order of the sum. We should remark that the assembled equation D.48) for
node the same regardless of the number
1 is of elements/nodes contained in the
computational domain. That is, even if the entire domain contains thousands of
nodes, D.48) will still involve only six nodal fields, implying that the corresponding
row of the assembled matrix system [/4](t/} = lf>] W'H contain only six nonzero
entries even though the rank of [A] is in the thousands. Thus, the assembledfinite
element matrix is always very sparse and this is a major advantage and characteristic
of the FEM. bandwidthThe and structure of [A] is determined by the connectivity of
the a result of the tessellation scheme.As can be understood,
nodes, the bandwidth
of the matrix [A] is strongly dependent on the node numbering scheme. We can
reduce the bandwidth by numbering the adjacent nodes using consecutive numbers.
This is difficult to achieve, but sparse matrix storage schemes as those presentedin
Chapter 8 can be used to maintain their efficiency for matrices of the same sparsitv
but different bandwidths. Parallel computing architectures take advantage of spar-
sity but in the case of vector processors, narrow matrix bandwidths must also be
maintained for substantial efficiencyimprovements [8].
A key issue in performing D.47) is the transforma-
the assembly as dictated by
from
transformation local to global nodes. This was discussedfor the one-dimensional analysis.
However, because of the easily predictableconnectivity of the elements (i.e., each
element was sequentially numbered and each node was shared by the two adjacent
segments) the local to global transformation was not an issue for the one-dimen-
case.
one-dimensional The issue of node numbering becomes apparent when we look at the
assembled equation D.48).
Since the unknowns Uf must be eventually put into a single column (with one
subscript), it is necessary to have a readily available mapping between the local and
global nodes which are associated with the <?th element* Thus, in addition to the node
geometry data provided to the finite element program, we must also provide infor-
information about the local and global node numbering schemes. Four tables may be
required before carrying out the matrix assembly routine:
\302\246
Node Location Table
A listing of all mesh nodes (interior and boundary nodes) using global
numbers and their corresponding (x,y) coordinates.
This table specifies
the geometry of the input configuration.
\302\246
Triangle Connectivity Table
The global nodes comprisingeach triangle are given by this table. For
example, by referring to Fig. 4.7 we observe that element #3 (e - 3) is
formed by nodes 3, 5, and 2, as given in line 3 of the table. Basically, the
table defines three arrays: n(l, e). nB,e), and \321\217C, The
\320\265). first of these pro-
provides the correspondence between local node 1 of the eth element and the
global nodes. The other two arrays provide the same correspondence infor-
information for the other two nodes of the triangle. Let'ssay for example that we
are working with the local nodes of element e = 4 and want to get the
corresponding global nodes of that element. Then the value of \302\253A.4) will
give the global number of local node 1, \302\253B.4) will provide the global
number of local node 2. and so on.
Section 4.3 \302\246
Discretization of the Two-Dimensional Wave Equation 107
Global node
numbers
Global coordinates
{x, \321\203) Element Local node arrays
rtode# x \321\203 e Mte) nB,e) nC,e)
1 *\\ \320\243\\ 1 2 4 1
2 2 5 4 2
3 *\320\267 \320\243\320\267 3 3 5 2
4 5 6 4
\342\200\242\302\246 \342\200\242
\342\200\242
Boundary element
connectivity table
Surface Local node arrays
edge
number
5 rtsd.S) /JsB. S)
1 6 4
2 4 1
3 1 2
Outer surface
of mesh
Figure 4.7 Geometry and connectivity data tables required for matrix assembly.
edges (line elements) on the outer boundary of the mesh and their associated
nodes. This table can be generated using the data in the previous two tables.
The data manipulations required to generate the surface node and element
information is typically part of the data preprocessorand is an important
step before assembling the final system. An example of such a boundary
element table is given in Fig. 4.7. For the two-dimensional case, it suffices to
identify all segments which bound the mesh and the nodes which form those
108 Two-Dimensional Applications \302\246
Chapter 4
elements. Again, the listed arrays provide the correspondence between the
local surface element nodes and the global nodes.
\302\246
Material Group Table
This is a look-up table the material
for specification of each element. 1\320\273
practice, the same material covers blocks or sectionsof the domain and il is
therefore not necessary to specify the material parametersfor each indivi-
individual element. Instead, one may choose to attach a material code column to
the element connectivity table. That is, the material of each element is
specified through the \"code\" which is in turn associated with specific values
of er and fir.
The first two of the above tables are always required but the latter two may or
may not be needed depending on the application at hand. Also, it may be convenient
to introduce other tableswhen different elements are used or more complexgeom-
geometries are modeled.
D.49)
where the dimension of (?/) is 16 and {bL'} was set to zero since no excitation is
assumed. A procedure for carrying out the assembly is as follows:
Note the correspondence between the local and global nodes. For example,
C/f=l() t/6, where, as usual, the
= single subscript refers to the global node 6. Thus,
the element matrix for e \342\200\224
10 is
\320\27310
.19
Aw \320\220\\\320\263
\320\220\320\236
Uu = 0 D.50)
,10
\320\260\\1 \320\233
32
'\320\220], An An' Hi
Ah Ah Ah
= 0 D.51)
.Ah Ah Ah.
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation 109
Local
node
numbers (\\
1 1 2 5 1 0 0
2 2 6 5 2 0.5 0
3 2 3 7 3 1.0 0
4 2 7 6 4 1.5 0
5 3 4 8 5 0 0.25
6 3 8 7 6 0.5 0.25
7 5 6 10 7 1.0 0.25
8 5 10 9 8 1.5 0.25
9 6 7 11 9 0 0.5
10 6 11 10 10 0.5 0.5
11 7 8 11 11 1.0 0.5
12 8 12 11 12 1.5 0.5
13 9 10 14 13 0 0.75
14 9 14 13 14 0.5 0.75
15 10 11 15 15 1.0 0.75
16 10 15 14 16 1.5 0.75
17 11 12 15
18 12 16 15
Figure 4.8 Geometry, node location, connectivity, and boundary node data tables
required for matrix assembly.
Ax \320\220\320\263 V2'
\320\220\320\267'
Ax A>
\320\220\320\263 u7 = 0 D.52)
Ax A*.
\320\220\320\263
110 Two-Dimensional Applications \302\246
Chapter 4
\320\2707
= 0 D.53)
A\\\\
U \320\270
.Ax
\320\220]] \320\220722^23
= 0 D.54)
Adding the element equationsin D.50) to D.54) which refer to testing at a common
node, yields the system
A\\2 0 0
Ah An + An a]3 0
a\\+a\\{
A . .9
An + Au + a\\\\
.9
0 + '
\320\220\\\321\212 \320\220
0 All 4? + /ill 0
U2
D.551
U \320\270
We can continue assembling element equations onto this system until all elements
have been accounted for. However, the third equation in D.55), referring to tesling
at node 6, will not change through the remaining assembly process.Clearly, this
equation is no different than the generic equation D.48). Thus, we have established a
pattern for assembling the final system of equations. Specifically, note that in the
final assembled system, the entries will be given by the sum
all D.56)
The index (see Fig. 4.8) e^ of the sum must be kept identical in n(i,e{) and
nU, ?i) when carrying out the sum over et, i and^. For example, A(,b
= a\\\302\260\\
+ \320\233^+
\342\200\224 +
a\\2
\320\220$\320\263 -^3|. and so on. A computer (matlab) routine for carrying out the
assembly is given in the appendix where the coefficients are adjustedfor the chosen
polarization. For a wavenumberof kn = 2\320\273
(i.e., A. = 1), the numerical values of the
assembled [A] matrix are given on the next page and can be used in conjunction with
a given excitation vector [b] to obtain the waveguide fields across its cross section.
0000000000*000*
1
.- \342\200\224
?, v> <
(\320\233
\320\2634
\320\276 <
\320\276
\320\276\320\276
- \342\200\224\342\200\242
\321\216 <\320\233
\\\320\236
\320\235
\320\276 \342\200\224'
\320\276 \320\276
OOOOOOOOOOOOOOsDOO
1\320\236 00 (\320\236
\320\236 1^. 1\320\233
ON \302\253N
\320\236
\320\236
\320\236
\320\236 f-l \320\237
\320\241 \302\253N
\320\236\302\273\320\236
\320\223\320\247
CI \320\236
\320\236
\320\223\320\247
\320\236
*t 't ^ \302\273\320\233
\320\236;O ^
W \320\276
\320\276\320\274\320\265
\320\276
\320\223\320\247 N \320\236
\320\236 f*J \320\236
\342\200\224
\320\276 \342\200\224
\320\276
lN-OOfN\302\273COOOO
\321\201\320\247\321\207\320\236
\342\200\224 \320\276
\320\276 \320\276
\342\200\224
\342\200\224
J5
\342\200\224
^\320\236\320\223\320\247\320\236\320\236
\320\234\320\236\320\236^\320\236\320\223\320\247\320\236\320\236\320\236\320\236\320\236\320\236
\342\200\224 \302\253\320\273
\320\276 \342\200\224
\342\200\224 \320\276
\342\200\224
\320\236 \320\236
\320\236 \342\200\224*
\320\276
\320\236\320\236^\320\236\320\241\320\236\320\236\320\241\320\236\320\236-\320\241\320\236\320\236\320\241\320\236\320\236\
m
\302\273ri \320\276
\320\265 \320\263\320\273
\302\273\320\276
%\320\276\320\276\320\276
\320\277\320\277
\342\200\224
^ \302\253\320\233
1\320\233 \342\200\224
vi
'\320\273
\320\276
\342\200\224
\320\276\320\276 ri \320\276
\321\201 \320\2766w
\320\276 \320\276
\320\236\320\241
** \320\223-1
\320\236\320\241 \342\200\224
I
111
112 Two-Dimensional Applications \302\246
Chapter 4
ya {a/b = 2)
\320\242\320\225 TM Analytical 110] FEM Calculation
10 3.142 3.144
20 6.285 6.308
01 6.285 6.308
11 II 7.027 7.027
12 12 12.958 13.201
21 21 8.889 8.993
. t t \342\200\242
t t t t t t t t t ,
. t t I t t t t
t t t t t t t t t 1 T t t
t t t i t t t t t t t 1 1
..'it t t I , ,
. t t i t t II t t t t t
t t t i t t t t t t I t t
t t t i t t t t t t t t t
t i i i I
j
1 t 1 t t t
i t \342\200\242
i t t t t ! t t
I
waveguidewith a/b
= 2. [Courtesy of Reddy el al. [9J.\\
5\302\253lion
4.3 \302\246
Discretization of the Two-Dimensional Wave Equation 113
mesh for this type of cross section is illustrated in Fig. 4.3. Calculations were carried
out by Reddy et al. [9] using 340 elements to model the cross section between the
inner and outer conductor for \320\263\320\2631\320\263\321\205
= 4. The first few eigenvalues (cutoff wave
numbers) and eigenvectors are given in Table 4.2 and Fig. 4.11, respectively. For
= 4)
plays the same role as the [endpoints]for one-dimensional problems (see Chapter 31
As discussed in Section 3.7, since the field values at the boundary nodes are zero, ii is
not necessary to test at these nodes. Instead,we set the boundary node fields to zero
whenever they appear in the system. By avoiding testing (or weighting) at the bound-
boundarynodes (or elements), the integrals over Cs do not enter in the construction of the
final system and can thus be neglected altogether. We also remark that the choice of
omitting testing at the boundary nodes is equivalent to setting the weighting func-
functions to zero when testing at these nodes.
Although the above arguments are sufficient to proceed with the FEM maim
assembly while neglecting the presence of the boundary integral, they are neverthe-
difficult
nevertheless to visualize without going through some of the details. Therefore, below
we will (for a moment) proceed with the assumption that the boundary integral
contribution is needed. We begin with the discretization of the boundary integral
where Ns denotes the number of boundary edges A2, for example, in Fig. 4.8) and f,
carries the usual notation. That is, \320\244? refers to the value of \320\255\320\225:/\320\264\320\277
at the /th local
node of the sth segment of the boundary (see Fig. 4.7). The expansion bases arc
linear interpolation functions between the node values of They
\320\244(\320\263). are of the same
form as those discussed in Chapter 2 (see Section 2.3.1) and Chapter 3. Fora
constant value of .v or y, the two-dimensional shape functions reduce to the same
linear one-dimensional expansion functions given in B.3). Thus, we can write
x\\ - x , .
,boundaries
-
\342\200\224
on v = constant
4 *
= v? - v
IJ(r)
,,s
on.v =
,,.)\302\246
constant boundaries
\320\263> 1
\320\243
L\\(t) = 1 - Z.f(r). e C,
\320\263 D.59|
otherwise
where is
\321\204 the angular variable ranging from = 0
\321\204 to =
\321\204 2\320\273\\
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation US
[Ae]\\Ue] )
= {be] D.61)
where the entries of [Ae] and (//} are given by D.42) and D.43). Those of [B\"] are
computed from
Nf(t)L](i)dl D.62)
or
L](r)L)(r)dl D.63)
(dt= dx or
dy for rectangular boundaries) and are associatedwith the last left hand
side of D.39).The latter expression D.63) for By results from the identification that
the ith surface or boundary segment must be an edge of the eth element for a
nonzero value of By. It is also understood that the boundary matrix [Bx] will be
nonzero only when the eth element is associated with a pair of nodes on the outer
surface/boundary of the computational domain.
of the elemental equation D.61)again
The assembly amounts to summing the
equations from each element weighting at the same global node. For the specific
rectangular waveguide example shown in Fig. 4.8. the resulting assembled system
will be of the form
D.64)
\342\200\242''is.:
116 Two-Dimensional Applications \302\246
Chapier-i
\320\276 0
0
Bn
\320\262\302\273 0
0 0
0 \320\236 \320\276 0
0 \320\276 \320\276 0
0 \320\276 \320\276 0
0 \320\276 \320\276 0
\320\262\320\274 0 8.12 0
0 \320\276 \320\276 \320\276
\320\276 0
0 \320\276 0
0 \320\276 0
0 \320\276 0 BaM
0 \320\276 ^1.1,14 \320\236
0 \320\276 ^14.14 0
\321\204,
*2 /\302\2732
\320\244,
\320\2444 ^4
\320\2445 \321\2145
\320\244? bi
*8 \321\214%
\320\244, h
b\\o
bu
\321\204\342\200\236
\320\244|2 bn
\320\244|4 bu
\320\261''
in which \320\244\342\200\236
denotes the outward directed normal derivative of E. at the nth node
The values of \320\244 at the interior nodes are irrelevant becausethey are associated with
all zero rows, included for the proper addition of the [-4] and [B] matrices. It is
understood that [A] is a very sparse matrix, as discussed earlier in the chapter.
We observe that the above system involves 12 nontrivial unknowns from the
column
{\320\244} plus 16 unknowns from the {(J) column for a total of 28 unknowns
Clearly, this number of unknowns is much greater than the available 16 equations
For a solution for {[/} and {\320\244} we must add 12 more equations or conditions on the
values of {{/} and {\320\244).This is done through the introduction of the Dirichlet bound-
boundaryconditions satisfied by \\U) = {?.) on the boundary nodes. The procedure i;
\\{US)\\
where in the case of Fig. 4.8, (?/'}= \\Ub, G7, U\\0, Un]T are the interior node fields
and [Us] contains the boundary node variables.Also, we formally define the column
(*} as {\320\244}
= *4.
*\320\267>
(\320\244,,\320\2442, *5. *8. *e. *i2' *i3- *u. *is. *ie)r which excludes all
interior node values of \320\244.
Using this notation, we can rewrite the system D.64) as
\"] Oil
Oil \320\236
[A'S]]\\{U')\\.\\O \\O \\_\\{b')
\\_\\{b)\\ 4
in which [-4\"] refers to the submatrix of [A] containing the interactions among the
interior nodes.
[AIS] and [ASI] are associated with interactions among exterior and
interior nodes, and [Ass] refers to the interactions among the boundary or outer
surface nodes. Similarly to{U1} and \\US), the excitation subvectors {b1} [bs] are
a nd
s'l
{bs} D.67)
The interior node fields are now
decoupled from (\320\244}
and we can therefore proceed to
solve D.66) without consider the solution of (\320\244).Thus, the boundary
a need to
integral can be neglectedwhen assembling the FEM system provided U or its normal
derivative are zero on the boundary C. After [U1] is found, we can return to D.67)
= \\b) D.68)
where
{b} = [bs}-[As'][U')
In the case of an eigenvalue problem, the excitation column {/>} is set to zero
and D.66) is written as
where [K\"] and [K11] are identical to those in D.57) except that only the entries
associated with the interior nodes are kept and that pejq\" are defined for the TM
polarization case. Somevalues of for
\321\203 the TM modes, obtained from D.69), are
given in Tables 4.1 and 4.2 for the rectangular and coaxial waveguides. Also, Fig.
4.12displays the fields of the lowest order TM mode for each of these empty wave-
waveguides. The analysis can be carried out using the matlab program in the appendix.
For the mesh in Fig. 4.8, the corresponding [An]. [Ky] and [Ku] are
118 Two-Dimensional Applications \302\246
Chapter -I
fttttttftf...
1 1 t 1 I t
'
\342\200\242
1 \302\246
i ! | |
Figure 4.12 Calculated fields for the lowestTM modesof the rectangular (a/b = 2)
and coaxial (ri/rt = 4) waveguides. [Courtesy of Rcddy cl at. (9j.\\
The above TM mode analysis confirms that the boundary integral in {AM
could be neglected from the start when dealing with metallic domain enclosures. The
final system for the \320\242\320\225
and TM analysis is then obtained by enforcing the boundary
conditions t/(r) only. For the \320\242\320\225
on case, testing is imposed on the boundary and
interior without any specification
nodes for the boundar/ values of U(t). However,
for TM analysis, the boundary values of U(r)areset to zero a priori. Thus testing at
the boundary nodes is avoided and the final TM system is smaller than that corre-
T'M da
|
+ - f
n \302\246
Vtf.scal(r)] dl=\\l (is D.70)
\320\251\320\263)\320\224\320\263)
Fr J(.\302\253w+ J
where
/(r) = -V \342\200\242
\320\224 1 +ja>E0Mi:
is the excitation function and cinncroulcr represent the closure boundaries of the
computation domain, as depicted in Fig. 4.13. For Hi = and
\320\265\320\264\"(\0208*\"+1!\"\320\277\320\233)
Mt = 0. it follows that
\321\217; D.71)
D-72)
\320\201 E
i i i
where Hi* denotes the unknown scattered field values at the nodes and Nf(x,y) is
given by D.36). Subsequently, on choosing Galerkin's testing (i.e., W = NJ), we
obtain the element equations
r -1
f f \320\2231 1
\320\257*' \302\246 + tiVrNiNJ dx
E E VNi
\342\200\224L
e<-
VN> dy
/=l
\342\204\226| J-)S2,L J
i \342\200\242 - \302\246
e
' Vtff\302\260l]dl
\320\233'\320\224\320\275 j '6
N}[n V//f\"]dl
D.73)
,.=i
It has been assumed that the interior domain (dielectricand free space)
bounded by c\"\"\"\" and has
\320\241\321\210\" been subdivided into triangles, whereas the con-
contours c\302\260\"lerand been subdivided into N,{ and Nx, line segments, respec-
c\"nner have
respectively. Note also the excitation function/*' will be nonzero
that only when er and \321\206\320
are not equal to unity, i.e.. only if the element is within the material region \320\257,/,
illustrated in Fig. 4.13.
h \302\246
V//;tal was zero on the metal boundary. However, this is not the case here because
120 Two-Dimensional Applications \302\246
Chapter 4
-:x/\321\217
'
. , \321\201-
outer
Inner
Q Figure 4.13 Illustration of the eonii\302\273
and
C\302\260ula as well as boundary
\320\241\342\204\242\" vcui.-r.
* for the scattering geometry.
Hf*{ represents the scattered and not the total magneticfield. SinceH. = H'.+ \320\257!6\".
it follows that
.
\320\246
(vhI + V//.scaI) = n \342\200\242
VH, = 0, re Cinner D.74i
and thus
n \302\246
VH-scal = -n \342\200\242
VHt, r e CinMr D.75i
where EUin refers to the total tangential electric field on cmn\". Similarly, for the
scattered field
\".? D.77,
Z,, \320\262\320\263
- VtfP\" n =
\302\246
E'ttn r e Cinntir D.7h
+\320\244
with EJan
= Zfl;-(.vsin0o-Pcos0ok/*(l(lfClls*e+1'sin'*ul. Thus, the boundary integral
over Cnncr can be moved to the right hand side of D.73) to be included as pan of
the excitation column. This type of detail in treating a boundary condition (or
constraint) demonstrates how a knowledge of the field over a boundary becoi\302\273
Before casting D.73) onto a matrix system, we must also consider the boundary
conditions on So far,
\321\201\302\260\321\210'. no information has been given with regard to the
boundary condition that must be satisfied by #fal on comer.For scattering prob-
problems, the field continues to propagate to infinity and at large distances from the two-
dimensional scaUerer it has the form Ae~'kr j-Jr, where r denotes the radial distance
from the origin and A is a constant. Thus, since
1 e~
00 D.79)
This is the well-known first-order Bayliss-Turkel [12]absorbingboimdary condition
(ABC) and, by its nature, it can only be enforced on circular boundariessuch as that
shown in Fig. 4.14. in this case, f coincides with the normal h, and we then rewrite
D.79) as
D.80)
\320\255\321\217
^\320\271\342\200\224\320\276
Nonreflecting or
ABC surface
Incident
plane wave Reflected
Figure 4.14. Illustration of a circular ABC wave
systems. In the 1980s (see Seniorand Volakis [13] for a review of ABCs)much work
was carried out. aimed at deriving ABCs which provide a better simulation of non-
reflecting surfaces even when placed at a fraction of a wavelength from the scatterer
[12]. [14], [15], [16].These improved ABCsare associated with higher order tangen-
tangentialderivatives. For
example, D.79) is referred to as a first-order ABC because ii
involves a single derivative (with respect to the tangent) of the field.The general form
of the second-order ABC is
where t and n are shown in Fig. 4.14. For the second-order Bayliss-TurkelABC [12],
the coefficients a and j8 are given by
a = \\
-Ao(
D.83i
with
second-order ABC D.83) is the Engquist and Majda [14]ABC and is extensively
used in connection with Finite Difference-Time Domain solutions.
f 1 f I / \320\257'\320\235\320\226\320\2331\\
-
Nf[n \342\200\242
Vtffdl] dt = - Nf I <5\320\257^\"
' 4- \320\254
-^~ dt
J (\320\274\320\275\320\270\321\202
Er J (-outer Er \\ at' I
where we used integration by parts to transfer one of the derivativesfrom the field to
the testing function as was done by obtaining the weak form of the wave equation.
To proceed with the discretization of D.84) we must introduce an expansion for the
field on the boundary C\"utcr. Choosing the linear expansion basisD.58)or D.59), we
have
4 4- *]
As noted before, L-'(r) = \320\233^(\320\263)
when e Cmcr
\320\263 is an edge on C\302\260ulcr
and \320\273-, belonging
to the eth element, as depictedin Fig. 4.7.
With the evaluation of the boundary integral over C\"nner as given by D.78) and
the discretization of the other over C\302\260ul\"as given by D.85), we are ready to cast
The entries of the [Ae] matrix are again given by [see D.46)]
roe\021
The excitation column entries consist of two components\342\200\224one from the exci-
excitation function/(r) and another from the boundary condition on c1\"\"\". Specifically
[again, these entries are not related to the b' in D.87)],
% = f f dxdy
\320\222\320\224\320\223 -Jp \\
Nf(r)(E'tan
\342\200\242
!)dt D.89)
where s2 is a segment on cmner belonging to the eth element. Substituting for/1\" and
= $ A
- \320\233 \\ Nf(r) *
[
which can be evaluated in closed form for each of the line segments and triangles.
The assembly of the elemental equations D.86) is carried out in the saint
manner as done for the TM waveguide mode analysis. The resulting global system
will be of the form
D\320\233
]|54[ ]|)w
where we used the superscripts \"interior\" and \"boundary\" to indicate the separation
of the node field column as done in D.65).
This system is similar in all respects to D.65) for the TM mode analysis
However, there is a major difference in that has
(\320\244) now been replaced with the
field itself on the boundary. Thus, the convenient decomposition to a pair of smaller
systems is no longer possible, nor is it needed. Since the ABC permitted the elimina-
no additional
of {\320\244},
elimination equations are required for the solution of (\320\257?\"\"}.Addition
of the two left hand matrices gives
[\320\231 |
where the entire matrix is sparse and the system can be solved using an iterative
solver (see Chapter 9).
=
Jim 2*1--^-
(\302\246-\302\246\320\236\320\241
D.9.T,
The field outside the computational domain is obtained by application of the surface
?/*=>) = -I \302\246
(\302\253'V't/(r') G2/)(r. r') - \302\246
[\302\253'V Gw(j, r')] U(r')} dl' D.94)
any shape or form and can be located at any distance from the scatterer. Also,
G^Cr,r') is the two-dimensional Green's function
. ,..u,-
..\342\200\236 -r'l) D.95)
4
where H^^ denotes the Hankel
zeroth-order function of the second kind. This
Green'sfunction can be
interpreted as the field generated by a line source at r'
since it satisfies the differential equation
klG1D= -8{r- r') VzG2l) + D.96)
J = x H
\320\264 =
-j^p- n x [f x V?J D.97)
L D.98)
J'd4
Also, the equivalent magnetic current is given by
M = E x =
\321\217 ?? = lM, D.99)
and therefore D.94) can be rewritten as
?!\302\260V)
= i l-jk0Z0J.(r)G2D(r.,') + M,[n \302\246
V'C20(r. r')]} dl' D. 100)
Although C/c can be arbitrary, it is best to choose it so that the integrated fields
are most accurate. For purely metallic scatterers, the computed field and its deriva-
is
derivative most accurate near the metallic surface. Thus, it is appropriate to choose CK
to be near to or coincide with the metallic surface of the scatterer. For TM incidence,
when CK is coincident with the metallic surface. ?/(r)= 0 for \320\263
\320\261 and
\320\241\320\272, thus
\302\246 - D.101)
[n V'Ez(r')}H^(k0\\r r'\\)dl'
Likewise,for \320\242\320\225
incidence, h \302\246
VU(r)
- 0 on CK and D.94)reducesto
1-
\320\2575>)
= r'\\)]dl' D.102)
|
However, when dealing with coated metallic or purely dielectric scatterers, D.101)
and D.102) cannot be used.In this case, the contour CA- is placed above the outer
surface of the dielectric (see Fig. 4.16)and the scattered must be computed from
D.100)or its dual.
by the original FEM tessellation of the domain. For simplicity, let us choose Q to
with
^
\320\244(\320\263')
- 2rrK cos(<p
-
*'
D.104!
= \320\277
\320\244(\320\263)
\342\200\242
VU(t)
2, + r - 2rrK cos@
- <
lA2)(k0Jr2
_ \320\224\320\276
r r u(x'\\
V
4 AJo + ~
y/r2 4
D.1051
\320\263-\320\263'
\320\257'2)(*\320\276|\320\263-\320\263'|)
= -\321\204')] D.106)
-\302\253\320\276 [\320\263\320\272-\320\263\321\201\320\276$(\321\204
|\320\263-\320\263'|
in which \320\257{2)denotes the first-order Hankel function of the second kind. Also, q
refers to the angle between r' and the .v-axis, as shown in Fig. 4.16.
For far zone computations (i.e.,r -*\302\246 oo), we can simplify the above integrands
D.107)
77 TT
\342\200\224
for amplitude terms
1 cos(i* )
\321\204 \342\200\224
-\321\204)
\321\201\320\276$(\321\204
\320\223\320\272 for phase terms
D.108)
Substituting D.107) and D.108) into D.104) and D.105)and discretizing the integral,
yields the far zone approximations
D.109)
,./V* \320\265\320\276\320\260(\321\204-\321\204\34
D.110)
In these, = 2tt/Nk
\320\233\321\204 denotes the angular extent of each discretearc segment and \321\204\34
is the value of at
\321\204 the midpoint of the nth element. For this case, Un and refer to
\320\244\342\200\236
the average value of the field and its normal derivative at the midpoint of the \302\253th
segment.
From D.93). the echowidth can be computed from
)+*0 \320\201u\302\273
D.111)
n=\\ n=l
Another expression for the far zone field using equivalent surface electric and mag-
magnetic currents is
\342\200\236-Akor-n/4)
D.112)
for \320\242\320\225
incidence. These are obtained from D.100) onceD.107)and D.108) are used.
For this case M = ?E: and J = -z/^/(k0Z0).
= 0.3A010.4A0,
0.5A0,0.6A0 \\
\\
/
^ABC
/
\\
boundary
\\
/ \\ /0 3A0
\\ \\
I *
1
1 I a = 0.25 A I
\\ / 1
\\
\\ \\
/ /
\320\243
\\ /
\\ /
\\ ?f=4 / Figure 4.17 Geometry of the coated circular
cylinder and illustration of the \320\233\320\222\320\241
bound-
boundary.
\321\210
0.00
0.00 20.00 40.00 60.00 80.00100.00120.00140.00160.00160.00
angle (degrees in direction)
\321\204
Figure 4.18 Finite element solution of the near-zone TM incidence scattered field
measured at r = ().275\320\224.\320\236.
The geometry is shown in Fig. 4.17 and the
four ourves refer to the radii at which the ABC was placed {rn = \320\236.\320\227\320\233\320\
0.4A0. 0.5A0. and 0.6A.o). [After Peterson and Castillo [17]. CO IEEE.
1989.]
Section 4.4 \302\246
Two-Dimensional Scattering 129
specific example was consideredby Peterson and Castillo [17] and was used to assess
the accuracy of the second order ABC D.81) by comparison with the exact eigen-
function solution [18]. In Fig. 4.18 we show the near zone field ?fal due to a plane
TM incidencefor different values of the ABC radius r0. As expected,the computed
?f\" field is quite inaccurate when r0 = O.3A.O since the ABC boundary is then co-
coincident with the outer boundary of the scattering geometry. By its derivation, the
second-orderABCis valid for large r0 since it neglects [13] terms beyond O(r~9/:) as
well as nonradial waves. Clearly, the choice of rt) = 0.3Xq violates this assumption.
However, the accuracy of the solution improves substantially when r0 is increased to
0.4A.o, and continues to improve as r0 is increased. Typically, it may be necessaryto
increase r0 as much as 2AQ to obtain very accurate results. This is especiallytrue for
the \320\242\320\225
incidence where nonspecular fields caused by traveling and surface waves are
of importance.
Another example application of the finite element method with ABCs for a
noncircular cylinder is shown in Fig. 4.19. The geometry is a metallic triangular
0.7071\320\257
E,=ie'
i i i i i
10
O-Q
5 -
i \320\233 \\
0 - < -
\\ \302\260
triangular elements [17]. For a TM plane wave impinging from the negative \320\264-\320\26
the echowid th is shown in Fig. 4.19 as a function of the angle \321\204 measured from the *-
axis. For this bistatic pattern, =
\321\204 0\" corresponds to the direction of forward scat-
lossy environment (see. for example. Fig. 4.20). tapered T he shape provides a better
impedance matching, whereas the loss in the material causes absorption of the
entering waves.
In a finite element analysis, we can also use material absorbers for mesh trun-
truncation and this approach will be referred to as the finite element-artificial absorber
(FE-AA) method. For numerical simulations, it is not necessary to make use of
material parameters or profiles which are physical. Instead, we can use any fictitious
(i.e., artificial) material profile and employ it for mesh truncation, as illustrated in
Fig. 4.21. The shown absorber can be curved, if necessary, to minimize the computa-
computationalvolume, depending on the scatterer's or radiator's shape.
Not being restricted by the material choices, an optimizer such as the simplex
method can be used to determine the material parameters sn and
\321\206\320\277 thickness t to
minimize reflections over all visible incidence angles. This approach was used b)
6zdemir and Volakis [19] to obtain the parameters given in Fig. 4.21. A homo-
Metal
<\342\200\224
Very small
reflection
= 1-/2.7
\302\246t=0.15Ao
0
\321\201 0
0 \321\201 0
0 0 I/\320\263
An example of using the artificial absorber for mesh truncation is shown in Fig.
4.22. This configuration is a rectangular groove situated in an otherwise flat metallic
plane (ground plane). We are interested in computing the scattered field due to \320\260
\320\242\320
plane wave excitation. For our formulation, the excitation is simulated by setting
= 0 on the absorber metal backing
= \342\200\224
E't tan on cavity and ground plane metallic surfaces
and the resulting matrix system is identical to that in D.91) except that [B] is set to
zero and
= *
-f^ f
\342\204\226)A')<bdy
that coincident with the ground plane (y = 0). Since we are interested in the fields in
the .v > 0 region, we can arbitrarily set those below the aperture to zero for the
application of the surface equivalence principle.The surface magnetic currents are
then given by
0 < x < \320\270\\
.v
= 0
otherwise D.114)
_ J 2E x \320\273
= 2E x 0 <
\321\203 x < w, =
\321\203 0
otherwise
D.115)
and J = 0. and these radiate in freespace. The fact that the electric current vanishes
is a substantial simplification since the integration of the radiation integral is limited
over the aperture of the groove. Specifically, from D.113)
\342\200\224Absorber
/////////\\\\\\\\\\\\\\\\\\
7////////\\\\\\\\\\\\\\\\\\
Figure 4.22 Illustration of a groove recessed
/////\\\\\\\\\\ in a perfectly conducting ground plane. The
artificial absorber in Fig. 4.21 is used Tor mesh
7//7/\\\\\\\\\\\342\200\224
truncations.
Section 4.4 Two-Dimensional
\302\246 Scattering 133
\320\243 \320\225,\320\235
\320\243
J = 0
\320\234
**-
Metal
w
<
(\320\260)
* U
J = ft \321\205
\320\235 /> =
\302\246 ? = \320\225\321\205
\320\234 \320\273
1
Ground plane
(\320\254)
\321\200 Illustration of the surface equiv-
equivalence for computing the scattered location
principle Original
\321\217
^ \320\233
fields from a groove: (a) original geometry: \320\243 of ground plane
J \342\200\224
^
ft) setup for surface equivalence; (c) equiv- \342\200\224**\302\246
\302\246\342\200\224^
equivalent
currents alter application of image , = 2E x h, 0<x< w
theory. (c)
,
\302\246.116)
and from the finite element solution, the magnetic current is given by
Mequ
= 2(E x >\342\200\242)
= 2~(E'X + ?*tcal)
= ?2 Z,,si D.117)
and we should emphasize that H% refers to the scattered field at the /th node of the
eth element. The latter sum is over the three nodes of the element bordering the
aperture at the computation point for
Mequ. From D.93), the corresponding echo-
width is
D.118)
4(Z0J
Bistatic echowidth calculations for the rectangular groove depicted in Fig. 4.22
are given in Fig. 4.24. The curve corresponds to a groove width of W = 2.5A.O and a
depth of d = 0.2A.o. The incident plane wave was incoming at an angle of 70\302\260
from
the face of the ground plane and the absorber was placed O.15A.Ofrom the top of the
groove. As seen, the echowidth computations using the FE-AA method is in good
agreement with those based on the rigorous FE BI method discussed next (see [22].
[23]). However, care must be given when using artificial absorbers for mesh trunca-
truncation.The accuracy of the results is not assured, and this is more so for the near zone
fields. Also, the convergence rale of the iterative solver may deteriorate for certain
absorber parameters.
134 Two-Dimensional Applications \302\246
Chapter 4
where Hz = tfi 4- #fat. Here, the quantities and Hz are not related through
\320\264\320\235:/\320\264\320\277
D.120)
f; +hsj
2
D.121)
s,=l
in which
' 2 D.122)
0 otherwise
1&\320\244
Section 4.4 Two-Dimensional
\320\250 Scattering 135
is the pulse function and we have assumed a circular contour c\302\260utcr, as shown in
Fig. 4.25.
Basically,the above expansions D.120) and D.121) approximate the values of
H. and over each
\320\257\320\257./\320\255\321\217 boundary segment by a constant value equal to the average
of their values at the bordering nodes. They are lessaccuratethan the linear expan-
expansions but provide substantial simplification in discretizing KirchhofiTs boundary
integral D.119). Substituting D.120) and D.121) into D.119) yields
\321\206>.
v'G2fi(r, l' II -- Hi(r)
r) dl' Hi(r)== R(t), e C\302\260wcr
\320\263 D.123)
J
R(T)W(r)dl=
I
Choosing^(r) = 8{\321\204
\342\200\224
#|
=
\321\204\320\2471),
1,2...., Nsr implying point matching, yields the
system
2
= \302\253i(r,,). q\\
= 1,2 NSi D.124)
in which denotes
\321\202\320\247[
the location of the midpoint in the gtth testing segment. The
entries GSl4l
and Gv are the integrals
136 Two-Dimensional Applications \302\246
Chapter \342\200
J -i,th
tf' = '
= +
f ^
\320\272\321\200\320\270\320\273\320\270
[ u
G20(r,
\302\253jm\302\253u
r')|- c//' D.126)
in which denotes
1-\321\217 radius of \320\241\302\260\320\2701\321\201\320\263.
the For Si, the evaluation
q\\ \320\244 of these integrals
can be carried out via the simplemidpoint method since \320\263* is typically
\320\220\321\204 very small.
However, when q\\ = S\\, the argument of the Hankel function vanishes. Noting thai
with A\\
=
\\-j%(\\nY-fo2), A2 =-\\+j^{\\a.y~\\ -ln2) and y= 1.781072418.
it is clear that the integrands in D.125) and D.126) become singular when \321\211
=\320\264,,
However, the evaluation of G,|5| can be readily carried out by substituting D.\320\251
work with the rightmost expression in D.126). That is, we first integrate the small
argument expansion D.127) and then perform the differentiation before setting
r = Doing
\320\223\321\206. so yields
D.129)
In the above, the 1/2 term can be also viewed as a result of the identity
*.
' dl'
S1 Oft
where the bar through the integral implies principal value and r* denotes that the
evaluation point may be just inside (+) or just outside the
(\342\200\224) contour However,
\320\241
principalvalue concept for evaluating these integrals since the 1/2 term can be
extractedin the limit as the testing point approaches the integration surface or as
done above where the differentiation is carried out after the integration.
We can now approximate GS[4i and Gv as
Section 4,5 \302\246
Edge Elements 137
=
\302\246vi\320\257\\
D.130)
\302\246i\\Qi
\320\244
Gv = D.131)
\320\270\321\217]
\320\223 1 , oil
\320\223\320\276 \320\276
D.133)
[[\320\273*]
we obtain a total of N + Ns equations for the N node field values and Nf values of \320\244
on c\302\260uler.The entries of [A] and [B]are identical to those given in reference to D.65).
Although more cumbersome, this approach is rigorous and would be exact apart
from the numerical approximation required in obtaining the linear systems D.132)
and D.133).It is commonly referred to as the finite element-boundary integral (FE-
BI) method, and scattering results basedon it were used for reference in Fig. 4.24.
It should be noted that the boundary integral subsystem D.132) is associated
with possible fictitious resonances [25, 26] and the solution fails when the resonances
are excited.To suppress them, one can simply introduce a small imaginary part in k0
or
[27] employ the combined field formulation [28].
4.5EDGE ELEMENTS
In previous sections, FEM solutions were carried out using node-based scalar basis
functions based on expanding the unknown quantity in terms of its nodal values, i.e.,
its values at element nodes. Suchan expansion is suitable for modeling a scalar
quantity. This is indeed the case for static potentials or a single field component
as is the case for homogeneous waveguides and 2D scattering. For inhomogeneous
waveguides, it is necessary to work with the vector wave equations D.23)-D.26)
requiring an expansion of the transverse vector field component E, or H,.
However, it has been found that node-based expansions are not ideal for represent-
the
representing vector nature of an electromagnetic field. Node-basedexpansions require
specification of field values at clement nodes where the field may not be defined
(corners).Also, the implementation of boundary conditions occurring in electromag-
138 Two-Dimensional Applications \302\246
Chapter4
radius of R = 25cm and one of the dielectric regions is an offset cylinder of radius
a = 10 cm. Mode field solutions are given when the wavenumber of the inner ofTset
seen, the mode field solution using the node-based elements is substantially in error
and corrective approaches have been extensively studied [29], [30], [31]. Initially, an
Figure 4.26 Failure of the node-based FEM implementation to predict the correel
mode fields inside a coaxial guide with an offset center conductor. Top:
mesh interior to the conducting cylinder offset inner cylinder material
wavenwmber is kl,,K, = 2000. and lhat of the remaining region is
A-^ulcr
= A96.1. 39.22); bollom left: mode solution using node-bused
elements; bottom right: reference mode fields. [After Pmilsen end
Lynch, 1: IEEE. 1991.]
Section 4.5 \302\246
Edge Elements 139
approach referred to as the penalty method [32] was employed to reformulate the
weak wave equation (or variational functional) in conjunction with the node-based
elements.However, in recent years it has been recognized that edge-based elements
or Tangential Vector Finite Elements (TVFEs)remove the shortcomings of node-
based elements.
As discussedin Chapter 2, TVFEs are based on expanding the unknown
quantity in terms of its average values along element edges. The corresponding
basis functions are vector basis functions as opposed to scalar basis functions (sepa-
(separate
expansion for
component)each when node-based finite elements are applied.
TVFEs enforce tangential field continuity along element boundaries for
but allow
normal field discontinuities and have been shown to be free of the shortcomings of
node-basedelements [2, 33, 34, 35]. By using TVFEs, field values are not specified
where the field is not denned, spuriousmodes can automatically be eliminated, and
Dirichlet boundary conditionsare easily imposed.
To describe the Whitney element at some greater detail (seealsoChapter 2), let
us consider the rectangular (.r. v) coordinate system shown in Fig. 4.27. As usual, we
denote the coordinates of the first, second, and third node of a triangular element by
(*b.Vi)> (*2.\320\224'2)' and respectively.
(\320\245},\321\203\320\267), Also, we denote the edge from node 1 to
node 2 as edge #1, the edge from node 2 to node 3 as edge#2, and the edge from
node 3 to node 1 as edge#3. The length of the fcth edge will be denoted lk, and the
unit vector directed from node / toward node/ will be referred to as \320\265\321\206
or ek- We will
assume that a point P internal to the triangle has the coordinates (x,y). The geo-
geometry is illustrated in Fig. 4.27.
Next, we definedas the area of \320\224123, A\\ as the area of AP23, A2 as the area of
AP31, and Aj as the area of \320\220\320\240\320\234.
Using the simplex or area coordinates L\\,L2,and
(given
\320\246 by L, = N\\
= Ax/A, L2 = N-, = A2/A, and L3 = N3
= A3/A) defined in
D.1341
\320\251
= N5, = 12(L2VL} - L3VL2) D.135)
D.136)
Following the usual notation, the basis function Wj>. is associated with the kth edge
of the eth element. As noted in Chapter 2, the tk factor serves as a normalization
parameter. Each basis function can be shown to be divergence free Wjj = 0),
(i.e., V \342\200\242
to ensure tangential field continuity across element boundaries and to allow normal
field variation across element boundaries. The field F\" (either an electric field E' or a
magnetic field He) in the eth element is expanded as
D.1371
k=\\
where the unknown coefficient F% represents the average field value along edge#k of
the <?th element. That is, the field Ath edge is expanded in the direction of
along the
the unit vector introduced for the /cth edge. In the case of an electromagnetic scatter-
problem,
scattering the scattered field can be expandedvia D.137) and the known incident
field can then be added to form the total field.
For the triangle in Fig. 4.28 for which (.Vi,>'i) = @.0), (x2,^2)= and (l,0)
(Xh>'i) = @> ')' the three vector basis functions W', W2, and W3 associatedwith
edge #1, #2. and #3 are plotted in Fig. 4.29 to Fig. 4.31.
The vector basis function W? provides a constant tangential component (with
unit magnitude) along edge #k and zero tangential component along the two other
edges. The normal component of W*, however, is varying linearly along all three
edges.
ff
J in'
VNf-V]dy.
J\\\\Jtr NfNjdy
0.8 \342\200\242 -
0.6 \302\246 -
0.4 \342\200\224-
-*
.\342\200\224 <-* /> / /
0.2 \342\200\224
/
0 \342\200\224- s
0.8
0.6 *^*^
\302\273-.
\342\200\224
_ ^
\302\273-. V 4 \\
0.4 \342\200\224
.\302\253_ N
>\302\253 \\ \\ \\
_ - v N N N \\ \\
0.2 _ - v \\ N \\ \\ \\
1 \342\200\242
. ^ \\ \\ \\ \\ \\ \\
'
0 - ... 1 I t. \\ \\ t t \\ t
1
/
0.8
/> /
/-. / /
0.6
/ / / /
/ / / / /
0.4 ' / / / / / /
. ' '
/ / / / / / / ;
/
/
/ ij /
*\\
/ tj /
4-
/
0.2 / \302\246>
t / I I I i V
A \302\246x
V, (V, x B) = (V, x A)
\342\200\242
(V, x B)
- V,
\342\200\242
[A x (V, x B)] D.138)
V, -(Ax B)ds = i (AxB).M D.139)
- I A \342\200\242 \342\200\242
ds B) = \342\200\242 \342\200\242
ds - L \342\200\242 \342\200\242
B)( A \320\231) dl
V,(V, (V, A)(V, B) (V, D.140)
j |j
(see Fig. 4.4 for a definition of n, Si, and C) in conjunction with the weighted residual
equation
X ~ = \" =
T(' V' X dy \302\260' \342\200\242
-2-3
[Vf (~ E') y2E'ldx
where T, is the weighting function (we did not use the usual notation to avoid
confusion with the TVFE basis notation). The resulting weak equation becomes
x T,)\342\200\242V, x - \302\246
\321\2031!, E,l dxdy
- I T, \302\246
\320\2231
(V, x Er) x =
\321\201//
JJ [(V, A E,) \320\270]
D.142)
Seciion 4.5 \302\246
Edge Elements 143
=
[\320\232$]{\320\232) \320\233\320\232\320\226) D\320\23343)
for the eigenvalue problem. The column {??}now refers to the edge field values of
the eth element, and from D.142)
\320\232\321\210=\\\\
K-Wjdxrfr D.145)
D\320\23346)
hieDmD\"
Km
= +
(\320\233 h + h + h + h) D.147)
-\320\251^
D.148)
= + v dx dy D.150)
^r. (Am \320\222\342\200\236
Al,Bm)\\\\
D.151)
Jdxdy D.152)
in which
=
\320\220\321\210 D.153)
\320\260\320\251-({\342\204\226
=
\320\222\321\202 D.154)
\320\263,\320\251-^\320\254\320\247
=
\320\241\342\200\236, D.155)
\302\253^-\302\253^
Dm = -Bm D.156)
and the remaining parameters are given We note that the
by D.38). subscripts m and
n refer to edge numbers, whereas the subscripts / and / are associated with the node
numbers as specifiedin Fig. 4.27. The above closed form expressions are not neces-
necessarily less expensive to evaluate than a direct numerical evaluation of the integrals
D.144) and D.145). Thus, in practice, one may simply opt to use Gaussianintegra-
integrationfor matrix element evaluation.
The assembly of the element equations D.143) is carried out in the usual
manner. For TVFEs. each edge(or unknown) is shared by two elements only (unless
the edge is on C) and thus the resulting assembled global matrix has greater sparsity
144 Two-DimensionalApplications \302\246
Chapter 4
than the corresponding global matrix for node-based elements. This makes up for
the largernumber of unknowns associated with TVFEs and typically the storage
requirements are about the same for the two types of elements. However, the pre-
preprocessing stage of writing a FEM code based on TVFEs is more involved. In
conducting and dielectric boundaries, and unique directions for all edges (unknowns)
must be generated sincetheseare used during the element matrix construction and
assembly process.
waveguide with PEC walls. The waveguide is assumed to be homogeneously filled with a
H,. In either case, we are interested in determining the eigenvalues y2 from which we
Assembly of the element equations yields the global matrix equation system
D.1571
which is identical (in form) to D.57) but where \\F] now represents the field values
along element edges rather than at element nodes. For TM polarization the number
of unknowns equals the number of global edges,whereas for \320\242\320\225
polarization the
number of unknowns equals the number of global edges minus the number of
D.158)
with the column vector [Hfat] representing the unknowns scattered magnetic field
components along the edges of the triangular mesh. Following the analysis in Section
D.159)
f(v,Xrj.^v,xw;)-k\\ntvrm\302\246 w;Jda
=
\320\272, I w;
\342\200\242
\\a
x
(- v, x di
\320\275\320\2331
-.dksl
d=0.15Jt0
clnner
P
I * ef=1
Mf-3-/3
\\
E1
V T
\\4
\342\200\224>\302\246
\342\200\242*\342\200\224
Figure 4.32 Coaled square cylinder illumi- a
nnled by TM polarized plane wave. Ho
146 Two-Dimensional Applications \302\246
Chapter 4
in which refers
\320\257'' to the region occupied by the dielectric coating and Cdld is the
contour on the outer boundary of the dielectric as depicted in Fig. 4.32 (see also Fig.
4.16). The computational domain will be terminated by a metal-backed artificial
absorber and therefore we can neglect the presence of any boundary integrals. The
absorber was Q.5X0 thick and placed at a distance 0.5X0 from the boundary between
the coating and free space. The absorber'srelative permeability and permittivity was
MB = Exn = (E-z) x +
(\302\273xx nyy)
=
-nyE,x + nxE2y
= + \320\234\321\203\321\200
\320\234\321\205\321\205 D.162)
in which E is found from E = ^~. These currents can be integrated using the
radiation integrals to give the far zone scattered field. For TM incidence, we find
|\320\223' \342\200\224
[~ZqJ:(t') Mx(r')sin\321\204' + MY{
D.163)
where the reader is referred to Fig. 4.16 for a definition of the primed parameters.
In Figs. 4.33 to 4.36. the computed equivalent currents Je and \320\234\321\217
on \320\241\"*1
are
compared with those obtained using a moment method (denoted as MoM on the
analyses give nearly identical results (for magnitude as well as phase). The corre-
corresponding bistatic echo width as computed from D.163) and the dual of D.93) is given
in Fig. 4.37. Again, the FEM resultsare in agreement with the moment method data.
0.0015
\321\212
0.001
0.0005
0.5
Normalizeddistance
4 1
MoM
FEM-- ~
3 '7*1
\"
/1 / \\ i i / \\
/11 [\\ / \\ i i \\
1
2 / 1 I ,
i \\ / I i i / \\.
/ ' \320\232 A \\ / 1'
i / \\
I v A^ '\\ 11 \\
\342\200\242
\\ i/
1 I V I \\ \\1
\320\2721
i /
|i
\\ 1 V
\302\246
/
/ \302\253
i /
/ \\\\
'u
0 \\ I
\320\262 /
S. -1
i / \\
-2 / \\:V
/
; 1
\302\246
-3
-4
0.5
Normalized distance
1.8
MoM
-- .
1.6 FEM
/A
1.4 \342\200\242
m 1-2
2 \\
1 \302\246
0.8 \302\246J
1
0.6
0.4 1 f
A
02 If\342\200\224
0
0.5
Normalized distance
3
2
\302\246
/
I \302\260
? -1
-2
\320\251
-3 MoM \342\200\242
FEM
0.5
Normalized distance
20
-40
30 60 90 120 150
Observation angle ^
Figure 4.37 Bislalie RCS of cylinder in Fig. 4.32. (Results in Figs. 4.33 to 4.37 are
courtesy of L, S. Andersen.) The RCS was computed by integrating
equivalent electric and magnetic currents (Js. M\") on a contour a
small distance from C\"*1.
Ah Ah Ah A\\,
A\\\\ Ah Ah A%,
-A%\\ Ah Ae4i AW j
i= 1,2.3.4
*' 18
v 36
150 Two-Dimensional Applications \302\246 4
Chapter
N_elements=18;
n_lst_lDcal_node=[l 2 2 2 3 3 5 6 7 8 9 9 10 10 11 12);
5 6
n_2nd_lDCal_node=|2 6 3 7 4 8 6 10 7 11 8 12 10 14 11 15 12 16);
n_3td_local_node=l 5 5 7 6 8 7 10 9 11 10 11 11 14 13 15 14 15 15);
mu=onesA:N_eleraents);
eps=ones(l:N_elements);
ko=2*pi/lambda!
for e=l:N_elementsi
n(l,e)=n_lst_local_node(e);
nB,e)=n_2nd_local_node(e);
nC,e)=n_3rd_locai_node(e);
end;
non_cond=!6 7 10 11);
% Nodes locations table:
%Initialization Process:
A=zeros(N_elements);
Kdel=zeros(N_elements);
K=zeros(N_elements);
Kedel=zerosC)j
Appendix 2 \302\246
Sample matlab Code \320\223\320\276\320\263
lmplemenling the Matrix Assembly 151
for e=l:N_elements;
4E_z/TM polarization :
pe=l/mu((e)) ;
qe=eps((e)) ;
% H_z/TE polarization :
pe=l/eps((e) );
qe=mu ( (e) ) ;
for i=l:3;
x(i)=xnodes(n(i.e));
;
y(i)=ynodes(n(i,e))
end?
Aiea=.5*abs((xB)-xA))*(yC)-yAI-\321\205C)-xA))*yB)-yA)));
\\ A(n(i,e) ,n(i,j))=0;
% Kdel(nli,e),n(i,3) 1=0;
\320\263
Klnli.e),n(i,j))=0;
for 1*1:3;
il=0;
if i==3(
1-1=3,
end;
ipl=(i+t)=il!
12=0;
if ipl~3;
12=3;
end;
ip2=ipl+l-i2;
bi=y ( ipl )-y', ip2) ;
ci=x(ip2)-x(ipl) ;
152 Two-Dimensional Applications \302\246
Chapter 4
for j=l:3;
jl=O,
Lf j==3;
jl=3f
end;
jpl=(jn)-jl;
j2=0;
if jpl=3(
J2=3;
end;
jp2=(jpl+l)-j2f
bj=y(jpl)-y(jp2);
cj=x(jp2)-x(jpl>;
Ae(i,j)=-(pe*(bt*bj+ci*cj))/D*Aiea),
Kedel(i,j)=-Ae(i,j);
if i=j!
Ke(i,j)=qe*(Area/6);
Ae(i,j)=Ae(i,j) + (kD\342\200\2422)*qe*(Ar ea/6);
else;
Ke(i,j)=qe*(Area/12);
Ae(i,j)=Ae(i,j)+(ko'2)*qe*(Area/12),\342\200\242
end;
A(n(i,e),n(j,e))=A(n(i,e),n(j,e)
Kdel(n(i,e),n(j,e>)=Kdel(n(i,e),n(j,e))+Kedel(i,j);
K(n(i,e),n(j,e))=K(n(i,e),n(j,e))+Keli,j)(
end;
endj
end;
K_TE=K;
Kdel_TE=Kdel;
A_TE=A;
K_TM=K(non_cond,non_cond);
Kdel_TM=Kdel(non_cond,non_cond) ;
A_TM=A(nDn_cond,non_cond);
eig_squares_TE=eig(Kdel_TE,K_TE);
eig_values_indices_TE=find(eig._squares_TE >= \320\236),1
eig_values_TE=sqrt(eig_squares_TE(eig_values_indices_TE))
eig_values_TE=sort(eig_values_TE);
Appendix 2 \302\246
Sample matlab Code for Implementing Ihe Malrix Assembly 153
figured)
plot (eig_values_TE,'*');
xlabelf'The Modes case)')
(\320\242\320\225
ylabel('The eigenvalues') ;
title!'Eigen values for modes
\320\242\320\225 in a rectangular waveguide');
grid;
legend('a=l.5 \321\201\321\216
and b=.75 cm');
figureB)
eig_squares_TM=eig(Kdel_TM,K_TM);
eig_values_TM=sqrt(eig_squares_TE(eig_values_indices_TM));
eig_values_TM=sort(eig_values_TM)j
plot! eig_values_TM,'d' ) ;
ylabeK'The eigenvalues');
title!'Eigen values for TM modes In a rectangular Waveguide')!
grid;
legend('a=1.5 cm and b=.75 cm');
e_no=l:1&;
table=[e_no' xnodes' ynodes'];
diary data.data
\302\273
The General matrix
\320\232
\320\232
\\ The A matrix
\320\242\320\225
AJTE
% The matrix
\320\242\320\225
\320\232
K_TE
\\ The Kdel
\320\242\320\225matrix
Kdel_TE
\\*t***************
% The TM A matrix
A_TM
% The TM matrix
\320\232
KJTM
\\\302\253*********\302\253*\302\253\302\253**\302\253
table
diary
REFERENCES
waveguides using tangential vector finite elements. IEEE Trans. Microwave Theon
York, 1961.
Problems:
Closed Domain
5.1 INTRODUCTION
Finite elements have been used extensively to model open- and closed-domain elec-
electromagnetic problems in scalar form in two and three dimensions [1], [2], [3].
However, the true power of the finite element method is revealed in three-dimen-
volume
three-dimensional since surface-based integral equation methods have great
formulations
difficulty in dealing with material and structural inhomogeneities. As explained in
earlier chapters, finite elements do not suffer from these shortcomings. But for a long
time [4], reliable full vector formulation proved to be extremely difficult to imple-
implement. Discretization of the curl-curl version of the wave equations A.30) usually
resulted in the appearance of nonphysical or spurious modes.The causeof the
problem is the traditional nodal basis functions that are used to discretize the
unknown field variable.1 The reasons for the failure of node-basedelements in
157
158 Three-Dimensional Problems: ClosedDomain \302\246
Chapter 5
The first part of this chapter describes the variational formulation for the
1.11.1), the variational formulation leads to the same system of equations as llit
weighted residual method employed in Chapters 3 and 4. We also formulate the
problem in terms of vector potentials. The field formulation and the potential for-
formulation are equivalent; however, each has its pros and cons. After the formulation
to obtain the linear system of equations,we briefly describe the problem of spuriom
solutions encountered with node-based elements. Generation and assembly of \320\277\320
finite element coefficient matrix using tetrahedrals and bricksare given and issues
related to modeling sources for circuit problems arediscussed.This is very critical for
the computation of the scattering parameters (S parameters) in a circuit. We end the
chapter by presenting a few applications of the finite element method pertaining \321\216
cavity resonators and packaged circuit configurations.
5.2 FORMULATION
Resistive or
Impedance
sheet
Figure 5.1
Inhomogeneous structure encte
by a mesh termination surface Sa assumed in
be a perfect electric conductor (PEC).
Section 5.2 Formulation
\302\246 159
As done in the previous chapters, the problem statement is to satisfy the vector
wave equation
V x \342\200\224
V x E - A-k-E
=
-jkoZof
- V x E.1)
ttf)
(\342\200\224
surface So enclosing the volume V. Here. J' and M' are the electric and magnetic
current respectively, contained within
sources, the volume. These current sources are
usually known and form the excitation for the problem.
a priori Tn this chapter, we
take the variational approach to formulating the finite element solution. This
approach is often employed in the literature to construct the linear system, but as
discussed in Chapter 1, it typically leads to a system that is identical to that obtained
from the weighted residual method. The variational formulation is therefore another
method for finding the solution of a given boundary value problem.
The calculus of variations originates from a generalization of the elementary
theory of maxima and minima of functions. In the variational technique, we strive to
find the extrema of functionals (seeChapter 1). The functional can loosely be taken
to mean a function which depends on the entire course of one or more functions
within the domain of interest. For the wave equation, we can express the functional
for the total electric field as
F(E)=- E Vx\342\200\224(VxE)-
Mr
where V is the entire computational domain. For the sake of simplicity, we will
assume a source-free volume domain and considerthat the only sources used to
excite the circuit are coupled through the ports of the geometry. Using the vector
identity
A-VxB = B-VxA-V(AxB)
f E.3)
iv v u
F(E) = I [ (V x E)
\320\223\342\200\224 \302\246 x
(V E)
- *g\302\253rE eI dv
\342\200\242
\302\246Ui-LMr J
+ ^f [E(nx VxE)]dS+ f
EfdV E.4)
2J.% iv
In E.4), f is the source function given by
fiV1 E.5)
160 Three-DimensionalProblems:ClosedDomain \302\246
Chapter?
and So denotes the surfaces within V for which the tangential component of E and or
H is discontinuous. We remark that the corresponding formulation based on the
weighted residual method would lead to the weak form of the vector wave equation
<R, W) = f (V
[\342\200\224
x E) \342\200\242
(V x W) - *jjerE wl dV
\342\200\242
+ [W \342\200\242
x V
(\320\273 x E)]dS + f
W \342\200\242
P dV E.6,
Js(, iv
x (\"fc
\302\253\320\272 * E) = -RnK x (H+ -
H~) E.7)
must be enforced,where H* denotes the total magnetic field above and below the
sheet, R is the resistivity in per square meter and hK is the unit normal
Ohms vector
to the sheet pointing in the upward direction D- side). For an impenetrable impe-
impedance surface, the appropriate boundary condition is (see Section1.6)
* (\"\320\272
\"\320\272 x E) = x H
-\321\211\320\272 E.fci
conditions, the functional for the total electric field can be more explicitly written as
F(E) = f (V x
\320\223\342\200\224 E) \342\200\242
(V x E) - kl(rE \302\246
\\E-P(E)]dS E.9.
+7*oZbf ^(\321\217\321\205\320\225).(\321\217\321\205\320\225)\302\253\320\2224-|
where is
\320\232 the surface resistivity of a resistive card and equals the surface impedance
for an impedance sheet. It has also been assumed that the relation between tangential
H and tangential E on the surface So can be expressed in terms of the differential
equation
x V
\320\273 x E = f(E) E.10)
Note that the factor of j in E.9) was dropped on the assumption that in carrying out
the extremization of F(E), the differentiation will be applied to P(E) as well. Again.
we note that the corresponding weighted residual equation which would lead to the
same linear system is [13]
\342\200\224 x
(V E) \342\200\242
(V x W) - /&,E \342\200\242
wl dV
\320\273
sK is0
Section 5.2 \302\246
Formulation 161
E \342\200\242
x V
(\320\270 x E) = -jk0Z0E \342\200\242
(n x H) = -jk0Z0H \342\200\242
(E x \320\273)
Therefore, the integral over So in E.9) vanishes for PEC and PMC surfacescompris-
the
comprising inner boundaries of the volume V. Thus, the surface integral over 50 reduces
to the integral over the outer boundary or the mesh termination surface. However,
for packaged structures which are bounded with electric walls (PEC surfaces), the
integral expression over the outer boundary vanishes too. For open structures, this is
not the case and leads to the use of absorbing boundary conditions (ABCs)which are
x E)} -
=
[(V x E) \302\246 \342\200\242 \342\200\242
\302\246
f(E) {[\320\264]-1 (V fcjjE {? E}] dV
-i
+jk0Z0 f
x
(\320\273 E) \342\200\242
(/?x E) dS + I E \342\200\242
/>(E) dS E.12)
\320\232
ish JSll
where
M.n- M.T--
\320\264,\320\273. I E.13)
\321\206\321\203\321\203
\321\206\320\263
[Vxx
and
'\"
? = \320\241 fw I E.14)
e.-.v eiy ^
The symmetry of the final system of equations now depends on the symmetry of \321\206
and ?. For packaged structures, when the surface integral over the outer mesh
boundary So vanishes, we are left with the functional
F(E)=
J [(VxE)
( E.15)
The functional representation for open boundaries is the subject of the next chapter.
The extremum of the functional can be found by using the Rayleigh-Ritz minimiza-
minimization
procedure to differentiating F with
which amounts then setting respect to E and
it to zero, as
explained Chapter 1. In practice,
inthe differentiation is done after
introducing the expansion for E. By setting the derivative with respect to each
coefficient of the expansion to zero, a set of equations is obtained which is said to
be stationary with respect to the first variation in F.
162 Three-Dimensiona] Problems: Closed Domain \302\246
Chapter?
The finite element problem can also be formulated in terms of vector and scalar
design behavior could have significant importance. The potentials used as the solu-
solution variables are the magnetic vector potential A and the electric scalar potential \320\244
The potentials are defined in terms of electric and magnetic fields as [14]
V x A = \320\222 E.16)
\320\225 E.17)
where is the
\320\222 magnetic field density and w is the angular frequency. Again, assuming
a source-free domain and using Maxwell's equations, the curl-curl wave equation
V x - = 0
V x (- A w2e(A + V<J>) E.1
J
The boundary conditionsfor perfect electric and perfect magnetic materials are a
little more complicated than the field formulation. The perfect electric boundary
conditions in terms of potentials are
A x n = ^ (E x n) = 0
w
=
\320\244 0
n = Hxn = O E.20)
1
F(A , 4\320\233
~ \320\223--VxA-VxA-
\320\223'
ty2e(A + - A I
\320\243\320\244) dV
\320\244) -=\\
2jy\\_fj, ']\302\246
E.21)
2J
numerical
stability of the algorithm.
As in the field formulation, the surface integral over So vanishes for PEC and
PMC boundaries as well as for interior surfaces without transition conditions. The
sheet boundary condition can also be appliedrather easily as in the field formulation
by setting
Section 5.3 \302\246
Origin of Spurious Solutions 163
x V
\320\273 x A
E.22)
where is defined
\320\232 as the surface impedance. The functional given by E.21) is
discretized by using edge basis functions for representing the vector potential and
nodal basis functions for the scalar potential. Useof the edge-based vector potential
eliminates spurious modes (to be discussed in the next section) in the solution spec-
spectrum and also helps in the enforcement of boundary conditions on edges and corners
of the desired structure. It should be noted that the potential formulation results in a
larger solution space for an identicalmesh topology than the field formulation. This
is due to the extra nodal unknowns required to discretize the scalar potential \320\244.
However, as noted earlier in this section, the formulation allows robust analysis even
for the lower frequencies in the analysis spectrum [15J. The solution for E.21) is
obtained by extremizing F(A, \320\244)
with respect to A and \320\244.
Conventional finite element basis functions give rise to spurious solutions when
E.31) is solved. As Wong and Cendespoint out in [16], the origin of these spurious
solutions lies in the infinitely degenerate eigenvalue = 0 in the eigenvalue spectrum.
\320\272
Given the eigenvalue matrix system along with the PEC condition n x E = 0 on the
exactly everywherein the source-free solution region. Then the only solution corre-
corresponding to the =
\320\272 0 eigenvalue is the trivial E = 0. In finite
one elements, solving
an eigenproblem along with a constraint E.23) is well known [17]. Researchers have
E = V x \320\241 E.24)
Since substitution of E.24) for E into E.15) results in second derivatives, we needio
construct first derivative continuous elements or C1 elements.As shown in [16],
discretization of E using node-based C1elements eliminates the problem of spurious
solutions since the nullspace of the curl operator is modeled exactly. The added
constraint of normal derivative continuity across inter-element boundaries provides
the excess degrees of freedom to increase the size of the \320\243\320\244
subspace. However, C1
elements are not commonly found in finite elements and may need to be explicitly
derived for the problem at hand.
Another method of eliminating spurious modes, without getting rid of the
eigenvalue = 0,
\320\272 is by using edge elements [20].Webb in [21] provides an elegant
rationale as to why spurious modes do not appear with edge-based elements and why
they are likely to be presentwith node-based vectorial elements.
Let us consider a single tetrahedral element. With a vector first-order node-
based formulation, the element has 12 degreesof freedom (three each for the four
To discretize the electric field E within this volume, we subdivide the volume into
E.25)
where WJ are the edge-based vector basis functions and EJ denotesthe expansion
coefficients of the basis WJ. The upper summation index m represents the number of
edges comprising the element, and the superscript e stands for the element number.
On substituting the expansion E.25) into E.15) and setting 3F(E)/9?/
= 0, we obtain
the system of equations
\320\274 \320\274\320\274
- -
? {Ae}{Ee} ki J218\"^&)
=
J2\342\204\226\342\204\226)
10} E.26)
where
4 = f i(Vx W?)
\342\200\242
(V x Wf)* E.27)
W-Wjdv E.28)
Cf =jk0Z0 1 Wf
-
(n x U)Js ~\\ -(fix Wf)
- x
\321\204 Wj)dS E.29)
|_Jy Jsj k J
In the preceding equations, all matrices and vectors following the summation sign
have been augmented using global numbers.
The surface area the boundary of element
S*' indicates e. As mentioned earlier,
all free faces, i.e., faces on the boundary, have zero contribution to the surface
integral since we consider only packaged structures in this chapter. This reduces
the original unknown count and eliminates the need to generate equations for
those edges/unknowns which would otherwise have to be included in the solution.
We can further simplify the surface integral contribution to [C] by taking advantage
of the inter-element continuity afforded by finite element basis functions. Dueto the
continuity of tangential H at the interface between two elements, an element face
lying inside the body does not contribute to the integral over S* in the final assembly
of the element equations. As a result, the last term of E.26) merely reducesto the
integral over imperfectly conducting or impedancesurfaces
contribution (S*).
For
simplicity, let us assume that no impedance surfaces exist in the structure.
As will become clear in the derivation below, it is difficult to solve the problem in the
presence of impedance structures. With all surface integrals reduced to zero, we are
left with the equation
[A][E]- kf)[B][E}
= 0 E.30)
which corresponds to the generalized eigenvalue problem for finding cavity reso-
resonances. The matrices [A] and [B] are N x \320\233'
symmetric, sparse matrices with N being
the total number of edgesresulting from the subdivision of the structure excluding
166 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5
the edges on the boundary. Their entriesare given by E.27) and E.28). As usual, \\E\\
= k[B]{E]
whose solution yield the resonant field distribution {E}and the corresponding
will
wavenumber k0. The inclusion of impedancestructures in E.31) would require us to
solve a nonlinear eigenproblem given by
[A)\\E] = E.32)
where k0 is the desired eigenvalue and [R] is the contribution from the surface
impedance terms. This problem is usually solved by obtaining an initial guess
from the solution of the linear eigenproblem and subsequently using this solution
to find the true solution iteratively. For more discussionon such a solution process,
the interested reader is referred to [22].
As mentioned in Chapters 3 and 4, the finite element matrices are sparse and
can be filled very quickly. If the element shapes used for discretization are simple.
then the local element matrices A\" and can
8\320\265 be derived analytically and usedto fill
the global system after taking the proper numbering into account. For tetrahedra
using H0(curl) elements (see Chapter 2), the element matrices are given by
(b6,b6) (b6lb4)
\342\200\224(\320\254\320\261,
\320\252$) (b6, b3) -(b6, b2) 0\302\2736.
bl)
-(bs.be) (bs,b5) -(bs, b4) -(bs, b3) 0>5. \320\2542) -(b5, bi)
1 1 (\320\2544,\320\2545)
-(\320\2544,\320\2545) (b4.b4) (b4. b3) -(b4,b2) (b4, bi)
-(b3lb5)
(\320\254\320\267.\320\254\320\261) (b3<b4) (b3.b.o -(b3. b2) (b3, b,)
where (b,,by) denotes the dot product of the edge vectors b, and by of the tetrahedral
element. Referring to Fig. 2.6and Table 2.4 as an example, we note that
b, = (x2 - - + (z2
\320\243\\)\321\203
- zx)z
b2
= (x-i --
[23]
Section S.4 Matrix
\302\246 Generation and Assembly 167
\342\200\224
J
t- 2/23 /|4
+/\320\267\320\267) + 2/34 + /\320\267\320\267
-\320\233\320\267 -/.4 + /34 2/,4 + /34
-
/\320\277- /12- /\321\206- 2(\320\233,-/.4 /J
\342\200\224
/24 2/,2-/24 2\320\233\320\267/34
1-2/24 1-2/34 +/44) + /34
-\320\233\320\267 -/|4 + /44 -/14+ /44
/22- 2/,2 -/23 /.2 -/24 2(/22-/23 /22-/23- /23-/33-
+ /23
2\320\233\320\267 + /\320\267\320\267
-\320\233\320\267 + /34
-\320\233\320\267 /24
+/\320\267\320\267) 4 \302\2462/34 2/24+ /34
-
/22- /23- 2/,2 -/24 /22 /23- ^^ -/24 2/23-/34
E.34)
with
Fjj=:Fr F/. Here, Ft is the inward oriented normal vector to the tetrahedron's
face opposite to node / and has an amplitude equal to the area of the triangular
face.
The vector edge-based expansion functions for rectangular bricks were pre-
presented in [24], [25]. These basis functions were reviewed in Chapter 2 and consist of
12 unknowns for each of the 12 edges in the rectangular brick element. They are
rather simple to derive analytically and are presentedhere for the sake of complete-
The
completeness. [Ae] elemental matrix is given by
V
E.35)
2 -2 1 -1
2 2 -1 1
E.36)
1 _j 2 -2
1 1 _2 2
168 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5
2 1 -2 -1
1 2 -1 -2
E.37)
2 -1 2 1
1 _2 1 2
2 1 -2 -1
2 -1 2 1
E.38)
1 2 -1 -2
-1 -2 1
and [ ]r denotesthe matrix
the notation transpose (see Fig. 2.11 for the definition of
parameters and node specification).2Also.hxy_: denotes the length of the edges as
shown in Fig. 2.11. Substituting the values of [Kt], [K2]. and [tf3] in E.35) yields
a A2 x 12) system for the [Ae] element matrix. The values for the matrix [If] arc
given by
'
0 0
0 [L{\\ 0 E.39)
36
0 0 [1,1
where
4 2 2 \320\223
2 4 12
E.40)
2 14 2
12 2 4
To model microwave circuits, we usually need sources at the input ports to excite the
circuit. The source modeling issue is critical since it provides the input boundary
condition problem. A small error in the source model could lead to large
for the
errors in the 3D simulation of circuit parameters. In providing the source model fora
microwave circuit, it is assumed that the field is already accurately solved (stabilized)
2The reader is cautioned lhat the matrix given in [25] contains misprints which have been corrected
in E.35).
Section 5.5 Source
\302\246 Modeling 169
port and use a feeder transmission line of suitable length to excite the actual circuit.
The disadvantage of this method is that the feeder structure increases problem size
and wastes valuable computer resources. Moreover, complicated transmission line
\321\206\320\234\321\206\320\265-\342\204\242 E.41)
where
since only the dominant mode excitation is being considered for now. In E.42), \320\234\321\
is the modal pattern (eigenvector) for mode/ at port i. y,j is the propagation constant
of theyth mode in the /th port and are the scattering coefficients to be determined
aVj
in the analysis. As noted earlier, the modal information is obtained from a 2.5D
analysis of the port eigenvalue problem.
It remains to match the fields at the ports to the edge-based volume finite
elements in the interior of the domain. Usingthe continuity condition for tangential
electric fieldsat material interfaces, we set the projection of the modal approximation
equal to projections on the port plane from the finite element basis functions inside
the computational volume. Tangential continuity of edge-based finite elements
across inter-element boundaries simplifies the field matching condition. Moreover,
continuity of the tangential magnetic field at the port surface is automatically satis-
satisfied since it is a natural boundary condition of the electric field formulation.The field
matching condition is imposed through the surface integrals on the port planes.
Assuming unit power at each port, we can normalize the surface integrals for each
mode on port surfaces such that
-!-E(> x V x
E,v E.44)
-I .-=o
170 Three-Dimensional
Problems: Closed Domain \302\246
Chapters
On substituting E.44) into the surface integral of the electric field functional, we
obtain
. No. of Modes
\342\200\224
(Ex V x E) \342\200\242
\320\2535 =->/xo
Yl
i s, Mr /=l
. . / No.o
.ofModes
\342\200\224
(ExVxE).?rfS = /e**0 \320\234- E.47)
2)
{Ml\\[ASlS,){M2)
[M2)[AS!Sl)[M2}+Ja>n[S]2_
E.48)
subscripts 5( and Si denote edges on ports 1 and 2, respectively. The matrix entries of
= x \342\200\242
V x ~
At f G- V
W/ W* *k'W/ dV
\342\200\242w*)
The column [E] represents the interior volume field coefficients as in E.31), whereas
{a\\\\ and \\a2) stand for the unknown modal coefficients of the expansions E.41) and
E.42). Thus, the transfinite method gives rise to a partly sparse and a partly dense
matrix. However, the dimension of the dense part is limited to the surn of the number
of modes per port and the number of ports in the structure. It is, therefore, always
much smaller than the sparse portion of the matrix which represents the volume
unknowns in the geometry. Other circuit excitations,like a voltage or a current
source, can be modeled in the usual way by placing a voltage or current elemeni
and integrating over the volume of the element or elements occupiedby the source.
Section 5.6 \302\246
Applications 171
More details on source modeling are given in Chapter 7 for antenna and scattering
applications. The integrals over the sources are given in E.4) and E.5).
5.6 APPLICATIONS
5.6.1Cavity Resonators
formulation
using tetrahedral predicts the first six distinct
elements non-trivial eigenvalues
with less than four percent error and is seento provide better accuracy than rectan-
rectangular brick elements. Both the tetrahedral and the brick elements used in the com-
computation are H\302\260
(curl) elements. The maximum edge length for the rectangular brick
elements is 0.15cm whereas that for the tetrahedral elements is 0.2cm. Figure 5.2
shows that the tetrahedral elements have slightly less error when the same number of
unknowns with bricks are used for modeling the rectangular cavity. However, it
cannot be categorically stated that the brick elements are better than tetrahedrals
for modeling rectangular structures. The primary advantage of letrahedra lies in
their generality in being able to automatically mesh arbitrary structures. This is
Computed Computed
(bncksJ70 (tctra.J60 Error (%) Error (%)
Mode Analytical Unknowns Unknowns (bricks) (telra.)
7.025
\320\242\320\234\342\200\236\342\200\236 7.182 6.977 -2.23 .70
1.\320\254-
1.25-
Tetrahedron
X
Brick
0.75: \"\302\246\302\246\302\246\342\200\242\342\200\242\342\200\242\302\246
\\
0.5 :
\\^2l2T
0.25:
0- 5.2 Performance comparison
( ) 500 1000 1500 20
Figure of rec-
not true for rectangular bricks unless staircasing is permitted to model curved sur-
surfaces as is done in the finite difference method. Bricks are used primarily for con-
convenience in meshing and can lead to a reduction in unknowns in solids with
percent.
Finally, Table 5.3 presents the eigenvalues of the geometry illustrated in Fig.
5.3. This is metallic cavity
a closed with a ridge along one of its faces.Note that even
with a relatively coarse initial mesh B67 unknowns), the dominant eigenvalues are
recovered with less than two percent error. However, a much finer mesh is needed to
obtain a reasonable approximation to the modal field pattern or the eigenvector of
the geometry.
As the degeneracy of the eigenvalues increases, the eigenvalue problem
becomes increasingly ill-conditioned and the numerical solution is correspondingly
Section 5.6 \302\246
Applications 173
TABLE 5.3 Ten Lowcsl Nontrivial Eigenvalues (A-o, cm\021) for the Ridged
Waveguide Geometry: 267
(\320\263)Unknowns; (b) 671 Unknowns
1.0 cm
0.5 cm
less accurate [30]. Therefore, for the partially filled rectangular cavity, the absenceof
degeneratemodesgives results which are accurate to within one percent of the exact
eigensolutions. As expected,the solution yields a set of eigenvaluesequal to the
degrees of freedom (unknowns). Of these,there is an inherent presence of zero
eigenvalues the number of which equals the number of internal nodes. These eigen-
eigenvalues correspond to the dimension of the nullspace of the curl-curl operator and
were explained in an earlier section. The zero eigenvaluesare easily identifiable, and
because they do not correspond to physical modes,they are always discarded.
Top view
This configuration serves two purposes: first, the CPW ground serves as the micro-
strip ground and second, the CPW aperture and the top conductor in the microstnp
run in parallel with the separation to reducecross-talk.
Without the rectangular via
hole, the transition geometry has a significant cross-talk of about -15 dB at 20Gto.
However, as the frequency increases, the transmission coefficient (Si\\) increases to
about due
\342\200\2244dB to radiation effects from the open endsof the microslrip and the
CPW. On drilling a via hole connecting the CPW and the microstrip, the crosstalk
levels stay below 20 dB for a wider frequency range (from 20 GHz to 50 GHz).a<
shown in Fig. 5.5. This is just oneof a wide range of applications where a full-wave
three-dimensional solution can be used to meet design specifications for critical pan\302\273
of a complicated circuit.
Another type of common inter-chip feed-through is the hermetic bead transi-
transition. In Fig. 5.6, a coax-to-microstrip transition is modeled by approximating the
governs the insertion loss at lower frequencies with the loss increasing as the gap
spacing is enlarged. At higher frequencies, impedance mismatch will further degrade
interconnect performance.
The reflectionand insertion losses for the hermetic transition were computed
by Yook et al. [32] using the finite element method with first order edge-based
Section 5.6 \302\246
Applications 175
iii.
0
-10
i, -20
03
. \321\201\320\265\320\274
\321\201
-so
f
S^FEM
<;,.
= 10.8. Hfter Yook et til. T IEEE \\32\\.)
tetrahedral elements. The results are given in Fig. 5.7 and as expected the insertion
loss increases from close to 0 dB at 10 GHz to nearly -3 dB at 25 GHz. To lower the
insertion loss, the air gap spacing can be decreased and the geometrical parameters
can be optimized to reduce the impedance mismatch at higher frequencies.
176 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5
-
\320\276
-10 -
\320\276
\320\276
\320\276
0
\320\276
Sn : FEM
- \320\276 \342\200\242 -
S21: FEM
\320\276 S^ : FDTD
S21:FDTD
1 i
Figure 5.7 Scattering parameters of the her-
10 15 20 25 hermetic bead transition shown in Fig, 5.6
Frequency [GHz] {After Yook et al. IEEE
\320\244 [32].]
In this appendix we presentthe matrix entries needed for finite element analysis using
right triangular prisms (see Chapter 2 for their basis functions). Figure 5.8 shows a
right triangular prism as an edge-based vector finite element [33], [34]. The top and
bottom surfaces are identical and parallel to each other, while the vertical arras are
perpendicular to the base of the prism. The vector electric field inside the element is
\\K*
(a) (b)
Figure 5.8 Right-prism with edgc-basedunknowns: (a) perspective view; (b) top
view. [Courtesy of T. 6zdemlr.\\
Appendix \302\246
Edge-Based Right Triangular Prisms 177
an interpolation among the nine vector unknowns each parallel to and constant
along a particular edge of the prism.
The prism is specified by its height, the lengths of. and the angle between two
sides of its triangular surface, namely, c, </2. <h-> and <*\\ \342\200\242
respectively. First, we need to
compute somescalarand vector quantities that will be used later in computing the
matrix elements. These are illustrated in Fig. 5.8 and given by
E.50)
A, =f/2sina3 E.53)
= sin E.54)
\320\2332 </\320\267 at
=
\320\2333 d\\ sin o2 E.55)
= -I
hi E.56)
u2
= cosarjl - sina3^ E.57)
=
cosofjl
\320\270\320\267 + sinai'? E.58)
The edge-basedbasis functions in their final forms are given by (see also Chapter 2)
E-59)
E.60)
E.61)
Z-^-L^Jd-.- E.62)
$
= c/2
- \320\246 A
-
z/c) E.63)
(^Z.,1 Ji ^
Ki;=fZ.'i E.65)
K5 = fZ.5 E.66)
K5 = fLS E.67)
178 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5
where
are the usual nodal shape functions for a triangle in the (?, rj) coordinate frame as
EWWCU = f (V x (V x WJ) dV
W?) \342\200\242
f [
cos ft,
_\342\200\224
d,dt /cos/?b , cosftm
\"\320\263
\320\233\320\220\320\257 Kkll
cosfty,,,
I I L I, \320\220\320\272\321\202
, I I \"l I, ;. Kjn
t V \320\234\320\270 \320\251\"\321\202 \"\320\272\"\321\202 \\"\320\277
sin
+ i /, \302\253n
\321\214\320\247ft* ft,,,,) E.\320\231)
(VxWl')-(VxMj)(/K
(\320\250
+TATX7Tsin^sinA\"n)
3 \320\233\321\203\320\230\320\220\320\233,\342\200\236\302\253\342\200\236
/
?^A'Q = f f [ (V x WO \342\200\242
(V x K?) t/K
E.7Di
=
\302\246\342\200\236 (vxM;v(VxM';)rf^
[[[
= ?^^C,Y E.7!l
= (vxm;v(VxK?)^
\\\\\\
= -EWKCU E.7:
= Kf)
\320\265\320\272\320\272\321\201\342\200\236 (v x (v x
\342\200\242
K)dv
[ [J
cos ft,
s
2 h,h,
\342\200\242
dv
w^ w;
\342\200\242
dv
w;- ivr;
= i EWWDit E.75i
Appendix \302\246
Edge-Based Right Triangular Prisms 179
= Wf-KJrfK
EWKDlt
JJJ
= 0 E.76)
EMMDie=\\\\\\ M',-M',(IV
J J J i\"
= EWWDlf E.77)
=
?\320\233\320\250>\342\200\236 Mf.K?rfK
MI-K'idF
jj|J
J .1 rr
=0 E.78)
j j j \321\203
= cx\302\253 E.79)
ifr = *
@
[ ar + or, otherwise
Air/, f /?i
=
\320\245\342\200\236 +
\320\270>\320\270*, [(colofj -cota2)(^nv + i]rws) + 2(&wr + $rws)]
-\321\203- | j
+ \342\200\224
12
[3(cota3
\342\200\224
cot^K^vtr ... + 1r%s) + 2rjrnt(cot2 o--.\342\200\224
cotor2cota3 + cot2aj)
w, = 1,
&=-?-. 4i=0
cosa3
i j \320\272 m \302\253
1 2 3 1 2 3
2 3 1 2 3 1
3 1 2 3 1 2
1 cm k, cm'1 % Error
Mode
0.5 cm (Exact) Prism Brick Teha
ioi 5.236
\320\242\320\225 0.73 -1.36 0.44
7.531 0.64
\320\242\320\225
201 -3.13 \342\200\2420,56
(Actual mesh)
8.179 0.22
\320\242\320\234\321\211 -2.09 2.29
(a) (b)
The percent error for triangular prisms along with the results for bricks and
tetrahedrals [29] is given in the table next to the cavity mesh in Fig. 5.9 and shouldbe
compared with Table 5.1. The number of segmentsalong the .v-, y-, and r-directed
edges were seven, four, and five, respectively,for both the triangular prism and (lie
brick discretizations resulting in 382 edge unknowns in the triangular prism caseand
270 edge unknowns in the brick case. The tetrahedral discretization, on the other
hand, resulted in 260 unknowns. As seen, the performance of the triangular prisms \320\262
comparable to that of bricks and tetrahedrals.
As a second test, we consider the eigenvalues of a cylindrical-circular cavity
with metal walls, as shown in Fig. 5.10. The table accompanying the figure shows tk
percentage error in calculating the first five eigenvalues. Note that the prism model-
modelingis quite good given that the discretization results in only four edges along Ik
radius as well as the axis of the cylinder. This example showsthe advantage of the
triangular elements over rectangular ones in being able to model cavities with arfe
trary cross section.
Radius = 1 cm
(Exact) (Computed)
2.405
\320\242\320\234\320\276\321\216 159
(a) (b)
REFERENCES
[12] \320\242. A.
\320\222. Senior. Combined resistive and conductive sheets. IEEE Trans.
for driven high frequency problems using hybrid edge and nodal finite elements.
IEEE Trans. Microwave Theory Tech., 44A): 15-23, January 1996.
[16] S. H. Wong and Z. J. Cendes. Combined finite element-modal solution of three-
dimensional eddy current problems. IEEE Trans. Magnetics, 24F), November
1988.
[17]\320\236. Zienkiewicz.
\320\241 The Finite Element Method. McGraw-Hill. New York, third
edition. 1979.
[18] B. M. A. Rahman and J. B. Davies. Penalty function improvement of wave-
waveguide solution by finite elements. IEEE Trans. Microwave Theory Tech., 32:922-
928, August 1984.
182 Three-Dimensional Problems: Closed Domain \302\246
Chapter S
[19] J. P. Webb. Finite element analysis of dispersionin waveguides with sharp metal
edges. IEEE Trans. Microwave Theory Tech., 36(12):1819-1824. December
1988.
[20] A Bossavit. Solving Maxwell's equationsin a closed cavity, and the question of
spurious modes. IEEE Trans. Magnetics. 26B):702-705, March 1990.
[21]J. P. Webb. Edge elements and what they can do for you. IEEE Tram,
objects by the hybrid moment and finite element method. IEEE Trans. Aniennm
May 1997.
[34] Z. S. Sacks and J. F. Lee. A finite element time domain method using prism
elements for microwave cavities. IEEE Trans. Electromagnetic Compatibility,
November 1995.
Three-Dimensional
Problems: Radiation
and Scattering
\320\230
INTRODUCTION
183
184 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter d
back into the computational domain. These techniques maintain sparsity but may
worsen convergence properties for iterative equation solvers.Even for direct solvers.
the ill conditioning of the matrix could lead to unstable or incorrect solutions.
In the first part of this chapter, we present a survey of the more popular vector
ABCs artificial absorbers. Detailed derivations for the ABCs along with recent
and
advances made
in understanding their behavior are included.In the following sec-
tions, we formulate the open domain problemin terms of the linear functional ami
incorporate the ABC into the unite element system.The scattered and total fid!
functionals then presented. In the last
are section of the chapter, we include
examples
of various applications solved using finite elements and absorbing boundary condi-
The motivation for applying ABCs to simulate open domain problems was discussed
in detail in Chapter 4. In three dimensions, the advantages of locality and subsequent
scalability are even clearer. All 3D finite element formulations rely on a vector
representation of
the underlying variable to
generality over a wide clas>
maintain
where kn is the free-space wave number. We also assume that the field has a well-
= Wtt+X0(/|,/2) F.2)
where n is the unit normal and Xo(/|.i2) denotes the surface of the reference phase
front. The curl of a vector in the above coordinate system is given by
\320\255\320\225
F.3)
where Vr x E is called the surface curl involving only the tangential derivatives and
is defined in [12], [13] as
Vr x E = -n x V?n + 12\320\272\\?,,
-
fidc2?V, +\302\253V-(Ex \320\270) F.4)
In F.4), and
\320\272\\ denote
\320\272\320\263 the principal curvatures of the surface under consideration,
?/,, ?,j are the tangential components, and ?\342\200\236
is the nomrial component of the
electricfield on the surface. The principal curvatures are associatedwith the princi-
principaldirections }t2 of a surface and are given by [11]
K]
~ __~ Lf F.5)
\321\217, ft, d
-_L-_J_
2~ ~ F.6)
R2 hi
186 Three-Dimensional
Problems: Radiation and Scattering \302\246
Chapter 6
where ht. h2 are the metric coefficientsand R{, R2 are the principal radii of curvature
Using the aforementioned coordinates, the Wilcox expansion for a vector
n x V x E (i.e., \321\217 x H where H is the magnetic field) and the tangential components
of the electric field on the surface. Taking the curl of the electric field expansion given
by F.7) and crossing it with the normal vector, we have
^pt
Ayr
\320\247\320\233
p=Q
where = \321\203/\320\251\320\251
\320\270 and
(\320\243,
orJxVxE- (Jka +
-
~K) E,
\320\272,\342\200\236
= 0+ 0(n~3) F.101
cation boundary farther away from the scatterer or employ higher order boundary
conditions which satisfy higher order terms of F.7).
To reducefurther the order of the residual error, we include the tangential
components of the curl of F.9). This yields
nx V x [n x V x E - (/7@ + -
~K-)
*-,\342\200\236 E,]
\302\246
\320\225
\321\200\320\272,\342\200\236\320\232
\342\200\242\321\200>
F.11)
\320\270\"+|
where = *jk-2
\320\272\320\272
is the Gaussian curvature. Using the result derived in F.9) and
simplifying F.11) reduces lo
nx V x [ft x V x E - {jl<o + -
\320\232)
\320\272\321\202 \320\225,]
\\
F.12)
If we take a closer look at the term in the square brackets on the RHS of F.12), we
find that it can be written as
- - - -
+ + 3*7,,
(\320\224\320\276 A
\320\232 V \321\205
\321\205 \320\225 (jk{) + \320\272\321\202
*^- {w \320\233\"-)
\320\225(}
=\" x E -
7,?\321\200\342\200\236+\321\200*-)\342\200\236\320\225/\342\200\236
F.13)
L -\320\232-)\320\225,
Now the dominant terms on the RHS of F.12) can be eliminated by consider-
the
considering higher order operator
x
\320\223n V x -(jkjk0 + \320\252\320\272\321\202 x V x E - (jk0+-\302\253\302\246\342\200\236,
-f -)E,)
-^--^-J |(\302\253
The residual of F.14) can be reduced further to yield the absorbing boundary con-
condition of second order which satisfies F.7) to 0(/\320\2235), where n is the normal distance
from the object surface to the phase front. This second-order ABC is found to be
x V x -(jkn + -
4\320\272,\342\200\236 T\302\246 x V x E - (/-,, + \320\272\321\202
- f
\342\200\242)
E,}
[\320\271 -^- jl\\n
= 0 F.151
\\
^),En
Km /
The operator on the LHS of F.15) can be applied repeatedly to obtain ABCsof
increasing order; however, higher order basisfunctions are needed for their imple-
implementation.
After some algebraic manipulation, the terms on the LHS of F.15) reduce \321\216
simpler ones. In addition to the wave equation, the following vector identities were
utilized to carry out the simplifications and are provided below for the reader's
convenience:
nxVxE, = \320\271\321\205\320\243\321\205\320\225-
V,?,,
nxVx(wxVxE) = Vx x -
kfe,
- {(V x
\320\220\320\272 + (V x
{\302\253(V \320\225)\342\200\236| \320\225)\342\200\236/, E),,/,]
where Ak = tct The derivation
\342\200\224\320\2322. of these identities is given in an appendix to this
chapter. Upon simplification, the second-order ABC can be compactly written as
- (D - x V x E
2\320\272\342\200\236,)\320\277+ {4^, - Kg + D(jkQ - \320\251+
\342\200\242
~K \342\200\242
\320\226 + \320\272,\342\200\236
\320\224*\321\201|}
\320\225,
in which
and
F.11A
Section 6.2 \302\246
Survey of Vector ABCs 189
\321\202\321\201
l -iknn
\321\217V x
\321\205
V,En =
4* ?,=.
(p-\\)Km-L-L
V,(V E,) =
\342\200\242 n x V x V,En
- bcmV,En
(D
- 2icm)n xVxEx
+ V x + 2f =
\342\200\242 0 F.20)
xEU+1
{\302\253(V
(jk0 \320\227\320\272\321\202
-^-- V,(V
\342\200\242) E,)
JK0 \\ Km /
which -
= \320\272] It can be easily shown the above condition
in \320\220\320\272 \320\272\320\263. that boundary
leads to a symmetric system of equations when incorporated into the finite element
equation F.20) cannot be incorporated into the finite element equations without
modification. As explained in Chapters 3 and 4, the absorbing boundary condition
is implemented in the finite element system through the surface integral over the
where P{E) denotes the boundary condition relating the tangential magnetic field to
the tangential electric field on the surface.
190 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter \320\272
Let P\\(E) denote the first-order absorbing boundary condition given by F.10),
where the subscript represents the order of the ABC. Therefore, the surfaceintegral
contribution for the first-order ABC reduces to
\320\225./>,(\320\225)
= (\320\2240+ *\342\200\236,)[ E-(f-E,)tfS F.22i
f E-E,dS-\\
that
f [ ?l\\ ?? F.23)
which is a readily implementable form of the first-order ABC. However, the second-
order ABC does not simplify as easily.If P^IE) denotes the second-order ABC given
by F.20), we can rewrite it in more compact vector notation as
+ n - D (jk0- -
+ \302\253\"IKm*\302\253) hh
\320\2722) F.25)
m
= 0{'i
{\" + \320\250 'i + hh\\ F.261
jko(D-2Km)
+ E-{f-Vt(V-E,)\\dS
f
= ct\\El +a2E?dS
/i F.28)
J.Vo
then the divergence theorem is employed to eliminate one of the terms. Considering
the integrand of the second integral /2, we note that
Section 6.2 \302\246
Survey of Vector ABCs 191
F \342\200\242 = V \302\246
V x \320\277\321\204 \321\205 +
\320\233
(\321\204\320\277
\342\200\242
V \321\205
\321\204(\320\277 F)
V \342\200\242
x F)
(\321\204\320\277
= V, \342\200\242
x F)
(\321\204\320\277 + -?- \342\200\242
(n xF))-J
{\321\204\320\277
\302\246x
(\"
[\321\204\320\277 F))
on
= V,
\342\200\242
x F)
(\321\204\320\271 F.29)
/2 = f V(
\342\200\242
x F)
{\321\204\320\277 dS + f V x
\321\204( dS
\320\233\342\200\236
We next apply the surface divergence theorem to the first term on the RHS of this
expression to yield
f V, ,
\302\246 x
F)
(\321\204\320\277 dS - f
\302\246x
(n
\321\204\321\202 F) dl = 0 F.30)
Jin Jc
since the surface So is closed.We note that m = / x h and / is the unit vector along
the edge of the surface element and denotes
\320\241 the contour of integration (see Fig.
1.1).On the basis of F.30) and considering that Jt is a simple scalar, /2 reduces to
where i/r
= V \342\200\242
E,. Next, setting G = \302\246we
E,
\321\203 obtain
G \302\246
Vx/,
- n = V \342\200\242 -
(\342\204\226) V/V G -
\342\200\242
Gn F.32)
j \320\251
^
V \302\246
($G)
- V,
\342\200\242
($G)
- J
(\342\204\226\342\200\236)
(\342\204\226\342\200\236)
+?- on
and as usual Gn
= n-G. Also, since BGjdn = C?-V
V\342\200\242 the LHS
G,4-\320\243\320\241\342\200\236. of
F.32) reduces to
'Thebook by van Bladel [131a's0 contains an extensive list of identities associated with divergence
and curl operators.
192 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter t
G \302\246
Vtfr
- ii =
V,
\342\200\242
WG)
-
*v \302\246
G, F.33)
J \320\251
We can thus replace the integrand of /3 with the expression in F.33) and use the
= f m-(^G)dS- f
s\302\273
where m has been defined earlier and the contour integral vanishes when the princi-
principalcurvatures of the outer boundary are equal to zero, i.e.,for a rectangular ABC
surface. The integral, however, does not vanish for spherical or cylindrical bound-
boundaries, as was pointed out in [38]. In our computations, we have ignored the con-
contribution of the contour integral that results from the non-vanishing portion of the
surface integral. For further discussion on how to include the effect of the contour
integral without destroying symmetry of the finite element matrix, the interested
reader is referred to [38].The integral /3 can finally be rewritten as
F.341
Using F.28), F.31), and F.34), the complete surface integral term incorporat-
the
incorporating conformal second-order ABC reduces to
f
E \342\200\242 dS
/\302\2732(E)
=
[ (a,El
+ a2El) dS + [ /3[( V x E)J2 dS
(V-E,){V.G.E),}rfS F.35)
element system. It will then be possible to generalize our findings to a more general
E \342\200\242
f =jk0
\320\233(\320\225)dS f (El + El)dS F.36)
Js,> J-s,
F.37)
The ABC given in F.37) is identical to the boundary condition derived in [16] fora
spherical mesh termination surface and leads to a symmetric system of equations.
Section 6.2 \302\246
Survey of Vector A BCs 193
[
E \302\246
P2(E) dS =
\\ \\jk0E>
+
^- [(V x E)J2 - JL (V
\342\200\242
E,J]
dS F.38)
[ f F.39)
)sn Js,,
2
JSa JS,,
F.40)
-L
zi
ABC enclosure
be noted that the resulting unsymmetric system will, in general, have fewer
dimensions, the mesh for such structures can get increasingly complicated and reduce
their viability when compared with ABCs. In the following section, we discuss a
Berenger [17] rests on the concept of splitting each of the electric and magnetic field
components into two parts. In the case of the most general medium, Berenger defines
an electric conductivity </ and a magnetic conductivity </. However, Berenger goes
one step introducing additional degreesof freedom
further by in the electric and
9>^ F-42)
degree of freedom was introduced in defining the conductivity of the medium leading
to the appearance of the split field components. Similarly for the dual of F.42) we
have
Section 6.2 \302\246
Survey of Vector ABCs 195
\320\264\321\203
F.43)
\320\224
F.44)
)
in whichal,.,- denote the new electric conductivity parameters. The dual equations
of F.44H6.46) canbe obtained by replacing ?ah with H(lf,, Hat, with 4
\342\200\224?Uh-with \320\224\32
and vice versa, and crjj>v-- with ffJiV.t.. The latter are the new magnetic conductivity
ty-tocos*
In F.48), is
\321\204 the angle the electric field makes with the _y-axis (equal to the angle
between the incident direction
field and the \320\264\320\263-axis)
and a, fi, HzM, H:vo are
unknowns to be determined from the split Maxwell's equations. Substituting the
196 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter 6
0, a' a*)
\320\240\320\234\320\246\320\236,
a* a*
\320\240\320\234\320\246\320\276*
\\
\320\240\320\270\320\246\320\260\302\273\320\260$,\320\260\302\273\320\260*) oj>)
PML(oS,<yj.O,0) tfj 0, 0)
\320\240\320\234\320\246\320\276\302\273,
values from F.48) into F.44) and F.45) and the relation for \320\257-,after taking the
necessary time derivatives, we obtain
\321\201
sin sin = +
e0E0 \321\204
-j\342\200\224E0 \321\204 0 (H:x0 H2}<>)
oi
= \320\260
e0E0 cos \321\204 (H:xt) +
-j\342\200\224Eq \321\201\320\276\321\212\321\204 \320\235.\320\273)
w
I Mo -j \342\200\224
I Hzxn
= <*?o cos \321\204
=
I$Eq sin
H:vq
\342\200\224_/\342\200\224=-1 \321\204 F.49)
Eliminating H:x0 and i/.,.o from F.49), we arrive at a relation between or and p.
F.50)
a
or = \302\246
F.51)
where G \342\200\224
Jwvcos2 + w,,sin2
\321\204 and
\321\204
F.52)
1
-\342\200\224
, F\0253>
W-.n
= Eu wY sin2 \321\204
ZqG
Thus, the impedance of the plane wave in the PML medium is given by
Z = F.54)
f
Assuming that both a*K and condition F.47), the variables G, \320\270'\321
ctJ; satisfy the PML
become
\320\2701,, unity irrespective or angle of incidence.Consequently,
of frequency the
impedance of the plane wave in the PML medium reduces to the free-spaceimpe-
impedance, and hence, the PML medium is perfectly matched to vacuum. Thus, any plane
wave traveling from vacuum to a properly matched PML medium will be entirely
transmitted. Another interesting thing happens to the magnitude of the propagating
plane wave. The electric field of the plane wave in the PML medium can be written as
E = Eair where
/ . cos
\320\273: + .)'sin (\320\220
0z\342\200\224i. \320\223 1 \320\223 1
\320\260'\321\214\321\202\321\204
\321\201\320\263^\321\201\320\276\321\214\321\204
ft, exp I
= z. \342\200\224* z. x \342\200\224i z.
\321\204 -jco exp eXp y F.55)
V \321\201 / L eo^ J L W J
where denotes
\321\201 the velocity of light in free space. Therefore, the PML acts as an
absorbing medium and. in the limit, will eventually all propagating waves of
absorb
all frequencies and incidence angles. The same results hold true for the TM case with
the electric field replaced with a magnetic field. Since an arbitrary plane wave can be
considered as a superposition of TEr and TM- modes,the above analysis holds true
for all plane waves.
The story, however, is not We have merely established the fact that
complete.
the PML absorbs plane waves upon ft very effectively, [t is still unclear
incident what
happens when a plane wave is incident at the interface of two PML media. It has
been shown in [17] that for an interface normal to the x- or y-axis and lying between
two matched PML media having the same conductivity couplet (<jj,a(!)or (o$,er*), a
plane wave is transmitted without reflection regardless of angle of incidenceand
frequency. This is true between vacuum and a PML medium as well because vacuum
can be thought of as a zero electric and magnetic conductivity medium. This prin-
principle is illustrated in Fig. 6.5. Not considering the corners of the domain, PML media
along the .Y-axis are given conductivity values of @,0,&v\\ tfj). whereas PML media
parallel to the >-axis have conductivity values of (a\"x, crj, 0,0). The corners have
conductivities which superposition are a of the intersecting
layers. The corners
play a very
important role since they absorb the transverse components of the
field entering into the PML layer [19]. The above derivation can be easily extended
to three dimensions by considering a separate couplet of conductivities in the third
orthogonal direction and superimposing the conductivity couplets in the two
pre-
preferred directions of propagation.
In [17]. Berenger terminated the PML boundary with a perfect electriccon-
conductor. Therefore, the reflection of the propagating plane wave occurs at the electric
wall and is turned back into the medium. The plane wave then passes back through
the PML layer, a part of it getting absorbed along the way, and then re-enters the
198 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter d
We will try to answer a few of these questions in the next section. However, this
topic is still an area of active research and the behavior of PML is not very well
understood.
interesting conclusions. They rewrite the set of PML equations F.44)-F.46) such that all
split fields are eliminated. Assuming Uxy + Ux. = Ux for all components of the
electric and magnetic field, the modified equations in the frequency domain are
- --
at\" erf,
+ oj)
(><?\342\200\236 Ex = -jkyH. +jk:Hy >ejk.Hy
JOlKQ + O
= -JkxHy+jkyHx - x
+
\302\25360 ax) Ez I jkyHx
for the electric field, where kXtV%: denote the (x, \321\203,
z) components of the propagation
vector k. The magnetic field equations can be obtained from F.56) as outlined
earlier. On imposing the PML condition F.47), duality is restored between the
electric and the magnetic fields. This is necessaryso that electric and magnetic
field formulations lead to equivalent final results. Thus, F.56) shows that in addition
to the regular terms arising from Maxwell's equations for an anisotropic medium
with uniaxial
a 3x3 conductivity tensor, there are excitation-dependent source
terms. source terms are proportional
These to the differences in the conductivities,
The PML medium is thus active.
Considering \320\263-directed propagation through the PML medium and enforcing
continuity and phase matching conditions at the PML interface, it is found that the
F,571
= = = = \320\276
^\320\264- \320\232 \302\260; \302\260i \302\253^
proposed by Berenger [17] and is illustrated in Fig. 6.5. When the unsplit PML
equations F.56) are once again rewritten with the values in F.57), we obtain
. = -jkxHy +jkyHx
for \320\263-directed propagation. These equations are identical to the ones presented in [9],
based on coordinate stretching. In the coordinate stretching technique, the spatial
variable is replaced
\302\253 by the complex spatial variable \320\270
given by
F.59)
assuming that the wave is propagating in the \320\270-direction. The spatial variable \320\270
can
equations. However, it provides us with valuable insight into the true nature of the
PML medium.
As was shown PML medium
in [19]. the can be thought to consist of an
anisotropic a
material with
conductivity tensor. Along these lines, Sacks et al. [20]
(see also Kingsland et al. [21]) have proposed an anisotropicabsorber with perfect
transmission characteristics over all incident anglesand frequencies for planar sur-
surfaces. Assuming diagonal tensors, the permeability and permittivity tensors in the
most general case can be written as
a 0 0
0 b 0 F.60)
Mo 0 0 \321\201
\342\200\224 F.61)
rtm _(yfr/a)cose,-cosfl,
cos 91 + (y/b/a) cos\320\262,
with and
\320\262, 0,, as displayed in Fig. 6.6. From the phase matching condition at the
PML medium
'\302\253-JP 0 0
JL-L- 0 a-JP 0
F.63)
\320\265\320\276
\320\234\320\276 I
0 0 -
\320\276\321\202
PML medium. In [22]. it is shown that the choice of p is critical for the performance
of the absorber. If p is too small, field decay is insufficient to eliminate reflection
Too large a value of P leadsto reflection since the mesh is insufficient to model the
sudden jump in material property. This phenomenon is illustrated clearly in Fig. 6.7
A finer mesh will improve the situation but will perhaps never remove the problem
The value of p optimized for normal incidence is given by the relation [22]
^-= -0.0106|/?|+0.0433
F.64)
the PML, and iV is the sampling density in the PML. Typical good choices for a and
the framework of Maxwell's equations. It is without doubt that the PML is a ven
effective absorbing medium. In fact, it is probably the best artificial absorber known
to date since il is reflectionless in the limit for all incident angles, frequencies, and
polarizalion. how effective is PML and how close to the target can we placeit? It
Bui
has been that the PML does not do a good job of absorbing evanescent
shown waves
[24] implying that it still needs to be placed far enough from the target for the
evanescent modes to die out. This fact coupled with the convergence difficulties
Section 6.3 Formulation
\302\246 201
250
beta == 0.5
1.
- 2. -
200
+ 3.
\\ N
4
\\ 10.
2. 150 -
N
\\
N
1 \\ \\
\\
\\ \\
\\
100 1
+ \\ \\
\\
\\
\\
\\ \\
1
\\ \\
\320\266
50 + \\ \\ -
N \\
+ N.
\320\226
\320\251
*, t t 1\302\273'
60 62 64 66 68 70 72 74 76 78 80
Segments Number Along Waveguide (alpha = 1.0,f = 4.5 GHz)
presented in solving for fields in an active medium indicate that further research
needs to be done to determine the viability of the PML when compared to the ABCs.
6.3 FORMULATION
In the following section, the open domain problemis modeled with ABCs described
earlier and is formulated in terms of the finite element functional. A Rayleigh-Ritz
minimization is then carried out in the usual way to find the stationary point of the
functional.
provided E is interpreted as the scattered field ESC1\". On writing the same equation
with the total field as the working variable, we get
n x V x E = />(E) + UilK
F.65)
where
and E = Escal + Einc is the total field with Einc being the incident electric field, to
usual, Escat denotes the scattered field. ConsideringF.65)to be the boundary con-
condition employed at 50, we can express the functional for the total electric field as
F(E)= f (V x
\320\223\342\200\224 E) \342\200\242
(V x E)
- k%erE
\342\200\242
eI dV
J^LMr J
+Jk0Z0 f \\z
x E)
(\302\253
\302\246x
(\320\270 E) dS
F.671
G(E,E) = x E) \302\246 x
E)
- A&E \342\200\242
eI dV
| l^- (V (V F.6S)
Expressing the above relation in terms of the incident and the scattered fields, we
have
The first and third terms on the RHS of F.69) cannot be simplified any further than
the form given in F.68). The second term does, however, lend itself to more simpli-
simplification. Making use of a simple vector identity and the divergence theorem, we can
rewrite G(Escal. Einc) as
G(Esca\\EinC)= f E8\"\" x
\342\200\242 \342\200\224
V x Einc - dV
[v ftgerElnc]
iV L Mr J
-
f \320\225\321\210|.(/1\321\205\320\243\321\205\320\223)\320\231 F.70)
Scclion 6.3 Formulation
\302\246 203
since
f \320\223\342\200\224
(VxEscal)(VxEillc)]^
= f ESG\" x \342\200\224
\342\200\242
\320\223V V x Eincl dV - f
\342\200\224
Escal \342\200\242
(n x V x Eiac)dS F.71)
iv l \320\224, J JsM,
and the surface integral cancels out everywhere inside the computational domain
except on the mesh termination boundary SQ. If we define Vd to be the volume
occupiedby dielectric materials, then the remaining volume (Vo = V - Vd) is the
volume occupied by free space. On incorporating this into F.70), we have
I EseaI
\320\223\".\320\225|\320\237\320\241)=
\342\200\242
[V x V x Eint
J v,,
+ f [v x
ESC1\" \342\200\242 \342\200\224
V x Einc - *g\302\253rElBCl rfK
Jr., L Mr J
- EsciI \342\200\242
x V
(\321\217 x E\342\204\242)dS F.72)
f
JA\"o
Since the electric field satisfies the wave equation in free space, the first term
incident
of F.72) is identically zero. The third term cancels exactly with the cross term
E
|S[| U'nc dS in tlie total field functional
\342\200\242
F.67). The second term can be simplified
by employing the first vector Green's theorem to yield
f
E8*\" \342\200\242 \342\200\224
[v x V x Eillc - kl\342\202\254rEinc]
dV
iv,t L Mr J
= f
\342\200\224 x
(V E8\"\") \342\200\242
(V x Einc)
- klefE**1
\302\246
Einc dV
\302\246k/M-r
\342\200\224 \302\246x
E5\342\204\2421
+jk()Z0 f (n H\342\204\242)dS F.73)
is,, V-t
where the normal to S,i is directedaway from Vj. The surface integral over the
dielectricinterface Sdsince the tangential component of the scattered
occurs elec-
electric field disconlirraous
is over the interface between two dielectrics having dis-
dissimilar permeabilities. It should be noted that F.73) is valid even when there are
multiple dielectric regions present. If the dielectric regions have the same perme-
permeability (fj,fl =
(iri
= \342\200\242
\302\246
\302\246
=
/^
= l, for example)and different permittivities, the sur-
surface integral contribution over the dielectric interfaces, S,i Sd , is zero. If
different permeability values are also present, then the permeability values must
be substituted into the element equations and the direction of the normal for the
two elements on the interface should take care of the respective signs.
Using F.72) and F.73). G(Escal. Eillc) reduces to
204 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter 6
\342\200\242
\320\2231 -
0\342\202\254\320\263
J K, Mr
_L Escal - x Hf
(\321\217
F.74)
-J.
F(Escal) = f (V x
\320\223\342\200\224
Escal)
\302\246 x
(V Escai) - \302\246
\320\233^\320\225*\302\2731 dV
E5\"\"]
x E**1) \302\246x
+Jk0Z0 f i\320\273
(\320\270 (\320\231 EscaVS
J.S,
\342\200\242
+ f ESCttl />(\320\225\320\266\320\2601)^
\342\200\224
Eseat \342\200\242
x Hinc) dS
+ 2jk0Z{) f (\302\253
+ 2jk0Z0 I \\-
x E30\021)
(\320\231
\342\200\242
x Ei
(\320\231
JskK
+/(Einc) F.751
where F,/ is the volume occupied by the dielectric (portion of V where er or /i, art
not unity) and Sd encompasses all dielectric interface surfaces. The function/(E\"\"|
is solely in terms of the incident electric field and vanishes when we take the first
variation of F(Escal).
6.4 APPLICATIONS
The open domain problem has varied applications in scattering, radiation, and
microwave circuit simulations. The computation of the radar echo area from geo-
packed high frequency integrated circuits. In the following sections, we will present
examples that model some of the phenomena mentioned earlier. The first few
examples demonstrate the validity of the conformal ABCs; the remaining ones
Section 6.4 \302\246
Applications 205
are progressively more complicated both in terms of modeling difficulty and struc-
structural features. They include scattering simulations as well as applications of ABCs
and artificial absorbers in computing radiation from antennas and microwave
circuits.It should also be mentioned that all scattering and circuit computations
were done using H0(cur() elements (i.e., six unknowns per tetrahedron as men-
mentioned in Chapter 2). The radiation problems were solved using linear basis func-
functions on bricks and triangular prisms. An iterative solver was used for solving the
final system of equations in all cases. Storage was, therefore, never a problem in
any of the applications although convergence rate was geometry and excitation
dependent.
6.4.1ScatteringExamples
igscaii
<r3D
= lim \342\200\224.\342\200\224-
4\320\273\321\2022
r-oo
|Elnc|
with E**1 as described by the far zone expressions A.63). The currents J and M now
= /5 x H = n x\\ -
= \320\225\321\205\320\271
where fi denotes the outward normal of a surface Sc that encloses the scatterer. This
surface can be arbitrary but for better accuracy it should be placed as close as
possibleto the scatterer. In the following, we will mention the a\342\204\226la^
echo area
or radar cross section (RCS),as it is commonly referred to. The subscripts in these
quantities simply identify the polarization of the incident and scattered fields used
for the evaluation of the RCS. Specifically,
IE-scat -i2
= |E .
afq lim \\n? '*}
implying that
om is the measured RCS due to the pth component of the scattered
field for a ^/-polarized incident plane wave. As usual, p and q represent either \320\262
or \321\204
FEMATS [25].
15
10 -_
5 -.
-E -5-
FE-ABC(vd = 0\302\260)
-15- Measured
\342\200\242
90\
-20- 0 Measured
-25
30 60 90 120 150 160
Observation angle 0O, deg.
Figure 6.8 Bistalic echo-area of a perfectly conducting cube having edge length of
0.755;.. Plane wave incident from 0 \342\200\224 = 90'.
IW: \321\204
30
20 -
10 -
S 0-
-10-
-20-
-30
Resistive a/2
sheet
T
a/2
Metal
-30
Figure 6.11 RCS pattern in the.vr plane for the composite cube shown in Fig. 6.10.
The lower half of the cube is metallic while the upper half is air-filled
with a resistive card draped over it.
scatterer was enclosed within a cubical outer boundary placed only 0.3A. away from
the scatterer. This resulted in a 30,000 unknown system which converged lo ihe
solution in about 400 iterations when the Sommerfeld radiation condition is
employed to terminate the mesh and in 1600 iterations when the second order
ABC was used. Increased iteration count for higher order ABCs is a direct result
of the shift in the spectrum of the matrix. Higher order ABCs usually result in more
eigenvalues of the coefficient matrix to shift toward the negative real axis. For this
geometry, the second-order ABC did not provide a significant improvement in accu-
accuracy (only about 0.1 dB) over the first-order condition.However, this is not true in all
cases as will be demonstrated later.
The problemsize with the conformal mesh termination is much smaller than
the 40,000 unknown system which results when the same target is enclosed in a
spherical termination boundary. The decrease in the unknown count is even more
dramatic as we go to larger scatterers. The same case was run with a higher dis-
discretization resulting in a system of 50,000unknowns; however, there was no signifi-
significant difference in the far-field values with the earlier case. The geometry for ihe
backscatter pattern shown in Fig. 6.12 is the same as the geometry drawn in Fig.
6.10 with the air-filled section now occupied by a lossy dielectric having er = 2 -\320\224
The backscatter echo-area pattern for the \321\204\321\204 polarization as computed by the FE
10
0-
\342\202\254
-10
-
-20
180
Observation angle <p0, deg.
conducting inlets. The aperture of an inlet usually has a large radar cross section
around normal incidence. Therefore, a good understanding of its scattering charac-
characteristics is critical if measures need to be taken for reducing its echo-area. A differenl
method for simulating electrically large jet-engine inlets can be found in [27]. An
-15
Figure 6.13 Backscaiter pattern of a metallic rectangular inlet (IX x IX x 1.5X) for
HI) polarization. Black dots indicate computed values, and the solid
line represents measured data [28]. Mesh termination surface is
spherical.
\302\253I*
-15
mesh can be terminated with a rectangular box placed only 0.35X away from the
scatterer (see inset of Fig. 6.15).The problem size reduces dramatically to 145,000
unknowns, a 35% reduction over the spherical mesh termination scheme. The con-
convergence time for each excitation vector is about 220 seconds,less than 4 minutes,
when run on all 56 processors of a KSR1.The computed
are again compared values
by the fact that the problem size has been reduced by more than a third and comput-
time
computing by about a fourth. Thus in many cases, a conformal ABC makes it possible
to obtain a solution with the resident storage capacity and within a reasonable time
interval. The results for the VV polarization with a rectangular mesh termination are
equally accurate.
-15
Next, consider the scattering from a perfectly conducting cylindrical inlet. Even
though integral equation codes are more efficient for such bodies of revolution, the
goal with this test is to examine the performance of the conformal absorbing bound-
boundaryconditions. Moreover, the real strength of finite elements lies in its ease of
handling material inhomogeneities encountered in practical structures. The target
-10
FE-\321\207\320\220\320\222\320\241
(HH poi)
-45
0 30 60 90
third. However, the backscatter echo-area computed for the same geometry by
Shankar [29] using the finite difference-time domain method agrees with the com-
computed results via the FE-ABC for all incidenceangles.In the
absorbing bound- [29].
was
boundary placed a few wavelengths from the scattering structure.
Next,we employ a conformal termination scheme with a cylindrical surface for
mesh truncation. This example further demonstrates that a truly conformal mesh
boundary is possible with the ABCs derived earlier in the chapter. The cylindrical
outer boundary was placed about 0.45A.from the target, and the computed RCS lor
Thus, the problem size was reduced by about 45% and the computation time by a
similar, if not greater, amount. The savings in computational resources is quite
significant even when we compare the rectangular and cylindrical termination
schemes\342\200\224a 25% reduction in problem size and a similar decrease in computation
time. This phenomenon is only to be expected from the geometrical point of vie\302\273
and, of course, improves with the problem size.
Won 6.4 \302\246
Applications 213
-10
-40 4
BOR
\342\200\242
FE-ABC (HH pol)
-45
0 30 60 90
Observation angle 0O, deg.
6.4.1.3 Plate. The motivation for testing the FE-ABC method for perfectly
conducting plates is twofold. It is usually very difficult to model the scattering from
the edges of the piate even using integral equation methods. Therefore, in this section
we present examplesto seehow the method performs at edge-on incidence. Second,
we examine the performance of termination boundaries of esotericshapes.The first
choice is to enclose the plate in a rectangular box. The second choiceis to use a box
with half cylinders attached to the faces normal to the plane of the plate\342\200\224the
reasoning being that because the edge of the plate behaves like a line source and
scatters cylindrical waves, a cylindrical mesh termination is most suitable for wave
absorption. Both mesh termination schemes require approximately the same number
of unknowns; the superiority of one over the other is thus decided only on the basis
of accuracy of the computed backscattev values.
The test case is \320\260 3.5\320\233x 2A. perfectly conducting rectangular plate. In Fig. 6.18,
we plot the backscatter pattern for the \321\204\321\204 polarization in the xz plane, i.e., over the
long side of the plate. Generally, the agreement with reference data is quite good.
However,the backscatter echo-area at edge-on incidence is not calculated accurately.
Thus, we need to check whether other mesh termination shapes will perform better.
In Fig. 6.19,we show the RCS of the conducting plate in the it plane, i.e., over
its short side, for the \321\204\321\204
polarization. The backscatter echo-area for edge-on inci-
incidence is picked up very well for a rectangular-cylindrical termination, whereas a
rectangular truncation scheme gives completely incorrect results. Thesetwo schemes
214 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter f>
-30
0 30 60 90
m
\302\246o
? 0-
-10-
-20-
-30
have approximately the same storage requirement; in fact, the box-cylinder combi-
combination
yields a slightly smaller system of equations.This example truly illustrates the
power of a conformal truncation scheme composed of simple shapes; not only are
the results far more accurate but even the storage requirement is slightly less.
In the above simulations, the boundary was terminated at O.35A. from the flat
face of the plate and 0.5A. from the edges of the plate. To test the accuracy of the
A.8C method as a function of mesh termination distance, we consider the backscatter
patterns from the edges of the plate as the mesh termination distance is increased.
Figure 6.20 shows that the backscatter values from the plate edges slowly take the
shape of the reference data as the mesh truncation distance is increased.However,
even though the results are seen to approachreferencedata as the mesh boundary is
pushed farther away from the plate edge,the experiment also shows us the limita-
limitations of this technique. The FE-ABC method is a true 3D technique; therefore,
although it is possible to use it in solving 2D problems, the associated computational
cost makesit unjustifiably expensive. A surface formulation using integral equations
or a hybrid finite element-boundary integral formulation (see Chapter 7) is more
efficient for such applications.
30
Reference
\320\236
Rect(.4Sl)
20-
Mixed
\342\200\242 (.45\320\257)
Mixed
\320\264 (.651)
10 -
-10-
-20 I
30 60 90
the .i~ plane. The numbers in the legend indicate mesh termination
distance from the plate edges.
removed in two ways: (i) by creating a small region near the tip and detaching it from
the surface or (ii) by chopping off a small part near the tip of the cone. The second
option inevitably leads to small inaccuracies for backscatter from the conical tip,
however, this option was chosen since the conical angle in our tested geometry h
extremely small (around and
7\302\260) the mesh generator fails to mesh the first case on
surface is a rectangular box placed 0.4A. from the surface of the conesphere. The far-
field results compare extremely well with computations from a body of revolution
code [32].
In this conesphere example, the choice of a piecewise planar rectangular
boundary might be questioned. A truly conformal boundary would have been t
ABCs earlier in this chapter, ABCs need a piecewise smooth surface where die
scattered or radiated field can be expressedin terms of an infinite series in I/V
ABCs have also been found to fail for concave and re-entrant structures. Thus,
using a truly conformal ABC surface in the form of a conesphereis not a good
idea. The second hurdle in using arbitrary conformal surfaces as the mesh termina-
termination
boundary is the difficulty in implementation. Surfaces of arbitrary curvature will
usually lead to loss of symmetry in the finite element matrix, thus resulting in a more
complicated solution process for a small reduction in the size of the problem. In
order to address these problems, mesh termination strategies are being investigated
which use artificial absorbers instead of ABCs.Applications of artificial absorbers
similar lo the scattering formulation except that there is the additional aspect of
source modeling. This is outlined in detail in the next chapter.
performance of ABCs for this application. Among those studied, we present the
1\320\224
4\320\257
-40
-90 -60 0 30 60 90
Observation angle deg.
\320\2620,
-40
-90 -60 -30 0 30 60 90
Observation angle 0Ol deg.
Patch
Metal
-50
-180.0 -90.0 0.0 90.0 180.0
Figure 6.23 Convergence of ihe FE-ABC method for computing the H-plane radia-
of a cavity-backed axially
radiationpattern polarized patch. The reference
data is provided by a rigorous FE-BI formulation for the same cavity-
backed antenna.
scattering, the radiation pattern calculated via the FE-ABC method is seen to be in
excellent agreement with thepattern computed by the more rigorous finite element
boundary integral (FE-BI) [33] approach even when the mesh is terminated only
0.3A. from the aperture.
for such situations when a truly conformal ABC may be much more difficult to
implement. Figure 6.24(a) shows the setup, where a 2cm x 3cm rectangular patch
Section 6.4 \302\246
Applications 219
I- 6\" j. 6\"
6\"
(Side view)
(Top view)
(a)
Feed 6 cm
\320\261-Polarlzed radiated power
location
2 cm 6cm
Cavity
Patch
1.125em
-50
3 cm -180 -90 0 90 180
Observation angle (deg.)
\321\204
0.08 cm
Computation
T Measurement
(b)
(c)
Figure 6.24 Cavity-hacked rectangular patch on ogive: (a) setup, (b) antenna
dimensions, (c) comparison with the measurement. [After Ozdemir and
Volakis, t' IEEE, 1997 [34].]
cavity (ten times the maximum cavity dimension). The ^-polarized radiation does
'\321\210-\320\230\321\210-1-12.7
PEC
of T.
|C<We\302\273.v Ozdemir.]
220 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter 6
not interact with the edges of the ogive and can therefore be computed by localiz-
localizingthe mesh near the cavity on the ogive. Thus, the radiated power patlerr.
accounts only for the antenna (and the curvature of the platform), but does not
include interactions with the ogive's tips. Figure 6.24(c)shows the computed fr
polarized radiation as compared to the measurement [35]. The agreement with
measured data is very good for this polarization. However, predicting the \302\251\34
polarized radiation (not shown) requires modeling the entire ogive as ihi\302\253
cause diffraction from the ogive's tips. A way to account for such secondary
diffractions is to interface the finite element-artificial absorber (FE-AA) method
6.4.2.3 Cylindrical Via. For all practical problems in circuit design, a lull
wave analysis of the circuit components can be carried out only for small parts of
the circuit. Therefore, in analyzing microwave circuit problems,ABCs may be
required to predict radiation loss from circuit elements, for analyzing circuit discon-
discontinuities or for modeling small critical paths in a large, complicated integrated circuit
design. Figure 6.26 shows a 0.77-mm radius cylindrical via discontinuity connectim
5.9 mm
0.33 mm
I * 3.3 mm
Figure 6.26 (a) Side view of cylindrical \321\210
two striplines, each 3.3 mm by 0.33 mm. The metallization in the dielectric serves a-u
6.27 Comparison of
\320\230\321\206\320\270\320\263\320\265 scattering para-
of
parameters cylindrical via for open and closed
of top microstrip. [After
5 10 15 20
\302\253alls Wang el al, r;
IEEE. 1994 [37].] Frequency (GHz)
\320\255\320\225
VxE = Vr \342\200\224
+ \320\273\321\205
\321\205\320\225
F.76)
\320\260\320\270
where Vr x E is called the surface curl involving only the tangential derivatives and
is defined as
E = -it x -
V> x VEn +
i2K{Elt t^iE,, + nV \302\246x
(E \320\273) F.77)
As before, and
\320\272\\ denote
\320\2722 the principal curvatures of the surface under considera-
consideration,
Eh, ?,, are the tangential components, and is the normal
\320\225\342\200\236 component of the
vector E on the surface.
We are interested in the evaluation of the three vector identities given earlier in
the chapter. Let us considersimplifying the tangential components of the curl of a
vector, E in this case. Using the definition of the curl given above, we have
x V
\320\273 x E = -(\320\273x n x V)?,, - - K2El2i2+n x \302\273x-
V Bn)
= Vx E,
\320\243,?\342\200\236+\320\273\321\205 F.78)
where x
-{\320\273 n x V) = V,. The first vector identity is, therefore, easily proved.
222 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapici \320\273
Next, we will prove the second of the three identities. We start with the lerm
x V
\321\217 x V,E,, and simplify it using the definition of the curl of a vector given above.
1
V?n -n \342\200\224-
[\320\252\320\225
\320\255?
= -\321\217 V
\321\205 x \342\200\224-
\321\217
\320\255\320\271
Since V E =
\302\246 V \302\246
E, + (V \342\200\242 we can
+ \320\250\342\200\236/\320\264\320\277,
\320\277)\320\225\342\200\236 simplify the above relation even
further by substituting the appropriate expression for the normal derivative of the
normal component of the electric field and using the fact that the electric field is
divergence-free in a source-free region.
x V
\320\273 x V,?,, = V,(V
\342\200\242
E,) + (V \302\246
n)V,En
= V,(V \342\200\242
E,) + 2KmV,En Fjjf)|
where = (\320\272,
\320\272\321\202 -I- curvature.
\320\2722)/2is the mean
The proof of the is more complicated because it involves
third identity two curl
operations on the electric field. We first need to switch the positions of the outermoa
and
\321\217\321\205 the Vx operators to arrive at a simplified form of the rather
compto
expression. Therefore,
x V
\321\217 x (n x V x E) = V x x n
{\321\217 x V x E| - \342\200\242
(n x
\321\217\320\243, V x E)
-ArUVxE^t,+(VxE),,t2}
= -V x {V x E - x E),,}
\320\273(\320\243
-
nV,
\342\200\242
(n x V x E)
- A*{(V x t, + (V x E),, t,| F.8!|
E),;
Now we use the fact that the electric field satisfies the wave equation to reduce the
- -
= V x x
{\320\273(\320\243 \320\246\320\225,
\320\225)\342\200\236} Ak{{V x E),2t, + (V x
E)r,t2l
where = \320\272\\
&\320\272 \342\200\224
\320\2722.
Thus, we have shown that all three identities hold as long as the vector. E it
this case, is divergenceless and satisfies the vector wave equation.
REFERENCES
objects by the hybrid moment and finite element method. IEEE Tnins. Antennas
[7] L. \320\241
Kempel and J. L. Volakis. Evaluation of new vector ABCs for conformal
printed antennas. 1994 URS1 Radio Science Meeting Digest, Seattle, WA.
[8] T. Ozdemir and J. L. Volakis. A comparative study of an absorber boundary
condition and an artificial absorber for truncating finite element meshes. Radio
Science.29:1255-1263, 1994.
September-October
[9] W. Chew
\320\241 and H. W. Weedon. A 3D perfectly matched medium from modi-
modified Maxwell's equations with stretched coordinates. Microwave Opt. Tech.
Lett., 7A3), September 1994.
48:163-193, 1992.
[13] J. van Bladel. Electromagnetic Fields. Hemisphere Publishing Corp., New York,
1985.
[14]S. M. Rytov. Computation of the skin effect by the perturbation method.
Tlieor. Plm., 10:180-189,
J. Exp. 1940.Translation by V. Kerdemelidis and
K. M. Mitzner, Northrop Navair, Hawthorne, CA 90250.
[15] A. F. Peterson. Absorbing boundary conditions for the vector wave equation.
Microwave and Opt. Techn. Letters, 1:62-64, April 1988.
element solution of the vector wave equation. Microwave and Opt. Techn.
Letters, 2A0):37O-372. October 1989.
[17]J.P. Berenger. A perfectly matched layer for the absorption of electromagnetic
waves. J. Phys., 114B):
\320\241\320\276\321\202\321\200. 185-200, October 1994.
[18] D. S. Katz. E. T. Thiele, and A. Taflove. Validation and extension to three
[19] R. Mittra and U. Pekel.A new look at the Perfectly Matched Layer (PML)
concept for the reflectionless absorption of electromagnetic waves. IEEE
Microwave and Guided Wave Lett., 5C):84-86, March 1995.
[20] Z. J. Sacks.D. M. Kingsland, R. Lee, and J.-F. Lee. A perfectly matched
anisotropic-absorber for use as an
absorbing boundary condition. IEEE
Trans. Antennas Propagat., 43:1460-1463, 1995.
[21]D. M. KingsJand, J. Gong, J. L. Volakis, and J.-F. Lee.Performance of an
anisotropic artificial absorber for truncating finite element meshes. IEEE Tram.
layer for truncating finite element meshes. IEE Electronics Lett., 31A8):1559-
1561, August 1995.
Radar Conference Proceedings, pp. 339-344, Ann Arbor, MI, May 1996.
[26] A. D. Yaghjian and R. V. McGahan. Broadside radar cross-sectionof a per-
1991.
[33] L. \320\241.
Kempel and J. L. Volakis. Scattering by cavity-backed antennas on a
circular cylinder. IEEE Trans. Antennas Propagal., 42:1268-1279. September
1994.
[34] T. Ozdemir and J. L. Volakis. Triangular prisms for edge-based vector finite
element antenna analysis. IEEE Trans. Antennas Propagal., pp. 788-797, May
1997.
[35] R. J. Sliva and H. T. G. Wang. Personal communication.
FE-BI Method
7\320\233
INTRODUCTION
In this chapter, the finite element-boundary integral (FE-BI) method for full three-
dimensional geometries is presented. This is one of the most powerful computational
electromagnetics techniques
(\320\241\320\225\320\234) in use today and represents a hybridization of
the traditional method of moments with the finite element method. Interest in FE-BI
stems from the fact that volume integral equations have difficulty modeling com-
combined metal and dielectric structures and they lead to more computationally intensive
programs as compared to the finite element method. In the FE-BI method, the
boundary integral (or integral equation) is used to satisfy the following requirements:
1. Bound or terminate the computational domain in which the finite element
method is used.
2. Relate the electric and magnetic fields on the boundary.
The manner in which the FE-BI method satisfies these requirements will be pre-
presented.
valuable since each case individually combines the flexibility of the finite element
method with the efficiency of a specialboundary integral mesh closure. For both the
general and specialcases,comments regarding computational cost, flexibility, and
accuracy will be addressed. For the case of cavity-backed antennas recessed in a
ground plane, an extremely efficient solution technique that utilizes Fast Fourier
Transforms (e.g., the CG-FFT method) will be presented in detail.
227
228 Three-Dimensional FE-BI Method \302\246
Chapter?
For the most part, the method of weighted residuals and Galerkin's technique
(for the total electric field formulation considered herein). Various dielectric and
magnetic materials are by appropriate
specified permittivities and permeabilities
on an element-by-element basis. This is in marked contrast to the surface integral
equation (method of moments) sincein that case the material must be homogeneous
within each enclosed domain.
We begin with a derivation of the FE-B1 equations using the physical equiva-
equivalence principle.
The derivation of the FE-BI equations begins with the vector wave equation. This
second-orderpartial differential equation is solved by first taking the inner product of
the vector wave equation and a vector sub-domainbasis function, W,, thus forming the
weighted residual (see Chapters I and 4). Our goal is to minimize this residual or
equivalently to minimize the difference between the solution of the FE-BI discrete
approximation and physical reality. This proceduregeneratesNt, equations where
Ne is the number of sub-domainbasis functions associated with the electric field within
and on the boundary of the volume.The resulting integro-differential equation is
X
f
v x \320\223V E\"\"l \342\200\242
WidV-kl I 6rEm
\342\200\242
W,\302\253/K
=
iv Mr iv
|_ J
In this, the left hand side contains the unknown interior electric fields (Eml) while the
right hand side has the impressed sources (J', M').Since the excitation of the system
is not relevant to the derivation of the FE-BI equations, the right hand side can be
expressed as
and its evaluation is left for specific applications. In practice, the electriccurrent J' in
G.2) is useful for modeling filamentary current sources such as the ones used to
excite patch antennas, examples of which will be presented later in this chapter.
The magnetic current M' can be used to represent aperture feeds within the compu-
computational domain or on its boundary.
The FE-BI equation G.1) contains second-order derivatives of the unknown
electric field due to the use of the wave equation. It is desirable to transfer one of the
derivatives from the unknown electric field onto the weight function so that linear
weight and expansion functions used. This derivative
may be transfer is accom-
accomplished by invoking the first vector Green's theorem [1] (see Chapter 5). Doing so.
G.1) becomes
This is theweak form of the wave equation, and it possesses useful properties
compared to G.1). It has a symmetric volume contribution since an identical number
of derivatives are required of both the unknown electric field and weight function
and we will be using Galerkin's testing procedure. Hence,one may expect a sym-
symmetric linear system associated with the volume integral provided the material within
the computational volume is reciprocal (i.e., not general anisotropic).
Recall that in the beginning of this chapter, we stated that there are two
requirements which should be satisfied on the boundary of the finite eleraem
mesh: A) mesh closure and B) relating the tangential electric field to the tangential
magnetic field. The latter requirement is clearly illustrated in G.3) since the surface
term includes the surface electric field through the testing function Wf and ttw
tangential magnetic field n x H'm. In addition, with some foresight, we recognize
that G.3) represents an underdetermined system since the test functions are only
associated with the electric field while both the electric field and surface magnetic
field are unknown. We must therefore find a means of closingthe mesh, relating the
surface electric and magnetic fields, and providing additional equations.
The exterior excitation (e.g., a plane wave) can be introduced into G.3) by
considering the incident, reflected, and secondary(or scattered) fields as separable.
Specifically, the total exterior magnetic field can be expressed as the sum
+ H**1 {1A)
Here, incident
the field, H', and the reflected field, Hrefl, are known while the sec-
secondary field (Hscal) is obtained in terms of the interior fields using the surface
equivalence principle. This field decomposition is useful in particular for
analyzing
structures that are infinite in extent such as a conformal antenna recessed in a
metallic ground plane since the reflected field is already known and hence need
not be computed. For finite structures, such as a scattering body within a encased
surface of revolution domain (see the FE-SOR method later in this
chap-
discussed
chapter), the reflected field is omitted and only the impressed (incident) and secondary
(scattered) fields are considered, where Hacal would have also to include any reflected
fields lhat may be present. For radiation analysis, the impressed and reflected fields
are omittedaltogether and the total field is set equal to the secondary field.
A magnetic field integral equation (MFIE)[2],[3] can be formed once surface
equivalent currents are used as illustrated in Fig. 7.2. These currents can then be used
to express H**\" giving the MFIE [4]1
-n x [H'(r) + Hrefl(r)] =
-^-(Lx[Vx
5(r, r') \342\200\242
J(r')] dS'
'The first right hand side term is due to the identity (valid just interior to the surface S)
where (see Chapter 4) the horizontal bar implies the principal value of the integral. However, as pointed
out by Sancer [4]. the principal value is not necessary since the numerical evaluation of the integral with r
on S does not produce a singularity. The 1/2 factor is actually obtained without invoking the principal
Section 7.2 General
\302\246 Formulation 231
\"Bubble\"
above ground
Aperture / ; 7| ^p|ane
in ground
Ground
plane
B=0)
.Cavity
Cavity
As usual the electric and magnetic currents are associatedwith the tangential exter-
externalfields, e.g.. J = n x Hext and M = Ecxt x n, respectively. Also, G.5) enforces the
identity J = nxHm.
An alternative boundary integral equation can be derived by introducing the
electric field integral equation (EFIE). to do so we decompose the electric fields as
where the secondary field (Escal)can be written again in terms of J and M in Fig. 7.2
by invoking the equivalence principle (seeChapter 1). Doing so results in the EFIE
[2], [5]
value theorem by placing r slightly off the surface S and then taking the limit as r approaches S.
Alternatively, the V operator can be moved oulskie the integral and applied after completing the integra-
Note that ihe identity
integration.
and EFIE that may be more suitable for a particular application. However, lor
simplicity and illustration, we will only use G.5), G.7), or a variation of them.
Neither of these boundary integral equations suffers from spurious resonances
when used to simulate cavities or antennas that are recessed in a ground plane
such as the case shown in Fig. 7.2.
Note that in both G.5) and G.7), the dyadic Green's function is left unspecified.
For finite geometries (e.g., no infinite structures such as a metallic plane), the free-
space dyadic Green's function Go should be used.In contrast, for structures involv-
compromise is to use the CFIE since it has both the MFIE and EFIE in it. However,
for the examples considered in this text, it is convenient to use the MFIE and hence
for the rest of the chapter unless otherwise noted, we shall use the MFIE.
The weak form of the vector wave equation G.3) involves the fields within the
enclosing boundary while the fields in G.5) are in the outer region. Thesefields must
be coupled together to effect a hybridization of G.3) and G.5). This is done by
enforcing tangential field continuity across the computational volume's boundary
= \320\277
n x \320\235\320\2501 x Hcxl on the surface S G.9)
The continuity condition associated with the magnetic fields G.9) is often termed a
,
\302\246x t, x \342\200\242
EM dS'dS =.f?xt G.11)
[a n']
where the exterior excitation term /f*1 is given by
and the testing functions associated with H15*1 are indicated by Q,-. Note that there are
at present three classesof unknown fields in G.11): Eint, Ecxl, and Him. In G.11), the
magnetic field continuity was explicitly enforced: however, the elec-
condition G.9)
electric field continuity must also be enforced
condition G.10) to solve G.11). This can
be accomplished in A) implicitly
one of two ways:
by using identical basis functions
for Eim and Ecxl on the surface S (hence,all occurrences of Ecxl are replaced by Eim in
G.11)); or explicitly by enforcing the auxiliary relation
[Q, \342\200\242 -
n x (\320\225\321\213 Ecx1)] dS = 0 G.13)
s
which satisfies G.10) in a weak or average sense. Also, the testing functions for the
interior (FE) problem, W,, are not necessarily the same as the testing functions used
for the exterior (IE) problem,Q,. In fact, the testing functions W, are associatedwith
the interior electric field while the testing functions Q, are associated with the surface
magnetic field. Hence, for the general case of a different expansion for the exterior
and interior electric fields,we have the coupled equation
interior equation:
EcndS'dS=ff
coupling equation:
[Q, \302\246
n x (Eml - Eex VS = 0 G.14)
The solution of G.14) proceedsby expanding the volume electric field and
surface tangential electric and magnetic fields in terms of sub-domain basis functions
Eim =
J^ E/Wj volume electric field
_
?c*i
y2 E-V/ surface electric field
'V, +*\302\246\342\200\236+.\\</u
Hinl =
J2 HjQj surface magnetic rield G.15)
/=\320\233',+\320\233',.,
where N,. is the number of volume electric field unknowns, Nes is the number of
surface (exterior)electricfield unknowns, JV/,5 is the number of surface magnetic field
unknowns, and the total number of unknowns is given by =
\320\233' Nv + Nes + #/,,,.The
234 Three-Diraensional FE-BJ Method \302\246
Chapter 7
volume electric field is expanded in terms of volumetric basis functions that are based
on the edges of the geometry (see Chapters 2 and 5). The surface electric and
magnetic fields are expanded in terms of separate functions that have support only
over sub-domain surface patches which are often triangular or rectangular in shape.
In both cases, since we are using Galerkin the procedure for converting a continues
physical problemto a discreteapproximation of that original problem, the same
basis functions used for testing in G.14) are used for field expansions, though W, and
including the coupling equation. When the three field expansions G.15) are substi-
substituted into G.14), we get
lJs *
j=Ne+\\
- i V x f x
\320\243. H\\-il IQt
\302\246x
<fi Qj)]dS | Q/ [n
x
\342\200\242 \342\200\242
\320\231'1
QjdS'dS
Y. Ej\\l I Qr\\nx:dxn']-\\jdS\"ds)=f
L J
j=n,+\\
<Jtt+\\ US is J
* = 1.\320\233,+
\320\233\320\223,+ 2 N
X>Jl[QrnxW,]rfs|-
1JS >
?
J=\\ ;
!= Ne+\\,Nr + 2 N
where JV(,
= N,, + \320\233',,,is of electric field
the total number unknowns and
M = N,, + Ncs +
iV/,s is of
theunknowns.
total number
A different approach implicit satisfaction of the
involves the electric field con-
continuity requirement by employing identical basis functions. In this case, the surface
basis functions V/ are chosen to be identical to a surface evaluation of the volume
basis function W/. e.g., Vy = W; as (x. y. z) -*\302\246 surface. Accordingly, we rewrite the
H=
^2 H.iQj surface magnetic field G.17)
V X W|' V X E\"\"
dV - \342\200\242 Him dS
\342\200\242
n x
f k\\ f e,W, Eim dV -jk0Za J W, =/}BI
iv Mr Ji- is
J2 Hj\\i W, \342\200\242
n x
Q;ds\\ =/}\302\246\", /=1,2 Ne
Y) Hf\\-\\i[Qr(nxQ,)]dS-i V
iQ/\320\223\320\277\321\205
x S x n'l Q.rfS'(
i=Ne+l,Ne +2 N G.19)
We observe that the number of unknowns is equal to the number of equations /V and
/V,.
\342\200\224 It
N,.. is also understood that each contribution is nonzero only when both the
test and expansionfunctions have support, e.g.. although all electric field unknowns
are shown in the second equation of G.19).only those associated with the surface
have support.The linear system represented by G.19) is solved using either a direct
or iterative matrix solver to determine the unknown electric and magnetic fields (see
Chapter 9).
This latter set of coupled equations G.19) yields the least number of unknowns
and the simplest formulation to implement. However, this approach limits the flexi-
236 FE-BIMethod
Three-Dimensional \302\246
Chapier 7
bility of the method since the discretization required by the interior fields must also
be used on the exterior electricfield. Consequently, the boundary integral portion of
the formulation may need to be oversampled to accommodatethe geometrical
requirements of the interior region. The converse is also true where many volume
unknowns are essentially wastedto permit a detailed surface mesh. Regardless, iht
result is a potential waste of computational resources and flexibility.
The versatile FE-BI methods are still associated with large demands on com-
computational resources. The FE portion of G.11) will permit the specification of a
complex inhomogeneous volume fill while imposing a low memory and compute
cycle burden, principally due to the resulting sparse matrix. However, the two
boundary terms are essentially identical to a surface method of moments formula-
with
formulation the resulting fully populated matrix. Figure 7.5 illustrates the fill profile of a
typical FE-BI matrix. The dark region indicates matrix entries that are nonzero while
the white space denotes zeros and hence correspondsto portions of the matrix ihai
Section 7.2 General
\302\246 Formulation 237
30.0
Theta-90deg
\302\246
\302\246
\302\246
Measured data
20.0
10.0
-10.0
-20.0
Bl system
i1 flj
J !'!i jjP'b
Figure 7.5 Fill profile for a typical finite 120
clement-boundary integral matrix. [Courtesy 20 40 60 80 100 120
ofS. Bimligamvale.] Column
discretized using edge-based brick elements with edge length a conservative Ji/10.
Each faceof the volume will have 180 unknowns per field component. Including the
edges at each face junction, the total number of surface unknowns is 1200per field
component or 2400 tota] surface unknowns. The boundary matrices associated with
these unknowns would require approximately 12 MB of RAM if single precision
complex number storage is used. This is obviously not a large burden even for
many personal computers. However, consider the effect of doubling the sample
rate as is often necessary for complexgeometriesor radiation problems. In itm
case, each edge of the mesh will be A/20. The number of unknowns per field com-
component is now 4800 or 9600 total surface unknowns, and these require over 184 MB
of RAM! Although this amount of RAM can be found in high-end engineering
workstations, it is clear that the boundary integral memory demand does not scale
favorably.
Hence, the FE-BI method is usually implemented for certain special cases that
reduce BI's demand
the on resources. In the next several sections, we will preseni
examples of these special cases beginning with the case of a cavity-backedvolume
recessed in a metallic ground plane. However, we first introduce the important topic
of excitation and feed modeling.
aperture feeds. The FE-BI method is particularly important for antenna modeling
since sufficiently accurate and efficient artificial mesh truncation procedures are nol
available for accurate near-field calculations. Sincethe input impedance is required for
antenna analysis and gain calculations, near-field accuracy is critical as well as a good
feed model. In the following section, we present a brief discussionof feed modeling.
One source, useful for determining the radar cross section (RCS),is the plane
wave
=
\320\235\320\263\321\201\320\237
x \320\225\320\263\320\265\320\237]
\320\2430[\320\272\320\263 G.21)
ff = -J2kaI Q,
\302\246
f x [k' x eV-'*\302\253(i'\"] dS G.22)
Another useful source is an impressed electric current, J', which can be used to
simulate a probefeed.This feed has been used to model a patch antenna feed by the
method of moments [13]as well as in the finite element method [10]. Assuming an
infinitesimally thin current filament, the excitation term G.2) for an arbitrarily
oriented probe feed is given by
where / indicates the orientation of the probe, W, is the weight or testing function
associated with the /th unknown, and /0 is the current flowing through the filament.
For current probes lying along the .v-, >>-, or r-directions, we have
J' =
\302\246
\320\273-
- -
/0\302\2530' Wf(x,
\320\243\321\200\320\226\320\267 z)
\321\203,
\320\263\342\200\236)
J' =
\302\246 - -
\321\203 /0<5(.x Zp)W^{x. y.
\321\205\342\200\236\320\251: z)
2 \302\246
J' = hAx
~
a>Rv
- yp) Wf (.*. y, z) G.24)
element cell is usually assumed for simplicity (so thai integration over thai length
will yield the moment /()/). With this assumption, the excitation vector generating
function becomes
240 Three-Dimensional FE-BI Method \302\246
Chapter 7
f'f
= [8b- - -
zp))x
\302\246
W,dV
\320\243\321\200\320\251\320\263
= -/*0Z0/0Ax Wf{., yp, Sp)
-jk0Z0I01
= -jk0Z0I0
- - \302\246 dV
W,
=
xp)8{z
\320\251\321\205 zp))y -/A:(,Z0/0\320\224\342\200\236
\342\204\226*(*,... z,)
j(
- - = -jknZ0/0A:
[ xpWy
[\320\230\321\205 yp)]i\302\246
W, dV Wj{xp,yp, \302\246) G.3)
where Ihe integration volume in G.25) includes all elements containing the source.
Also, the three expressions in G.25) correspond to x~, y-, and s-directedfilamems,
respectively. If more than one test edge is involved in the feed model (e.g., when the
G.261
gap voltage feed is implemented by forcing the electric field (the unknown) to be
equal to ^ where d is the length of the /th edge. This can be done during each
iteration of an iterative solver or via an appropriate auxiliary (additional equation)
condition.
Section 7.3 Excitation
\302\246 and Feed Modeling 241
7.3.4 CoaxialCableFeed
As noted above, the probe feed mode) is of acceptableaccuracy for very thin
substrates and for circumstanceswhere the diameter of the probe may be safely
ignored. For thicker substrates or for circumstances where the diameter of the
probe and the size of the coaxial (or coax) aperture need beconsidered, an improved
feed model is necessary. The coaxial feed geometry is shown in Fig. 7.6. The following
derivation and model are based on the development given by Gong and Volakis [15].
Cavity Patch
With the presence of the coax cable aperture, the boundary integral (not/Jnl)
in G.3) will include the term
G.27)
where n = z and S/ denotes the aperture of the coax cable. Assuming \320\260 mode
\320\242\320\225\320\234
G.28)
where
A+\320\223). G.29)
In these expressions, p is the radial variable from the center of the inner conductor, p
and \321\204
are the usual cylindrical unit vectors in the coax aperture, /\342\200\236
is the current
flowing through the coax cable, and en is the relative permittivity of the dielectric
between the inner and outer conductorsof the coax cable. Also, \320\223 is the reflection
\302\273 GJ0)
w
We observe that G.30) is the constraint at the cable junction in terms of the
new quantities /i0 and eu which can be used as new unknowns in place of the fields E
and H. However, before introducing G.27) into the system, it is necessary to relate c{)
and Ao to the unknowns (edges) lying within the coaxial aperture. Since the actual
242 Three-Dimensional FE-B1 Method \302\246
Chapter ?
=
\320\233\320\240 ~
a) = eQIn -.
?\321\201\320\276\302\273\321\205(\320\233 i= Np (\321\200=\\\320\233 Nc) G.3li
where A V denotes the potential difference between the inner and outer surface of the
cable, and ?coux denotes the field in the coax. Also, Np denotes the global number for
the edge across the coax cable and Nc- is the number of edges in the coax aperture
When the condition G.3 J) is usedin the function G,27), it introduces the excitation
into the finite element system without a need to extend the mesh inside the cable or to
employ a fictitious current probe. Specifically,we have
\342\200\224
CF _ /-coax
with
\320\273
C
Note that/f is nonzero only for thoseedgeswhich coincide with the aperture of the
coaxial cable. It is also apparent that//0\"* is a constant and becomes part of the
excitation function/1\"' when moved to the right hand side of the matrix system
Basically, the excitation column entries will be zero or equal to/?oax for those edges
coinciding with the coaxial cable aperture. Upon solution of the system, the input
admittance at the coax aperture (z = \320\236)
can be obtained using the expression
\342\200\236 f 2/0 1
=J_
where Zt. is the characteristic impedance of the coax cable and the integration path \320\241
Antenna elements
(\302\246)
1
/ / / / / /*/
St A1) 4
Figure 7.7 Cross an aperture cou-
section \320\276\320\223 ,
\"
pW patch antenna, showing the cavity region \\
I and the microsirip line Truncation
II \320\223\320\276\320\263
two plane \\
region
Afferent FEM computational domains. Coupling aperture Sa
bricks are the best candidates since the feed structure is rectangular in shape and the
substrate has a constant thickness.
some connectivity matrix must be introduced to relate the mesh edges across the
aperture. This can be accomplished using the coupling equations G.9), G.10), and
G.13),introduced previously. However, since the aperture is very narrow, a 'static'
field distribution may be assumedat any given frequency. Therefore, the potential
concept may be applied to relate the fields on either sideof the aperture. To do so, let
us first classify the slot edges as follows [17]:
El;2
=
^(eJEJ
+ eHlE*+l) G.33)
in which
*,= G.34)
{+]
and in these, / and d are the lengths of the parallel and diagonal edges, respectively.
That is, / is simply the width of the narrow rectangular aperture between the two
meshes. The coefficient (/ is equal to \302\2611depending on the sign conventions asso-
associated with the meshes to either side of the coupling aperture. Essentially. G.33) is a
potential approximation to the electric field continuity conditions G.10) and G.13).
aperture and synthesizing a coax Green's function via a modal series. Modal series
feed models are discussed by Reddy et al. [18]. In this, the authors present as iin
appendix the formulas for rectangular,circular, and coaxial feed apertures.
The electric field across the feed aperture can be expressedas a sum of tncidcni
with
= E'(x,y) e-*\021
E'(x, y, z) G.361
given by
\\ ^ G.37|
is,
Across the feed aperture the magnetic field is obtained by Faraday's Law
H = & V x E G.39|
assuming a nonmagnetic material (juf = 1). This is used across the feed aperture in
G.3) by equating it to the interior magnetic field (H = Hinl). Hence, the excitation for
/-mode= 2yme
One of the most successful applicationsof the FE-Bl method involves antennas
situated in a cavity recessed in an infinite, metallic ground plane. The reason this
application is so well-suited for the FE-BI method is that the costly boundary
integral portion of the formulation while the material flexibility
is minimized of
the finite element method is retained. FE-BI was first applied to this problem by
Jin and Volakis [10] where they utilized brick elements. The computer program
developedunder that effort, FEMA-BRICK, was exceptionally efficient and has
been used by government, industry, and academia for the analysis and design of
very large (hundreds of elements) patch and slot antenna arrays. The secret of this
particular computer program's efficiency is the use of a biconjugate gradient-fast
Fourier transform (BiCG-FFT) matrix solver [16], [19].This approach will be dis-
discussed below as well as a description of the BiCG-FFT method. The FE-Bl method
has sincebeen implemented using tetrahedral [11], [20] and prism [21] elements to
enhance flexibility: however, use of such flexible finite elements
generally preclude
the use of a BiCG-FFTmatrix solver. An additional enhancement of the basic
recessedcavity formulation involves grafting geometrical theory of diffraction
(GTD)coefficients to the FE-BI results to approximate a finite ground plane [22].
Figure 7.8 illustrates a cavity-backed metallic ground plane. The
aperture in a
aperture lies in the xy plane and can have arbitrary shape. The cavity can also have
arbitrary composition, but we assume a metallic boundary on all hs sides except for
the aperture. The material within the volume is assumed to be inhomogeneous, and
unless otherwise noted, the exterior region is assumedto be free space. The aperture
can be either open or partially covered with an infinitesimally thin metallic patch.
There can be more than one cavity/aperture, but for the sake of simplicity, we
assume that all apertures lie in the z = 0 plane.
Aperture
Ground
plane
(Infinite)
Recessed
cavity
- Base of
cavity
Figure 7.8 Illustration of a cavity recessed in
a metallic ground plane
246 Three-DimensionalFE-B1 Method \302\246
Chapter)
7.4.1 Formulation
The efficiency of this method lies in the fact that the only exposed (nonmeiallk)
surface is the aperture. Sincethe cavity's other walls are metallic, for a total electric
field formulation, the boundary conditions on those walls require a vanishing tan-
tangential electric field. The assumption of the aperture lying in an infinite metallic
ground plane allows further simplification. The tangential surface electric fields
(Ecltl) in G.11) short out over the entire ground plane except for the aperture
Thus, the magnetic current (\342\200\224ft x Eoxl) has support only over the aperture (it also
dyadic Green's function can be used in place of the usual free-space Green'sfunc-
function, and since this Green's function satisfies the Neumann boundary condition, the
electric currents are no longer required. Hence,the Green's function enforces the
metallic plane boundary condition (without a need to introduce an equivalent elec-
electric current) thereby reducing the number of unknowns. This dyadic Green's func-
function is an electric dyadic Green's function of the second kind [9], and for the case of
an infinite metallic plane, it can be derived using image theory. Sincethis Greens
function converts a surface tangential magnetic current to an exterior magnetic held.
image theory states that the Green's function is simply twice the free-space Green's
function
where both the source and test points lie in the .vv plane. Note that G.42) contains a
minus sign, but alternative representations without the minus sign may be used
An important difference between G.43) and G.5) is that since the magnetic field in
the aperture can be completely represented by the tangential electric field in the
aperture, e.g., through the use of G.43), we can substitute the surface magnetic
field integral into G.3). That is, the FE-BI equation, assuming surface field
expan-
expansion terms identical lo the volume expansion terms, is given by
f fuv Einl V x W,
\342\200\242 . \302\246 1
Mr
h-\\_ J
- kl I \342\200\242 x ft' -
ft x \320\252\320\273 dS' dS =f'f +/fl
? [\\V, Eml]
G,44)
Hence, rather than having separate but coupled equations G.19), we have a single
equation G.44).
Section 7,4 \302\246
Caviiy Recessed in a Ground Plane 247
where we have assumed that the volume expansion functions reduce to the surface
expansion functions in the aperture, thus ensuring the enforcement of G.10). Note
that in G.45), the surface term is nonzero only for test and source edges that lie in the
aperture.
7.4.2 SolutionUsingBrickElements
Although the formulation presented above can be and has been implemented
using various different element volumes (e.g., bricks,prisms, or tetrahedrals) [2]. [3].
[11]. [12],the use of brick elements with uniform discretization allows for a particu-
particularlyefficient matrix solver: an iterative, FFT-based method.Bricks are attractive for
discretizing rectangular volumes sincethey are easy to implement and readily con-
conform to the cavity's dimensions. Bricks are not suitable for drum- or odd-shaped
cavities sincethese volumes introduce stair-casing and thereby reduce the accuracy
of the solution.
As stated above, if brick elements are used, then a particularly efficient imple-
implementation can be achieved. Experience has shown that the boundary integral portion
of the FE-B1 equation G.44)dominates the computational cost for many applica-
applications both in terms of memory and compute cycles.This is because the sub-matrix
associated with the boundary integral is fully populated and hence requires ('(N^,)
storage and compute cyclesper iteration, where jVbl is the number of unknowns
uniform surface discretization, and in this case, the boundary integral terms depend on
the physical distance between unknowns in terms of rows and columns of the surface
grid. Hence, the boundary integral sub-matrix entries can be written as
|r[,| G.46)
where [m. n] is the row and the surface grid) of the test edge and [m',n'] is
column (in
corresponding position source In
of the
edge. these terms, the arguments to the
Green's function is the difference between the row (m \342\200\224 m1) and column (n \342\200\224
n1)
of the source and test points. Hence,the sub-matrix that results from G.46) is Block
Toeplitz. This particukir matrix structure lends itself well to efficient solution.
Specific formulae for the boundary integral matrix entries are presented in the
appendix.
248 Three-Dimensional FE-BI Method \302\246
Chapter 7
Iterative solution algorithms refine an initial guess at the solution until a preset
error threshold is satisfied. The major computational cost driver in any iterative
algorithm is the matrix-vector multiply. All iterative solvers require at least one
1 U?\302\273 = If
{^ci Lt\302\260] raj J i4c>\"
{\321\204 {/imi
{,/
suggested by G.47), the matrix-vector multiply required by the BiCG method can
search vector (see Chapter 9 for the BiCG algorithm) and the result retained in a
the search vector would be multiplied by null rows and is therefore omitted). This
result is then added to the FE product, and it is identical to the product vector
obtained if the entire matrix were multiplied by the search vector at one time
However, since the costly matrix-vector operation has been partitioned into a FE
and BI portion, it can be optimized to exploit the sparseness of the FE matrix and
considered and stored. In fact, since the FE matrix entries for a brick element are
regular, there is no need to storemore than a handful of entries. Each brick has 1?
for the second volume integral. Many of these potential interactions yield zeros, and
thus significantly fewer interactions need be computed or stored. However, in prac-
practice, the logic required to determine significant and insignificant (null) interaction*\320\272
extensive and usually 288 complex memory locations per layer of the mesh are
allocated. Each layer is stored separately sincethe layer thickness need not be con-
constant from layer to layer and therefore the matrix entries will be different. However,
if the layers are of constant thickness(e.g.,all bricks throughout the mesh are
identical), only a single set of 288 interactions need be computed and retained
This is still insignificant compared to the storage cost of the actual FE interaclion
matrix! Specific formulae for the brick FE entries are provided in the appendix.
The Block Toeplitz structure of the BI sub-matrix, [#], can be exploitedto yield
impressive memory consumption and run-time efficiency. Careful inspection of
G.46) indicates that all interactions can be represented by computing and storing
a single row of the interaction matrix and a portion of another row. For example, if
the unknowns were numbered with all .v-directed edges first followed by all j-direc-
ted edges (assuming the aperture lies in the z = 0 plane), then all interaction
between two .v-directed aperture edges are represented by the first /Vv entries of
the matrix's first row. The next Ny entries of the first row represent all interactions
between .v-directed test edges and .v-directedsourceedges.Finally, the last Nv entries
Scciion 7.4 \302\246
Cavity Recessed in a Ground Plane 249
computed, although in practice two full rows of the matrix may be computed and
stored to simplify the required logic. (The additional partial row represents all inter-
interactions where the test edge is v-directedand the source edges are .v-directed. For the
situation considered herein, these are identical to the interactions involving an \320\273-
represented by a sparse FE matrix, [sf\\, whether stored or computed on-the-fly, and the
BI interactions may be represented by [\320\251.During the crucial matrix-vector multiply
(per iteration), a sparse matrix operation is usedfor the FE portion and a FFT-based
matrix-vector multiply is used for the Bl portion. The FE matrix-vector multiply
the entire system and Mmm is the maximum number of edges in either the x- or y-
direction. The following section presents a detailed description concerning the imple-
implementation of a FFT-based matrix-vector multiply scheme.
gur[t
- /'. .v - =
\320\273'] -A-o f [ Wv{u\\ </) Wu(u, v)G\"\"(u -u',v- v')du' dv'dudv
Is,.isr,
gm[t-l'.x-s'] =
-k20 f f Wu(u\\v')Wl{u,v)Gin\\u-u',v-v')du'dv'dudv
-
?,\342\200\236[/ /'. s -.?'] = +kl \\ I Wv(u\\ v')W,,(ii, v)G\"\"(u - v-
\302\253', v')du'dv'dudv
Js,. is,,
G.48)
where gm represents the u-u interactions (first Nu entries of the first row of \320\251), gm,
represents the vv interactions (last N..interactions of the (iV,, + 1)th row), and so on.
Wluy{u,v) are the edge-based testing/expansion functions, and e refers to the test
element while e' denotes the source element. is the
\320\233',, number of grid points in the \320\270-
direction while Nv is the number of grid pointsin the u-direction, as displayed in Fig.
7.9.
250 Three-Dimensional FE-BI Method \302\246
Chapter!
Vk
Nv-2
Nv-3
\342\200\224
\342\200\224
\342\200\224
\342\200\224 -
\320\270 \320\270
3
3
2
2
1 1
t=Q
s = 0^23 Nu-2 S= 0^23 Nu-3
(a) (b)
of the physical surface mesh is indicated by s. For a mesh involving both \320\274-directd
and ?>-directed edges, there are two different collocated meshes:one for w-direcied
edges and one for u-directededges,as shown in Fig. 7.9. Figure 7.9(a) corresponds to
the \321\213-directed edges while Fig. 7.9(b) refers to the v-directed edges. Notice that there
are /V,, \342\200\224
2 rows and Nu \342\200\224
I columns for the unknowns on the \320\275-directed edges. Also,
there are iVt. \342\200\224
I rows and Nu \342\200\224
2 columns for the unknowns on the ^directed edges.
This example illustrates our comment that there are two collocated mesh schemes
with different numbering conventions.
These two separate meshes are important in implementing a FFT-based
matrix-vector product. Since such a matrix-vector product relies upon the physical
distance between the test and source functions rather than their matrix position,
understanding the physical layout of the meshes permits proper filling of the data
arrays and the correct calculations of the matrix-vector product. All of these com-
comments are based upon the fact that each of the sub-matrices represented in G.48) j&
pair
\302\2532-1.
=
\321\203]} ?
.t=0
Af2-I
\320\233/,-1
\302\246\342\200\242
\321\205
/* \320\223/ *\320\223~\"'
i*1 \342\200\224 / /\320\223\320\223\320\273
7 1/, \320\247 -^201^19-, / , G.491
\320\2302
\320\223\320\276
.\302\253=0
where \"\342\200\242\"
indicates a Hadamard (e.g., term-by-term) product. The order of the
relevant DFTs must be \320\233/,
> 2 (number of rows)-1 and M2 > 2 (number of
columns) - 1, where the number of rows and columns of the discretization may
vary with each convolution. For example,the first convolution in G.48) is associated
with \320\274-testing and \320\270-source edges and hence the number of rows and columns is
(jV,.
-
2) and (Nu
\342\200\224
1) respectively. The field sequences are loadedinto an \321\205
\320\233/| M2
array in row/column order of the field discretization, and the remaining entries form
a zero pad.
The Green's function sequence be loaded into a similar array (in the same
must
manner), and periodic replication must be performed to provide the necessary \"nega-
\"negativelags.\" The data array (matrix) entries representedby G.48) using the first u-
directed edge for testing and all of the \320\270-directed edges for sources (e.g., part of the
first row of the matrix) represent all interactions where the source edge is to the right
and above the test edge, as shown in Fig. 7.9. The \"negative lags\" are situations
where the source edges are to the left and/or below the test edge.
If the sequence has the property, g[t
- t',s- = g[l'
,\321\207']
- t, s' -
s], then this
= g[Ml+2-t.s]
^-<t<Mx-\\ 0<.9<^-l
The first group in G.51) consists of caseswhere the source edge is to the right and
above the test edge. The secondgroup represents the cases where the source edge is
to the right and below the test edge. The third group is for interactions where the
source edgeis to the left and above the test edge. Finally, the last group represents
interactions when the source edge is to the left and below the test edge.
If such symmetry is not present, all possible lags must be computed requiring
longer matrix build time since more than the first \320\275-directed and t<-directed edges
need be used as sources.
Whether even symmetry exists or not depends on the specific
matrix entry with the correct search vector entries. Similar results are obtained for
the other convolutions in G.48).
The interested readeris referred to [16], [19], and [24] for additional details.
252 Three-DimensionalFE-BI Method \302\246
Chapter!
.60E+06
7.4.4 Examples
on obtaining a fully functional FE-BI brick program is provided at the end of this
chapter. This program, LM RICK
\320\222 (a.k.-d. Low Memory Brick) utilizes the optimiza-
optimization
techniques described previously for brick element implementations of the FE-BI
antennas.
This program can compute the Radar Cross Section (RCS) of a cavity-backed
patch or slot antenna, the radiation and gain pattern of a probe-fed conformal patch
or slot antenna, and the input impedance of such antennas. Due to the efficient
implementation, a large, finite array of similar or dissimilar elements may be mod-
modeled. Also, since the FE method is used in the cavity volume, the dielectric fill may be
inhomogeneous on a brick-by-brick basis.
For example,considercalculating the RCS attributed to a 4cm x 3 cm patch
antenna recessed in an 8 cm x 6 cm x 0.1cm cavity that is homogeneously filled with
dielectric (er = 2.0). This antenna is shown in Fig. 7.11. Several different discreiiza-
Section 7,4 \302\246
Cavity Recessed in a Ground Plane 253
tions are used to illustrate the computational scaling associatedwith the FE-BI
method using a BiCG-FFT solver. All calculations are made on a Pentium 60-
MHz personal computer running Linux, and the iterative solver tolerance was set
at 0.01. The first case involves 0.5cm x 0.5cmx 0.1cm bricks which resulted in 411
unknowns. A total run time of 0.38 hours was required to compute the RCS for this
problem at 201 different frequencies. For the same geometry, using
0.5cm x 0.5 cm x 0.05cm bricks, the number of unknowns was 932 and the run
time was 1.63 hours. When the grid cell size was 0.5cm x 0.5cm x 0.025 cm, the
number of unknowns was 1974 and the corresponding run lime was 5.86 hours.
height is kept constant at 0.1 cm and the xy grid which is relevant to the boundary
integral is varied from 0.5cm x 0.5 cm to 0.25 cm x 0.25 cm and then down to
0.125cm x 0.125cm. As noted above, the solve time for the 0.5cm x 0.5cm x
0.1cm grid cell size was 0.38 hours for 201 frequencies. In the case of a
0.25 x 0.25cm x 0.1cm grid cell size, the corresponding solve time is 2.22 hours
(with 1781 unknowns). Finally, for the smaller cell size of 0.125cm
x 0.125 cm x 0.1 cm the number of unknowns grows to 7401 with a corresponding
CPU time of 14.5 hours. Examining the ratio of solve time to number of
unknowns, it is clear that boundary integral unknowns scale more favorably
than volume unknowns. For example, in each set of runs, the cell grids of
0.5cm x 0.5cm x 0.025 cm and 0.25 cm x 0.25cm x 0.1 cm have roughly the
long! The efficiency associated with the boundary unknowns, even though they
254 Three-Dimensional FE-BI Method \302\246
Chapter 7
lead to a fully populated matrix, is due to the use of a FFT-basedsolver and the
improved convergence of the solver for dense systems. If a more traditional matrix-
vector product is used for the boundary integral portion, it would result in a
indicating that the increased surface unknowns are refining the solution, e.g.. improv-
improvingthe estimate of the fields within the volume and on the aperture. In the cases
where discretization was held constant but the volume was subdivided
the aperture
into thinner
layers, no appreciable change was observed in the radar cross-section
computations. Hence, the thin thickness of the cavity was sufficiently sampled
using a single layer of elements whereas increasedaperture discretization improved
accuracy.
-10.0
-20.0
-30.0
-40.0
0.125x0.125x0.1 cm
-60.0
-70.0
3.0 3.5 4.0 4.5 5.0
Frequency [GHz]
-20.0
-30.0
-40.0
FEM:00-pol
-50.0 MoM: 0</>-pol
MoM: 00-pol
-60.0 I
4.0 5.0
3.0
Frequency [GHz]
Figure 7.13 Comparison between the FE-BI method and the method of moments
6 cm x 5 cm x 2cm cavity
for computing the radar cross section \320\276\320\223\320\260
with a 3 cm x 2cm slot aperture, \\MoM data are Courtesy of James T.
Aherk, 1097.]
Another example of the use of a planar FE-Bl computer program involves the
design and analysis of finite confonnal antenna arrays. The FE-Bl method described
above, where brick elements are used for subdividing the cavity volume and the FFT
is used to handle the Block Toeplitz matrices, allows for the simulation of rather
large antenna arrays on a modest computer.Figure7.14illustrates the gain pattern
of a 5 x 5 patch antenna array. Each antenna element was similar to the one shown
in Fig. 7.11 except that the dielectric constant of the substratewas ef
= 13.9 and the
center-to-center spacing was 10cm in each direction. The pattern at er = 2.45GHz
was allowed to radiate broadside and steered 30 degreesfrom broadside using stan-
standard beam steering techniques. Note that when the FE-BI method is used to simulate
this array, all mutual coupling is included in the solution, thus increasing the fidelity
of the model. This example was run on a Silicon Graphics workstation using
approximately 14 MB of RAM. It involved 10,275 unknowns and took approxi-
approximately seven minutes to compute each pattern!
10.0
-10.0 -
-20.0 -
-30.0-;
-40.0
-90.0 -60.0 -30.0 .0 30.0 60.0 90.0
Deg(Theta)
Figure 7.14 Radiation pattern of a 5 x 5 patch antenna array for broadside and 30
degrees broadside.
\320\276\320\223\320\223
[Courtesy of Jeffrey Tackell. 1997.]
7.15. For this case, the lower aperture results in an additional BI integral resulting in
V x Emt VxW, -
!\302\246
/c5efEinl
\342\200\242
W,
- x z G, \342\200\242
\342\200\242
z x ECTll dS' dS
\\dV k\\ f f [w, J
is* is'+L
=
-kl\\ f rw,xz-G2-zxEcxtlrfS'f/5
J /7'l+/rI G-52)
is-is'-*-
Slot aperture
6.0
f\\
f = 0.01A
150
4.0 - f =0.101
\342\200\242
/;t'\\ f=0.2Sl
3.0 -
The interested reader is referred to [25] and [26] for further details concerning
the application of the FE-B1method to transmission problems.
shown in Fig. 7.17 where the patch elements are printed on a cavity-backed dielectric
substrate that is recessed in the cylinder. In this case. G.44) may be discretized using
cylindrical shell elements similar to the brick. Cylindrical shell elements possess both
edges aligned along each of the three orthogonal directions of the cylindrical coor-
coordinate system. Each element is associated with 12 vector shape functions given by
258 Three-Dimensional FE-BI Method \302\246
Chapter 7
Wi4(/>. 0, r) \302\273 ,2; pft, -. 2,, +). WmO\302\273,0. 2) = . 0.2; a,. -.2,,-)
. 0,2) = .2; zb, -).
\321\200\320\271,., W67(/t>. 0,2) = , 2: p.,,.,
\321\204, +)
\320\2634,
W,j(/>, 0.2) = W.(p, 0.2; ^. 0r... -f). W26(^, 0.2) = W.(/>. 0.2: A. 0f... -)
W4g(p. 0. r)
= W.(/>. 0.2; 0,...
\320\233. -). W3T(/>.0.2) = W.(/>. 0. 2; 0,.
\320\233. \342\200\242,
+)
G.53)
where W\302\253is associated with the edge wliich is delimited by local nodes (l.k) as
shown in Fig. 7.18 and (p. 0.2) denotethe cylindrical coordinates. As can be inferred
from G.53), three fundamental vector weight functions are required for the complete
representation of the shell element. They are
p. 0 2:J5.0.2.*) =
^
(p - p)(s- 2) 0 G.54)
W.(p, 0,2; 0,
\320\224, f, s) =
j-
(p - p)@
-
0) \320\263
Section 7.5 \302\246
Cavity-Backed Antennas on a Circular Cylinder 259
satisfy both the radiation condition and the Neumann boundary condition at p =a
1
G.56)
<2*)%f=
\320\247\320\243)
where G-*(a. =
G*'-(a,
\321\204,\320\263) z),
\321\204,
=
\321\203 kpa and kp
= Jk\\
- ft?.
260 Three-DimensionalFE-BIMethod \302\246
Chapter 7
7.5.1 Examples
The FE-BI method has been applied to scattering and radiation by cavities
recessed in an infinite metallic cylinder. Consider the scattering by a cavity-backed
patch antenna recessed in a circular cylinder. The patch is 3cm x 2cm and is placed
-10.0
-20.0
-30.0
-40.0
-50.0 a= cm
200
\320\276 Planar
-60.0 I . . . .
Figure 7.19 Radar cross section of a conformal patch antenna for transverse mag-
magnetic (TM) The various curves correspond
polarization. to different
cylinder radii, and the antenna is Bush mounted to the surface of the
cylinder.
Section 7.5 \302\246
Cavity-Backed Antennas on a Circular Cylinder 261
\302\253>10em
14.95cm
\302\253=
a.20cm
- \342\200\242
\342\200\242 = 200 cm
\320\262
Figure 7.20 Smith chart for a 3.5cm x 3.5cm patch antenna for frequencies
between 2.4 and 2.7 GHz. The cylinder radius varied between I Ocm
and quasi-planar B00 cm).
262 Three-Dimenstonal FE-B1 Method \302\246
Chapter 1
cylinder surface. One of the strengths of the FE-BI method for singly curved con-
formal antennas is its ability to investigate the effects of curvature on the RCS of a
patch such as the one shown in Fig. 7.17. The RCS for various cylinder radii
The material presented thus far in this chapter was developedduring the late 1980s
infinitely periodic structures with unprecedented flexibility. Also, recent papers have
introduced nonplanar, noncylindrical boundaries in the implementation of the FE-
BI. Further, techniques such as the surface of revolution (SOR) and fast multipole
method (FMM) offer increased flexibility with moderate cost. An overview of recent
publications follows. The interested reader should consult the cited papers for imple-
implementation details.
The flexibility of the finite element method permits the antenna element to be
constructed on arbitrary materials and can have arbitrary shape. Rather than using
one of the integral equations introduced at the beginning of this chapter, the finite
element-periodic moment method (FE-PMM)utilizes a periodic integral equation.
Specifically. Floquet modes are used to periodically replicate the boundary and
radiation conditions imposed on the unit cell. Note that in this application, periodic
boundary conditions are not only applied to the aperture of the antenna, but also on
all nonmetallic sides of the unit cell.
An example of the use of the FE-PMM method is illustrated in Fig. 7.21 where
a notch radiator is immersedwithin the unit cell mesh. This is an exampleof the type
of detail readily modeled via the finite element method. Figure 7.22 illustrates the E-
Coax
aperture
Substrate surface mesh
Substrate
Ground
plane face
(a) (b)
Figure 7.21 Finite element mesh for flared notch antenna: (a) unit cell illustrating
coax aperture; (b) substrate surface mesh. [After MvGruth [32].\\
-o-3.0 GHz
-*-4.0GHz
-a-5.0GHz
o- 3.0GHz
-*-4.0 GHz
-a- 5.0 GHz
At the beginning of this chapter, the most general form of the finite element-
wavelength geometries. The major uses of this finite element-surface of revolution (FE-
SOR) method are to compute the scattering by complex objects and the radiation by
axisymmetric antennas.
The FE-SOR beenconsideredby Boyse and Seidl [35] for scatter-
method has
scatteringby nearly axisymmetric bodies. Their formulation involved the use of a finite
element expansion of the fields within the SOR and an eigenfunction series expan-
expansion in the azimuthal direction for the boundary integral. They used node-based
tetrahedral elements within the mesh and a Fourier modal-azimuthal expansion
utilizing Hermite polynomials.
A group at the Jet PropulsionLaboratory (JPL) has also published a seriesof
articles [36]. [37]. [38], [39] detailing their hybrid FE-SOR formulation. In their
implementation vector edge finite element expansion functions are used rather
Section 7.6 Recent
\302\246 Advances in the FE-BI Method 265
than the node-based elements in [35]. Also, dissimilar boundary integral basis func-
functions are used and the finite element and boundary integral regions are coupledusing
the method presented in [36]. Recently Zuffada, Cwik,and Jamnejad presented an
analysis of a circular waveguide antenna with a choke collar [39].In this, they used a
magnetic field finite element formulation
-^ -(VxT-VxH)- \342\200\242
G.57)
\320\275]^-?\320\263.
G.58)
where the integro-differential operators ZM and Zj are those used by the body of
revolution (BOR) integral equation formulation [40]. Also in [39], the essential
boundary conditionsare enforced using
G.59)
and tetrahedral finite elements are used to discretize G.57). Finally, the SOR basis
functions are given by
pit)
G.60)
Pit)
where Tk{t) is a triangle function spanning the kth annulus on the SOR. The vari-
variables I and \321\204to the local
refer SOR coordinates, and n refers to the mode in the
Fourier series expansion.
As an example of the implementation in [39], Fig. 7.24 illustrates a circular
waveguide antenna with a choke collar. The H-planepattern for this antenna (see
Fig. 7.25) was computed using the FE-SOR method to determine the fields in the
shaded region shown in Fig. 7.24.
We note that in [39], a mode matching technique was usedto model the feed.
This is in effect a separate integral equation applied across the feed aperture and
Choke ring
H plane
I
-10
I Meas (JF
-20 \\
-30 4 >
Chapter 8.
In the FMM, the boundary integral unknowns are grouped together and the
interaction between groups is computed rather than the interaction between indivi-
individual unknowns. These group interactions are then disaggregated to provide the
a uniform grid so that the O{N log N) FFT can be used for carrying out the matrix-
vector products.Researchon fast integral methods is currently very active, and the
reader should consult future publications on the development and application of
these techniques.
The FE-BI equation for a cavity recessed in a metallic plane is given by G.45) and is
reproduced here for convenience
-*o[ f [WrzxGr2xz-Wy]</SVS}
J5 h' J
{if) T[Q\\
[0]] {?}bi} _
~ {/=\302\253\342\200\242}
-
1 J ( }
(?f) [[0] [0]J (?*} {/\342\200\242?\"}
where [A] represents the FE matrix and [Q\\ denotes the boundary integral sub-
matrix. In this appendix, we give explicit formulas for the \\A] matrix and entries
formulas that permit numerical evaluation of [Q\\. We begin with [A] Cor anisotropic
media. The correspondingformulae for isotropic media are presented in Section 5.4.
Ev
= G.63)
^\320\225\320\223\320\260\320\235\320\251\321\205,\321\203.=)
where ?(/, are the unknown field expansion coefficients associated with thev'th local
edge which is parallel to the (/-axis where d = [x, y,:) and Wdi
are the expansion
functions associated with each local edge. In G.63), there are four field expansions
per brick for each component corresponding to the 12 edges of the brick. The super-
superscript e denotes the global element number, and the index j denotes the local edge
numbers.The localedgesare defined as follows:
268 Three-Dimensional FE-BI Method \302\246
Chapter 7
= node!
edge\302\273, -* node5 edge10= node2
-\302\273\302\246
node6
= node4 -* nodeg edge12= node3
-* node7 G.64)
Figure 7.26 illustrates this brick element, the local node numbers usedin G.64), and
the edge lengths, {/?, \320\251,,
\320\272\320\263].
Vs)]dVe G.65)
A second set of 12x 12 data arrays for each layer of the mesh is required for the
second integral in G.61), which for anisotropic media becomes
G.66)
Mxv (J-xy M
G.67)
fi-zy fizz .
G.68)
Appendix I \302\246
Explicit Formulas for Brick Elements 269
, \320\251\320\230\321\203\321\203
+
^?[\320\220],+^\320\2251\320\233\320\237\320\267'
/<\302\246?,.-
^\321\202\320\260\320\263
Kfirx
= +
^[/f]4
[ ]\320\267 G.69)
For G.69), the values for the brick element matrices are given by
2 -2 1 -1 2 -2 _j
2 2 -1 1 1 \302\273
-1 -2
1 -1 2 -2 2 - 1 2 1
1 1 -2 2 1 -2 1 2
1 -1 1 1 1 I 1 -1
1 1 -1 -1 1 -1 1
G.70)
1 -1 1 1 1 - 1 -1
1 1 -1 -1 1 1 -1 1
2 1 -2 -1 1 1 -1 -1
2 -1 2 1 1 1 -1 -1
[4s =
1 2 -1 -2 1 \342\200\224
1 1 1
1 -2 1 2 1 _ 1 1 1
The reader is cautioned that these matrices are not the same as the ones used for
isotropic media (see Chapter 5).
The element matrix terms for anisotropic dielectric media are given by
270 Three-Dimensional FE-B1 Method \302\246
Chapler 7
B) ~ fM B) ~
1\320\251\320\226\320\265\321\205\321\205 hxhyhUx>, l2) _
~ \320\254'\321\205\320\251\320\265\320\245;
\321\202
x y \320\273\320\263
36 24 \320\2424\342\200\224
G.71,
24 36 24
<21 _ \342\200\224
l~\" \302\246\"*'\342\200\242'\302\246 l l Jl
24 24 36
2 12 1'
2 12 1
12 12 G.72)
12 12
/Bl = - t f x W,)
K\302\253
-
BS0)
\342\200\242
(i x W,)] dS' dS
Jsis
/Bl =^ -2 f f [W, \342\200\242
(zx Go x z) \342\200\242
Ws]dS'dS
G.7?)
is is
rBl _.
^Bl(l) ,
where Co is the free-space dyadic Green's function defined in Chapter 1. From this
'I
/Bhi)
2\321\217
f [W,
\342\200\242
(zx 7 x ^- \320\272 dS'dS
z) \342\200\242
W,] \302\246 G.74)
JSJS
and
G.75)
where =
\320\233 + {}>- y'f.
\321\203/(\321\205-\321\205'J
For convenience,we can expressthe surface basis functions as
= -
Wv(x, v; x, y. s) -^ (x x) G.76J
' _ rBI(l)
rBI(l) \342\200\224
T
, ,\320\2221(|)
'.v.v *yy G.77)
Appendix 1 \302\246
Explicit Formulas for Brick Elements 271
where
/Bid) _ Slsi
G.78)
dx'dy'dxdy
-ffTf
h', Jv| J..-; J.v;
\342\200\224jj-\\dx'dy'dxdy
*>'** + \320\273'^'\320\273*
-\320\223\320\242\320\223\320\223
Jv; Jr; .!.\302\273\342\200\242;
\302\246!..\302\246! 1T-r\\ J Jv, J J \320\273
G.79)
where the latter integral can now be evaluated analytically. Specifically we have [27]
dx'dy'dxdy
/(.v-\320\273-'
\\\\x lnCv + +
\320\233) .1' ln(.v + R)\\ dx dy
\\x-x')(y~y')
i(.v-.v')ln[Cv-.\302\273')
-
X')(V
(\302\246\320\243
- /)[(.Y- .V') \320\243\320\233]_*
:/5't/5 G.81)
\320\2372)
= +,81B)
/\320\2221|2\302\273 G.82)
272 Three-DimensionalFE-B1Method \302\246
Chapter 7
where
_ \302\246 *A
fB\\Q)
slss rBI(c)
xy
2khffX)
G.83)
.Bi(c)
Once again, /BI(C> is evaluated using analytical formulas for the self-celland standard
numerical integral techniques for all other interactions.
For triangular elements, the interested reader is referred to [5], [44], [451,
and [46]. These papers provide various coordinate-free evaluations of the self-cell
integrals.
To assist the reader in understanding some of the difficult concepts in Chapter 7. Dr.
Leo \320\241
Kempel Corporation has made available
of Mission Research a fully func-
http://www-personal.engin.umich.edu/~volakis/
Some features of this computer program. LMBRICK (a.k.a. Low
Memory
Brick), are as follows:
1. Automatic mesh generator for rectangular cavity-backed patch and slot
antennas.
2. Precomputation of only necessary FE interactions and a custom sparse
matrix-vector product that utilizes this minimal data set.
3. FFT-based matrix-vector product for the BI sub-matrix, hence the ability to
REFERENCES
three-dimensional
geometries by a curvilinear hybrid finite element-integral equation
approach. J. Opt. Sot: Am. A, 11 D): 1445-1457, April 1994.
[3] J. M. Jin, J. L.Volakis, and J. D. Collins. A finite element-boundary integral
method for scattering and radiation by two- and three-dimensional structures.
IEEE Antennas Propagat. Soc. Mag.. 33C):22-32. June 1991.
[4] R. E. Collin. Field Theory of Guided Waves. IEEE Press, New York, 1991.
[5] S. M. Rao. D. R. Wilton, A. W. Glisson. Electromagnetic scattering by
and
surfaces of arbitrary shape. IEEE Trans. Antennas Propagat., 30:409-418, May
1982.
[6] J. R. Mautz and R. F. Harrington. A combined-source formulation for radia-
radiation and scattering from a perfectly conducting body. IEEE Trans. Antennas
[8] A. F. Peterson. The interior resonance problem associated with surface integral
equations of electromagnetics: numerical consequences and a survey of reme-
remedies. Electromagnetics, 10C): 293-312, July-September 1990.
[9] \320\241.\320\242.Tai. Dyadic Green's Functions in Electromagnetic Theory. IEEE Press,
New York, 1994.
[10]J. M. Jin and J. L. Volakis. A hybrid finite element method for scattering and
radiation by microstrip patch antennas and arrays residing in a cavity. IEEE
Trans. Antennas Propagat., 39A1): 1598-1604, November 1991.
[11] J. Gong,J.L.Volakis, A. Woo, and H. Wang. A hybrid finite element bound-
boundaryintegral method for analysis of cavity-backed antennas of arbitrary shape.
IEEE Trans. Antennas Propagat., 42(9):1233-1242,September 1994.
[12] T. Eiberl and V. Hansen. Calculation of unbounded field problems in free space
by a 3D FEM/BEM-hybrid approach.J. Eleclromagn. Waves Appl., \320\251\\):\320\254\\-
78, 1996.
[15] J. Gong and J. L. Volakis. An efficient and accurate model of the coax cable
DS/ENG/93-4.
[35] W. Boyse and A. Seidl. A hybrid finite element method for near bodiesof
revolution. IEEE Trans. Mag.. 27:3833-3836. September 1991.
[36] T. Cwik. Coupling finite element and integral equation solutions using
decoupled boundary meshes. IEEE Trans. Antennas Propagate 40:1496-1504.
December 1992.
[37] T. Cwik, \320\241
Zuffada, and V. Jamnejad. Efficient coupling of finite element and
[42] N. Lu and J.-M. Jin. Application of the fast multipole method to finite-element
boundary-integral solution of scattering problems. IEEE Tram. Antennas
[45] R. D. Graglia. On the numerical integration of the linear shape functions times
the 3D Green's function or its gradient on a plane triangle. IEEE Trans.
Antennas Propagat., 41A0): 1448-1455, October 1993.
[46] T. F. Eibert and V. Hansen. On the calculation of potential integrals for linear
When iterative methodsare used for the solution of hybrid finite element-boundary
integral (FE-BI) systems, such as that in D.133), most of the CPU time is typically
which is repeated here from D.132). The greater CPU time is due to the fully
populated matrices [Gv] and [G]. Consequently,the CPU time for carrying out the
matrix-vector products is 0{Nl),whereas the corresponding CPU time for sparse
matrices approaches0{N).As the total number of unknowns
usual, N denotesin the
277
278 Fast Integral Methods \302\246
Chapter 8
\320\263\320\243\320\263\320\273\"\320\233
\320\220
\320\233
\320\233
\320\233\320\233\320\220/\320\243\320\233
\320\233
\320\233
\320\220
\320\233
\320\260-\320\220
surface currents/fields and those due to the delta sources on the new equi-spaced
grid. For planar BI surfaces,the delta sources are placed on a rectangular grid,
whereas in three dimensions the equi-spacedgrid is cubical. Therefore, three-dimen-
FFTs
three-dimensional must be used in the same manner as done with A'-space methods [3]. [4].
The method introduced by Bleszynski et al. [2] is referred to as the Adaptive
Integral Method (AIM) and has been implemented for scattering and radiation [2],
[5]. In all these applications of AIM, the FFT is only used to compute the matrix-
vector products associated with the far zone fields, whereas the near zone interac-
interactionsare computed using the original fields/currents on the BI surface. That is. [G]in
(8.1) is decomposed as
where [Gnear] is a banded or sparse matrix and [Gfar] is Toeplitz in form. With this
decomposition, the overall CPU and memory requirements of the BI subsystem are
reduced down to <?(/V/J5)or Jess.The constant in front of /V^5 though varies with (he
bandwidth of [Gnoal.].Typically, [Gnear] includes those elements which are a distance
of 0.3A. to 0.5A.from the testing point to maintain the accuracy of the solution.
Clearly, this that AIM is more efficient
implies for large-scale simulations involving
bodies which span many wavelengths. However, it has been observed [5] that AIM is
particularly attractive even for small bodies which include fine details as is the case
with antennas. In some situations with only 1150 BI unknowns, as much as tenfold
reduction in CPU and memory has been observed.
AIM belongs to the category of matrix compression methods. At this time
other techniques are also being investigated to speed up the matrix vector products
and reduce memory requirements of large-scalesystemswhich may involve hundreds
of thousands of volume and boundary integral unknowns. Among these, the fast
multipole method (FMM) is being considered by several research groups and \320\272
discussed below.
Section 8.2 Fast
\302\246 Multipole Method 279
The fast multipole method (FMM) is an efficient approach for calculating the
matrix-vector products associated with dense subsystems as that in (8.1). One of
the first applications of FMM was given by Barnes and Hut [6] for calculating
interstellar body interactions. More recently, the FMM was used quite successfully
to handle very large-scale interactions [7], [8]. The reader is referred to [9] for an
applications. The reader is cautioned that the speed-up achieved by the various
compressionschemescan compromisethe accuracy of the solution [14].
= 2#!nc -
#.(r) +J- \\ r'\\)dl\\ \320\263'
\320\235(\320\263\320\223\\\320\2720\\\320\263 \320\263,
\320\244(\320\263') \320\241
\320\261 (8.2)
2 ic
where denotes
\320\2571\320\237\320\241 the \320\242\320\225
incident/excitation #u2)(-) is the zeroth-order
field and
Hankel function of the second kind. This integral is a specialization of D.119)
and can be combined with the finite element system D.133) for the solution of H.
and \320\244.
Physically, (8.2) describes the field relation at the aperture of a dielectri-
cally filled groove, as illustrated in Fig. 8.2. It is constructed by enforcing the
condition
jf
(\302\253\320\276./'\32
(Boundary/aperture)
\320\223
-r'\\)dx', \320\263'\320\265\320\241
\320\263.
(8.3)
bHz
_ jk0 +jko p
Lx
Zo
and from D.114)
the factor of 2 in the right-hand side of (8.2).In the next few subsections we examine
the discretization and evaluation of the integral (8.3) using various versions of the
FMM. This exposition provides a close look at the characteristics of FMM for
electromagnetic applications and demonstrate the features which are responsible
for the CPU speed-up and memory reduction.
In accordance with the FMM (see Figs. 8.3 and 8.4), the Nh boundary
unknowns introduced for the discretization of (8.2) or (8.3) are subdivided into
Basis elements
Testelement
Group center
_ . . Group/'
Source element _ _ ^
\321\203 j (sourcegroup)
x
Global origin
Figure 8.3 Computation of the boundary inic-
center
Group gra| mairix vector product using exacl FMM.
Section 8.2 Past
\302\246 Multipole Method 281
Group \320\223/.
;
Group Group /Group
K+1 Kl / K-1
min
4 $
't,
Far group |
< 4mn
\320\263,,.
=r> Near group
Figure 8.4 Compulation of the boundary inicgral matrix vector product using exact
FMM.
groups with each group assigned Mi, unknowns. Thus, a lotal of Lb \302\253\302\253 Nh/M/, groups
are constructed. The key step in all FMM procedures is to rewrite the integral (8.3)
as a product of terms each being a function of r (observation point) or r' (integration
point) but not both. In this manner, the evaluation of the integral is carried out by
considering the group-to-group interactions separately from the intergroup interac-
interactions. Beyond the math, this breakdown of interactions/operations can be viewed in
the context of the manager-worker model. Basically, we can view each group as
managed by the center element with the workers comprising the elements of the
group. Communication/interaction among the groups takes place through the man-
managers who in turn interact with the group elements. The decompositions reduce the
direct interdependence of each group member with the other elements belonging to
different groups, and this is at the heart of the CPU speed-upafforded by FMM. As
stated earlier, though, there are inherent approximations as part of the group decom-
decomposition process which must be understood in order to assess the accuracy of each
FMM algorithm.
To achieve the decomposition of (8.3) into a product of functions in and
\320\263 r',
we first invoke the addition theorem to rewrite the Hankel function as
+ r, - \\r,
-
(8.4)
where denotes
\320\263\321\206- the between the centers of the / and /' groups, as illus-
distance
illustrated in Figs. 8.3 and 8.4. Also, \321\204\321\206-
and are the angles between the vectors ru>
\321\204\320\263>\320\263
and \342\200\224
with
\320\263/- \320\223/ the x-axis, respectively. The source and observation points r/ and 17
have their origin at the center of the /' and / groups, respectively, while r' and are
\320\263
is used to truncate the sum (8.4), where D is the diameter of the circle enclosing the
groups. This is consistent with the radius of convergence associated with the Hankel
function. In general, Q/2 = Mh, ensures convergence. (It will be shown that Q is ihe
282 Fast Integral Methods M Chapter8
number of directions in which the radiation of the group is sampled. With M/, being
the number of basis elements in the group, Q = 2M/, satisfies the Nyquist criterion
for faithful replication of the source group radiation.)
Next we introduce the Fourier integral of the Bessel function
'\302\246\"\"-'tf0 (8.6|
4\320\273\320\263
ikr'^ (8.7)
j2)r
where = \320\2720(\321\205\321\201\320\276$\321\204
\320\272 is measured
+ \321\203$\321\202\321\204) from the In this,
\320\273\320\263-axis.
= .(r')e\"\"i'dl'
f (8.8)
J \320\263
0/2
= \320\2351~\\\320\272\320\276\320\2631\320\263)\320\265-\320\235\321\204-\321\204><1+1\022)
\320\223\342\200\236@)
]\320\237 (8.9)
= (8.10)
\320\257\320\223@
-^ \320\224*? 7>(*,) VriQ^\"
which is the radiated fields from some location in the source group /' to a point
within the receiving group /. Note that =
2tt/Q
\320\220\321\204 indicates the angular spacing
between the propagation vectors of plane waves emanating from a group. Thus
= \\...Q, whereas = As mentioned earlier,
\321\206&.\321\204,
\321\204\321\207 q\342\200\224 k4 \320\2720(\321\205\321\201\320\276&\321\204\321\207+\321\203&\321\202\321\204\321\207).
the number of plane wave directions is setequal to twice the number of elements
in thegroup (Q = 2Mh), thus satisfying the Nyquist sampling theorem with respect
to the integration over \321\204.
Given the above steps, the exact FMM procedure for
QLhMh.
From the above we concludethat the operation count of the above three
total
B.2.3Windowed FMM
In the exact FMM, the translation operation between groups assumed iso-
tropic radiation. However, it is suggestive that the groups would interact strongly
along the line joining them and lessso in other directions. Indeed, it was shown in
[13] that the translation operator could be contemplated as composed of a geome-
geometrical optics (GO) term (along the line joining the source and test group) and two
diffraction terms associated with the shadow boundaries of the GO term. To
illustrate the validity of this concept, we plot in Fig. 8.5 the translation operator
Groups 1 and 3 j
\342\200\2420=54;
tn =5\320\224
Q=54;/>,. =15 A
0 = 54; rr =45A
Groups 1 and 8 -\\
120 180
Figure 8.5 The Translation operator for difTcreni groups on the boundary of a 5(U
wide groove: 750 Bl unknowns; 27 groups.
284 Fast Integral Methods \302\246
Chapter 8
for different group separation distances along the groove of width For this
50\320\257.
groups. As seen, the \"lit\" region of the translation operator narrows as the
group separation distance is increased, eventually displaying the predictable sine
function behavior for large group separation distances. The tapering off of the
translation operator from a value oscillating around 2 down to zero for larger
\342\200\224 values
\321\204 \321\2041\320\223 is characteristic of the geometrical optics plus diffraction terms in
the context of traditional high frequency methods. We may also comment that this
high frequency model enables the identification of a lit region even for groups
which are not widely separated (for example, see Fig. 8.5 for the translation
operator between groups 1 and 3).
The key characteristic of the windowed FMM is the exploitation of the dimin-
diminished value of for - in the windowed FMM, the com-
large
\320\242\321\206(\321\204) Basically,
\321\204\321\204\321\206.
where
larger bandpass window when i\\,,- is smaller as dictated from high frequency analysis.
The discretized plane wave expansioncan now be written as
=
\321\217?\321\201\302\253, (8.13)
1\320\243\342\200\236.{\321\204\321\207)\320\24211.(\321\204,)\320\243\320\263(\321\20411)\320
\320\220\321\204^2
_^\320\276
\320\266
(*0|r
- r'|) - \302\253-J^'-'-r*.
U
_^ \320\265-\320\234-,\321\202\342\200\236, (8\320\2334)
Section 8.2 Fast
\320\250 Muftipote Method 285
Group *r \\
Group Group /Group
Window
Figure 8.6 Computation of the boundary integral matrix vector product using
windowed FMM.
is used. As shown in Fig. 8.7, r/</ is the distance between the center of the test group /
and the center of the source group /'; is the
/\342\200\242\342\200\236/< between
distance the Hth source
element group center; and rlm
and its is the distance between the wth test element and
its group center.
Group tar,
Group Group /Group
K+1 \320\232
,' K-\\
i e
ru I >
(8.15)
286 Fast Integral Methods \302\246
Chapter 8
and since the above aggregation needsto be done for all source groups, the
(8.16)
where in the FAFFA the translation operator simplifies to
(8.17)
This should be compared to the sum (8.9) for the exact FMM. Clearly,
(8.17) needs to be done only at the group level and involves O(Nl/M},)
operations for allsource group combinations, making
possible test and it
v
\302\261 Ae-^'i-*\342\204\242
? (8.18)
r=\\
Since this operation involves only the source group instead of the source
element, it needs to be done for each source group, implying O(Nh/M,,)
operations to generate a row of the matrix-vector
single product. To gen-
Mh rows,
generate corresponding to a test group, operation
the count would be
O(N/,). With N,JMh test groups, the operation count is O(Nl/Mb).
Consolidating the above three steps for the FAFFA algorithm, we have
where the first term refers to the operations associated with the near-field terms. As
before, Mh = and
\321\203/\320\251 the total operation count is 0{Nl5). While the operation
count for this algorithm could be further reduced down to O(Nl;^) by performing
the process of \"interpolation\" and \"anterpolation\" as described in [15] for very large
objects, we found that the accuracy deteriorated for the considered applications.
Hence, only the O(N};5) version was used.
Section 8.3 \302\246
Logic Flow 287
8.3 LOGICFLOW
The operation counts described in the previous section for the various algorithms are
illustrated with the help of flow diagrams and sections of code from the computation
of the matrix vector productsfor the far groups. Figures 8.8 and 8.11depict the flow
diagram and code for computing the matrix-vector product in the exact FMM. It is
seen in Fig. 8.11 that each of the aggregation, translation, and disaggregation
operations consists of a single multiplication which is described below.
\302\246
The aggregation operation consists of the product of an entry of the trial
vector (represented as Dum(J) in Fig. 8.11) with an aggregation factor,
represented in Fig. 8.11 for the /th element and Kin direction as
SrcGc(J,K). This is given by SrcGc(J,K) = \320\264,\302\253. **>'r*.>
#\302\253<*\302\253***+\342\200\242'\342
\320\232/2
= -*\302\273*\320\273*+*>2)
TransdGr. JGr. K) \320\265~\321\210> H);\\k{)riarJlir)
J2
u=-K/2
where is the
\321\204\320\272 Kth radiation direction, rtOfJGr and are the distance
\320\244\321\216\320\270\320\260\320\263
and angle between the groups. The result of the translation
IGnh and JGnh
operation yiekLs a term dependent only on the test group and radiation
direction (representedin Fig. 8.11 as GrGr (IGr , \320\232) ).
\302\246
The disaggregation operation involves the multiplication of the translation
sum, GrGr (IGr ,K), with a disaggregation factor, represented in Fig. 8.11
for the /th test element and the A'th radiation direction by TestGc (I,K).
This is given by
TestGcd. = \320\265-*<\320\233
\320\232)
where r/0> is the direction vector of the /th clement measured from the
center of the group (IGr ) it belongs to. The result of the disaggregation
operation yieldsa term dependent only on the test element alone and is the
contribution to the /th entry of the product vector.
The windowed FM1V1 differs from the exact FMM in the translation phase, and
this is illustrated in Figs. 8.9 and 8.12.These figures illustrate that the windowed
FMM achievesits reducedoperation count by eliminating some of the directions in
which plane wave interaction takes place. The innermost loop in the translation
phase has an operation count which is a constant A5-25 in our simulations. 20 in
288 Fast Integral Methods \302\246
Chapter 8
?
Perform tre atla
(or \320\262 for \302\246 \320\253 lorailngleelamnioflrw
tingle \302\253IMMnt
or \342\204\226\342\200\242 tingle p\302\253lr
IM group lor a alngla emctan
wurce group tor a imgie direction and tett QttmpBimd t\302\273
a
tlntf \342\200\242
dlnoHon (Operation {OparMUnoouni-l)
(OpTOHoncounUI)
(M) (M)
\320\233
M M
Is
TestGrNew
(Isthis a new
test group ?
/Aggregatlon\\
&
\\TranslaUon \320\242\320\2231
IOp\302\273
= + N2| + Yes
M 1 M2I M )
llncrement test group counter by onel
A It \321\204
IQAngles = 2*NGr
DPhi = 2*Pl/IQAngles
-
\321\201
Aggregation Operation count O(NM)|
doJGr=1,NGr
do \320\232
=
1,IQAngles
doJEI=1,NEIGr(JGr)
J = GetGiobal(JGr.JEI)
if (S.eq.'*') then
1! V{JGr.K) = V(JGr,K)+ Dum(J)*conjg{SrcGc(J,K))
\320\274
\320\274
V(JGr.K) = V(JGr,K)+ Dum(J)*SrcGc(J.K)
endif
enddo
enddo
enddo
- count 0(N^2/M)]
[c Translation Operation
dolGr= 1,NGr
doJGr=1,NGr
\320\223
If (Distance(IGr.JGr).gt.DMin) then
doK=1,IQAngies
if then
(S.eq.1*1)
GrGr(IGr,K) = GrGr(IGr.K) + conjg(Trans(IGr,JGr,K))*V(JGr,K)
M< else
M M GrGr(IGr.K)
= GrGr{IGr,K) + Trans(IGr,JGr,K)* V(JGr,K)
endif
enddo
endif
I enddo
enddo
Translation
\321\201 - Operation count \320\236^\320\263/\320\234\320\233\320\231)
dolGr=1,NGr
( doJGr=1,NGr
If (Dlstance(IGr,JGr).gt.DMIn) then
doK = 1,IQAngles
If (S.eq.'\302\2731)then
pr(abs(Trans(K3>r,JGf,K)).eq.u) tjwni
continue
= GrGr(IGr.K) + *
GrGr(IGr,K) conjg(Trans(IGr,JGr,K)) V(JGr,K)
lendtf
N
else
M \320\274
continue
else
= GrGr(IGr.K) *
GrGr(IGr.K) +Trans{IGr,JGr,K) V(JGr,K)
endlf
enddo
encflf
enddo
enddo
Figure 8.12 Code indicating the computation of the matrix-vector product in ihc
translation phase of the windowed FMM.
[13]) and is a significant reduction from the corresponding operation count in the
exact FMM.
The technique by which the FAFF A achieves its speed-up is depictedin Figs.
8.10 and 8.13. It is seen that the FAFFA \"recycles\" the plane wave spectra of the
source group. For a given test group, the aggregation and translation operations are
performed only once for each source group, necessitating that only the disaggrega-
tion operation needs to be performed for each individual element of the test group.
Similar to the exact FMM, the aggregation, translation, and disaggregation pro-
processes consist of a single multiplication. However, the factors used in the three
processes and the method by which the reduced operation count is achieved are
different.
\302\246
The aggregation operation again consists of the product of an entry of the
trial vector with an aggregation factor, representedin Fig. 8.13 for the \320\233\3
element and /Grth test group as SrcGcdGr, J). This is given by
SrcGcUGr , J) = ^\320\265~^'\321\210\320\272\"'\320\254\321\202'
where is the length
\320\224\321\203 of the Jth dis-
discretization element, rJ(,rjck 's the unit vector along the line joining the source
and test groups while Tjj(jr is the vector along the line joining the source
element with its group center. Thus, an aggregation sum is formed for each
combination of source and test groups.
\302\246
The translation operation involves the multiplication of the aggregation sum
with a translation factor, representedin Fig. 8.13 for the /Grth test group
Section 8.3 \302\246
Logic Flow 293
dolGr=1,NGr
ITestGrNew = 1
dolEI = 1.NEIGr(IGr)
I = GetGlobal(IGr.lEI)
'doJGr=1,NGr
If (Distence(IGr,JGr).gt.Dmin) then
| If flTa8tarNew,eq.i}ttign~l
doJEI = 1,NEIGr(JGr)
J = GetGlobal(JGr,JEI)
if (S.eq.'*') then
M V = Dum(J)*conJg(SrcGc(IGr.J))
+V
else
V= Dum(J)*SrcGc(IGr,J)+ V
N endlf
N
M enddo
M if (S.eq.'\") then
GrGr(JGr) = conjg(Trans(IGr.JGr))*V
else
GrGr(JGr) = Trans(IGr,JGr)'V
endlf
if (S.eq.'\") then
AX(I)=AX(I) + GrGr(JGr)'conjg{TestGc(l,JGr))
else
AX(I) m AX(I) + GrGr(JGr)\"TestGc(l,JGr)
endlf
endlf
enddo
ITestGrNew = 0
enddo
enddo
Figure 8.13 Code indicating ihc computaiion of the matrix-vector product in the
FAFFA.
8.4 RESULTS
The results presented in this section [16]. [17] are based on an FMM computer code,
incorporating a conjugate gradient solver, and executed on an HP 9000/750work-
workstation with a peak flop rate of 23.7MFLOPS.The geometry considered was the
rectangular groove shown in Fig. 8.2. Table 8.1 compares the execution time and
RMS error [14] of the standard FE-BIto the FE-Exact FMM, FE-FAFFA and the
FE-Windowed FMM (FE-WFMM) for grooves of widths 25A.. 35X and 50\320\273. The
depth of the groove was 0.35A. with a material filling of er = 4 and \342\200\224
1 and
\321\206\320\263 was
illuminated at normal incidence. The data reveal that the FE-FMMExacl offers
almost a 50 percent savings in execution time with almost no compromise in accu-
accuracy. While the FE-FAFFA is the fastest of the three algorithms, the RMS error was
substantially higher (> 1 dB). If the maximum tolerable RMS error is set at 1dB [14].
the FE-Windowed FMM is the most attractive option since it meets the error criter-
criterionand is only slightly slower than the FE-FAFFA.
Table 8.2 gives the exact Bl operation count rather than merely stating its
order. The knowledge of the constants associatedwith each exponent of Nh enables
us to compare the requirements of two algorithms which might have the same
order of operation count. In Table 8.2, /VNq is the number of near groups (groups
which are treated with the exact moment method procedureowing to their elec-
electrical proximity) which depends on the algorithm and the problem geometry. WN(i
is smallest for the FE-FMM^\"\" and largest for the FE-FAFFA. due to the use of
the far-zone Green's function in the latter. Table 8.2 also gives the number of
multiplications in a single BI matrix-vector product for the 50A. groove. For the
TABLE 8.1 CPU Times and RMS Error of the Hybrid Algorithms
Operations
(multiplications) Required
Operation count for BI Computation (as for BI Computation for
FE-BI Ni 562500
-
FE-FAFFA (\\m + 2)Nl 4A- 2NNa)N,, NwX* 136890
FE-FMMbl<CI 180356
FE-WFMM D + Nm + QvM)Nl-n - NNaQWisKM 153780
hybridization of the FMM does not have any adverse effect on the condition of the
FE-BI system. The time for each iteration is reduced and the total number of
iterations remains approximately the same, resulting in reduced overall solution
time for the Fast BI algorithms.
The performance of the hybrid more stressing angle of inci-
algorithms at a
incidence is depicted in Fig. 8.15. For
example this the width of the groove calculation,
was 10A. and it is seen that the RMS error follows the same trend as for normal
incidence illumination. However, even for this smaller size aperture, the scalability
of the speed-up is maintained. The employed near-group radius was \320\246
implying
thai the matrix-vector products for groups separated by a distance less than a
wavelength was computed using the exact method of moments procedure.
Smaller near-group distances can be employed to reduce the CPU time even
further, and near-group distances down to 0.3\320\233have been found to yield suffi-
sufficiently accurate results.
\342\200\224FE-BI
- FE-Exact FMM
-1 - FE-Windowed FMM
10
-2
10
10
10
Figure 8.14 Convergence curves for the hy-
100 200 300 400 500 60
hybrid algorithms lor the groove of width 25/.. Iteration number
296 Fast Integral Methods \302\246
Chapter 8
30
FE - Bl DMin = 1 A
20
FE - Exact FMM
10 FE - Windowed FMM
5
\321\201\320\276
FE -FAFFA
\302\260
g
-10
-20
-30
30 60 90 120 150 180
Observation angle (deg)
(b)
Figure 8.15 Scalability of the hybrid techniques to smaller problems: (a) Problem
geometry, (b) Bistatic patterns, (c) Error table.
REFERENCES
[1] J. Gong,J. L. Volakis, A. Woo, and H. Wang. A hybrid finite element bound-
boundaryintegral method for analysis of cavity-backed antennas of arbitrary shape.
IEEE Trcms. Antennas Propagat., 42(9): 1233-1242, September 1994.
[2] E. Bleszynski, M. Bleszynski, and T. Jaroszewicz. AIM: Adaptive integral
method for solving large-scale electromagnetic scattering and radiation prob-
problems. Radio Sci.. 31E): 1225-1251. 1996.
[10] V. Rokhlin. Rapid solution of integral equations for scattering theory in two
dimensions. Journal of ComputationalPhysics,86B):414-439, 1990.
[11]R. Coifman, V. Rokhlin. and S. Wandzura. The fast multipole method for the
wave equation: A pedestrian prescription. IEEE Antennas ami Propagation
Magazine, 35C):7-l2, 1993.
[12] J, M. Song and W. C. Chew. Multilevel fast multipole algorithm for solving
combined field integral equation of electromagneticscattering. Microwave and
9.1INTRODUCTION
In the previous chapter, we outlined the formulation of the finite element method as
appliedto problems in electromagnetics. In three dimensions, FEM is primarily a
volume formulation and the number of unknowns escalatesrapidly as the size of the
problem increases. Therefore, the limiting factor in dealing with three-dimensional
problems is the unknown count and the associated demands on storageand solution
time. Techniques which have O(N) storageand solution times are thus necessary to
tackle three-dimensional This
problems. is one of the principal reasons for the
popularity of partial differential
equation techniques over integral equation (IE)
approaches, as the latter lead to dense matrices with O(N2) storage. As the problem
size increases, the IE and hybrid methods, both of which need OIN1), I < / < 2,
storage, quickly become unmanageable in terms of storage and solution time.
Another concern while solving problems having more than 100,000 unknowns\342\200\224a
scenario that can be envisioned for most practical problems\342\200\224is to avoid software
bottlenecks. The algorithmic complexity of any part of the program should increase
at most linearly with the number of unknowns.This is not possible in many cases but
as a rule of thumb, it is generally true that schemes can be devised to manipulate
299
300 Numerical Issues \302\246
Chapter 9
mentioned to handle anisotropic geometries and situations where the boundary condi-
conditions make the system unsymmetric. We devote an entire section to sparse
eigenanalysis where we focus mainly on solving the generalized eigenvalue problem
using sparse and full matrix methods. To solve large problems, the computationally
intensive portions of the finite element code to be parallelized on massively
need
parallel architectures. A parallelization paradigm is discussed in connection with a
distributed memory multiprocessor such as the KSR1 (Kendall Square Research)
machine.
The matrix systems in finite elements and related PDE methods are very sparse and
the percentage of sparsity increaseswith the number of unknowns. In an average
three-dimensional tetrahedral mesh with edge basis functions, the minimum number
of nonzero elements per row can be9 and the maximum number of nonzeros per row
is about 30. The total number of nonzerosvaries between and
15\320\233\320\223167V, where N is
the number of unknowns. Assuming a square matrix, the matrix is 99.84 percent
sparse for a 10,000-unknown problem whereas for 100,000 unknowns, 99.984 per-
percent of the matrix entries are zero. Clearly, it makes little sense to store thesezero
entries which motivates us to find the best possible scheme for storing such matrices.
As we shall see in the subsequent paragraphs and indeed throughout this chapter, the
definition of best is not unique and is governed by computer architecture.
There are various storageschemesfor sparse matrices. In this chapter, we will
discuss the more viable ones: CompressedSparse Row (CSR) format, 1TPACK
format [1], and the jagged diagonal format. Knowledge of the storage formats is
VAC and COC equals the number of nonzero elements in A. Another pointer
array\342\200\224TlOWPAfTH\342\200\224of dimension N is used to store the number of nonzero
elements per row. Thus the position of each element in the sparse matrix is uniquely
defined. For example, if we have the 5 x 5 unsymmetric matrix A
'For conveniencethe matrices and column vectors will be denoted by bold letters in this chapter
only and the columns will be treated as vectors in the dot/inner product definitions.
Section 9.2 \302\246
Sparse Storage Schemes 301
3 0 0 4 5\"
7 0 4 0 2
A = 4 0 7 0 0
0 0 8 0 0
9 7 0 0 0
Then according to the CSR scheme, VAC and TlOW will take the form
VA? = [3 457424789 7]
\320\234 = [3 6 8 9 U]
Note the first value of IWWPN'Tll implies that after reading three entries of VAC,
we will then start reading entries that belong to the second row of A. After reading
the sixth entry of VAC we will then begin reading entries of VAC that to the
belong
third row of A and so on. The last entry of TlOWPAfTTZ is always equal to the
length of the vector VAC.
In the above example, the matrix entries for each row were stored in ordered
fashion., i.e., in increasing order of column indices, but this is not necessary for
commutative operationslike addition and multiplication. A similar data structure
which stores indices instead of the column
the row indices is called the Compressed
Sparse Column (CSC) format. The CSC format is sometimes usedwhen the matrix is
to be accessed along the rows and not the columns, e.g., in the multiplication of the
transpose of a sparse matrix with a vector. The CSR/CSC schemeis very convenient
arrays VAC and COC. Then, according to the ITPACK scheme, the rows of the
array VAC will contain the nonzero elements of the corresponding rows of the
original matrix. The number of columns of VAC will be equal to the maximum
number of nonzeros in a row; rows containing fewer nonzero elements will be
zero padded. Again, considering the sparse matrix A the corresponding ITPACK
array VAC can be represented as
302 Numerical Issues \302\246
Chapter 9
\023 4 5
7 4 2
VAC = 4 7 0
8 0 0
.9 7 0_
The column indices of the elements in VAC are stored in an integer array COC
defined as
\"L 4 5\"
1 3 5
COC = 1 3 *
3 * *
.1 2
The asterisk denotes that the corresponding elements of COC are zeros.The
ITPACK storage scheme is attractive for generating finite element matrices since
the number of comparisonsrequired while augmenting the matrix depends only on
the locality of the corresponding variable and not on the number of unknowns. This
feature can alsobe used for implementing fast searches and comparisons,whenever the
matrix is extremely sparse. Moreover, the sparse matrix-vector multiplication pro-
process can be highly vectorized because of large vector lengths when the number of
nonzeros in all rows is nearly equal. This is becausethe multiplication operation is
carried out by traversing the columns of VAC and COCwhose dimensions are O(N).
However, for our application,almost half the space is lost in storing zeros. As a
result, a lot of storage as well as computational effort is wasted in storing and
operating on zeros, respectively.
The modified ITPACK scheme [2] does alleviate this problem to a certain
degree by sorting the rows of the matrix by decreasing number of nonzero elements.
However, 30 percent of the allotted space is still lost in zero padding.
The other storage format that has been found to be useful for sparsematrices is
the jagged diagonal storage scheme [3].On a vector machine, this format givesbetter
performance in terms of vectorizability. In this scheme, the vector lengths are
approximately equal to the order of the system being solved.The rows are first
sorted by increasing degree of sparsity. The first jagged diagonal is constructed by
taking the first element from each row of the CSR data structure of the ordered
matrix. The rest of the jagged diagonalscan be obtained in a similar fashion. The
matrix is thus stored as a collectionof subvectors of decreasing length. The number
of jagged diagonals equals the number of nonzeros in the first row of the sorted
matrix. An additional vector is required as before to store the corresponding column
numbers from the original sparsematrix. The inner loop of the matrix-vector multi-
multiplication routine traverses the entire length of a jagged diagonal, the maximum
dimension of which is the same as that of the sparse matrix. This feature enhances
vectorization massively. The storage requirement of the above format can be made
to be the same as the previously mentioned CSR format through careful program-
programming.Again, taking the earlier sparse matrix example, we see in Fig. 9.1 how the
matrix is stored in the jagged diagonal format. The arrays VAL and COL store the
matrix values and the corresponding column numbers of the sparse matrix, respec-
Section 9.3 Direct
\302\246 Equation Solver 303
VAL 37498447752
COL 1 1 1 1 \320\267|4
3 3 g|5 5
PNTR 1 10
\320\262 12
ROWPERM 12 3 5 4
Figure 9.1 Jagged diagonal storage formal.
In this and the two subsequent sections,we will concentrate on the various tech-
techniques for solving the linear equation system
b (9.1)
When is
\320\233 dense, i.e., most of the elements of \320\233
are nonzero, the decision is some-
somewhat straightforward. The inversion of the matrix can be carried out in O(N})
operations using popular methods like LU decomposition or Cholesky factorization.
However, in the case of sparse matrices, a simple application of the traditional
methods can prove catastrophic,as storageand processor demands will far exceed
acceptable levels.Our focus in this section will, therefore, be on sparsefactorization
techniques.
A = CU (9.2)
The advantage of this representation is that the subsequent systems
304 Numerical Issues \302\246
Chapter 9
Cw = b (9.3)
Ux = w (9.4)
A = C?T (9.5)
The Choleskyfactorization thus preserves symmetry of the factored matrix and is
also a unique factorization of A. Public domain codes for both Cholesky factoriza-
and
factorization LU decomposition can be found in [4].
Due to finite precision arithmetic, floating point errors can creep into malrix
factorization schemes. For a full matrix order the
\320\270, error bound can be expressed
as [5]
|*| < \320\227\320\253\320\277\320\265\320\274\320\260\320\274 (9.6)
where eM is the machine precision and aM is the largest element of the original or
factorized matrix.
In sparse matrix factorizations, the error bound is actually lower since only a
few operations are performed on each nonzero element.The error matrix for the LU
decomposition is expressed as
?U = A + ? (9.7)
where ? is
\321\203
bounded by the expression
< (9.8|
\\?jj\\ 3.01\320\277\342\202\254\320\234\320\260\321\203\320\237\321\203
E(k) tlij
k=l
= 1, if both Lik and UkJ are nonzero
np
n[y'
= 0, otherwise (9.9)
As can be observed from the error bounds, the growth of the error is direclly
proportional to themaximum value of any element that occurs in the original matrix
or due to factorization. Thus, a strategy for monitoring element growth and then
reducing it points the way for error control. One of the most popularstrategies is
pivoting. Scaling with the largest element in the corresponding row of the submatrix
(partial pivoting) or with the largest element in the entire submatrix {complete pivot-
pivoting) usually stabilizes the factorization process and provides accurate answers.
Complete pivoting comes with a severe pricetag in computational expense; partial
pivoting is, therefore, a method of choice for most factorization schemes.It should
be mentioned here that if a matrix is positive definite, diagonal elements are chosen
as pivots since diagonal dominance is a natural consequence of positive definiteness.
Section 9.3 \302\246
Direct Equation Solver 305
For sparse factorizations, even partial pivoting can be too expensiveand too
rigid.In such cases, threshold pivoting is employed which strives to maintain sparsity
and employs a user-defined threshold parameter to determine the choice of the pivot
[5]. Threshold pivoting is quite popular and is used in production level codes.
Zlatev's strategy [6] is a variation of threshold pivoting: in addition to maintaining
sparsity, it reduces the number of search rows for the pivot to a user-defined value.
uncontrolled, serious storage and performance penalties ensue. Fill is undesirable for
three compelling reasons:
\302\246
Additional storage must be allocated for the extra nonzeros.An extreme
example is a matrix with full first row, full first column and main diagonal
and zeroselsewhere would be completely filled on factorization.
\302\246
Number of operations needed for factorization increaseswith increasing fill.
\302\246
The error bounds defined earlier increase as the matrix becomes filled with
more and more nonzero entries.
Strategies for the of fill-in
reduction have their origins in graph theory. Since
the amount of fill
depends on the row/column permutation selected, a convenient
ordering of the matrix will drastically reduce the computation time and storage
requirements of the factorization. However, it is extremely difficult to find an opti-
optimum ordering which will guarantee the smallest possible fill-in or operation count. In
fact, no general algorithm exists to generate an optimal ordering for an arbitrary
graph. Existing strategies attempt to find an ordering for which the fill-in and opera-
operation count are low, without guaranteeing a true minimum. In finite elements, all
matrices are structurally symmetric, i.e., the positions of the nonzeros form a sym-
symmetric pattern, even though the corresponding values may break the matrix symme-
Thus
symmetry. we will mention ordering strategies for symmetric matrices only.
A graph consists of a set of vertices together with of edges. Thus, a finite
a set
element mesh can be considered to be an undirected graph the edge pair between
since
symmetric matrix and its corresponding labeled undirected graph. A graph with n
vertices is labeled when there existsa one-to-one
correspondence between the vertices
and the integers 1.2 n. Ordering strategies for symmetric matrices hinge on the
fact that the graph of a symmetric matrix remains invariant under a symmetric
permutation of its rows and columns; what changes is merely the vertex labeling.
Before discussingthe various algorithms involved with matrix ordering, we
need to be familiar with a few basic terms in graph theory. Any square matrix \320\233
of order N can be considered to be an undirected graph with iV labeled vertices,
vl.V2,..., vn. The pair (\302\253,,
Vj)
is an edge of the graph if and only if Ay \320\244
0. The
306 Numerical Issues \302\246
Chapter 9
X X
2 X X X
3 X X X
4 X X
X X S X X
X X e
X X X 7 X X X
X X 8 X X
X X X X 9 X
Figure 9.2 (a) Corresponding graph; (b)
X X X 10
symmetric sparse matrix structure.
eccentricity of the vertex e(Vj). The vertex with the largest eccentricity is called Ihe
peripheral vertex. Since no efficient algorithms are available for determining a per-
peripheral vertex, a pseudo-peripheral vertex is used. Vj is a pseudo-peripheral vertex if
= e(Vi) implies =
d(Vj, vj) that e(t>,) e(vj), thereby guaranteeing that the eccentricity
of the selected vertex is large.
Most matrix reordering algorithms start with a vertex of minimum degree or a
pseudo-peripheral vertex. The bandwidth reduction algorithm mentioned here is due
to Cuthill and McKee [7]. Starting with a pseudo-peripheral vertex, all unlabeled
vertices adjacent to it are labeled successively in order of increasing degree.The
reverse Cuthill-McKee algorithm is used when the matrix profile needs to be mini-
minimized. In this case, the orderings of the Cuthill-McKee algorithm are merely
reversed to arrive at the minimized profile. Figure 9.3 shows a typical profile reduc-
reduction algorithm at work. Notice the bandedness of the final system compared with the
arbitrary sparsity pattern of the original matrix. King [8] also proposed profile a
reduction algorithm with similar performance characteristics as the reverse Cuthill-
McKee algorithm. The profile reduction and bandwidth reduction algorithms are
useful since they save both storage and operation count in the triangular factoriza-
factorization
process. However, none of them explicitly minimize the fill-in of the factors.
The algorithm commonly used for reducing fill-in during factorization of a
sparse matrix is called the minimum degree algorithm. The idea behind the algorithm
Section 9.4 Iterative
\302\246 Equation Solvers 307
X104 x 10\"
0 0
\"if
0.5 *?\" 0.5
\302\273\342\200\242
\342\200\242
\320\233
*
1 1
1.5 1.5
ft
*
2 .. \". ' \302\246 2
2.5
.
' ~ 2.5
i
3
* . , 1
0 0.5 1 1.5 2 2.5 : 3o 0.5 1 1.5 2 2.5 3
nz = 469151 x 10\" rtz = 469151 x 104
(a) (b)
Figure 9.3 (tt) Original matrix structure: (b) matrix structure after re-ordering using
a profile reduction algorithm.
is simple intuitively and one of the cheapest and the most effective computationally.
Fill-in and operation count is minimized locally by selecting, at each stage of the
elimination process and among all possible diagonal entries, that row and column
which introduces the least number of nonzeros in the resulting factor. It is quite
amazing that such a simple idea works so effectively. One of the problems with the
original implementation of the algorithm was that the total storage could not be
predicted beforehand. To alleviate this problem, George and Liu [9] introduced the
from fill-in to an extent that these large problems cannot be solved at a reasonable
cost even on state-of-the-art parallel machines. It is, therefore, essential to employ
solverswhose memory requirements are a small fraction of the storage demand of
the coefficient matrix. This necessitatesthe use of iterative algorithms instead of
direct solvers to preserve the sparsity pattern of the finite element matrix.
Especially attractive are iterative methods that involve the coefficient matrices
308 Numerical Issues \320\250
Chapter 9
algorithm of this type is the conjugate gradient algorithm for solving positive definite
linear systems [10]. In this section, we will discuss some algorithms which have been
found effective in solving the sparse matrices that occur in our application. These are
the biconjugate gradient (BiCG)and the quasi-minimal residual (QMR) [11] algo-
algorithms. A version of the generalized minimal residual (GMRES) method is also
presented. These algorithms can also be used for solving unsymmetric matrix sys-
systems as is the case with anisotropic materials.
The convergence pattern of the CG method for self-adjoint positive definite
(SPD) systems can be described by
where En = r%A~lra, Ku
=
A.m\302\273xAmin is the spectral condition number and n is the
number of iterations. Axelsson [12]and Van der Vorst [13] examine the convergence
of the CG algorithm in detail. Its convergence is shown to be superlinear, with a
convergence rate that depends on the distribution of the (mainly smallest) eigen-
eigenvalues of A, rather than on the spectral condition number. However, if the matrix \320\233
is not too far from being positive definite, which is the case with the matrix systems
emerging from edge element implementations, the BiCG and CG algorithms should
still converge. Some implementations espouse premultiplication of A by A1 to
ensure positive definiteness of the system. Unless the condition number is known a
priori to be small or the matrix is unitary, this is a very bad idea, since the conver-
convergence is going to be drastically slow as is evident from (9.10).
The hiconjugate gradient (BiCG) method is a variation of the CG algorithm.
This scheme is useful for solving unsymmetric systems; however, it performs equally
well when applied to symmetric systems of linear equations. For symmetric matrices,
BiCG differs from CG in the way the inner product of the vectors are taken. BiCG
usually converges much faster than CG; however, the
convergence is highly erratic.
The BiCG algorithm for unsymmetric matrix problems is given in Fig. 9.4. For
symmetric positive definite matrix systems, the BiCG algorithm needs approximately
half the computational work and only one matrix-vector multiply operation. The
complete algorithm is presented later in the chapter (Fig. 9.18).
The conjugate gradient squared (CGS)algorithm [14] performs best when
applied to unsymmetric systems of linear equations. big advantage
A of CGS over
BiCG when solving unsymmetric equation systems is that the matrix-vector product
only involves the matrix A and not Ar. Figure 9.5 shows the CGS algorithm for
Initialization
\320\273:
given
Pn
= *n \342\200\242
**\302\273 A)
Pit = Pn/Pn-\\ B)
Pn - rn + PnPn-\\ C)
\320\226 ' rf\\
\320\237 \320\257 V \320\224\342\200\224
1 D)
E)
rH+i =\320\263\342\200\236-
anApj, F)
*e+l
= *\321\217\"\320\2701^*
?\321\217 G)
*\321\217+!
= *n + (8)
g/)Pn
EndRepeat
is
\320\233 a sparse complex unsymmetnc matrix.
Initialization
x and $ given
,v: Arbitrary
ro = b-Axo; =
\320\277\320\273\320\272/
yrj \342\200\242!
q0 =p_x = 0: p_i=0;
= = 0;
\321\217
<rn=s- =
\320\274\320\271:
\320\260\342\200\236
\320\240\320\237/\302\260\320\237 F)
=\302\273,-<\302\273,
?\302\253+! G)
Aua + qa+l) (8)
= \"\320\254 i (9)
\342\200\242'n+l*\321\217 ff^ It
\320\241, \321\202\320\223\320\264-^-1'
'
\320\231+1
= \320\233
\320\257 + I
EndRepeat
or if
Galerkin condition that a BiCG residual must satisfy. The second type of breakdown
parallels the breakdown in the unsymmetric Lanczos process.
Freund [11] has proposed the quasi-minimal residual {QMR)algorithm with
look-ahead for solving linear equation systems.QMR eliminates the oscillations in
the BiCG residual norm and generates smooth, near monotonically converging iter-
iterates. QMR also avoids breakdowns of the first and second kind: the latter is corrected
by using technique. Moreover, transpose-free
a look-ahead QMR algorithms exist for
solving unsymmetric matrix systems. The readeris referred to [11] for algorithms
passing that in most cases, breakdowns do not occur; however, for the sake of robust-
robustness, the look-ahead feature should be included in commercial codes. The QMR
algorithm for complexsymmetric systems without look-ahead is presented in Fig. 9.6.
In most cases, QMR converges in about the same number of iterations as
BiCG with one significant difference. Since the algorithm minimizes the residual at
Initialization
x given
r0 = b \342\200\224
AxQ ;
= ' =
yrjj-ro :
resd P\\
= 'o; vi *olP\\ '\342\200\242
y/r*a
/>o
= 4> = O; t'o = 6O= \\,so = = -1:
9o=- 1.\320\247\320\276
=
vi ro/resd;
\320\270=1;
Sn = vB-i\302\273n A)
if = \320\236
8\342\200\236then stop
= B)
P\302\253 va-{pnSJeri-i)pn-,
= 6ll/\302\253e C)
e\302\253=P\302\253--4*\302\273\302\253:A
= -
va+i finvn
\320\233\321\200\342\200\236 D)
\342\200\242 = Ar-iAn-ilA,! E)
v\302\273+i:e\302\273
= = = nn-v<,pA!Pncl-\\)
cH sn
1\320\233/1+\320\262\320\227; \320\263)\342\200\236
\320\250\320\250\320\245\321\200\320\277+\320\274/\321\201^iaj): F)
4, = + @\302\273-|?\320\233\302\273-1
\302\273\302\253/>\320\277 G)
*B = *\302\273-i+ dn\\ va+, = va+,/pn+l (8)
n = n+ 1
End
is
\320\233 a sparse complex symmetric matrix.
Figure 9.6 Quasi-minimai residual (QMR) algorithm for complex symmetric
matrices without look-ahead.
Section 9.4 Iterative
\302\246 Equation Solvers 311
1|\320\263.||<||\320\263\320\276||>/1\320\275\320\237\"|51\320\260:....\321\202\320\270| (9.13)
which is approximately five times larger than the true residual norm. Usually, the
upper bound is computed until we are very close to the tolerance and then the
Initialization
x is given
Define the (m + 1) x m matrix \"Hm = [hy. 1 < / < m + I and I <j < m]
Specify the number of spanning vectors m
START: = \320\254-
\320\263\320\276
Repeal2 for v = 1 m
Repeat 1 for / = I j
End Repeat 1
End Repeat2
\342\200\224
Compute ym
to minimize ftmy||2
\320\235/\320\227\320\265,
Vm = {v,,v2 vm)
x= x + V,,,ym
If convergence is achieved, then stop, else go to START
Figure 9.7 Restarted GMRES algorithm (fy refers to the /th column the identity
\320\276\320\223
matrix),
312 Numerical Issues \302\246
Chapters
may not converge at all. On the other hand, if m is large, an excessive amount of
work is needed since the CPU time is O(m2N). More details about this solver are
simulating an enclosed transmission line. The GMRES solver was applied with only
12 and
m \342\200\224 converged after about 75 restarts.Its errorhistory was monotonic. a
Figure 9,8 Example convergence hisiory of the BiCG, QMR and GMRES algo-
algorithms. For GMRES. iterations refer to restarts. [Courtesy of Y. Boiros.]
Section 9.5 \302\246
Preconditioning 313
9.5 PRECONDITIONING
The condition number of a system of equations usually increases with the number of
unknowns. It is then desirable to precondition the coefficient matrix such that the
modified system is well conditioned and converges in significantly fewer iterations
than the original system. The equivalent preconditioned system is of the form
The preconditioned mentioned in the following section are the diagonal and the ILU
point and block preconditioned. Block preconditioners are usually preferable due to
reduced data movement between memory level hierarchies as well as a decreased
number of iterations required for convergence. Block algorithms are also suited for
high-performance computers with multiple processors since all scalar, vector, and
matrix operations can be performed with a high degree of parallelism.
9.5.1Diagonal Preconditioner
The simplest preconditioner that can be used in iterative solvers is the point
diagonal preconditioner.The preconditioning matrix is a
\320\241 diagonal matrix which is
J
i (3MRES without Precond.
i 1 1
\320\276
GMRES with Precond.
. ! i 1 1
iA
1U \320\263\321\202 i 1 1 \320\263 ~r \320\223
\\ I i 1 1 1 1
1
-15 \302\246^bd- -+- -1- -H- -H \\- _
^S 1 1 1 1
-20 j \321\202\320\263\321\202J
1
\342\200\224 1 J JL
m i \320\276 1 1 1 1
\\
\302\246a i 1 1 1 I
ol\\
E -25 \342\200\224i- ^rV -r _1 r_
\342\200\224
1
N
\320\276 i \302\260l4J 1 1 I
-30\342\200\2241_
\320\250 - J 1- _ j
i 1 1 I
~v
-35 i
i -I4
. 1 1
1
1 I
\320\223
i 1 > I
-40 \342\200\224i-
- + - -1- SJ
-i r-
i 1 1 i 1 I
-45 20
i
40
1
60
1
80
i I
Iterations
Figure 9.10 Convergence pattern for GMRES with and without diagonal precon-
preconditioning for a standard problem. [Courtesy of Y. Botros.]
Section 9.5 \302\246
Preconditioning 315
multiplication lime, and iteration count for convergence [12]. There is another flavor of
ILU called ILUT [20] which stores matrix elements only if they exceed a certain
threshold value relative to the diagonal. In the cases mentioned below,no attempt
was made to employ higher values of fill-in since the preconditioner already occupied
storage spaceequal to that of the coefficient matrix.
Ft is assumed that the data is stored in CSR format column numbers for
and that the
each row are sorted in increasing order. The sparse matrix is stored in the vector 2>
and the column numbers in PC. SIQ{i) contains the total number of nonzeros till the
/th row. The locations of the diagonal entries for each row are stored in the vector
TXLAQ. The preconditioner is stored in a complex vector, CU.
begin
lu(ij)=e=lu(ij)/d
for k=lbeg+l step 1 until lend do
begin
kk=pc(k)
ik=srch(kk,i)
if (ik.ne.O) lu(ik)=lu(ik)-e*lu(k)
end
end
end
end
316 Numerical Issues \302\246
Chapter 9
In comparison with the traditional ILU preconditioner given above, the modi-
modified ILU preconditioner eliminates the inner loop over the integer variable k. The
modified algorithm basically scalesthe off-diagonal elements in the lower triangular
portion of the matrix by the column diagonal. Since the matrix is symmetric, it
retains the LDLT form and is also positive definite if the coefficient matrix is positive
definite. For our test cases, the modified ILU was especially helpful since the tradi-
traditional ILU preconditioned system may not have been positive definite, as documen-
in [21]. The modified
documented ILU preconditioner is alsolessexpensive to generate and
converges in about 1/3 the number of iterations taken by the point diagonal pre-
preconditioner. However, on vector architectures, the time taken by the two precondi-
preconditioning strategies is approximately the same since each iteration of the ILU
preconditioned system is about three times more expensive [22].The forward and
backward substitutions are very difficult to parallelize and prove to be the bottleneck
since they are inherently sequential processeswith vector lengths approximately half
that of the sparse matrix-vector multiplication process. The triangular solver is also
extremely difficult to parallelize [23].
One way to improve the paralielization of the ILU preconditioneris to use level
scheduling and self-scheduling [23]. In particular, level scheduling can be used to
increase parallelizability by taking advantage of matrix structure and sparsity. For
solving any lower triangular system Lx = b, the /th unknown in the forward solution
is given by
i / \\
(9.16)
scheduling is based on this simple observation. The dependencies between the unknowns
can be modeled using a graph in which node / corresponds to the unknown .v, and an
Thus jc, can be solved at the klh step if all the components Xj
in (9.17) have been
computed in the earlier
steps.
To implement the level schedulingalgorithm, it is first necessary to define the
depth of a node and the level of the graph. The depth of a node is defined as the
maximum distance from the root [3].Therefore,let us place an imaginary root node
with links to the nodes having no predecessorsso that the depth of each node will be
defined from the same point. The depth of each nodecan now be computed with one
pass through the structure of the coefficient matrix L by
1 \302\246 if Ijj
= 0 for ally < /1
(9.18)
/..^\320\276\321\214 otherwjse )
Section 9.5 \302\246
Preconditioning 317
The number of levels of the graph, nlev. can be easily determined from the depth
information. To do so, let us define two other integer vectors: ORDER(i)storesthe
ordering of the rows of L in terms of increasing node depth and LEVEL(i)which
stores the index to the start of each level in ORDER(i).
do k=l,...,nlev
do j=ilevel(k),...,ilevel(k+l)-l (parallel loop)
i=iorder(j)
execute Equation (9.17)
enddo
enddo
However, in our experience, parallelizing the ILL) preconditioned system with level
schedulingdid not lead to significant speedup mainly due to the enormous amount of
memory traffic that was generated. This observation was also noticed in [23], where
the authors estimated that the parallel algorithm generated as much as ten times
more traffic than the sequential code. The blockFLU preconditioner considered next
reduces memory traffic and is thus more effective to parallelize.
In implementing the 1LU preconditioner.
block one block is distributed to each
processorin a multiprocessor architecture, thus achieving load balancing as well as
minimizing fill-in. The modified ILL) decomposition outlined earlier is then carried
out on each of theseindividual blocks. Further, since the blocks are much larger than
the block diagonal version, the preconditioner is a closer approximation to the
coefficient matrix. Moreover, the triangular solver is fully parallelized since each
processor solves an independentsystem of equationsthrough forward and backward
substitution. For example, in an equation system with 20,033 unknowns, the number
of iterations was reducedby approximately half the number required by the diagonal
preconditioner. However, since the work doneis less than twice that for the diagonal
preconditioner, only marginal savings of CPU time was achieved in this case. Also,
the number of iterations required for convergence is highly sensitive to block size, as
shown in Table 9.1 for the 20,033 unknown system. Table 9.1 clearly shows that a
larger block size (smaller number of blocks)does not guarantee faster convergence.
Nevertheless, there is an approximately 50 percent decreasein the number of itera-
iterations the point diagonal preconditioner. regardless of blocksize.The optimum
over
preconditioner for 28 blocks is given in Table 9.2 for a system having 224.476
unknowns. From the table, it is clearly observed that the block ILL) preconditioner
is very effective in reducing the iteration count; however, the CPU time required is
about 10 percent lessthan that required by the point diagonal preconditioner for the
best case.
318 Numerical Issues \302\246
Chapter 9
1 127
2 176
4 185
8 172
12 162
16 174
24 223
28 177
Iterations
Angle of Incidence Point Diagonal (I) Block ILU A1) Ratio (II/I)
In a nutshell, the simplest and the most effective preconditioner was found to be the
diagonal preconditioner. It is also amenable to vectorization and parallelization. The
ILU preconditioner should be employed when the matrix system is ill-conditioned
and vectorizability is not an issue. On most high performance PCs and scalar work-
workstations, the ILU preconditioner performs better than the diagonal one. For parallel
architectures, block ILU is clearly the method of choice. Matrix ordering strategies
that minimize matrix profile can further enhance the block ILU performance since
only a small fraction of resultant nonzeros will lie outside the blocks.
R = 1-AM (9.19)
where M is the preconditioning matrix and / refers to the identity matrix.
The Frobenius norm of any matrix S of dimension (m x n) is given by
\\\\s\\\\F
=
Referring to [17], this minimization can be achieved in two different ways. The
first is via the Global Iteration approach which treats the matrix M as an unknown
sparse matrix and minimizes the objective function in (9.19). One of the well-known
techniques that employs this approach is the Global Steepest DescentMethod. Its
implementation is as follows:
Initialize M
Repeat for i=\\ till convergence
Update M
EndRepeat
The drawback of this technique is its high CPU time and memory cost (both of
order \320\2702).This is because the entire matrix is used during the minimization process.
However, the Column-Oriented Algorithm minimizes the individual functions
=1,2
\321\203
/\321\203(\321\202)=||\320\265,-\320\233\321\202,||\321\214
n (9.21)
where e, and my are the yth columns of the identity and preconditioner matrices,
respectively. The F
subscript implies the Frobenius norm defined in (9.20).
The algorithm for the Approximate Inverse Preconditioner is given in Fig.
9.11. Note that nt in the 'Repeat2' loop of this algorithm can be set as small as
Initialize M
\342\200\224
\320\251 -Me,
=
aij m, + ajtj
Apply numerical dropping to my
EndRepeat2
EndRepeat 1
Initialization
x is given
Specify the number of spanning vectors m
Define the (m + 1) x m matrix Hm
=
(A//. 1 < / < in + 1 and 1 <j<m\\
START: ro = b-.4x
v\\
= 'o/P
=
\302\246ijMrxy},
w =A
Repeat2 for /= 1 j
h,j = w \342\200\242
yj
EndRepeat2
EndRepeatl
minimize \342\200\224
Compute ym to ||0e, W,,,y|b
x = x+Zmym
If convergence is achieved, then stop, else go to START
9.6 EIGENANALYSIS
problem is essentially solved, and E1SPACK has excellent black box routines to
do the job. However, the sparse eigenproblemis still an area of active research
and will be the main focus of this section.
It shouldbe pointed out that the eigenvalue problem for a general matrix is
usually more difficult to solve than a set of linear equations. Since determination of
eigenvalues requires finding the mh-order polynomial, it is an essentially
roots of an
iterative process as polynomial be solved algebraically for fourth-
equations cannot
and higher order polynomials. However, before the onset of iterations, the system is
usually reduced to a convenient form for fast calculation of eigenvalues.The reduc-
reduction does not come cheaply and usually takes longer than the actual eigenvalue
calculation process. It is also in this reduction process that dense and sparsepro-
problems are treated differently.
The standard eigenproblem is defined as
Ax = \320\233\321\205 (9.22)
where A, denotes the eigenvalue of the matrix A and x represents the corresponding
eigenvector. The generalized eigenprobiem found most commonly in finite element
analysis is given as
Ax = kBx (9.23)
where is
\320\272 the eigenvalue of the \320\222
\320\220, pencil. Usually, A is symmetric and is
\320\222
Chapter 5, can
\320\222 be symmetric indefinite which increases the computational rigor
significantly. The eigenproblem is usually reduced to form before starting
a simpler
the iterative solution. The reduction is achieved in dense matrices by means of a
congruence transformation. The resulting eigenproblem amounts to solving for the
eigenvalues and eigenvectors of a symmetric tridiagonal matrix for symmetric prob-
problems or an upper Hessenberg matrix when the original matrix is unsymmetric. A
tridiagonal matrix has nonzeros only in its diagonal and in its first upper and lower
codiagonals. An upper Hessenberg matrix has nonzeros only in its upper triangle,
diagonal and first lower codiagonal. The reduction to tridiagonal or upper
Hessenbergform is achieved by a series of rotations, called Givens rotations, or
Householder reflections. For detailed information on these algorithms, the reader
is referred to [24]. Givens rotations are used when the matrix is sparse and banded
sinceit is possible to carry out these rotations such that no nonzeros are introduced
outside the band. Once the tridiagonal or the upper Hessenberg form has been
obtained, there exist powerful techniques like the QR and QL algorithms to deter-
determine all the eigenvalues of the system. Such algorithms are readily available in
source code from the EISPACK library through neilib [4]. However, there are two
restrictions in the above approach: A) the matrix needs to be banded and B) the
entire spectrum of eigenvalues is computed. The necessity of a bandedmatrix is a
prohibitive requirement for finite element meshes with arbitrary sparsity patterns.
The computation of the entire spectrum can be avoided using the bisection method
based on Sturm sequences for calculating eigenvalues within a specified interval [24].
322 Numerical Issues \302\246
Chapter 9
A.|, A,2,..., A.,, have been found along with the eigenvectors x2
\320\264\320\2631, then the
\320\273\320\263\",
However, the outlook is not so rosy in practice: round-off errors rapidly lead to loss
of orthogonality, and re-orthogonalization is necessary from time to time. The cri-
criterion for re-orthogonaiization is usually when |A.,/A| is larger than a prespecified
tolerance, where h is the current estimate of the eigenvalue and A.) is the dominant
eigenmode. As we will see later, loss of orthogonality is the bane of sparseiterative
eigensolvers and cannot be avoided in finite precision arithmetic, leading to storage
and time constraints if interior eigenvalues are required in addition to the extremal
ones.
Initialization
Choose any column vector, say et
Repeat until
Axk/\\Axk\\ =
strip line is a case in point. The dominant mode is usually the one closest to the
maximum wavenumber supported by the dielectric medium. However, predicting an
eigenvalue close to the desired one is often not an easy task. Algorithms like the
determinant search method use preliminary guesses and the properties of Sturm
sequences to predict eigenvalues near the desired ones.
Once an educated guess can be made regarding the desired eigenvalue, the
eigenproblem becomes somewhat easier to solve. Shifts of origin can be carried
out to improve the performance of the algorithm. This works on the principle that
Section 9.6 \302\246
Eigenanalysis 323
the same eigenvectors as A In this way, interior eigenvalues can be calculated once
the neighborhood of the eigenvalue can be determined.The method converges lin-
inverse problem
(9.24)
where A is symmetric and nonsingular. The method of inverse iteration with shifts
Fig. 9.14. The shift factor is usually taken close to the desired eigenvalue. The
method requires the solution of a linear system of equations at each step. It can
be accelerated by factorizing the matrix
\320\222 using sparse techniques at the beginning of
the iterative procedure.In the inverse method with shifts of origin, the system to be
solved at every iteration is
?yk+1= (A -
crB)xk
Initialization
Choose any column vector, say e\\
Repeat
Solve Byk+i
= {A-aB)xk
*
xk lnen sl0P
generalization of the power method describedin the previous section. This method
as well as the subsequent Lanczos algorithm rests on the concept of Rayleigh
matrices, Ritz values, and Ritz vectors. Let us considerQ = (qt, \321\206\320\263qm) as an
orthonormai basis of a subspaceS. invariant under A, arranged in the form of an
n x m matrix Q. Then \320\241 = QTAQ is a square symmetric Rayleigh matrix of order
m. It can be shown that the eigenvalues of \320\241
are the same as that of A and the
eigenvectors of \320\241
equals Qy, where \321\203
is the eigenvector of A The advantage is that a
much smaller matrix \320\241 of order m <K n yields the desired extremal eigenvalues of \320\233
The eigenvalues of \320\241
are the best approximations to the eigenvalues of A and arc
known as Ritz values and the corresponding eigenvectors are known as Ritz vectors.
Thus, if rt = Ax,- - fXjXi is the residual vector for the Ritz pair (jih x,), then there is
an of A -
eigenvalue in the interval [\320\264, ||r/||, /n, + \320\24617\320\246].
subspace.
Initialization
Choose any orthonormai basis Vq of dimension nx where
\320\272, \320\272
m = l
Repeat for m = 1, 2,...
Orthonormalize \320\241
such that
=
\320\241 QR, with Q unitary and R upper triangular
Find eigenvalues of RRT
Spectral decomposition of RRT = PDPT,
where D is diagonal with =
fij
?>,,\342\200\242
and P is unitary
Vm = QP
If residual vector < tolerance,convergenceachieved
End Repeat
cases, simultaneous iteration provides a powerful tool for extracting the extremal
eigenvalues. If an educatedguess can be made regarding the eigenvalue, the inverse
Section 9,6 \302\246
Eigenanalysis 325
iteration with shift is used for finding the eigenvalues of large, sparse systems. As
shown earlier, the shifted generalized eigenproblem is defined as
pencil is symmetric indefinite. Note that the sparse system A-aB needs to be solved
at each step for multiple right-hand sides. It is convenient to do this using one of the
sparse direct solving strategies outlined in the earlier section. The solution of the
generalized order of the HA, HB pencil, is much smaller than n, the order of the \320\222
\320\220,
pencil.
Initialization
Choose any Ult of dimension n x k, where \320\272 \320\277.
\302\253: is also
\320\233\320\276 available.
m= \\
Solve (.\320\220-\320\276-\320\222)\320\241
= \320\257for \320\241
HB = CTBC
Solve the generalized eigenproblem:
{HA-aHB)P=\\mHBP,
where P is //\320\264-orthogonal
Vm
= CP
If residual vector < tolerance,convergenceachieved
End Repeat
Figure 9.16 Simultaneous iteration with shirt for the generalized eigenproblem.
One of the problems this method is that k, the size of the desired subspace,
with
is not known a priori. However, \320\272
can be modified within the iteration process by
adding new columns to the basis or by deflating the basis from the converged
eigenvectors.
9.6.3LanczosAlgorithm
The Lanczos algorithm results when the initial guess for the orthonormal basis
is drawn from the Krylov subspace. Therefore, if A is an arbitrary nonzero vector, a
Krylov subspace is defined as
\302\246
The computation of the orthonormal Lanczos basiscan bedone through a
three-term recurrence relation.
\302\246
Convergence to the eigenvalues is very rapid.
superior convergence properties than the power method. Lanczos converges approxi-
approximately as whereas
B\320\272)\320\263^'~1) the
power method converges as /c2(\"~'\\ where n is the
\320\223\321\202= $m-\\4m-\\
\320\220\321\206\321\202-
= \302\246
\302\253m 4m rn>
is symmetric indefinite, its Cholesky factorization does not exist and consequently,
the product B~[A will usually be unsymmetric. The QZ algorithm is the method of
choice for solving full unsymmetric generalized eigenproblems. The Lanczos tri-
9.7 PARALLELIZATION
As mentioned earlier, ihere are two problems which limit the vectorizability of a
sparse matrix code: short vector lengths and indirect addressing.Thereis not much
to be done about the second problem since sparse matrices must have indirect
addressing to exploit the O(N) storage feature. However, the first problem can be
removed by storing the matrix in an optimizable machine-dependent format. The
jagged diagonal method of matrix storage is a case in point. The still slower execu-
execution speeds of the matrix-vector multiply compared with the vector update dueto
is
the indirect addressing in the inner loop which causes memory contention. On a
distributed memory architecture, the second problem can also be partly removed by
keeping local copies of the desired vector in each processor. The subsequentgather
and scatter operations of the updated vectors then consume the majority of the
processor communication time.
Belowwe discussthe implementation of a finite element code on two different
types of massively parallel architectures:2 the KSRI and the Intel iPSC/860. The
KSR1is a parallel machine which implements a sharedvirtual memory, although the
memory is physically distributed for the sake of scalability. The Intel iPSC/860, on
the other hand, is a distributed memory, Multiple Instruction, Multiple Data
\302\246
parallelization of DO loops
Parallelism is introduced by allowing each processor to execute a portion of
the DO loop.
\302\246
distribution of arrays among the processor set
Sinceeach processoronly has a limited amount of memory, each array is
divided into smaller units that reside on each node. This also allows array
accesses from each processor to be serviced by different nodes, thus reducing
contention for resourceson any single node.
On a cache-only memory machine such as the KSRI,only the first step is necessary
since the hardware cache system automatically takes care of data distribution among
the processors. This makes porting codes to the KSRI quite easy. However, the
increasedcontrol of data distribution and communication on theiPSC/860 can trans-
translate into improved performance for some applications. The data distribution on mes-
\"The KSRI is no longer available, and the Intel iPSC/860is being phased out. Nevertheless,these
parallel platforms represent examplesof distributed memory architectures.
328 Numerical Issues \302\246
Chapter 9
1. KSRIPort
The most important aspect of parallelization involves optimizing the iterative solver.
For the sakeof simplicity, we complex symmetric BiG solver.
consider the
Figure 9.18 shows the symmetric BiCG algorithm; the unsymmetric method
given in Fig. 9.4 contains an additional matrix-vector multiply and a few additional
vector updates. For a system of equations containing N unknowns, all vectors in the
algorithm are of size N and the sparse matrix A is of order N. Table 9.3shows the
operation counts per iteration for each type of vector operation, where nze denotes
the number of nonzero elements in the sparse matrix. In the finite element code, each
vector operation is implemented as a loop and parallelization is achieved by tiling
these loops. For P processors, the vectors are divided into P sections of N/P con-
consecutive elements. Each processor is assigned the same section of each vector. This
Initialization:
\320\273:
given
r = b \342\200\224
Ax;p = r, Imp r-r
\342\200\224
a = tmp/{q \302\246
p) B)
x = x+ ap C)'
r=r \342\200\224
uq D)
q = CTx*r E) Step!
resd = Vk \342\200\242
f* I (&)
P = (r-q)/tmp G)
= /8 x imp (8)
imp
. Step 3
p = q + Pp (9)
End Repeat
is
\320\233 a sparse complex symmetric matrix.
the preconditioning
is
\320\241 matrix.
q, p, x, r are complexvectors
a, ft, tmp are complex scalars; resd,tol are real scalars.
Complex Real
Operation \320\266 + * +
Note that the dot products in lines 6 and 7 require only one synchronization. The
the result vector by multiplying the corresponding block of rows of the sparse matrix
with the operand vector. Sincethe operand vector is distributed among the process-
data
processors, communication The communication pattern
is required. is determined by
the sparsity structure of the matrix, is derived from the unstructured mesh.
which
Therefore, the communication pattern is unstructured and irregular. Vector updates
and dot products are easily parallelized using the same blockdistribution as in the
communication
pattern. However, scheme was easilyand efficiently
the previous implemented on
the KSRl Massively Parallel Processingmachine thanks to the global address space
[29].Table 9.4 shows the execution time of one iteration (in seconds) and the speedup
for different numbers of processors and for two problem sizes.
TABLE 9.4 Execution Time and Speedup for the Iterative Solver
N = 20,033 N = 224,476
\"For 1. 8. and 16 processors, only the first 100 iterations were run.
''Code run on a 64-node KSR at Cornell Universiiy.
56
Linear
Measured (solver}
Measure\"d'(rnatgen)
16 24 32 40 48 56
Figure 9.19 Speedup curve for the linear
Number of processors
equation solver on the K.SR1.
each processor compute the elemental matrix of the elements it owns, and update ihe
global sparse Since
matrix. the global sparse matrix is shared by all processors, the
update needs to be done automatically. On the KSR1 this can be done by using the
hardware lock mechanism.
The performance for the matrix assembly is given in Table 9.5 and also in Fig.
9.19.
TABLE 9.5 Execution Time and Speedup for the Matrix Generation
and Assembly B0,033 unknowns)
I 24.355 )
2 13.376 1.8
4 6.811 3.6
8 3.744 6.5
16 1.89 12.9
25 1.625 15,0
28 1.276 19.1
9.7.1Analysis of Communication
at all.
communication The distribution of the nonzero entries in the matrix affects the amount
350-
300
250
200
150
100
50
1 I I I I I I \320\237 I I I 1 I I I I II I I I I I I
\320\223 I
012345678 9 10 1112 13141516171819
20 2122 23 24!
Thread ID
of executing the poststore instruction in Step 3 offsets the reduction in execution time
of Step 1. On a poststore,the processor typically for 32 cycles while
stalls the local
cache is busy for 48 cycles. As a result, the net reduction in execution time is only 3
percent.
Line 9. Before proceeding with the updates of the N/P elements of p for
which it is responsible, each processormust acquire exclusive ownership for those
Lines 2,6, 7. The rest of the communication is due to the three dot products.
Each processor computes the dot product for the vector subsection that it owns.
for parallelizing the DO loops on the iPSC/860 is similar to the KSR1 with each
processor executing a portion of the DO loop. This scheme works fine as long as
there are no dependencies in the body of the loop, as is the case for the vector
updates and the sparse matrix-vector multiply of the linear solver. However, the
main loop in the matrix generation/assemblyphasecontains a dependency between
loop iterations. As on the KSR1. this problem is solved by using a mechanism in
Section 9.7 Parallelization
\302\246 333
which each processor locks a row of the matrix while performing an update. Since
the of each row is maintained by the processor
locking whose memory holds the
particular row, processors lock and unlock rows by sending messages to the appro-
appropriate row owner.
Even though the parallelization of loops enables programs to run faster on
multiprocessors, the distribution of arrays must be done for all.
the code to run at
Arrays are distributed code by partitioning
in the data along one array dimension
among the processors. Thus for a 1000-element array, processorI holds the first 100
elements, processor 2 the next 100 and so on. The straightforward method for
accessing this distributed array involves the translation of array references into sub-
subroutine calls. Thus an expression x = a{i)is translated into the call call fetcha(i. \320\273).
The subroutine then sends
/<?/\320\263\320\233\302\253 a message to the processor that holds element a(i),
which in turn sends a reply message with the value of \302\253(/).Although this scheme
requires the implementation of a new subroutine for each distributed array and the
replacement of each array access with a subroutine call, the process is easy and
mechanical.
The schemementioned above does not. however, result in good performance.
The primary reason for this is that the overhead for sending a messageis much
higher than that of sending a single byte. The cost for sending ten or even 100
bytes is usually not much higher than that of sending 1 byte. Thus, messages need
to be \"bundled' for fast and efficient operation. However, the simple strategy
mentioned above is in direct contrast to message bundling. One way of overcoming
this conflict is to implement the simple schemefor parts of the code that do not take
up a significant portion of computation time like the matrix generation/assembly
phase and a better scheme for accessing the distributed arrays in the equation solver
phase.
The primary operation in the solver that
generates communication is the sparse
matrix-vector product. Sincethe matrix-vector product involves performing a dot
product of each row with the distributed vector, each processor must obtain the
values vector from the other processors. The dot product operation
for the entire
must be carried out in several phases as each processor may not be able to hold the
entire vector in memory. Thus, each processor P beginsthe matrix-vector multiply
by sending its portion of the vector to other processors, then performs the following
tasks for every other processor P'\\
\302\246
Reads the portion of the vector owned by P'.
\302\246
Updates the partial dot product for each row by adding the product of the
appropriate matrix element with the elements of the partial vector.
After performing the above operations for all the processors, the dot product is
complete. Unfortunately, each phase requires a pass over all the sparse matrix
rows owned by the processor. For better parallel performance, each row of the
matrix must be sorted to allow the phases to pass over the rows in order. It was
found that the problem scaled reasonably well for a small number of processors.
However, as the number of processors increased, much of the time was spent on
communication and book-keeping than on true computation.
334 Numerical Issues \302\246
Chapter 9
REFERENCES
18:719-741, 1992.
[24] G. H. Golub and \320\241F. Van Loan. Matrix Computations. Johns HopkinsUniv.
Press, Baltimore, MD, 1983.
[25] \320\241. Reinsch.
\320\235. A stable rational QR algorithm for the computation of eigen-
eigenvalues of a hermitian, tridiagonal matrix. Numer. Math., 25:591-597,1971.
[26]J. Cull urn and R. A. Willoughby. Lanczos algorithms for large symmetric eigen-
eigenvalue computations. Progress in Scientific Computing Series. BirkhauserBoston
Inc., 1983.
absorbers capacitor. 69
active, 198 caviiy
337
338 Index
vector plots, two-dimensional, 51-52, Fast Fourier Transform (FFT), 247, 249.
140-141 260, 267
see also elements fast integral methods, see adaptive integral
eigenvalue problem, 111,117, 144, 315, 320 method; Fast Multipole Method
isoparametric, 41 feeds
linear. 39, 73 aperture, 229, 242
241
prism/pentahedral, 48, 59, 176-178, 247 coaxial cable,
memory, 66 radiation, 5
mesh examples. 67-68 resistive, 19
node numbering, 74 inner product, 24
procedure/steps, 68, 72 interchip feed-through. 174
pseudocode,79. 89 interelement continuity, 165
stiffness matrix, 78 isoparametric element, 41
weak form. 72 isotropicmedium. 2, 228, 269
weighted residual method. 71. 75 iterative algorithms
finite element-potential formulation, 162 BICGSTAB, 310
formulation Biconjugate Gradient (BiCG), 245,248,
electric, 34 318.328
conesphere, 215
cylindrical inlei, 211 variational formulation, 24-27, 159.161,
groove. 132 170
metallic cube,205 vector norms, 29
plate, 213 volume coordinates, 46, 56
rectangular inlet, 209
triangular cylinder. 129
Watson's transformation. 260
see also applications
wave equation, 5
scattering, two-dimensions, 120-127
general form. 100
self-adjoint, 24, 33
self-cell, 136 scalar, 97
functions
vector, 97-98, J84
shape
37, 48, 143 weak form. 72. 102. 230
edge-based,
node-based,37, 39 waveguide eigenvalues
circular, 112
one-dimensional, 39, 73
see alsoelements rectangular, 108-111
see \320\242\320\225
modes, 111
sheet transition conditions, boundary
conditions
TM modes, 114
shell, 257-259 waveguide propagation
Arindam Chatterjee obtained his Ph.D. from the University of Michigan in 1994.
From 1989 to 1994, he served as a research assistant and later as a Research Fellow
in the Radiation Laboratory, University of Michigan, Ann Arbor. His work there
dealt with the development, implementation, and application of the finite element
Leo \320\241
Kempel is a senior research engineer in Mission Research Corporation's
Electromagnetic Observables Sector. He received his Ph.D. from the University of
343
344 About the Authors