0% found this document useful (0 votes)
11K views360 pages

Finite Element Method For Electromagnetics

This document provides an overview and summary of the book "Finite Element Method for Electromagnetics" by John L. Volakis, Arindam Chatterjee, and Leo C. Kempel. The book covers the theory, development, implementation, and applications of the finite element method for solving electromagnetic problems. It begins with the basic theory and variations of the finite element method, and provides modern applications to both open and closed-domain problems in 2D and 3D. The book is intended for graduate students, engineers, and scientists working with computational electromagnetics.

Uploaded by

Felipe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11K views360 pages

Finite Element Method For Electromagnetics

This document provides an overview and summary of the book "Finite Element Method for Electromagnetics" by John L. Volakis, Arindam Chatterjee, and Leo C. Kempel. The book covers the theory, development, implementation, and applications of the finite element method for solving electromagnetic problems. It begins with the basic theory and variations of the finite element method, and provides modern applications to both open and closed-domain problems in 2D and 3D. The book is intended for graduate students, engineers, and scientists working with computational electromagnetics.

Uploaded by

Felipe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 360

Finite Element

Method for

Electromagnetics

ANTENNAS,
MICROWAVE CIRCUITS,

AND SCATTERING APPLICATIONS

John L Volakis
Arindam Chattcrjet
Leo \320\241
Kempel

The IEEE/OUP Series


IEEE
on Electromagnetic Wave Theory
Donald Q. Series Editor
PRESS
Dudley.
ANTENNAS, MICROWAVE CIRCUITS, AND
SCATTERINGAPPLICATIONS
A volume \\n the IEEE/OUP Series on Electromagnetic Wave Theory

Donald G. Dudley, Series Editor

Employed in a large number of commercialelectromagneticsimulation packages, the finite clement method is


one of the popular and well-established numerical
most techniques in engineering. This book coversthe theory,
development, implementation, and application of the finite element method and its hybrid versions to
electromagnetics.
Finite Element Method for Electromagnetics begins with a step-by-step presentation of the finite element method
and its variations, and then provides up-to-date coverageof three-dimensional formulations and modern
applications to open- and closed-domain problems.Topicscovered include:
and Ritz methods
Calerkin's MATLAB sample codes
One- and two-dimensional theory and applications Efficient implementation of the finite element
Thrce-dlmeDSicmal development of the method method, sparse matrix storage schemes,popular
using edge elements and applications iterative solvers, eigenvalue solutions
Mesh truncation schemes Experiences on code porting to parallel computers

Integral algorithms forfastimplementation'of the boundary integral matrix-vector products. Written by


experts who have extensive experience in both teaching and implementing this method to many applications,
Finite Element Method for Electromagnetics can be used as a textbook tor lirst-year graduate students, as well as a handy
reference Jor engineers and scientists interested in computational electromagnetics.

John L. Volakis is professor


Department of Electrical Engineering and Computer
at the Science at the University
of Michigan. He has published
than 140 refcreed journal
more articles and more than 140 conference papers
on numerical and analytical techniques in electromagnetics. Dr. Volakis is also coauthor of Approximate Boundary
Conditions in Electromagnetics (IEE Press, 1995) and several book chapters.

Arindam Chaterjce has developed three-dimensional computer simulation of electromagneticfields for scattering
and microwave circuits, and is currently a member of the finite element development group for the HFSS finite
element commercialpackageat Hewlett-Packard.
Leo C. Kempel developed three-dimensional antenna simulation packages using the finite element-boundary
integral method and has extensive experience with all popular numerical techniques in electromagnetics. He
is currently at Mission Research Corporation, Florida, conducting research and development on all aspects of
electromagnetics.

Formerly tbv E Press


I\320\230 Series on Electromagnetic V ave< ttiis joi it series between IH ?,Press and
Oxford Univeisny Press, offers outstanding coverage ol the field, with new titles as well 3s \320\263\320\265\321\200\320\277\320\277\
and revision? (>f'n \302\246 clones
>*\342\200\242.,ncd that mat -tain 1o ig-tcrrn archival srgnifitaiKC in clcctroma.jnctn
wav^S and applications. Designedspecilic.i!!y
for graduate student*-, piactiung engineer, and rcvcarchcis,
ibis\" series provides' all'\302\273rJ.iMf volumes tbit explore electromagnetic waves* and applications beyond ihc
iindcryi level
\302\273d\302\273ate

\320\263

Published by John Wiley and Sons. Inc.

0780334 56
IEEE Press 90000
445 Hoes Lane
P.O.Box1331
Piscataway, NJ 08855-1331
1-800-678-IEEE (Toll-Free U.S.A.and Canada)
or 1-732-981-0060

9||78078\320\262||33425\320\262
Order
\320\250\320\225\320\225 No. PC5698
FINITE ELEMENT METHOD
FOR ELECTROMAGNETICS
lEF.E/Ol'P SERIES ON ELECTROMAGNETIC WAVE THEORY

Tlie IEEE/OUPSerieson Electromagnetic Wave Theory consists of new title* as well as


reprimings and revisions of recognized classics that maintain long-term archival significance
in electromagnetic waves und applications.

Series Editor Associate Editors


Donald G, Dudlev Electronuigneth- Theory. Sivtiuring. and
University of Ehud
\320\220\320\273\321\202\320\276\320\277\321\217 Hcyinan
Tel-Aviv University

Advisory Board Equation Methods


\320\251\320\246\321\202\320\265\321\202\320\253

Robert K. Collin Andreas C. Cangellans


Ciise Western Reserve University University at Arizona

Akiru Ishimaru Iitfvgrid Equation Methods


University of Washington Donald R. Wilton
University of Houston

D. S. Jones
University of Dundee Anwrnm, Propagation* uittl
David R. Jackson
University of Houston

BOOKS IN THE IEEE/OUP SERIESON ELEC'I'ROMAGNETIC WAVE THEORY

Christopoulos. \320\241The Tranmihxhn-Lwi1 Modeling Mvthods: Tl.M


Cltimmow. P. C. Tin' Plane ll'avv Sfietlrwu Rf/nvxt'iiiuihn <*( ffleiirvnwftriinfc Fields
C'nllin. R. E,. Fh'lil Thtfttrv of Gukkd Waves. Second Edition
Dudley. D, G.. MalfnttKltifttl Ftnmtiiilhnxjhr Elevtwnuignrtit Tfwory
Elliot, R. S.. Kh'ftrontttfpwtirs1 History. Theory, and Applhttlltms
helsen. L. \320\222.. and Murcuvitz. N.. Radiation tmd SaJilfring i\\t Wun:t

Harrington. R. F.. Field Computation hy Munwtii Meiliud.\\


Joiics\302\27315. S.. Methods in b'/t'itrontagHeilr Wove Propagation. Second lidition

Lindell. 1. V.. Method* for KfartnunagfU'lk Field Anulysis


Peterson c\\ \321\217\\\342\200\236
I'tmtpuwthmat Melhmk
far Electrmnufowiirs
Tai. \320\241, Gi'in'rii/Pi'd
\320\242.. iiml Dyadic Analysis'
I't't'tnr Applied Maduwoik v in Field Theorr
Tui. \320\241.
\320\242..
Dyadic (iiei'n Ftmctitmx in Merirminigiit'tic Theory. Second Edition
Van Hlailel. J.. Singular Eh'cinmmglii'tic Fields and Source*
Wail. J.. Klecinmwpicih' Waves in Sivaiifwd Media
FINITE ELEMENT METHOD
FOR ELECTROMAGNETICS
Antennas, Microwave Circuits, and Scattering Applications

1EEE/OUP Series on
Electromagnetic Wave Theory

John L. Volakis
University of Michigan
Arindam Chatterjee
Hewlett-Packard
LeoC. Kempel

Mission Research Corp.

IEEE Antennas & Propagation Society. Sponsor

IEEE
PRESS

The Institute ol Plcctricul Oxford University Pns

and Electronics Engineers. Inc.. Oxford.Tokyo.


New York Melbourne
This book and other books may be purchased at a discount
from the publisher when ordered in bulk quantities. Contact:

IEEE PressMarketing
Atln: Special Sales
445 Hoes Laae, P.O. Box 1331
Piscataway, NJ 0\320\2505-1331
Fax: G32) 981-9334

For more information about PRESS


\320\223\320\225\320\225\320\225
products,
visit the IEEE Home Page:hltp:/'/www.ieee,org/

f 1998 by the Institute of Electrical attd Kettrunies Engineers, Inc.


345 East 47th Street. New York, NY 10017-2394

All rights reserved. No part of this bonk may he reproduced in any form,
{\321\216\320\263
may it be stwed in a retrieval system or transmitted in any form,
without written permission front the puhlbber.

Printed in the United States of America

\320\256 987654321

ISBN0-7803-3425-6
IEEE Order Number: PC5698

OUP iSBN 0 19 $504799

Library, of Congress Catatoging-in-PubHcation Data

Volakis. John Leonidas,1956-


Finite element method for electromagnetics : with applications to
antennas, microwave circuits, and scattering / John L. Volakis.
Arindiim Chaliffrjse. Leo C. Kenipel.
pi cm.

Includes bibliographical references, .and mdes.


ISBN0-7803-3425*6 (alk. paper)
1. Electromagnetic fields\342\200\224Mathematical models. 2. Finite element
method. 3. Antennas (Electranies) 4. Microwave circuits.
5. Electrons\342\200\224Scattering. 1. Chalterjee. A. (Arindam) II, Kempel,
LeoC.
TK7867.1.V65 \\\320\250 97-4876\320\232

530,. 14'I\342\200\224dc21 CIP


IEEE Press
445 Hoes Lane. P.O. Box 133!
Pjseataway. NJ 08855-1331

Editorial Board
Roger f\\ Hoyt. Editor In Chief

J, B. Anderson A. H. Haddad M. Padgett


P. M. Anderson R, Herrick W. D. Ree\\4

M. Eden S. Kartalopoulos G, Zobrist


M E. El-Hawary D. Kirk
S. Furui P. Laplantc

Kenneth Moore. Director oj IEEE Press


John Griffin, Senior Acquisitions Editor
Linda Matarazzo. Assisumt Editor

Antennas
\320\250\320\225\320\225 & Propagation Socicly, Sponsor
AP-S Liaisonto IEEE Press.Robert Maiiloux

Cover design: William T. Donnelly. WT Design

Technical Reviewers

James T. Aberle. Arizona Suite University


Jin-Fa Lee. Wmvster Pntyierhnw Imtintfr

Andreas Cangellaris, University of Arizona

D. R. Wilton. University of Houston


Daniel T. McGrath.Air Fane Research Lahorutorv

Oxford University Press

Walton Street, Oxford OX2 6DP

Oxford New York

Athens Auckland Bangkok Bombay


Calcutta Cape Town Diir cs Salaam Delhi
Florence Hong Kong Istanbul Karachi
Kuala Lumpur Madras\302\273 Madrid Melbourne
Mexico City Nairobi Paris Singapore
Taipei Tokyo Toronto

and associated companies in

Berlin Ibadan

Oxford is a trademark of Oxford University Press


Contents

PREFACE xiii

ACKNOWLEDGMENTS xv

CHAPTER 1 FUNDAMENTAL CONCEPTS 1

1.1 Time-Harmonic Maxwell'sEquations I

1.2 Wave Equation 5


13 Electrostaiicsand Magnetoslaties 6
1.3.1 Electrostatics 6
1.3.2 Magnctostatics9
1.4 Surface Equivalence 9

1.5 Natural Boundary Conditions 14


1.6 Approximate Boundary Conditions 17
1.6.1 Impedunce Boundary Conditions 17
1.6.2 Sheet Transition Conditions 14
1.7 Pointing's Theorem 20
1.8 Uniqueness Theorem 22
1.9 Superposition Theorem 23
1 10 Duality Thedrem 23
1.11 Numerical Techniques24
1.11.1 The RiU Method 24
I.I 1.2 Functionals for Anisotropic Media 27
I.I I 3 Method of Weighted Residuals 28
1.11.4Vector and Mairix Norms in Linear Space 2\320\243

vii
viii Contents

1.5
I\320\233 Some Matrix Definitions 31
1,11.6 Comparisonof Solution Methods and Their
Convergence 32
I.I 1.7 Held Formulation Issues 34

CHAPTER 2 SHAPE FUNCTIONS FOR SCALAR AND

VECTOR FINITE ELEMENTS 37


2.1 Introduction 37

2.2 Features of Finite Element ShapeFunctions 3JJ

2.2.1 Spatial Locality 38


2.2.2 Approximation Order 38
2.2.3 Continuity 38
2.3 Node-BasedElements 39

2.3.1 One-Dimensional Basis Functions 39


2.3.2 Two-Dirnensroml Basis Fimcrtons 40
2.3.3 Three-Dimensional Basis Functions 45
2.4 Edge-Based Elements 48
2.4.1 Two-Dimensional Basis Functions 49
2.4.2 Three-Dimensional Basis Functions 53

CHAPTER 3 OVERVIEW OF THE FINITEELEMENT


METHOD: ONE-DIMENSIONAL
EXAMPLES 65

3.1 Introduction 65

3.2 Overview of the Finite Element Method 66


3.3 Examples of \320\236\320\273\320\265-DimensionalProblems in
Electromagnetics 69
3.4 The Weighted Residual Method 71
3.5 Discretisation of the \"Weak\" Differentia! Equation 73
3.6 Assembly of the Element Equations 76
3.7 Enforcement of Boundary Conditions 79
3.7.1 Neumann Boundary Conditions (Homogeneous)80
3.7.2 Dirichlct Boundary Conditions (Homogeneous) 80
3.7.3 Nonzero Boundary Constraints (Inhomogenuous) 81
3.7.4 Impedance Boundary Conditions82
3.K Examples 83

Appendix 1: Sample One-Dimensional \320\274\320\273FEM Analysis


\320\270\320\264\320\270

Program 89
Appendix 2: Useful Integration Formulae for One-Dimensional
FEM Analysis 91
Contents ix

CHAPTER 4 TWO-DIMENSIONAL APPLICATIONS 93


4.1 Introduction 93

4.2 Two-Dimensional Wave Equations 94


4.2.1 Transmission Lines 94
42.2 Two-Dimensional Scattering 95
4.2.3 Waveguide Propagation (Homogeneous Cross
Section) 97
4.2.4 Waveguide Propagation (lahomogeneous Cross
Section) 98
4.3 Discretization of the Two-Dimensional Wave Equation 100
4J.I Weak Form of the Wave Equation 101
4.3.2 Discretization of the Weak Wave Equation 102
4.3.3 Assembly of Element Equations 105
4.3.4 Assembly Example: Waveguide Eigenvalues 108
4.4 Two-Dimensional Scattering 11$
4.4.1 Treatment of MetaLlic Boundaries 119
4.4.2 Absorbing Boundary Conditions 121
4.4.3 Scattered Field Computation 124
4.4.4 Scattering Example Using ABCs 127
4.4.5 Artificial Absorbers for Mesh Truncation 130
4.4.6 Boundary Integral Mesh Truncation 134
4.5 Edge Elements 137
4.5.1 Example I: Propagation Constants of a Homogeneously
Filled Waveguide 144
4.5.2 Example 2: Scattering by a Square-Shaped Material
Coated Cylinder 145
Appendix 1: Element Matrix for Node-Based Bilinear
Rectangles 149
Appendix 2: Sample matlab Code for Implementing the Matrix
Assembly 150

CHAPTER 5 THREE-DIMENSTONAL PROBLEMS:

CLOSED DOMAIN 157

5.1 Introduction 157


5.2 Formulation 158

5.2.1 Field Formulation 159


5.2.2 Potential Formulation 162
5.3 Origin of Spurious Solutions 163
5.4 Matrix Generation and Assembly 164
5.5 Source Modeling 168
5.6 Applications 171

5.6.1 Cavity Resonators 171


5.6.2 Circuit Applications 173

Appendix: Edge-Based Right Triangular Prisms 176


Fundamental

Concepts

The material book is generally considered to be at the level of a


presented in this
graduate student engineer. Thus, the reader is assumed
or a practising to have been
exposed to electromagnetics either through a suitable graduate course or practical
experience.In this chapter, fundamental electromagnetic concepts and theorems are
presentedalongwith the notation used throughout this text so thai a common base is
available to all readers.
Many good texts on general electromagnetics principles and techniques are
currently in print. Some tire considered classical treatises suchas [\\]~[2] while others
are more recent vintage such as [3] [4].Although this chapter presents the minimal
introductory material necessary for the study of the finite element method as applied
to electromagnetics, the interested reader is encouraged to consult these references
for a more completetreatment of electromagnetics.
Upon assumption of a harmonic field, the phasor or time-harmonic form of
Maxwell'sequations are presented along wilh complex material definitions which

permit of loss mechanisms.The natural


the incorporation boundary conditions are
derived followed by fictitious, though useful, approximate resistive and impedance
conditions. Electrostaticand niagnelostatic formulations are discussed for use in
later
examples of element method. Several useful
the finite electromagnetic \321\201\320

are presented
\321\201\320\265\321\200\321\214 including the Poynting, uniqueness, superposition, and duality

theorem*. Time-harmonic Maxwell's equations wiJl fee covered firs!,

1.1 TIME-HARMONIC MAXWELL'S EQUATIONS

Maxwell's equations were originally written as a set of coupled, time-dependent


integral equations. However, of primary interest in this text is the study of har-

harmonically varying fields (i.c, frequency domain) with an angular frequency of

1
Contents

PROBLEMS:
CHAPTER 6 THREE-DIMENSIONAL
RADIATION AND SCATTERING 183

6.1 Introduction 183


6.2 Survey of Vector ABC's 1 \302\2534

6.2.1 Three-Dimensional Vector ABCs 184


6.2.2 Artificial Absorbers 194
6.3 Formulation 201
6.3.1 Scatteredund Total Field Formulations 201
6.4 Applications 204
6.4.1 ScatteringExamples 205

6.4.2 Antenna and Circuit Examples 216


Appendix: Derivation of Some Vector Identities 221

FE-BI
CHAPTER 7 THREE-DIMENSIONAL
METHOD 227

7.1 Introduction 22T


7.2 General Formulation 228

7.2.1 Derivation of the FE-BI Equations 229


7.2.2 Solution of the FE-BI Equations 233
7.2.3. Commentson the General FE-BI Formulation 236
7.3 Excitation and Feed Modeling 238

7.3.1 Plane Wave 238


7.3.2 Probe Feed 239
7.3.3 Voltage Gap Feed 240
7.3.4 Coaxial Cable Feed 241
7.3.5 Aperture-Coupled Microstrip Line 242
7.3.6 Mode Matched Feed 243
7.4 Cavity Recessed in a Ground Plane 245
7.4.1 Formulation 246

7.4.2 Solution Using Brick Elements 247


7.4.3 FIT-BasedMatrix-Vector Multiply Scheme 249
7.4.4 Examples 252
7.4.5 Aperture in a Thick Metallic Plane 255
7.5 Cavity-Backed Antennas on a Circular Cylinder 257
7.5.1 Examples 260

7.6 Recent Advances in the FE-BI Method 262


7.6.1 Finite Element-Periodic Methodof Moments 262

7.6.2 Finite Element Surface of Revolution Method 264


7.6.3 Fait Integral Solution Methods 266
Appendix 1: Explicit Formulas for Brick Elements 267
Appendix 2: Brick Finite Element-Boundary integml Computer
Program 272
Contents xi

CHAPTER 8 FAST INTEGRAL METHODS 277


by S. Buidiganavale and J. L. Volakis

\302\253.I The AdapUvc Integral Method 277


8.2 Fast MullipolcMethod 279

8.2.1 Boundary Integral Equation 279


8.2.2 Exact FMM 280
8.2.3 Windowed FMM 2K3
8.2.4 Fast Far Field Algorithm 284
8.3 Logic Flow 287
8.4 Results 294

CHAPTER 9 NUMERICAL ISSUES 299


9.1 Introduction 299

9.2 Sparse Storage Schemes 300


9.3 DirectEquation Solver 303
9J. 1 Factorization Schemes 303
9.3.2 Error Control 304
9.3.3 Matrix Ordering Strategies 305
9.4 Iterative Equation Solvers307
9.5 Preconditioning 313

9.5.1 Diagonal Preconditioner 313


9.5.2 IncompleteLU (ILU) Preconditioner 315
9.5.3 Approximate Inverse Precondilioner318
9.5.4 Flexible GMRES with Preconditioning 320
9.6 Eigenanalysis 320
9.6.1 Directand Inverse Iteration 322
9.6.2 Simultaneous Iteration 324
9.6.3 Lanc/osAlgorithm 325

9.7 PurallelLzation 327


9.7.1 Analysis of Communication 330

INDEX 337

ABOUT THE AUTHORS 343


Preface

The finite element method (FEM) and its hybrid versions (finite element-boundary
integral, finite clement-absorbing boundary match-
condition, finite element-mode
matching,etc.) is oue successfulfrequency
of the most domain computational methods for
electromagnetic simulations. It combines geometrical adaptability and material gen-
generality for modeling arbitrary geometries and materials of any composition. The
latter is particularly important in electromagnetics since nearly most applications
dealing with antennas, microwave circuits, scatterers, motor and generator model-
modeling,etc. require the simulation of nonmetallie/cumpositematerials. Also, the hybri-
hybridization of the finite element method with integral equation techniques leads \321\202\320\276
fully

rigorous approaches which combine the best aspectsof volume and surface formula-
formulation
techniques.
Because unique features, the finite element
of its method is becoming the work-
workhorse for
electromagnetic modeling and simulations. Many research and develop-
development codes are now available from universities and industry, and these have
demonstrated the utility and capability of the method. Also, a number of commercial

finite element analysis packages are currently available. Typically, these packages do
not yet incorporate the more rigoroushybrid versions of the FEM. However, they
are rapidly evolving to more sophisticated and capablepackages which incorporate
new technologies in geometrical modeling, simulation engines, and solvers.
With the increasing importance of electromagnetics simulation packages using
the FEM, this book should serve as a valuable text for students, practicing engineers,
and researchers in electromagnetics. The original goal of writing the book was to
serve as a text for beginning graduate students Interested in the application of the
finite element method and its hybrid versions to electromagnetics. However, the
authors also recognized a need to report (in a coherent manner) the many recent

advances in applying the method!*) to traditional and new problems in electromag-


The
electromagnetics. result is a book that can serve both beginning students and more
advanced practitioners, The first half of the book has already bee\302\273used in the

xiii
xiv Preface

classroom as part of a courseon numerical electromagnetics at the University of


Michigan. The secondhalf of the book covers primarily work on three-dimensional
CD) developments and applicationswhich have primarily appeared in the literature
over the past 5 \321\203\320\265\320\260\320\263\321\207.
The book assumes that the reader is a first-year graduate student who has
likely taken one advanced course in electromagnetics beyond the standard under-
undergraduate courses. For practicing engineers, it is assumed that the reader is familiar

with concepts of electromagnetic radiation and has an understanding of Maxwell's


equations and their implications. No previous experience in numerical methods is
necessary, but such experience will, of course, help the reader in understanding
the procedure of casting analytical equations into discrete systems for numerical
solution.
For classroomuse.it is expected that the lirst four chapters will be thoroughly
covered with the exception of Chapter 2, which describes of basis/expansion
a variety
functions. At the introductory stage, only the initial sections of Chapter 2 need be
covered.The readermay then return to Chapter 2 as needed.Chapters 3 and 4 (one-

dimensional (ID) and two-dimensional BD) formulations and applications] are


written in a step-by-step process with the assumption that this is the first exposure
of the reader to numerical methods and the finite element method ia general.
Chapters through 5 7 introduce the finite element method and its hybrid versions
for 3D simulations with applications to microwave circuits, scattering, and confor-
rrial antennas. These chapters are written at a more advanced level and cover the
latest applications and successes of the method in electromagnetics\302\273. Chapter 5
(closed-domain 3D applications) is a straightfoward extension of the two-dimen-
two-dimensional
development in Chapter 4 and can he part of a quarter or semester coiirse
which includes Chapters 1-5. FEM implementations with absorbing boundary con-
conditions and the Unite element-boundary integral method for 3D applications are
described in Chapiers 6 and 7. respectively. These arc practical simulations
realistic
and should he of particular interest to practicing engineers and researchers in the
field. Their 2D counterparts are describedin Chapter 4 at a significant level of detail

along with explicit formulas for developingcomputer codes.


Chapter 8 describes some lecent developments on the implementation of the
boundary integral methods for mesh truncation in conjunction with fast integral
methods. Fast integral methods have shown dramatic reductions in CPU and mem-
memory. They are currently the subject of research and will impact the utility and devel-
development of the finite element-boundary integral method
Finally. Chapter 9 presents an overview of storage techniques for sparse
systems, iterative solvers, preconditioning, parallelization, and a variety of details
pertinent to the development of finite element codes. These items were not mixed
with the earlier chapters which discuss the mathematics and applicationsfor the
FEM. Thus, the reader can refer to Chapter 4 at different stages, and as needed,
when developing finite element codes.

/, L. I'olukis
A. Chanerjee
L. \320\241
Krmpe!
June \342\204\22

Ann Arbor, Ml
Acknowledgments

Interest in Ihe finite clement method (FEM) at the Radiation Laboratory of the
University of Michigan began in 1987 by the first author and his graduate students.
The motivation was to model large domains without restrictions in geometry and
material composition. At thut time, two graduate students. Timothy J. Peters and
Kasra Barkeshli,had completed successful implementations of boundary integral
solutions using A-space methods. This O(.VlogiV) approach paved the way for a
fully OiN) linile element-boundary integral algorithm which combined the rigor of
the boundary integral for mesh truncation and the generality of the FEM for
volume/domain modeling. The first of these hybrid implementations was developed
by Dr. Jian-Ming Jin. a graduate ussistant of John Volakis, resulting in \321\217 highly
successful finite element-boundary integral computer program. Versionsof this code
are still in use by government, industrial, and academic researchersin the United
Stales. Another graduate student of ProfessorVolakis, Dt. Jeffery D. Collins, furth-
furthered this work to a body-of-revolution with an integral mesh enclosure. His later
students among them the co-authors. Dr. Chattcrjce and Dr. Kcmpei.and Dr,
Daniel C. Ross, Dr. Jian Gong, and Dr. Tayfun 6?demir\342\200\224made significant con-
contributions toward the understanding of 3D problems in antennas, scattering, and
microwave circuits.
The authors are indebted to the entire research group of ProfessorVolakis for

graciously helping in the preparation of the manuscript. We would particularly like


to mention Hristos Anastassia, Lars Andersen. Youssry Botros. Arik Brown, and

Tayfun Oxdemir for proofreading and in providing data and figures. The authors are
grateful to Dr. Sunfl Bindiganavale for co-authoring the section on fast integral
methods in Chapter 8. He also helped in proofreading various sections of Lhe manu-
manuscript. The acknowledgments would be incomplete without mentioning Mr. Richard
Carnes. whose expertise in LATEX made the typesetting of the book a much easier

task. Ruby Sowards typed some sections of the book, and Patti Wolfe helped in
preparing several figures. The authors are thankful to each of them.

xv
xvi Acknowledgments

A lot of encouragement was received from several people throughout the pro-
project.
The authors would like to particularly acknowledge Professor Donald G.

Dudley, Series Editor of LEEE Press, who was instrumental in publishing this

book with the Press and was supportive throughout the preparation of the manu-
manuscript. The comments and constructive criticisms of the early chapters and the final
manuscript by Professor Andreas Cangellaris. Professor Jin-Fa Lee. Dr. Daniel T.

McGrath, and the anonymous reviewersare very much appreciated. The authors are
also thankful to the entire IEEE Press staff {John E. Griffin, Linda C. Matarazzo,

Christy Coleman. and Savoula Amanalidis) for their help.


On the personal front, our acknowledgments would not be complete without
mentioning the support of oxir families. John Volakis would like to express gratitude
to his wife. Maria, and children. Leonithas and Alexandro.for their patience, sacri-
lice. and understanding during the preparation of the manuscript. Arindam
Chatterjee would like to express deep appreciation for the constant encouragement
he received from his parents. He is also grateful to his wife for her insightful criti-

criticisms. Leo Kempel would like to express his appreciation for the support and
patience provided by his wife, Cathy.

John L. Volakrs

Arindam Chatter fife

Leu \320\241
Kempi'l

August 1997
Ann Arbor, Ml
Fundamental Concepts \302\246
Chapter I

eo = 2nf rad/sec since the finite element method for electromagnetics utilizes time-
harmonic fields. The interested reader is referred to one of the excellent general
electromagneticstexts cited in this chapter's introduction for a discussionof the
time-dependent form of Maxwell's equations. For our purposes,we begin with the
time-harmonic form of Maxwell'sequations.
The time-harmonic electric field is related to the lime-dependent electric field
that \342\200\224
(assuming j V\342\200\224T)
by

= + cos{<t>t +
xExq cas{cot + \321\203\320\225\321\203\321\212
\321\204\321\205) cos(w? + \321\204\321\203)
+ zE20 \321\204.) A.1)

where the complex vector

z)
\320\251\321\205.
\321\203.
=
xExOeJ*' + yEytf* + \320\263\320\225\320\273\320\265^ A.2)

is referred to as the field phasor, and similar representations can be employed for the
other field quantities. Introducing these into the time-dependent Maxwell'sequa-
equations [5], we obtain a simplified set

VxH = J+./WE A.3)

V x E =-M-\321\203\302\253\320\264\320\235 A.4)

a,, A.5)
p A.6)
where the corresponding vector field and current phasors are
E = electricfield intensity in volts/meter (V/'m)
H = magnetic field intensity in amperes/meter (A/m)
J = electriccurrent density in amperes/meter2 (A/m2)
M= magnetic current density in volts/meter2 (V/m2)
and the two scalar charge phasors are

p = electricchargedensity in coulombs/meter3 (C/m3)

pm = magnetic charge density in webers/meter3 (Wb/m3)

Both the magnetic current density (M) and the magnetic charge density (pm) are
fictitious quantities introduced for convenience.
Implied in these time-harmonic equations are constitutive relations for an

isotropic medium

D = eE = eoerE A.7)

\320\222
=/uH = \320\264\342\200\236\320\264\320\263\320\235 A.8)

J=crE A.9)
M = crmH A.10)

where two additional phasors


D = electricflux density in coulombs/meter2 (C/ra2)
=
\320\222 magnetic field density in webers/meter2 (Wb/m2)
Section i.I \302\246 MaxweJfs
\320\242\320\263\320\277\320\272\320\263-\320\235\320\260\320\263\321\210\321\201\321\210\320\265
Equations \320\2

are related and H. respectively.Theseconstitutive


to E equations are an important
link between the
original time-dependent form of Maxwell'sequations and the time-
harmonic form used in the finite element method. Similarly, the phasor forms of the

continuity equations [5] arc given by

(Ml)
= 0 A.12)
V.M+./\302\253?>*

In A.3)-( 1.12), the material constantsare given by

1:
f,, = free space permittivity
= #.854 x 10 farads/meter (F/m)
7
=
U\302\273 free space permeability = x
4\321\217 10 henrys/meter (H/m)

(.,
\342\200\224
medium's relative permittivity constant
=
\320\246, medium's relative permeability constant
G = electric current conductivity in mhos/m (jj/m)
am = magnetic current conductivity in ohms'm (fi/m)

The first two of these (e() and /x()) are fundamental constants while the others describe
the specific material. For example, er is a measure of the material's electric storage
capacity while a is a measure of the material's ability to conduct electric currents or
alternatively as an Ohmic lossmechanism. The relative permeability (ir and magnetic
conductivity am are the magnetic field analogues to e, and or. respectively. For the
purposes of the finite element method, all four of these material quantities may vary
spatially (inhomogeneous) and spectrally (dispersive).
Thecurrent densities J and M appearing in A.3) and A.4) do not include the

presence of impressed sources, In general. J and M can be written as a sum of

impressed (or excitation) and induced (or conduction) currents as

J = Jf + 4e = J, + aE A.13)
=
\320\234 = \320\234;
\320\234,+\320\234\320\263 + (\321\202\342\200\236\320\263\320\235 A.14)

where the subscript \"(\" denotes impressed currents while the subscript \"V refers to
conduction currents. When these are substituted into A.3) and A.4). the familiar
form of Maxwell's equations arc obtained

A.15)
A.16)
where

= - - =
iT j
\342\200\224
\342\202\254r = \302\253' /\302\253\" f'A -,/tanS) A.17)

and

A.1\302\273)

represent equivalent relative complex permittivity and permeability constants. For


notational convenience, the dot over the relative constitutive parameters will be
omitted with the understanding that these still represent all possible material losses.
Fundamental Concepts \302\246
Chapter I

Any one of the representations given in A.17) and A.18) are likely to be found in the
literature with the quantities

tan<5 = \342\200\224
A.19)
\320\270

tan im=?j A.20)

referred to as the material's electric and magnetic loss tangents, respectively.


To summarize, Maxwell's equations in phasor form for isotropicmedia are

V x H = J, +joj\342\202\254E A.21)

V x E = -M, -JtotiH A.22)


A.23)
A.24)

where the phasor form of A.11) and A.12)was employed to rewrite A.5) and A.6) as
given above.

The corresponding integral representations of A.21HI-24) are

H d\\ =
\342\200\242
(J, +>*E) \342\200\242
dS A.25)
1 \321\201 ih

E d\\ =
\342\200\242 - (M,
\302\246
+\320\234*\320\235) dS A.26)
f f

where is the
\320\241 contour bounding the open surface S illustrated in Fig. 1.1 and
rfS \342\200\224
ndS. The circle through the single integral indicates integration over a closed
contour, whereas the same symbol through the surface integral denotes integration
over the closedsurface 5,.which encloses the corresponding volume V, The surface S
associated with the integrals A.25) and A.26) is completely unrelated to Sr which
encloses the volume V.

Expressions A.21) and A.22) imply six scalar equations for the solution of the
six components associatedwith E and H. Thus, for time-harmonicfields. A.21)and

nds

Figure 1.1 Illustration of the differential


d-tctl clement ds and the contour \320\241
Section 1.2 Wave
\302\246 Equation 5

A.22) or A,25) and A.26)are sufficient for a solution of the electric and magnetic

fiejds. The divergence condjtwas A.23; and (J.24), \320\276\321\202


their integral counterparts,
A.27) and A.28). are superfluous.In fact, these two equations follow directly from
the first two upon taking their divergence and observing that {V x A) = 0 [6] for
V \342\200\242

any vector A. Equations A.21\320\2301 -28)can be easily modified for anisotropic material
fiH be replaced by f E and |f H. respectively,
\342\200\242 \302\246
as well. This requires that and
\302\253E

x
where ? and Ji represent 3 3 tensors[7].
In this text, open scattering and radiation problems will be considered.
Consequently, any valid and unique solution of the electric and'or magnetic fields
must also satisfy the Sommerfeld radiation condition, which describes the field
behavior at infinity

where is the wavenumber = = and A(I is the corre-


\320\272\320\276 free-space (An tOy/i^f)
2\320\273/\320\220,(,

corresponding free space wavelength. This simply states that the Held is outgoing and of
the form e~^\"r/f as r \342\200\224*\302\246
oo.

1.2 WAVE EQUATION

Ampere-Maxwell's Law A.2I) and Law A.22) arc independent first-order


Faraday's
vector equations, and noted
as> they lead to a unique solution subject to the
earlier,
specified boundary conditions. They may be combined together to yield a single
second-ordervector equation in terms of E or H known as the wave equation.
The finite element method is used to numerically approximate the solution of the

wave equation.
Specifically by takingthe curl of A.2I) or (L22) and making use of the other*
the following vector wave equationsare obtained:

kbn,H =
V x _ + V x HI
-Jkn\320\223\342\200\236\320\234
IYJLHJ

where the upper set of equations are for solution of the electric field while the lower

set is for solution of the magnetic field. In A.30). er denotes the relative permittivity
of the media and indicates
\321\206\320\263
the reUtive permeability of the media. For free space,
these two quantities are both unity. Also. Zu = I / >'u = is the free space
\320\243\320\264\320\276/\320\261\320\236 wave
impedance. In materials other than free space, the wave impedanceand wavenumber
are given by Z = 1/ Y -= yjuje and =
\320\272 respectively.
\321\210^/\320\265\320\264.

Utilizing the vector identity

=
\320\251\321\205
\320\247\321\205(\321\204\320\220) x A
\320\220+\321\204\320\247 A.31)

A.30) can be written in a more convenient form

Wa %,E + x v x = -A2\302\273J
- V x A.32)
\320\253^-) E] Ij-i|
Fundamental Concepts \302\246
Chapter 1

for the electric field and

H- +
*\302\247\320\264,\320\235 x V x Hi = -jka YQM + V x i A.33)
Wij j j
for the magnetic field. The significanceof this form of the wave equation is that for
homogeneous materials, the terms within the bracket are zero. Most implementa-
of the finite element method assume a homogeneous
implementations material within each finite
element and hence this bracketed term can be set to zero for those cases.
Another important version of A.30) for homogeneous media is obtained by
utilizing the vector identity

V x E = - V2E
\342\200\242
V x VV E A.34)
in A.32) to get

-V(V E) + V2E
\342\200\242
+ klfrfirE =jk{]ZonrJ + V x M A.35)

In a source-free region, A.35)simplifies to

2 2
= 0 A.36)
These equationsrepresent three vector field components each of which satisfies the
Helmholtz or scalar wave equation

vV + /cV = 0 A-37)

where V/ denotes Ev, E,. Similar


\320\225\342\200\236, partial differential equations can be formed for

the magnetic field from A.33).

1.3 ELECTROSTATICS AND MAGNETOSTATICS

Although for the majority of this text we are concerned with dynamic electromag-
some
electromagnetics, examples of static electromagnetics are included to illustrate basic finite
element principles. Therefore, we presentthe basic equations of electrostatics and

magnetostatics, namely equation and the potential relations.


Poisson's
In this section, we present a basic review of electrostaticand magnetostalic
expressions sufficient for this text. It is not a comprehensive review of either electro-
or magnetostatics, and the interested reader is encouraged to study [8] or [9] for
further information. We begin with electrostatics.

1.3.1 Electrostatics

The fundamental equations of electrostaticsare forms of Faraday's equation


A.4) and Gauss' Law A.6), namely
V x E= 0
V \302\246(<:?)
= A, (].38)
Section 1.3 \302\246
Electrostatics and Magnetostatics 7

The electric field can be expressed in terms of a gauge condition involving a


scalar quantity. \321\204(\320\263)

E=-V0(r) A.39)

With this field egression A.39), A.38) reduces to Poisson'i\302\273 equation (here given in
terms of general scalar fields iind sources since Poisson's equation is also used for
magnelostatics)

V \302\246
I\302\253V#r)| =/(r) A.40)

For electrostatics, the scalar quantity isa voltage = J'(r))


{\321\204{\321\202) and the sources are
volume =
(/'(\320\263)
\342\200\224
hence A.40) reduces to
charges fair)),

V-[fVK(r)] = -/>,,(r) A.41)


Solution of A.41). subject to the appropriate boundary conditions (Dirichlet.
Neumann, and/or impedance conditions), is equivalent to solving the original
Maxwell's equations.
Solution of A.41) is often accomplished either in closed form or numerically.

Closed form solutions are available for only a limited number of boundury condi-
conditions [V]. Hence, it is usually more practical to employ a numerical method such as
the finite element or boundary element methods.
The potential attributed lo a volume charge is given by the integral relation

K'(r) = f
f
\320\223
^p-
GiD{t. r') dV' <1.42)
where the three-dimensional static Green's function is given by

and volume charges are denoted by p,,. This representation is the solution of A.41) in
unbounded space and a pictorial relationship of the primed and unprimed par-
parameters is shown in Fig. 1.2.
J-'or two-dimensiona! situations (e.g., the sources of excitation run from
z = -oo to z = oc and are invariant with respect to r), the following potential
integral is appropriate

~ G-^ir.t1)^' A.44)

where /i, denotes surface charges and G2n is the two-dimensional static Green's
function

2n \\\\r-
Theseintegral
relations are used to determine the potential. V. at some point
in space due to an impressed charge distribution.They are used to derive integral
equations for the lotal potential due to an impressed source subjectto boundary
conditions on surfaces within the domain. Such an integral equation (for surface
problems) is given by
Fundamental Concepts \302\246
Chapter 1

(a)

Figure 1.2 Illustration of the geometrical


parameters associated with field representa-
(a) near
representations; zone setup; (b) far zone setup.

rt ds,

where = n\302\246
VK(r),
\320\255\320\230(\320\263)/\320\255\302\253 r')/8n'
=
\320\255\320\241(\320\263, n'- V'G(r,r') = -\302\253'\302\246
VG(r, and
\320\263') n'
denotes the outward normal to the integration surface S. Note that n = w(r') implies
that the unit normal is a function of the integration variables, where = \320\277{\321\202)
\321\217 is a
functicm of the observation variables.
For perfect electric conductors, A.46) can be rewritten in terms of unknown
surface charges. Specifically,by making use of A.39), and relation pje = \302\253-E=
\342\200\242
\342\200\224n
VK(r), we obtain the usual expression

V(t) = V'(r) + \320\223\321\201(\320\263,


\320\263') dS' A.47)
j J ?{pl

where G represents either A.43) or A.45). as appropriate.


Section 1.4 \302\246
Surface Equivalence 9

1.3.2 Magnetostatics

The solution of Maxwell's equations for a stationary magnetic field is similar lo


the procedure given above for electrostatics. In this ease. Ampere's and Gauss'
Magnetic Laws are
VXH=J
V =
\342\200\242
\320\222 0
(..48)

where the static field density is assumed to be related to the magnetic field intensity

by the expression

(L49)

Note that in A.48), we have not assumed a fictitious magnetic charge density.
Rather, the fundamental sources of static magnetic fields are currents. J.
Wc can define a magnetic vector potential in terms of these currents

A(r) = /i f
\320\223
\320\223
J(r')G(r. V-\"
\320\263 U-50)

where the Green's function is the same three-dimensionalfunction used for electro-
electrostaticsA.43) or the two-dimensional function A.45). For the hitler cuse, the integral
must be reduced to a two-dimensional one over the domain of J. With the introduc-
of this vector
introduction potential. soJufiun of A.48) wilh ().49) yields the expression

J(r'}x V<7(r.rVk\" (LSI)

The derivation of <LSI) can be found in most introductory electromagnetic texts


[2, 3. 5].and similar expressions are possible for two-dimensional currents where the

integration over a surface rather


is taken than a volume.
Integral equations may be formed for magnetostatics in a simitar manner to
electrostatics and the interested reader is referred to [9].

1.4 SURFACE EQUIVALENCE

Surface equivalent currents are very useful in thv formulation and execution of a
numerical solution of Maxwell's equations. Their introduction can be readily justi-
justified in the context equivalenceprinciple,e.g..two sources thut produce
of the surface
the Mime field within a region are .said lo be equivalent within that region.
The surface equivalenceprincipleslatesthat the field exterior (or interior) to a
given (possibly fictitious) surface may be exactly representedby equivalent currents
placed on that surface and allowed lo radiate into the region external {or internal) to
that surface. For the exterior cast;, these equivalent currents are given in terms of the
total exterior (E. H) fields while the interior liekb\302\273are assumed to be ?cro (this is

Love's equivalence principle). The appropriate currents for representing the fields
exterior to the surface are given by
10 Fundamental Concepts \302\246
Chapter 1

n x H = J
A.52)

For the interior fields, the negative of A.52) are used.The radiated fields due to these
equivalent currents are given by the integral expressions

E(r) = -J\302\261 V x G(R) \302\246


n' x E(r

\302\246
ti x H(r')dS' A.53)

H(r) = - \320\224 V x d(R) n' x


\302\246
H(r') dS'

fi' x E(r') dS' A.54)

where R = |r r'|,r and r'


\342\200\224
denote the observation and integration point, respec-
respectively, and h' is the outward directed unit normal at the point r'. The closedsurface
on which the equivalence theorem is applied is denoted by Sc. These geometrical
quantities are illustrated in Figs. L2 and 1.3. In A.53) and A.54),a dyadic Green's
function is required which at least satisfies the radiation condition A.29). When
J and M are radiating in free space, the dyadic Green's function is given in closed
form by

'II = A.55)

where / = xx + yy + zz is the unit dyad and the corresponding scalar Green's func-
function is given by

AnR
A.56)

Also,
V x G0(r. r') = -V x [lG(i(r,r')] = -VG0(r.r') x / A.57)

implying V x Go(r, r') \342\200\242


M(r') = -VC?0(r, r') x M(r').
When A.55)-A.57) are introduced into A.53) and A.54),and after the use of
common vector and dyadic identities, we obtain the representations

E,H E,H

Js=rixH
Sources
(a) (b)

Figure 1.3 Illustration I he application


ol\302\260 o( the surface equivalence principle.
Section 1.4 Surface
\302\246 Equivalence II

E(r) = 11 \320\223M(r') x VGu(r. r') -,/fcDZoJ(r')C0(r. r')

A.58)

H(r) = \320\231
[-J(r')
x VGn(r. r') -i ^ M(r')&'0(r. r')

I
A.59)
'kitZa

More explicit expressionsfor E and H can be obtained by introducing the identities

Grfr. r') V* = \320\2410{


=^~ -(/ft,, +1)

in which R = (r- r'Vlr \342\200\224


r'|. as depicted in Fig. 1.2. Specifically. A.58) becomes

[M(r') x

'
-ji-\342\200\224
lj(r')
\302\253\320\276\320\273
(A-(l\302\253)-J

and a corresponding expression for H can be obtained by invoking.duality (E-\302\273H.

H -E.
\342\200\224 J -> M. M -> -J. ,u -> e, f -> /./, Zo \342\200\224 Ko -\302\273
K\302\253, Z,().
We cun rewrite A.58) in a. less singular form by noting the identities.
VG = -V'a,

A.61)

= VJ(r') - + J(r') \342\200\242


V(V
-
J(r'K/0(r.T')l VG',,(r, r') VVG,((r. r')
= J(rVWC,i(r.r') 1.62)

to deduce that

J(r') \342\200\242
WC/,,(r. r')dS' = -vli J(r') -
V'<7,,(r. r')(fS'

V .J(r')VGnU.r')clS'
12 Fundamental Concepts \302\246
Chapter I

Introducing these into A.58) yields the expression

E =\320\224 x VC0(r, r') -jk0Z0J(r')G0(r, r')


JM(r')

which is most commonly used for integral equation numerical solutions and is also
valid for open surfaces since the normal components of J to the perimeter edges
of the surface vanish. The correspondingH field expression is again obtained by

duality.
For far zone computations (r -* oo>, the Green's function A.56) can be sim-
simplified as

**'*> A.63)
^
Using this in A.58) and A.59), carrying out the vector derivative operations, and
retaining only the terms that decay1 as C?(I/r), we get

E(r) ft\302\273
>0
e-jL
\320\257
[+
f x M(r') + Zof x(rx J(r'))]e**'* dS' A.64)

H(r) *\320\2240
\320\246
\342\200\224^
[-? x J(r') + Yof x (r x M(r'))] ****'*dS' A.65)

Theseare referred to as the fai zone field expressions and are typically used for the
evaluation of antenna radiated fieldsor for the calculation of the radar cross section
(RCS)ofa target. An acceptable criterion for using A.64) and A.65) is

? \342\200\236.\3

where D is the largest antenna or target dimension. In this case, the phase error in the
intervening approximations is maintained at less than f.
A typical setup for comput-
the
computing radiation from volume sources at points in the near and far zones is depicted
in Fig. 1.2.
Figure 1.4 also shows the spherical angles commonly used in electro-
electromagnetics and to be used throughout this text.

Note that use of A.55) requiresboth equivalent currents A.52). Alternatively,


when J and M radiate in the presence of certain bodies such as an infinite metallic

plane, a cylinder or a half-plane, a dyadic Green's function can be chosen to satisfy


the boundary conditions appropriate for that surface, hence avoiding the need for
currents to be placed on that surface. For example, the Dirichlet condition

\320\271\321\205\320\261|=0 onS A.67)

can be imposed for relating the electric or magnetic fieldsin the exterior region to the
magnetic or electriccurrents,respectively. The radiated field expressions, A .Si) and
A.54), now simplify to

'The notation O( 1 /r) is road as \"order of 1 jr.\"


Section 1.4 Surface
\302\246 Equivalence 13

Figure 1.4 Illustration of an infinitesimal


source J = aJa dt and the associatedcoordi-
and
coordinates angles.

E(r) = -<\320\253> V x G{{R) \302\246


n' x E(r')dS' A.68)

H(r) = - \320\224 V x n x H(t')dS' A.69)

implying that only a single current is requiredfor the representation of each field
quantity.
One use of this Green's function is in the calculation of the scattering by a
perfect magnetic conductor (a fictitious material) through the use of equivalent
electric currents.
If the Neumann boundary condition is imposed

nxVxG2 =0 onS A.70)

the resulting field integral expressions are again given in terms of only one current

E(r) = -\320\254/ V2(R)\302\246 h' x H(r')dS' A.71)

H(r) = -jk0Y0<B G2(R) -n'x E(t')dS' A.72)

This Green'sfunction (G2) is useful for calculating the scattering by a perfect electric
conductor using electric currents. Also, this Green's function will be used to relate
the electric field quantities of an interior finite element formulation to the magnetic
field of the
bounding surface.
_ _
The above dyadic Green'sfunctions, Gi and G2, are commonly termed the first

and the second kind dyadic Green's functions, respectively. A good discussion of
dyadic Green's functions used in electromagnetics is given in [10].
The above expressions are for three-dimensional fields. In the case of two-
dimensional fields (e.g., one dimension, such as r, is invariant), similar expressions
are used. These are scalar and typically written in terms of TM. and \320\242\320\225.
polariz-
14 Fundamental Concepts \302\246
Chapter I

ations. For TM- (#- = Ex = Ey


= 0), the electric field on or outsidea contour\320\241
due

to fields on that surface is given by

E. = i t')]dl' A. 73)
~'*<>Z\302\260I H,(r')[G2n(r,
?-<r')|\"^7 G2\302\260(t-r<)]dV

since dEJ'dn= +jk0Z0H,. The corresponding expression for \320\242\320\225.


(?. = Hx =
JJr = 0) polarization is

H. = l G2D(r,r')\\dl' +jkoYoi A.74)


H:(r')\\?;
In A.73) and A.74), the subscript \"/\" denotes the tangential component of the field
along the unil vector i = I, where nxt = z. Also, the two-dimensional Green's
function is given by

G2D(r. r') = -J- H\342\204\242(ko\\r


-
r'|) A.75)

In A.75), #o2>(-)denotesthe zeroth-order Hankel function of the second kind. We


observe that A.73) and A.74) can be reduced from A.46) the potential
by replacing
with
\320\243 E. or H, and making use of the equivalent current relations A.52) and
Maxwell's equations to relate H. For the special
E and cases of the TMr and \320\242\32

polarizations, the first two of Maxwell's equationsimply

H = 2 x V?r, A.76)
-\321\204- \320\225=^!\321\205\320\243\320\257;

the expressions presented in this section are given for currents radiating
All of
in free Fields within
space. a homogeneous media can be determined by replacing Ao
and
with \320\272 Zo with Z in all of these expressions. Throughout this text, will denote
\320\272

the wavenumber in the homogeneous media (k = je^.k0) and Z is the intrinsic


impedance of that material (Z =

IS NATURAL BOUNDARY CONDITIONS

Maxwell's cannot be solved without


equations the specification of the required
boundary conditions interfaces. The pertinent
at material boundary conditions can
be derived directly from the integral form of Maxwell's equations. Specifically, A.25)
is applied to the contour illustrated in Fig. l.5(a) with 5 being the area enclosedby
\320\241
Assuming At is small, 0 and
-\302\273\302\246
\320\224\320\233 At \302\273 A.25)
\320\224\320\233, gives

(H,
- H,) \302\246
/ = [J,
\342\200\242
(n, x /)] \320\224\320\220 A.77)

and in deriving this we set

lim + e2E2] \342\200\242nl=0


^\320\224\320\233[\320\265,\320\225!

which is valid provided eE is finite at the interface. When A.26) is applied to the

same contour in Fig. l.5(a) we find that

- E2) \342\200\242
i = -[M, \342\200\242
x /)] \320\224\320\220
(E, 0?! A.78)
Section 1.5 Natural
\302\246 Boundary Conditions IS

(a)

Infinitesimal volume
enclosed by Sc

Medium B
(E2.H2)

(b)

Figure 1.5 Geometries Torderiving the boundary conditions (a) for tangential
components, and (b) Cor normal components.

where we have again set


lim ^- = \320\236
&h\342\200\224n

implying that fiH is finite at the dielectric interlace.


The conditions A.77) and A.78) can be rewritten in vector form and more
compactly by introducing the definitions

A.79)
= \320\234,-\320\224\320\271
\320\234\342\200\236 A.80)

giving the conditions

\320\273,
\321\205(\320\235,-H2)
= Jit A.81)
16 Fundamental Concepts \302\246
Chapter I

w, x(E, = -\320\234\320\271
-\320\225\320\273) A.82)

The quantities Jft and M,> are referred to as the impressed electric and magnetic
surface current densities in A/m and V/m, respectively, at the interface. Nole that if

E2 and H; are zero, these conditionsare identical to A.52) except that in this case Ju
and \320\234/,refer to actual impressed currents rather than equivalent currents.
To generate the boundary conditions correspondingto A.27) and A.28), we
select St, to be the surface of a small pill box, shown in Fig. 1.5(b), enclosing the
volume V. The pill box is positioned at the dielectric interface so that half of its
volume is in medium I and the other half in medium 2. It is again assumed that
0 so that only its
\342\200\224\302\273\342\200\242
\320\224\320\220 flat surfaces need be considered in performing the integra-
integrations. Through direct integration of A.27) we obtain the interface conditions

= pm A.83)
\321\217,-(<=|E, -e2E2) = A A.84)

where p, denotesthe unbounded electric surface charge density in C/m\" at the inter-
interface and pnvt is the corresponding fictitious surface magnetic charge density in

Wb/m2.
The boundary/interface conditions A.81HL84). although derived for time-
harmonic fields, are applicable for instantaneous fields as well. In the time-harmonic
case, only A.81) and A.82) are required in conjunction with A.23) and A.24) fora

unique solution of the fields.


If we ignore the fictitious magnetic currents and charges appearing in A.81)-

A.84), the boundary conditions are


x - H2) =
\320\231, (H, Jfc A.85)

x (E,
\320\270,
- E2) = 0 A.86)

\321\217,-(/ijH,-\320\2642\320\2352)
= 0 A.87)
=
\320\273, -\342\202\2542\320\2252)
.(\320\265,\320\225, \321\200, A.88)

The first two of these state that the tangential electric fields are continuous acrossthe
interface whereas the tangential discontinuousat the
magnetic fieldsare same loca-
location by an amount equal to the impressed electriccurrent. Unlessa source (i.e., free
charge) is actually
placed at the interface. Jfc is also zero and in that case, the
tangential magnetic fields will be continuous across the media as well.
When medium 2 is a perfect electric conductorthen E2 = H2 = 0. In addition,

Mix and p,m vanish and A.81)\342\200\224A.84) reduce to

=Jft
\320\233|\320\245\320\235, A.89)

/51 xE| =0 A.90)


= 0 A.91)
\320\231,
\342\200\242(/i,H,)

\320\273,
=
.(\320\265|\320\225,) \321\200\320\273 A.92)

The first two of these now imply that Ihe tangential electric field vanishes on the
surface of the perfect electric conductor whereas the langential magnetic field is

equal to the impressed electric surface current on the conductor.


Section 1.6 \302\246
Approximate Boundary Conditions 17

1.6 APPROXIMATE BOUNDARY CONDITIONS

In the previous section, the boundary conditions which must be imposed at the
interface of different dielectrics were presented.Sometimes,it is difficult to utilize
these conditions sinceexcessive computational cost is required or the resulting for-

formulation is numerically unstable such as the case of a thin dielectric sheet. In many
cases, much simpler approximate boundary conditions that account for the presence
of an inhomogeneous medium, coaled metallic surface, or a thin dielectric layer can
be employed to simulate the actual surface. Below we discuss two types of such
approximate conditions:impedance boundary and sheet transition conditions. The
interested reader is directed to [II] for a general treatment of approximate condi-
conditions.

1.6.1 Impedance Boundary Conditions

The most widely applied approximate conditions are referred to as the

Standard Impedance Boundary Conditions (SIBC) or Leontovich Boundary condi-


conditions. It is derived by considering a plane wave impinging upon a material half-space.
Consider a material-air interface which corresponds to the = 0
\321\203 plane. The SIBC
takes the form

E. = -i)Z0Hx, Ex = rjZaH; A.93)


where the free space impedance is given by Z()
= vTW^o and the normalized ma-
material characteristic impedance is rj. An important concept to understand is that these
conditions are appliedslightly above the interface (assuming a plane wave originat-
in
originating the upper half-space) at =
\321\203 0+. Combining these two conditions, the vector
form of the SIBC is given by

\321\205\320\235 A.94)

where the outward directed unit in Fig. 1.6. This form


normal, is
\302\253,of the shown
SIBC is not restricted to a particular [as is the case interface
with A.93)] and is
commonly applied to convex surfaces such as a sphere, cylinder, etc.
All of the quantities used in A.94) are familiar and well defined except for the
normalized impedance, rj. One means of deriving this quantity is to demand that the

reflected field attributed to A.94) is identical to that due to the natural boundary
conditions. Then,
nrr

A.95)

This is exact for an infinite planar interface while it is approximate for a curved

boundary provided that

\\lm{yfcift\\kt,r,-\302\273l A.96)

where Im(-) denotes the imaginary part of the complex argument and the principle
radii of curvature, is associated
/\342\200\242\342\200\236 with the surface at a point.This condition assures
that the material is sufficiently lossy so that the fields which penetrate into the
material does not re-emerge at some other point.
18 Fundamental Concepts \302\246
Chapter 1

Impedance
surface

(a)

Impedance
surface

(\320\253

Impedance
Dielectric surface
(e,M) \320\273
coating

Perfect conductor

(c)

Figure 1.6 Simulation of dielectric boundaries and coatings with SIBCs.

For a coated conductor,the choice of typically


\321\206 is found by considering a

shorted transmission line model with length corresponding to the coating thick-
thickness, t:

A.97)

However, this condition is derived at normal incidence and deteriorates at oblique

angles with increasing inaccuracy for thicker coatings.


Section 1.6 \302\246
Approximate Boundary Conditions 19

The SlBCs can be applied for modeling surfaces whose material properties vary
slowly in the transverse plane. For a planar interface, the coating can have a varying
composition in the normal dimension, and Rytov [12] found the following
impedance
I \320\257 1

is where N = y/]Z^~r
useful is the index of refraction and the normal derivative is
applied at the surface.
More accurate approximate conditionscan be developed by incorporating
higher order derivatives in their constructions. These are referred to as Generalized
Impedance Boundary Conditions (GIBCs), and these are discussedin [11].

1.6.2 Sheet Transition Conditions

A thin dielectric layer may be replaced with an equivalent sheet model to


simplify the analysis. Consider a thin dielectric layer with thickness r, as shown in
Fig. 1.7. This layer has conductivity a and will support a volume current density

J = <tE A.99)

where E is the electric field within the layer. For r A.,


\302\253: this volume current may be
replacedwith an equivalent sheet current (having units of A/m)

J., = rJ A.100)
and from A.99)

E=^ = Z0ReJ, A.101)


zcr

This condition is referred to as a resistive sheet transition condition which only


supports a single surface current, J,. The parameter /?,, is the normalized resistivity

of the sheet and is measured in Ohms per square.


The electric field was assumedto be tangential to the sheet in the above deriva-
derivation. A more general expression for the resistivesheet condition is given by
ft x (n xE) = -Z0/i,J, A.102)
Furthermore, it is desirable to utilize fields just outside or inside the sheet surface.
Sincethe tangential electric field is continuous across the sheet

n x x + =
[\320\231 (E+ \320\225-)] -22\320\223\342\200\236/\320\263\320\233

Figure 1.7 Simulation of thin dielectric layer ///////////^r.


with a sheet condition.
20 Fundamental Concepts \302\246
Chapter I

The superscripts denote


\302\261 the above and below the sheet, and the intro-
fields just
introduction of the second equation in is necessary
A.103) to maintain equivalencywith
A.102). The natural boundary conditions may be used to rewrite A.103) as

x [n
\302\253 x (E+ + E\]")= -2ZaR,.nx (H+
-
H)
, A.104)
n x (E+ - E~) = 0
As long as the loss in the layer is sufficient to assure that no multiple field penetra-
penetrationswill occur, these resistive transition conditions may be used for curved layers.
The dual to the resistive sheet condition is the conductive sheet condition which

supports only a surface magnetic current. It is given by

+ H\]") = 2 Y0Rmn x (E+


- E~)
\320\273\321\205[\320\271\321\205(\320\235+
(LI\302\2605>

The normalized conductivity of this sheet is denoted by Rm with units Mhos per
square. This condition is requiredfor the simulation of materials which have non-
trivial permeability. Also, a specialcombination of coincident resistive and conduc-
sheets
conductive with respective resistivity and conductivity

yields the as an impedance sheet with


same result impedance ij and A.106) implies
that 4ReRm This set of sheetsare
= 1. useful in simplifying the analysis of a planar
impedance sheet since coplanar resistive and conductive sheets are uncoupled.
The resistivity of a dielectriclayer can be determined by considering the equiva-
equivalentpolarization current

J=jk0Y0(er-\342\204\226 A.1071
and A.100). It follows that the tangential components of the field are given by

E, = Z0ReJ,, A.108)
with

R,.= . ^ n A.109)
koz(er- I)
A dual conductive sheet is given by

More accurate representations can be formed using Generalized Sheet Transition


Conditions (GSTCs) [11] which incorporate higher order derivatives in their con-
construction.

1.7 POYNTING'S THEOREM

The quantity (\"*\" indicates complex conjugation)


Section 1.7 \302\246
Poynling's Theorem 21

is known as the complex Poynting vector and has units of Watts/m2. It represents the
complex power density of the wave, and it is therefore important to understand the
source and nature of this power. To do so, we refer to A.21) and A.22), where by

dotting each equation with E and H\",we have


E V x H* = j; E -/WE* E =
\302\246
j; \342\200\242
E -./W|E|2 A-\320\2302)

H* \342\200\242
V x E = -M/ \342\200\242
H\" -jcoixH
\302\246
H* = -M' \342\200\242
H* ->\320\274|\320\235|2 A.113)

From the vector identity [6]

V.(ExH*) = H\"VxE-E-VxH* A.114)


we then obtain

V \342\200\242
(E x H*) =yW|E|2 ->m|H|2 - j; E - M,
\302\246
H* A.115)

which is an identity valid everywhere in space. Integrating both sides of this over a
volume V containing all sources, and invoking the divergence theorem yields

A.116)

which is commonly referred to as Poynting's theorem. Since Sc is closed, based on


energy conservation one deduces that the right hand side of A.116) must represent
the sum the power stored or radiated,
of escaping, i.e.. out of the volume V. Each
term of the volume integral of A.116) is associatedwith a specific type of power but
before proceeding with their identification, it is instructive that e* be first replaced by
eoer+Jw- Equation A.116) can then be rewritten as

(E x \320\251 ds =
\342\200\242
Pei + Pmi -Pd A.117)
s,
X- 1\321\202<\320\246
(E x H*) \302\246
ds
-
lo^W, - Wm\\
-i Im f f [j;
\342\200\242
E + Mr H*]dv (I.I 18)
f

where
1 f f f
Pej
\342\200\224
\342\200\224r\\ Re(J*
\342\200\242
E) dv \342\200\224
averaging outgoing power due to
J J Jv current J
the impressed A.119)

^mi
\342\200\224
Re(M,
\342\200\242
H*) dv \342\200\224
average outgoing power due to the
~~Z)\\\\\\
\342\200\242'\302\246'\342\200\242'v
impressed current M, A.120)

= - <r|E|2rfi)= in \320\232 A-121)


Pfl average power dissipated
2 v

Wc =
- I I I ener|E|2 dv = average electric energy stored in \320\243 A.122)

Wm =2 /^(i^rlHI2^ = average magnetic energy stored in V A.123)


22 Fundamental Concepts \302\246
Chapter 1

The time-averaged power delivered to the electromagnetic field outside V is clearly


the sum of />,., and Pml, whereas Plt is that dissipated in V due to conductor losses.
Thus, we may consider

(ExH')-rfe A.124)

to be the average or radiated power outside V if a is zero in V. Expression A.118)


gives the reactive power, i.e., that which is stored within V and is not allowed to
escape outside the boundary of S,..

1.8 UNIQUENESS THEOREM

Whenever one pursues a solution to a set of equationsit is important to know a


priori whether this solution is unique and if not, what are the required conditions for
a unique solution. This is important because depending on the application, different
analytical or numerical methods will likely be used for the solution of Maxwell's
equations.Given that Maxwell's equations (subject to the appropriate boundary
conditions) yield a unique solution, one is then comforted to know that any con-

convenient method of analysis will yield the correct solution to the problem.
The most common form of the uniqueness theorem is: In a region \320\243
completely
occupied with dissipative media, a harmonic field (E. H) is uniquely determined hy the
impressed currents in that region plus the tangential components of the electric or
magnetic fields on the closed surface Sc bounding V. This theorem may be proved

by assuming for the moment that two solutions exist, denoted by (E],H|) and
(Ei, Hi). Both fields must, of course,satisfy Maxwell's equations A.21) and A.22)
with the same impressed currents (J,, M,). We have

V x H, = J, +>eE,. V x H2 = J, +jaxE2
V x Ei = -M/ \342\200\224j<afiHlt V x E2 = -M, -jto(iH2

and when these are subtracted we obtain

A.126)
A.127)

where E' = E| H' = H|


\342\200\224
E2 and
H2. To prove the theorem it is then necessary
\342\200\224
to
show that
(E', H') are zero or equivalently. if no sources are enclosed by a volume V,
the fields in that volume are zero for a given set of tangential electric and magnetic
fields on Sc.
As a corollary to the
uniqueness theorem, it can be shown that if a harmonic

field has a zero tangential or magnetic field on a surface enclosing


electric a source-free

region V occupied by dissipative media, the field vanishes everywhere within V.


The usual proof of the uniqueness theorem can be found in many electromag-
texts
electromagnetics (see, for example, [5]).
Section 1.10 \302\246
Duality Theorem 23

1.9 SUPERPOSITION
THEOREM

The superposition theorem states that for a linear medium, the total field intensity

due to two or more sources is equal intensities attributed to


to the sum of the field
each individual source radiating independent of the others. In particular, let us
consider two electric sources J, and J2. On the basis of the superposition theorem,
to find the total field caused by the simultaneous presence of both sources, we can
considerthe field due to each individual source in isolation. The fields (E|, H|) due to

Ji satisfy the equations

VxH, = J, +./WE, A.128)


A.129)

and the fields corresponding to J2 satisfy

A.130)
A.131)
By adding these two sets of equations, it is clear that the total field due to both
sources combinedis given by

E=E,+E2, H = H, +H, A.132)


where Hi)
(\320\221;,
and (Ej, are
\342\204\226>) obtained by solving separately A.128\320\235\320\25329) and

A.1 1.131),
\320\227\320\236\320\235 respectively.

1.10 DUALITY THEOREM

The duality theorem relates to the interchangeability of the electric and magnetic
fields, currents, charges, or material properties. We observe from A.3) and A.4) that
the first can be obtained from the second via the interchanges

M->- -J

HE:HE

Similarly, A.4) can be obtained from A.3) via the interchanges

J-> M

The duality theorem can reduce formulation and computational effort when

one is able to invoke it for a particular application.


24 Fundamental Concepts \302\246
Chapter 1

1.11 NUMERICALTECHNIQUES

For numerical solutions, all governing equationscan be written in operator form as

?u-f =0 A.135)
subject to appropriate boundary or transition conditions

B(u) = 0 A.136)
within the domain (fi) and on its boundary (Sr = dQ). In these, the operator ? is
based on oneof the following: an integral representation of the fieldssuch as (I.52)-
A.53), on the vector wave equation A.30), or the Helmholtz equation A.36) for
scalar fields. It is understood that \320\270
must be replaced by a vector field u when dealing
with the vector wave equationsA.30) or A.35). The forcing function/ is a known

excitation function while or u


\320\270 is the unknown quantity. Throughout this text, or u
\320\270

will denote a field or current density.

Unfortunately, very few analytical solutions for A.135) are available in elec-
electromagnetics. One such solution, the fields due to a magnetic dipole in the presence
of an infinite metallic plane or cylinder, will be used in Chapter 7 to form the
appropriate dyadic Green's function for those geometries.However, most useful
electromagnetic scattering and radiation problemscannotbe solved using analytical
methods. Rather, an approximate numerical solution is sought which in some way
closely resembles the exact solution. Two methods of formulating such an approxi-
approximatesolution are: the Ritz method and the method of weighted residuals.

1.11.1The Rib Method

The Ritz or Rayleigh-Ritz method2 [13, pp. 74-78],[14,pp. 13-63] seeks a

stationary point of a variational functional. For operators which are self-adjoint and
positive-definite (see later subsection for definitions), the stationary point of the

following functional

F(u) = \\{?u,u)-(f.u) A.137)


is an approximate solution of A.135). In A.137), the inner product over the domain \320\277
(volume, surface, or contour) of the two functions is defined as

{a.b)= ahdQ. A.138)


\\
Jn

or for vector functions

<a.b)= f
abdu A.39)
.In

The choiceof this inner product extends the validity of the varialional expressions to
vectorial fields. When the operator Cit and/ in A.137) are chosen as

2The method was originally introduced by Ritylcigh in 1877 and was extended by Ritz in 1909.
Section 1.11 Numerical
\302\246 Techniques 25

i = A.140)
Vxl-l-=-=|-*fcru
Mr J
/\\\321\217\\

A.141)

it can be shown that setting the first variation of F(u) to zero is equivalent to

satisfying the vector wave equation A.30) over the computational domain ?2.
Similarly, when
A.142)
setting the first variation of F(u) to zero is equivalent to satisfying the inhomo-
geneous Helmholtz wave equation

V2U + k20u=/ A.143)


To show that setting the first variation of F(u) to zero is equivalent to satisfying
the Helmholtz equation A.143), we begin by rewriting the functional (we assume a
two-dimensional domain)

+ klu]udn- f f fudu A.144)


n[V2tl
q J Jn
as

F(u) = \320\250 (-V\302\253


\342\200\242
Vm + k2ou2)dQ
+\\
\320\257
\302\253|j
dl-U fudQ A.145)

in which Vm \342\200\242
=
\321\217 denotes
\320\241
\320\254\320\270/\320\264\320\277, the contour enclosing the region ?2 (see Fig. 1.8)
and h is the unit normal vector to Note
\320\241 that in deriving A.145) we used the
identity

=
V\302\253) -Vm \342\200\242
V^ + V \342\200\242
(ifVu) A.146)

and the divergence theorem

=| Vu-nds A.147)
a ic
Next we proceed to evaluate the first variation of F(u) given by

SF = -
F(u + \320\220\320\270) F(u) A.148)

Figure 1.8 Illustration of (he region \320\277


and
the enclosing contour \320\241
26 Fundamental Concepts \302\246
Chapter 1

where 0 is
-*\302\246
\320\224 a scalar quantity. The evaluation of SF involves the quantities

= \320\2702
+ \320\220\320\270I
(\320\274 + (\320\224\320\270J
+ 2(\320\224\320\270)\320\275 A.149)

[Vh +
\342\200\242
V( \320\224\320\270)]
[Vm + V( Aw)]
= Vm \342\200\242
Vm + 2V( \342\200\242 \342\200\242
Vw + V(\320\224\320\270)
\320\224\320\274) V(\320\233\320\270)A150)

\320\264\320\270
3(\320\224\320\274) \320\264\320\270 \320\255(\320\224\321\213) .
\320\264 \320\220

These can be simplified by neglecting the last tenn of each expansionwhich is of


order \320\2242.
Doing so yields the approximation

F(u + % \342\200\242
Vm + klu2] dU - fudU
\\ f
\320\224\320\270) [-V\302\253 f [
2JJq JJq
9m
\342\200\236

\302\246
Vm
+ \320\224 f [-mVm + katr]dQ
I

When this expression is comparedwith that for F(u) in A.145), we have

F(u + \321\212
F(u)
\320\220\320\270) + A
[ [
m[V
\342\200\242
Vk + k\\u -f] </?2

-\320\233 \320\270*\320\250 +
+ A.153)
Jc dn 2 Jcl
\302\2611\\\320\270? \320\270?\\\320\260
\320\255\302\273
BnJ

where we also used the divergencetheorem A.147)and the identity A.146) to obtain

the second and third terms. Clearly, the last two terms in A.153) cancel each other
leading to

SF = F{u + -
\320\224\320\260)F(u)

= \320\224
f f h[V2m + klu -JidQ A.154)

Thus, setting SF = 0 implies that V2m + klu =f provided \320\224\320\274


is nonzero. That is,
from A.154) we conclude that the extremization of F obtained by setting
8F
SF = 0 or -r- = 0
\320\260\320\270

is equivalent to enforcing the Helmholtz wave equation over the domain In


\320\257.

practice, the condition SF \342\200\224


0 is enforced by setting
-
dF
~
F{u + \320\220\320\270)F(u) = 0 A.155)
\320\264\320\270 \320\220\320\270 \320\224\320\274
\320\273_\320\277 \320\273\342\200\224\320\276

i.e.. by settingto zero the derivative of the functional with respect to u.


Having established the equivalence between SF = 0 and the Helmholtz equa-
equation we can proceed with the discretizationof F(u) and SF to obtain a discrete system
of equations. The discretization begins with the trial function, ii, expanded in terms

of N basis functions
Section 1.11 Numerical
\302\246 Techniques 27

= <\320\230\302\273\320\223<\320\235'} A.156)
\320\233\">
./=1

where iv, are the basis functions and are


\302\253,
the unknown expansion coefficients. In
A.156), column data vectors are denoted with while
(\342\200\242} row data vectors involve a
transposition {}r. Substituting A.156) into A.137), the functional becomes

[w)fda A.157)
JJ
where we used the innerproduct definition

=
}. (\302\253}) [u)T{v]

for discrete data vectors. This functional is extremized by allowing all partial deri-
derivatives with respect to the coefficients, to
{\320\274}, vanish

A.158)

A single equation is obtained by differentiating with respect to each For


\321\211.

/=1,2 , N we obtain N equationswhich can be written as a matrix system

[A][u) = [b\\ A.159)

The elements of the matrix [A] and excitation vector \\b\\ are given by

A.160)
\\\\
b, = w,fdQ

A word of caution: electromagnetics differs from other branchesof engineering in

that no physical significance can be attached to the stationary point of the functional
A.137). In mechanical systems, for example, minimizing this functional represents
minimization of the total potential energy of the system. However, since electromag-
involves
electromagnetics complex quantities, such a statement may not be asserted.

1.11.2 Functionate for Anlsotropic Media

In three dimensionswith anisotropic media [15, 16] the appropriate operator is


of the form

U = E
?(u)=(Vx(^ \302\246Vx\302\273)-|1\302\260' A.161)

with the corresponding source function given by

- x \321\204.~[\342\200\242 u = E
I ->^oJ
V M),
f = A ]62)
+ Vx(?;'.J), u = H
I ->e0M
28 Fundamental Concepts \302\246
Chapter 1

The associated functional to be extremizedis then of the form (for u = H)

F(H) = lf [VxH-f;'-VxH-^H-^H]f/r- f
H(dV A.163)

where

\320\241\321\203\321\205
\342\202\254\320\243\320\243

are the permittivity and permeability tensors of the media and \320\257
represents a
volume. In general, for arbitrary anisotropy, this functional will lead to an asym-
asymmetric (non-Hermitian) system. One way to obtain a symmetric system is to use the
functional

= <?u, ua)
- (u. fa) -
(u0, f) A.165)
where ua and f(l satisfy the partial differential equation

A,ua =./\342\200\236 A.166)

in which Ca is the adjoint operator to ?. That is,

)= <v,?eii> A.167)

1.11.3Method of Weighted Residuals

The method of weighted residuals [17], [18] begins with the residual

A.168)

and seeks a solution for = \320\270


\320\271 by satisfying the condition H = 0 within the domain
In
\320\277. general, such a solution cannot be achieved at all points in \320\257.
Instead, it is
more practical to find a solution which satisfies the residual condition in some

average or weighted sense over jV subdomains of ?2, viz.

[t,C[w)r{u)
-
tif] d?l=0, i= 1.2,3 N A.169)
f
Jo

In general any testing function, r, (also referred to as trial or weighting functions),


may be used; however, since these functions modify the enforcement of the boundary
conditions throughout the domain, the choice of testing functions affects the quality
of the solution A.169). One popular testing procedure is called collocation or point
matching. In the testing or weighting
this, function is a Dirac delta function,
= S(.x implies enforcement of the boundary conditionsonly
\342\200\224which at discrete
t/ .X/).

points (e.g., xh i = 1,2,3 Another


\320\2330- popular choice is termed Gaterkin'spro-
When
Gaterkin'sprocedure. employing the Galerkin's testing procedure, the testing function is

identical to the expansion function used in A.156), e.g., /,


= 117 and the weighted
residual equation is given by

I
f
w,C\\Vjdn\\ {u)=\\ w,f d& A.170)
Section 1.11 Numerical
\302\246 Techniques 29

which is identical to the Ritz procedure given above. Thus. Galerkin'smethod leads
to the same linear system A.159)as the Ritz method.
As a generalization, when F(u) is chosen as

= i
F(M) (?\302\273.\302\253*)-(/,\302\253*) A.171)

where the \"*\"


indicates complex conjugation, the extremization of F{u) leadsto a
linear system that is identical to that obtained from the weighted residual method
with /,= w*.

1.11.4 Vector and Matrix Norms In Unear Space

A norm is a real valued function that of the size or \"length\"


provides a measure
of a multicomponent mathematical quantity
such as a vector or a matrix. It is usedin
numerical analysis to provide a measure of how well a given vector approximates the
exact solution.For matrices, norms provide a single value to quantify the \"size\" of

the matrix [A].They are often used in evaluating the numerical system's condition,
which in turn affects the stability of the solution. That is, of interest is how a small
change in the excitation or right hand side of the matrix system (I.I59) affects the

solution data vector.

1.11.4.1 Vector Norms


Euclidean Norm. The most popular form for a given discrete data vector

=
{\320\270} {\302\253|,Mi-U3,...,uN)T is the Euclidean norm. It is defined by3

llulh = ,J = !\"}>
((\302\253}.
= MT[u) A.172)

for real valued [u] and by

112
= = }r[u*
{u}r[u*\\ A.173)

for complex vector (\320\270).Here \\ut\\ implies the absolute value of the quantity.
Throughout the book, the notation will
f|\302\253|| imply the Euclidean norm of a vector
or data column unless otherwise noted.
Infinity Norm, The infinity norm of a data vector is defined by
=
||\320\270||,*, max of |m,| \\<i<N A.174)
This norm is also referred to as the uniform vector norm or maximum magnitude norm.
H6lder Norm. The Holder or p-norm is a generalization of the Euclidean

norm and is defined as

| {>p

(\320\25375)
X>
In
where denotes
|\321\213/[/* the /?th power of the quantity \\u,\\.

JHere u is a simpler notation for |\302\253|.


Fundamental Concepts \302\246
Chapter 1

1.11.4.2 Matrix Norms


Frobenius Norm. The Frobenius matrix norm is a generalization of the

Euclidean vector norm. For a square matrix [A], it is given by

\\\\A\\\\F
= (LI76)

the (l,j) of the [A] matrix.4 Note


where Al} denotes entry that A.176) can be general-
generalizedto nonsquare matrices and to the/?-matrix norms.

Matrix Infinity Norm. The infinity or uniform matrix norm is denned by

= \321\202\320\272
\320\276\320\223 (\320\25377)
\\Y,\\Aii\\\\

This specific norm is also referred to as the row-sum norm. Similarly the column-sum
matrix norm is given by

\\\\A\\\\X
= max of f
53\\\320\220-\320\233
A.178)

The infinity norm is referred to as the natural norm of [A]. It can be shown that

\\\\A\\\\X
= max(||M](u}||) = max of 0-179)
\\Y,\\*A

For Hermitian matrices A.178) is identical to A.177).


Matrix Condition Number. The condition of a matrix is related to the

natural norm of [A] as

Cond(^) = \\\\A\\\\\\\\A-11|
>
?dh1 A.180)

where \\\\A~l1| refers to the natural norm of the


of [A] (the Frobeiiius norm is inverse

not a natural norm). Here A.raax the maximum and minimum


and Amjn denoteeigen-
eigenvalues of the matrix, respectively. Since |A.nm| is a lower bound for the natural norm
of [A], the ratio |Amax|/|A.min| gives a conservative estimate for the condition of [A].
As an example of the importance of Cond(.A),let us assume that due to
truncation or arithmetic errors. [A] is instead approximated by [\320\220\320\264].
A computer

with / decimal digits of accuracy gives

\\\\A-AA\\\\ .
= 10\"
1\320\230\320\230

''Here we use the notation \320\246.\320\224\320\246


= IIMlII, i.e.. the calligraphic capital letter implies a matrix. Such a

notation will be used later in Chapter 9.


Section I.II Numerical
\302\246 Techniques 31

which is a measureof the normalized error in approximating [A]. If the condition of


the matrix is \321\201
= Cond(.4), the corresponding (normalized) error in the computed
solution [u] will then be [19, 20]

||u \342\200\224
uAll
,0-,

where

That is, s is always /, implying a larger error for the final solution
smaller than
vector. More specifically, in the seventh decimal place for the norm
an error of [A]
(i.e.. / = 7) translates to an error in the third decimal place for (the norm of) the

solution vector when the matrix condition number is 104. Alternatively, if


CondM) = 1, the solution vector error is of the sameorder the same decimal (i.e.,to
place) as the matrix error itself.

1.11.5SomeMatrix Definitions

1.1 we give the mathematical and descriptive definitions


In Table of matrices

often encountered in numerical analysis.


Additional definitions can be found in several numerical mathematics books
(e.g., see [19]and [20]). As can be understood, the operators which generated the
matrices in Table 1.1 also carry the same definition. That is, an operator is referred
to as Hermitian or self-adjoint if the resulting matrix is also Hermitian.

TABLE t.l Definitions of Matrices Often Encountered in


Numerical Analysis

Mathematical Statement Descriptive Statement

[A]r = [A] Symmetric


[A] = [A'f Herraiiian (self-adjoint)
\\A) = -]A*f Skew Hermitian
= \\A*\\T
[\320\233]\021 Unitary
[Ar][A] = [J] Orthogonal
[/] = identity matrix
[A')T[A] = [A][A'}' Normal
Ay
> 0 Positive-definite
A/j
> 0 Nonncgativc or positive scmidclinitc
= 0 for i ?\321\214
\320\220 / Diagonal
\320\273
= 0 for / >j Upper triangular
Asj
A,j
= 0 for i >j Strictly upper triangular

A,, > Y^,\\Ai,\\ for all i Diagonally dominant

> 0.
|u| 0
|\320\270)\320\263[\320\233]|\321\213|
\321\204 Positive-definite
>
0.
(u| 0
|\320\270}\320\263\320\230]{\320\270|
\320\244 Nonnegaiive
7'rMI\302\253)) <0 Indefinite
32 Fundamental Concepts \302\246
Chapter

TABLE 1.2 Some Common Relationships Between Operators


and Eigenvalues

Operator Type Eigenvalue (X.) Properties

Hcrmitian Real eigenvalues and >0

Unitary Eigenvalues on unit circle


Skew Hcrmiiian Eigenvalues on imaginary axis
Positive semidefinite Eigenvalues >0
Positive-definite Eigenvalues >0
Indefinite Sonic eigenvalues are >0 and some arc <0

Given the natureof the malrix or operator, we can immediately make a state-
statement about the eigenvalues of that operator. Some of the most common relation-
relationshipsbetween operators and eigenvalues are given in Table 1.2.

1.11.6 Comparison of SolutionMethodsand Their Convergence

The Rayleigh-Ritz and Galerkin's methods are standard solution approaches


for solving differential equations arising in practical engineering problems. Both
methods project a continuous space onto a finite separable Hilbert space.5 The
mathematical problem is then rephrased to seek a discrete solution set whoseentries
are the coefficients of the expansion. The premiseof each method is summarized in
Table 1.3. The third entry in the table is referred to as the Least Squares Method and
is generally more robust than the other two, as will be noted later.
It was shown in Section l.H.3 that Galerkin's method is equivalent to the
Rayleigh-Ritz approach when the first variation is set to zero. At the heart of this
statement is the assumption that the operator \320\241
(or the resultant discrete matrix \320\233) is

positive-definite. Unfortunately, in most practical problems in electromagnetics,

particularly as k0 becomes larger, the operators


= Vh2 - klu and
\320\241\320\270 = V x
\320\241\320\270

(/li^'v x u)
\342\200\224
kleru do not guarantee positive-definiteness. That is, if the operator
is not positive-definite, the Rayleigh-Ritz method fails to ensure minimization of the
functional since a global stationary point may not exist. However, the application of
Galerkin's method to yield a discrete system does not require that the operator is

positive-definite or even symmetric.6 The resultant solution is simply a stationary

point which is not guaranteed to be a minimum.

TABLE 1.3 Mathematical Statement of Solution Methods

Method Mathematical Statement (subject to boundary conditions)

minimize -
Rayleigh Ritz \320\270)(/'.\302\253)
\\ {\320\241\320\270,
Galerkio solve - (/.
(\302\253\320\241\320\270. = 0
ttj) \302\253/>
\342\200\224
Least Squares minimize Q{u) = {\320\241\320\270
\320\241\320\270
\342\200\224f. f)

'Hilbert space refers to a linear space where a given intcrproduei has been defined and which is
complete with respect to this interproduct.
6Because of the complex t, and y.,, in electromagnetics, the operators may be symmetric but not
Hcrmitian (i.e., self-adjoint).From Section 1.11.5, Hermitian operators have positiveand real eigenvalues
and are therefore positive.
Section 1.11 Numerical
\302\246 Techniques 33

In some cases, the problem statement is that of minimizing a functional and


consequently the Rayleigh-Ritz procedure is the natural method of choice. Examples
include problems associated with system energy minimizations and resonance
computations. However, in solving the Helmholtz or vector wave equations (subject
to given boundary conditions), the minimization of the functional must simul-
simultaneously imply a solution of those equations. For those cases, Galerkin's method
is the appropriate approach for constructing the linear system. However, introduc-
of
introduction an appropriate variational functional can simplify the problem statement
when dealing with boundary conditions other than Neumann-type (also referred
to as natural conditions). As seen from the derivation given in Section 1.11.1, the
presence of the boundary integral term provides a direct means for imposing the
Dirichlet, impedance, or other type of boundary conditions. In the case of Galerkin's
method, the boundary are introduced after application of the divergence
terms
theorem. FinaJJy, we note some cases a variattonaJ (minimization)
that in statement
of the boundary value problem may not be possible. In those cases, Galerkin's
method is the only approach for constructing a linear system of equations.
When dealing with operators that are not positive-definite, apart from the
breakdown of the functional minimization process, most iterative linear system
solvers also break down. Specifically, convergence of the approximate solution \320\27
to the exact solution w cannot be proven [21], [22] without invoking positive-definite-
ness. When in doubt, a positive-definite operator (and a correspondingpositive-
definite matrix) can be generated by instead solving the differential equation

CaCu = C\"f A.182)


where satisfies
\320\241

v) =
{\320\241\320\270, {u, Cv) A.183)
and is referred to as the adjoint operator. Clearly, A.182)is obtained by multiplying
the right and left hand =f by C. The new
sides of \320\241\320\270 operator V, where

Vu = g A.184)
(g = Cf) is now positive-definite and self-adjoint. The corresponding matrix system
resulting from A.184) is of the form

) A.185)
or
[B]M = {g}
where [B] It is thus seen that the desired property of positive-definiteness
= [A*]T[A].
comes at the
price of squaring the matrix condition number. As is well known, large
matrix conditions lead to less accurate solutions and slowerconvergence when an
iterative solver is used.
It should be remarked that minimization of the functional for the Least
Squares Method is equivalent to solving the differential equation A.182).
Consequently,the Least Squares Method leads to positive-definitesystemsat the
expense of squaring the matrix condition. Also,the Least Squares Method minimizes
the square of the norm (as -*\302\246
\320\277 u), viz.
Fundamental Concepts \302\246
Chapter I

Jim ||?a-/||2->0 A.186)

whereas Galerkin's method minimizes the norm

lim -*0 A.187)


N-*oo ||?\320\271-/'||

Again, it is important to note that nothing can be said about convergenceunless the

operator is positive-definite.

1.11.7Field Formulation Issues

The finite element method can assumevarious forms depending on the desired
field quantity. Many applications prefer either a total or secondary electric field
formulation. Other applications desire a result in terms of either the total or second-
magnetic
secondary field. Some applications can utilize a potential formulation. Thus, even
though Maxwell's equations relate these various quantities, an accurate field com-
computation often demands a particular formulation. The advantages and disadvantages
of each of these formulations are discussed below.
The total electric field formulation very popular choice. This is because
is a
enforcement of the
boundary conditions associated with perfect electric conductors
(pec) is particularly easy. Since the tangential electric fields on a pec surface must

vanish, the edges of the mesh associatedwith those surfaces are a priori set to zero.
Three methods are commonly used in practice to enforce this condition. The first is
accomplished by forcing a null field condition to zero out all entries of the matrix
associated with that edge (except for the self-term which is set to unity), and by also
setting the excitation entry to zero. Thus, as the unknown fields are solved, the edges
lying on pec surfaces are forced to zero. The second method involvesa preprocessing
step where the edges associated with a pec surface are removed from the list of
unknowns. Thus, the number of edges greateris than the number of unknowns
and matrix entries for these pec edgesare never computed. This approach has the
advantage of reducing the order of the matrix and therefore reducing memory and
compute cycle demands. The third method, useful when an iterative matrix solver is

employed, involves forcing the unknowns associated with the pec edges to zero

during each iteration.


Thus, use of a total electric field formulation and edge-based elements(see
Chapter 2 for a discussion on edge-elements) reduces the order of the matrix and

computational burden. However, this is not the only valid formulation and in certain

circumstances, a scattered field or a magnetic field formulation may be preferred


Scattered (or secondary) field formulations are usedto simplify the use of absorbing

boundary conditions (see Chapters 4 and 6). However, they also have an added
advantage when a boundary integral is used for mesh closure. Experiencehas
shown phase errors in the computed
that interior field tend to increase within the

mesh locations
at distant from boundaries on which boundary conditions arc

imposed. This is due to unavoidable numerical inaccuracies t hat increase as the effect

of the boundary conditions propagate throughout the mesh. That is, previous errors
in the adjoining field are incorporated and magnified as the field is evaluated at a

more distant field point. Since boundary conditionsalways are enforced with total

fields, the total field formulation enforces such conditionsonly on the boundaries of
References 35

the mesh and pec surfaces (for E-field formulations). For very large computational
domains,significant distance can lie between a field point within the interior of the
mesh and the mesh boundary.Hence,the potential for error propagation throughout
the mesh. A scattered field formulation enforces the boundary conditions on the
mesh boundary, pec surfaces, and dissimilar material interfaces. Therefore, the dis-
distance between a boundary condition and any interior field point is reduced and

accordingly the phase error throughout the mesh may be also reduced.The scattered
field formulation has the disadvantage of higher matrix order (i.e., more unknowns
and equations) and explicit enforcement of the boundary conditions associated with
pec surfaces and material discontinuities.

Magnetic field formulations are also possibleand can be obtained by applying


duality to the corresponding electric field equations. A total magnetic field formula-
has
formulation the same advantages for a fictitious perfect magnetic conductor (pmc) as the
total electric field has for pec surfaces. A scattered magnetic field formulation also
can reduce phase error propagationin the same manner as the scattered electricfield
formulation.

Magnetic field formulations are preferred for applications where the desired
result is the magnetic field within the computational domain. This is due to the fact
that although Maxwell's equations relate the electric and magnetic fields, in practice
one quantity cannot be accurately obtained from the other by numerical differentia-
This
differentiation. is due error occurring when
to the inherent continuous derivatives are
replaced with discrete Rather, a computationally expensive integral
differences.
expression is necessary for accurate field differentiation provided a suitable
Green's function is available. Hence, an accurate solution demandsa formulation
consistent with the desired result.
Finally, some finite element practitioners utilize a potential formulation which
employs the scalar or vector potentials as the unknown quantities (see Chapter 5).
The use of this approach is related to the hybrid finite element-boundary integral
method where the singularity of the integral equation associated with the boundary
can be reduced with a potential formulation. However, if this reduction is present,
we note that a numerical differentiation operation may be requiredto obtain the
desired field quantity and this operation may lead to inaccuracy.

REFERENCES

[1] J. A. Stratton. Electromagnetic Theory. McGraw-Hill,New York. 1941.

[2] R. E. Collin. Field Theory of Guided Waves. IEEE Press, New York, 1991.
[3] C. A. Balanis. Advanced Engineering Electromagnetics. McGraw-Hill,New
York, 1989.

[4] D. S. Jones. Methodsin Electromagnetic Wave Propagation. IEEE Press, New


York, 1994.
[5] R. F. Harrington. Time-Harmonic Electromagnetic Fields. McGraw-Hill,New
York. 1961.

[6] \320\241. Tai.


\320\242. Generalized Vector and Dyadic Analysis. IEEE Press, New York,
1992.
36 Fundamental Concepts \302\246
Chapter I

[7] J. A. Kong. Theoryof Electromagnetic Waves. Wiley InterScience, New York,


1975.
[8] J. D. Jackson.
Classical Electrodynamics. Wiley InterScience, New York, 1975.
[9] J. van Bladel. Electromagnetic Fields. HemispherePublishing Corp.,New York,
1985.

[10] \320\241. Tai.


\320\242. Dyadic Green's Functions in Electromagnetic Theory. IEEE Press,
New York. 1994.
[11]\320\242.\320\222.A Senior and J. L. Volakis. Approximate Boundary Conditions in
Electromagnetics. IEE Press, London, 1995.
[12]S. M. Rytov. Computation of the skin effect by the perturbation method.
/. Exp. 1940.Translation
Theor. Phys., 10:180-189, by V. Kerdemelidis and
\320\232. Mitzner,
\320\234. Northrop Navair, Hawthorne. CA 90250.
[13] S. G. Mikhlin. Variational Methods in Mathematical Phvsics. Macmillan, New
York, 1964.

[14] J. N. Reddy. An Introduction to the Finite Element Method. McGraw-Hill,New


York, 1984.

[15] A. Konrad. Vector variational formulation of electromagnetic fields in aniso-

tropic media. IEEE Trans. Microwave Theory Tech., MTT-24:553-559,


September 1976.
[16] \320\241H. Chen and \320\241D. Lieu. The varialional principle for non-self-adjoint
electromagnetic problems. IEEE Trans. Microwave Theorv Tech.. MTT-
28:878-886, August 1980.
[17] R. F. Harrington.Field Computation by Moment Methods. Macmillan. New

York, 1968.

[18] J. J. H. Wang. Generalized Moment Methods in Electromagnetics. John Wiley &

Sons, New York, 1991.


[19] R. L. Burden and J. D. Faires. Numerical Analysis. PWS Pub. Co., Boston, fifth

edition, 1993.

[20] G. H. Golub and F.


\320\241 Van Loan. Matrix Computations. Johns HopkinsUniv.
Press, Baltimore. MD, 1983.
[21] G. Strang and G. J. Fix. An Analysis of the Finite Element Method. Prentice
Hall, Inc., EnglewoodCliffs. NJ. 1973.

[22] D. G. Dudley. Mathematical Foundations for Electromagnetic Theory. IEEE


Press, New York, 1994.
Shape Functions
for Scalar and
Vector Finite
Elements
2.1 INTRODUCTION

The finite element method is used for modeling a wide class of problems by breaking

up the computational domain into elementsof simpleshapes.Suitable interpolation

polynomials (commonly referred to as shape or basis functions) are used to approxi-


approximate the unknown function within each element. Once the shape functions are
chosen, it is possible to program the computer to solve complicatedgeometries by

solely specifying the basis functions. The element choice, however, needs human
intervention and intelligence to ensurea reliable solution of the problem at hand.
As will be shown later in this chapter, the development of a specialclassof elements
which mimic the character of electric/magnetic fields has proved to be the key in
obtaining robust solutions to three-dimensional problems electromagnetics.
in

In this chapter, we will discuss the derivation of node-based and edge-based


shape functions for one-, two- and three-dimensional finite elements. Node-based

shape functions have been used extensively in civil and mechanical engineering
applications as well as in scalar electromagnetic field problems. However, a full
three-dimensional vector formulation brings out numerous deficienciesin these tra-
traditional element shape functions [1], [2]. Edge-basedvector basis functions with
unknowns associated with element edges have thus been derived overcome
to the

problems related to nodal basis and theseare now extensively used for solving three-
dimensionalelectromagneticsproblems.We will also describe the hierarchical nature
of the edge-based functions and their possible applicability in \321\203\321\202-based refinement

techniques.

37
38 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2

2.2 FEATURES OF FINITE ELEMENT SHAPE FUNCTIONS

The polynomials used to interpolate finite element solutions on specific element


shapes have some distinct features over the wide variety of basis functions used in

other partial differential equation (PDE) or integral equation (IE) techniques.

2.2.1 Spatial Locality

Finite element shape have compact support within


functions each element. i.e.,
their scope of influence is limited only to the immediate neighboring elements.This
feature plays a pivotal role in the viability of finite elements over integral equation
(IE) methods. The limited scope of influence for the basis functions is a distinguish-

property
distinguishing of PDE techniques and leads to very sparse matrices in finite elements,
whereas IE techniques give rise to full, dense matrices resulting in poor scalability as
problem size increases.

2.2.2 Approximation Order

The of the approximation depends on the completeness


order of the poly-
polynomialsmaking up the finite element basis functions. Moreover, the form of the
polynomial function must remain unchanged under a linear transformation from
one Cartesian coordinate system to another. This requirement is satisfied if the
polynomials are completeto a specific order such as

m(.y, v) = ct + c2x + c3y + c-tx2 + csxy + cby2 B.1)

or when the extra terms are symmetric with respect to one another, as in the follow-

incomplete
following third-order polynomial
=
\302\253(\320\273-,
\321\203) C[ + c2x + W + +
\320\263**2 Cixy + c6y2 -f c7a:2.v -f- t^xy1 B.2)
Such approximation functions have the characteristic that, for fixed x or y, they are

always complete polynomials in the other variable. In general, we seek expansion

polynomials that will yield the highest order of approximation for a minimum num-
number of unknowns associated with that element shape. The two examples shown above
apply to two dimensions, but their extension to three-dimensional elements is
straightforward. Typically, the higher the order of the approximating polynomial,
the lower the error in the final solution if element size remains constant.As usual,
there is a trade-off here between the desired accuracy and the degrees of freedom
required to solve the problem.

2.2.3 Continuity

The order of the differential equation to be solved determines the order of


shape function to be employed. Functions with continuous derivatives up to the

nth order are said to be C\" continuous. For elliptic PDEs of order 2k (k = 1,2),
the continuity requirement is C*\021 for Galerkm methods. In most electromagnetic
problems,functions which exhibit C\302\260
continuity (i.e., function continuity) are used
since the discontinuous first derivatives are integrable. However, it is difficult to
Section 2.3 Node-Based
\302\246 Elements 39

impose continuity of order 1 and higher since the determination of suitable shape
functions is very complicated. For example, ninth-order polynomials are required to
obtain C1 continuity for tetrahedral elements. In electromagnetics, Wong and

Cendes [3] used C1 node-based triangles to avoid the problem of spurious modes
in the determination of cavity resonances. AH shape functions derived in the follow-
sections
following impose function continuity or C\302\260
continuity (not derivative continuity)
between elements.

2.3 NODE-BASED ELEMENTS

In node-based finite elements, the form of the sought function in the element is
controlled by the function values at its nodes. The approximating function can

then be expressed as a linear combination of basis functions weighted by the

nodal coefficients. If the function values \320\270/


at the nodes are taken as nodal variables,
then the approximating function for a two-dimensional element e with p nodes has
the form

p
v,.v) B.3)

Since the expression B.3) must be valid for any nodal variable uj\\ the basis function
N\"(x.y) must be unity at node / and zero for all remaining nodes within the element.

Shape functions can be derived either by inspection {Serendipity family) or


through simple products of appropriate polynomials {Lagrangefamily). It is easier

and more systematic to construct higher order bases in the Lagrange family while

progression to higher orders is difficult in the Serendipity family. However, Lagrange


shape functions have undesirable interior nodes and more unknowns than

Serendipity shape functions of the same order.

2.3.1 One-Dimensional Basis Functions

One-dimensional are employed for solving


finite elements problems where the
discretization domain involves a curve or a contour around a two-dimensional

structure; for example, the bounding curve of the cross section of an infinite cylinder.
These basis functions can also be used in conjunction with higher-dimensional finite

elements when the modeled structure can be decomposedinto a single dimension


without loss of accuracy.It is most convenient to derive one-dimensional basis
functions in terms of Lagrange polynomials [4]. Let us considera straight line

with endpoints .V| and xi. The basis functions for element e are then defined as
\342\200\224
x
x\\

vf _ ,-e

The basis functions have unit magnitude at one node and vanish at all others with
linear variation between the nodes. Higher order basis functions can be constructed
40 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2

by inserting nodes between the endpoints of the finite element. If \320\273?,


/ = 1,2 \320\273,

are the nodes of the one-dimensional element, and we are interested in finding the
basis function for the /ah node x%. then the corresponding Lagrange polynomial
describing this basis function is given by [4]

{x-x4)(x-x'2)...(x-4-\\){x-xUi)---{x-x>tt)

The basis function defined above is of (n \342\200\224


1 )th order and correspondingly passes
zero n \342\200\224
1 times.
through

2.3.2 ltoo-Dimensional Basis Functions

Two-dimensionalfinite elements have found widespread use in modeling struc-


structures whose third dimension is significantly larger or smaller than the cross section,
thus ensuring little variation in the unknown parameters in this third direction. Two-
dimensional finite elements used to obtain reliableestimatesof three-
have also been
dimensional problems since the computational cost for obtaining two-dimensional
solutions is vastly less expensive than for three dimensions.

2.3.2.1 Rectangular and Quadrilateral Elements. The simple shape of the rec-
rectangular element permits its shapefunctions to be written down merely by inspec-
inspection. On examining the element shape given in Fig. 2.1. the shape functions can be
cast in the form

Figure 2.1 Reeiangular element.


Section 2.3 Node-Based
\302\246 Elements 41

where and
\320\273? y* denote the coordinates of the midpoints of the element edges, l(x and

hy represent the edge lengths and Ae denotes the area of the element. Each basis
function NJ has unit magnitude at the /th node, vanishes at the remaining three
nodes,and varies linearly. Hence it can be used m B.3) to represent u\". Higher order
rectangular elements include the eight node (three equispacednodes per edge)and
the twelve node (four equispaced nodes per edge)quadrilateral element discussed in
[4]. However, these elementscan model only regular geometries and decline in accu-
accuracy with excessive shape distortion. Thus, often they are not very useful in practice.

Irregular geometries can be modeled by using quadrilateral elements which can


also be viewed as distorted rectangles. To construct basis functions for a quadrilat-
quadrilateral
element, we need to use a transformation that maps a quadrilateral element in
the xy plane to a square element in the ?r\\ plane (Fig. 2.2). Such a transformation can
be found by satisfying the following relation at the four nodes of the quadrilateral
element
a- = a + />? + tJ7 -f dt-\321\202) \321\203
= a' + />'$ + -f d'
\321\201'\321\202) B.5)

The unknown coefficients\342\200\224a,b,c,d and a',b',c'.d'\342\200\224are solved by mapping the


four corners of the quadrilateral in the xy plane to the corner points of the unit

square in the tq plane. The eight equations thus obtained are


\342\200\224
)'\\ a'-b'-c'-\\-d'

y2 = a' + b'-c'-d'
+ b' + c' + d'
B.6)
=
.*\302\246, a + b + eH-d. yy
= a'

xA=a-h + c- d, v4
= -h'
\302\253' + c' - d'

On solving for the unknown coefficients in the above equation, the basis functions
can be cast in the following form

/= 1. B.7)
where ?0
= ff, and = and
\321\211 \321\211,

=
\321\203 B.8)

The variables (&, rj() denote the coordinates of the /th node in the (?, rj) coordinate
system. The linear quadrilateral is also known as an isoparametric element since the
shape functions defining the geometry and the nodal values are the same.

4 3

= -1 5=1

Figure 2.2 Transformation of a quadrilateral ele- X 1 1 = -1 2


\320\273
elementin the .v,r plane to a unit square in the f?j
plane.
Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2

Higher order quadrilateral elements include the eight-node element (four cor-
corner nodes and four midside nodes) and the twelve-node element (four equispaced
nodes per edge).The basis functions for such elements can be found in [5].

Due to the irregular shape of the quadrilateral element, it is not easy to inte-
integrate the basis functions in the xy plane. To facilitate and generalize the integration
process, the conceptof the Jacobian is introduced. The Jacobian matrix, or more
specifically its determinant, transforms the infinitesimal area or volume element from
one coordinate system to another.If we consider the above example, we are essen-
essentially transforming between the global (x, y) coordinates and the local (f, 17) coordi-
coordinates. By the chain-rule of partial differentiation, we can express the f derivative of
Nf as

3N[_-dN[0x \320\250[\320\264\321\203
\321\215\320\273-
\320\267\302\247 a? By a?

Similarly, taking the derivative


\321\211 and combining the expressions, we can write the

result in matrix form as


1 \320\223 BNf/Bx If I
= \320\255\321\205/*
\320\252\342\204\226\320\233
j l
\320\264^/\320\264\321\205

8N4/dr) J J
\\ dNf/\320\255\321\203
[\320\264\321\205/\320\264\320\263]
\320\264\321\203/\320\222\321\206] J| BNf/By J

Since (x, y) are known explicitly in terms of the local coordinates (?, rj), the Jacobian
matrix can be found explicitly in terms of local coordinates. Care must be taken in
the choice of the local coordinate system such that the Jacobian matrix is non-
singular. To find the derivatives with respect to x and v, we merely needto invert
the Jacobian matrix to yield

{2\320\251
\\dN!/<>y\\-[S] \\BNf/Bn)
The technique can easily be generalized for n-dimensional transformations if necess-
necessary.The infinitesimal area element <IA can now be written as
ctxdy
= det[J]d^di] B.11)

Thus the integration of the irregular quadrilateral element is simplified considerably


by performing it over the local coordinate system instead of the global Cartesian
coordinate system.

2.3.2.2 Triangular Elements. Triangular elements are popular because they


can model arbitrary geometries. We will determine the shape functions of triangular
elements by using Lagrange interpolation polynomials. In their final expression, the
shape functions will be expresed in terms of the so-calledarea coordinates. Let us
consider a point P within a triangular element (Fig. 2.3) located at (.v. r). where
(.V;,.\302\273f) denote the coordinates of the /th triangle node. The area of the smaller
triangle formed by points P. 2. and 3 is given by

I \321\205 \321\203
I
=\302\246 l
\321\207 4 /2 B.1-

The area coordinate\320\246is then given by


Section2.3 Node-Based
\302\246 Elements 43

2J
Rgm\302\273 Triangular element.

Area/\302\27323
41 I '

is the area of the whole triangle and can be found


where \320\224 from B.12) by replacing x
and \321\203
in the first row with .v, and yt. Similarly, the two remaining area coordinates
L2 and Lj are given by

A2_AreaP31
~
*~
\320\224 Area 123
._ \320\2243 Areafl2
3 ~ ~_
~A Area 123

The values for x and v inside the triangular element reduce to

(=1
> J>
i=l
?>,
/=l
B.14)

where the latter condition 52'=1L,- = 1 a result


is of the area identity +
\320\224|

+ \320\2243
\320\224\320\263
= \320\224.
Alternatively, \320\246,IX,
and L\\ can be obtained in terms of x, y, and
the vertex coordinatesby solving the system of equations B.14).
The coordinate\320\246is zero on the edge opposite to vertex 1 and unity at vertex 1.
Its variation along the height of the triangle is displayed in Fig. 2.4. The remaining
two area coordinates associated with the other two vertices behave similarly, vanish-
on
vanishing edge opposite to the corresponding vertex
the and having unit magnitude at
the vertex it belongs to. This feature combined with spatial locality and C\302\260
continuity

qualifies the area coordinates as suitable basis functions N]' for a triangle when the

interpolation order is linear. That is,

Nf^L'l B.15)
Higher order basis functions for triangles can be derived using the procedure
given in [4] and [6]. In general, the shape function /V/ for node /, labeled as (/, /. K),

is given by

= n B.16)
44 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2

t,=0

=
\320\2460.25

\\/ \\ \\ \\

\\ \\ \\ \\
Figure 2.4 Area coordinates of a triangle.

Figure 2.5 Six-noded triangular element


2 supporting quadratic bases.

where i
\320\246,
= I e are area cordinates defined previously and is the
\320\240\"(\320\246) poly-
polynomial

j -
.t=o

B.17)

Similar definitions apply for P\",{L\\) and

Using the formulae given above, the basis functions for a quadratic triangle

(see Fig. 2.5) can be conveniently defined as

N* = -
\320\246B\320\246 I), i = I. 2, 3 CORNER NODES
B.18)
M1DS1DE NODES

Besides simplifying the treatment of basis functions for a triangle, simplex


coordinates greatly facilitate the integration of an arbitrary function over a tri-
triangular region. A very useful integration formula in terms of the area coordi-
Li,
s^\342\200\224Li, and L$\342\200\224over a triangular domain is given by
Section 2.3 Node-Based
\302\246 Elements 45

where a, b, and are


\321\201 integers and is
\320\224 the area of the triangular region.

2.3.3 Three-Dimensional Basis Functions

Shape functions for three-dimensionalelementscan be described in a precisely


analogous way to their two-dimensional counterpart. However, the simple rules for
inter-element continuity given previously must be modified. The nodal field values
should now interpolate to give continuous fields across the face of each element.

2.3J.I Rectangular Bricks. The simplest polynomial approximation to a rec-


rectangular brick element is the trilinear function
tf{x, y4 z) = <f + b'x + c\"y + d*z + e\"xy +fyz + gezx + ifxyz B.20)
whose eight parameters are uniquely determined by matching z) to the field
uc'(x, y,
values u\" at the eight corners of the brick. This results in eight equations that are
solved to determine the coefficients <f, h\",..., h\". The final expression is in the form

8
ue(x, y. z) = J^ u4Nf(x, y,z) B.21)

However, this approach of formulating B.21) is cumbersome and can be easily

avoided on writing down the required basis functions by mere inspection. Since
the basis function N\302\260
must be unity at node i and zero at the remaining nodes,
the eight interpolation functions can be written down as

where jc'!. y*r, and z\"c denote the coordinates of the center of the element, hex, hey, and he:
represent the edge lengths of the element and V is the element volume.
Brick elements of the Serendipity family are derived in [4]. To obtain higher
order basis functions, progressively larger number of nodesare uniformly placed on
46 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2

the element edges. Bricks with nth-order interpolation functions (with \320\270
+ 1 per

edge) require 8 + 12(n- I) degrees of freedom. Higher order bricks are rarely
used in electromagnetics applications since the regular shape requirement and

decline of accuracy with excessive shape distortion place severe limitations on the

generality of the geometry to be modeled.


Shapefunctions for hexahedral elements or distorted bricks can be derived by

mapping the element in the xyz coordinate system onto a standard cubein a new ^in-
^incoordinate system. We proceed along lines similar to the derivation of bases for the
quadrilateral element in Section 2.3.2. We express the Cartesian coordinates (x,y,z)
m terms of (?, rj, ?) as follows:

x = o,
=
\321\203 a2 + *rf + e2n+ && + e,Jv +\320\250 + &ft + Arf 4f B-22)
2=

The unknown coefficients\342\200\224a,. A;, c,, dh / = 1,2,3--can be obtainedby a one-to-one


mapping of the corners of the hexahedral element to the corner points of the unit
cube. The desired transformation thus yields the basis functions Nf of the hexahedral
finite element

= + + WXI + M) B-23)
Nf id \320\2501

with (?,-, ?//, f,-) denoting the coordinates of the ith node in the ?>jf coordinate system.
As before, the relationship between the (\320\273;, z)
\321\203, and (?, ?;. f) coordinates is given by

8 8 8

=
\342\200\242v
?

As in the case of the two-dimensional quadrilateral element, the Jacobian

comes to our aid for calculating the volume integral over the arbitrary hexahedral

domain. The Jacobian matrix [J] is of orderthree and is given by

f ajvf/at 1
fax/af \321\215\321\203/ag Wax
I dN4 = \320\255\321\205/cit]
\320\264/\320\264
= \320\243]
{ BNf/By
\\
) [ HNf/d:

The volume element transformation from the global xyz to the local ?qf coordinate
system is expressed as

dx dy dz = delM dt- di) d% B.25)

23.3.2 Tetrahedral Elements. The three-dimensional analogue of a two-


dimensional triangle is a tetrahedron (four-faced element).Onceagain, we can intro-

introduce special coordinates, called volume coordinates or simplex coordinates, to sim-


simplify the derivation of shape functions. If P is a point within the eth tetrahedron
shown in Fig. 2.6, the four volume coordinates are given by
Section 2.3 Node-Based
\302\246 Elements 47

Figure 2.6 Teirahedral clement. ' z

Volume P234
~
1
Volume 1234
Volume P341
2
\"Volume 1234

Volume P412
3~ Volume 1234

_ Volume PI 23
4 ~
Volume 1234 B.26)
and any position within the element is specifiedby
4 4 4

with (xhyhZi) being the coordinates of node /. As for the two-dimensional case
(triangular elements), the basis functions ,Vf are equal to the volume coordinates,
i.e.,

Nf
= Li, / = 1 4 B.27)

Quadratic shape functions for a tetrahedron necessitates the use of ten node
points: the four corner nodes and the remaining six at the midpoints of the edges.
The shape functions for the quadratic tetrahedron are given by

Nf = -
\320\246B\320\246 1), 1=1 4 CORNER NODES
B.28)
N) = 4L}'Z.?, 1 = 5 , 10, MIDS1DE NODES

j and \320\272
are endpoints of each edge
Similarly to triangular elements, volume coordinates greatly simplify integra-
over
integration tetrahedral elements. A useful formula for integrating over the volume of a
tetrahedron is

= 6V B.29)
volume
dxdydz
(a + h ^f
+ C+ d + 3)!
where a, b, and
\321\201 d are integers and V is the volume of the tetrahedron.
48 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2

2.3.3.3 Triangular Prism Elements. Other three-dimensional elements that


have simple shapes include the triangular prism and isoparametric elements. To
ensure that a small number of elements can model a relatively complex region,
distorted prisms can be used in conjunction with rectangular bricks. The shape
functions for the first-order prism element (shown in Fig. 2.7)is given by

/=1.4

'= 2,5 B.30)

/ = 3.6

Figure 2.7 Linear triangular prism clement.

where {;u
\342\200\224\342\200\224
2(: zc)/heighi varies linearly from -1 to +1 over the height of the
prism and is zero at the midpoint ze of the vertical edge (joining nodes 1-4, 2-5.
or 3-6 in Fig. 2.7). Here \320\246
refer to the area coordinates of the triangle that forms the
cross section of the prism (i.e., the triangle formed by nodes 123 or 456).
The quadratic node-based triangular prism has 15 nodes\342\200\224one each at the
corners and at the midpoints of each of the nine edges. Shape functions for the
quadratic and cubic triangular prisms can be found in [4]. For a more efficient
discretization using the fewest unknowns, prisms and bricks can be combined.
This is easily done because prisms and brickscan be readily connected by sharing
the same nodes and edgesat their boundaries.

2.4 EDGE-BASED ELEMENTS

In electromagnetics, we encounter serious problems when node-based elements are

employed to represent vector electric or magnetic fields. First, spurious modes are
observed when modeling cavity problems using node-based elements [7]. Nodal basis

functions impose continuity in all three spatial components whereas edge bases
Section 2.4 \302\246
Edge-Based Elements 49

guarantee continuity the tangential component. This feature


only along mimics the
behavior of field components along discontinuousmaterial boundaries, and its
importance in resolving the problem of spurious solutions will be discussed in detail
in Chapter 5. Second, nodal bases require specialcarefor enforcing boundary con-
conditions at material interfaces, conducting surfaces, and geometry corners [8]. The
first limitation can alsojeopardizethe near-field results of a scattering problem, the
far-field typically escapes contamination since spurious modesdo not radiate.
Edge-based finite elements, whose degrees of freedom are associated with the
edges and the faces of the finite element mesh, have been shown to be free of the
above shortcomings. The detailsof how edge bases avoid the pitfalls of nodal basis
functions will be discussed in Chapter 5. Edge basis functions were described by
Whitney [9] over 35 years ago and have been revived by Bossavit and Verite [1],
Nedelec [10], and Hano [11]in the recent past. It was Nedelec's landmark paper [10]
that laid down the guidelinesfor constructing finite element basis functions that span
his curl conforming space with degrees of freedom associated with the edges, faces,
and elements of a finite element mesh. Mur and de Hoop [12]:van Welij [13]; Barton
and Cendes [14];Jin and Volakis [15],[16];and Lee et al. [17] among severalothers
have extended their applicability to various two- and three-dimensional shapes and
even constructed higher order elements for a more accurate approximation of the
field values. More recently, Monk [18] provided error estimates and convergence
proofs for edge bases. The derivations leading up to the convergence proof is beyond
the scope of this book but the final result is stated below. If we denoteby \320\257\320\233'(\320\257
the

standard Sobolev space of functions of order ,y in ft and by || \342\200\242 ||, the norm on this
space, we can then define the space

H(curl;ft) = e (L2(ft)K|V
(\320\270 x \320\270
\320\265
(L2(ft)K)

and its corresponding norm

\\W\\h- =(ll\302\253llo + IIVl/2

If E is the exact solution of Maxwell'sequation in ft and EA is the finite-dimensional


approximation to it (in essence, the edge basis functions), ihen assuming that
E e (//*+l(ft))\\ the following result holds

\320\246\320\225-\320\225'-\321\203,,,
<\320\241\320\233\320\220-||\320\225||\320\264.+1 B.31)

provided is not
\321\201\320\276 an interior eigenvalue and h is sufficiently small. In B.31), A is the

discretizationparameter, \320\272 is the order of the basis function and is a constant


\320\241

independent of h. Thus higher order bases lead to lower errors in the solution when
the sampling size is sufficiently small. The convergence is also optimal in \320\233.

2.4.1 IVvo-Dlmenslonal Basis Functions

2.4.1.1 Rectangular Elements. We consider the rectangular element first since


its vector basis function is usually the easiest to formulate. For the elementshown in
Fig. 2.1. we can find its edge-based finite element basis function merely by inspection.
If the edges are numbered accordingto Table 2.1the vector basis functions can be
written as
50 Shape Functions for Scalarand Vector Finite Elements \302\246
Chapter 2

TABLE 2\320\233Edge Numbering for Rectangular Element

Edge No. h k
I 1 2
2 4 3
3 1 4
4 2 3

where x, y, and z are the unit vectors in the Cartesian coordinate system.The above
basis functions have unity value along one edge and zero over all others, i.e..

where is the Kronecker


\320\246
delta and ij is the unit vector along the yth edge.
A graphical illustration of the W\\ vector basis function is given in Fig. 2.8. In
this figure, the largest value of IH^I corresponds to the largest vector length.
Using the above Wf the electric field within the finite element can be repre-
represented as

? B.32)

where now ?f denotesthe average tangential field along the /th edge. The has\302\273

functions W'( guarantee tangential continuity across inter-element boundaries sinoc

they have a tangential component only along the /th edge and none along the other
edges. They are also divergencelesswithin the element and possess a constant non-
nonzero curl. It should be noted that by taking the cross-product of z with W,, we obtain

Figure 2.8 Illustration for rectangular


of \320\251
clement.
Section 2.4 \302\246
Edge-Based Elements 51

basis functions which possess normal continuity across element boundaries, have zero
curl and non-zero divergence. The latter are ideal for representing surface current
densitiesand are known as rooftop basis functions in electromagnetics. They have
found extensive use in the solution of integral equations [19] and hybrid finite ele-
element-boundary integral implementations.
Edge-based vector elements can be derived
bases for quadrilateral by carrying
out the transformation detailed in of nodal basis for quadrilaterals
the derivation in

the previous section and then taking the gradient of the resulting expression for each
edge.Thesebaseshave two shortcomings. First, the integrals associated with edge-
based quadrilateral elements do not lend themselves to easy evaluation. Second, they
may not be divergence free. However, their ability to model complicated shapes with
a lesser number of unknowns than tetrahedra and the inherent property of enforcing

tangential continuity across elements makes them attractive for use in two-dimen-
vector
two-dimensional formulations.

2.4.1.2 Elements. We consider


Triangular again the triangular element
depicted in the edges of an arbitrary
Fig. 2.3. Since triangular element are not
parallel to the x- or ^-axis,it is not easy to guess the form of the vector basis function
by inspection. Therefore, the vector basisfor a triangular element will be expressed
in terms of its area coordinates,L\\, Lj, and LLj. These are the Whitney elements. If
the local edge numbers are denned according to Table 2.2, then edge bases for a
triangular element are defined as

W{ = N'u
=
\320\246(\320\246VL/
-
LJjVUj), ij = 1, 2.3 B.33)
where W\\ denotes the basis function for the A'th edge of the eth element and ly
= ?*
is the length of the edge formed by nodes i and j of the triangle. The vector field

inside the triangular element can, therefore, be expanded as

B.34)

where ?? denotesthe tangential field along the /cth edge. It can be easily shown that

the edge-based functions defined in B.33) have the following properties within the

element

VxJVj
= x
VLJ
\320\230\321\206\320\247\320\246

If is
\321\221| the unit vector pointing from node 1 to node 2 in Fig. 2.3, then
\342\200\242=
VL|
\321\221|
\342\200\224
\\jt\\ and \302\246
VL2
\321\221|
= \\jl\\. Since L\\ is a linear function that varies

TABLE 2.2 Edge Numbering for Triangular Element

Edge No. Node i\\ Node t;

1 I 2
2 2 3
3 3 1
52 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2

from unity at node 1 and zero at node 2 and L$ is unity at node 2 and zero at node 1,
we have

e, Afo
=
\342\200\242
L? + Z.1 = 1 B.35)

along the entire length of edge 1. This implies that \320\233^2has a constant tangential

component along edge 1. Moreover, since U\\ vanishes along edge 2, VL\\ is normal to
edge 2, U, vanishes along edge3, and VL$ is normal to edge 3, N\\2 has no tangential
component along these edges. Similar observations apply to N13 and N31.Thus,
tangential continuity is preserved across inter-element boundariesbut normal con-
continuity is not. Fig. 2.9 shows the actual variation of the basis function for the edgeof

a right triangle that is opposite to the node associated with the right angle. A
different method of constructing edge bases for triangular elements is given in [20],
[21].
Higher order vector basis functions involve a node at the midpoint of
adding
each edge and including the contribution of facet to the approximating
elements

function. Unknowns in the triangular element are assigned as shown in Fig. 2.10 [17].

Figure 2.9 Variation of edge basis function


for the edge opposite to the right angle.

_2 Figure 2.10 Triangular edge element with


3 two unknowns each per edge and per face.
Section 2.4 \302\246
Edge-Based Elements 53

The tangential projection of the vector field along edge {i,j} is determined by two
unknowns E\\ and E)and two facet unknowns\342\200\224-F\\ and F2\342\200\224areprovided to allow a
quadratic approximation of the normal component along two of the three edges.
Only two facet unknowns are required to make the range space of the curl operator
complete to first order. Therefore, there are eight degrees of freedom for each trian-
triangular element. Since the edge variables provide common unknowns across element
boundaries, tangential continuity of the field over the boundary is assured.However,
an obvious disadvantage of these elements is that the two-facet variables cannot be
symmetrically assigned. This disadvantage can be avoided by employing third-order
edge bases [22]. The higher order approximation to the vector field within the ele-

element is given by

B.36)

where we have arbitrarily chosen the facet variables to lie on edges 1 and 2. These
variables are local unknowns associated with each separate triangular element and
are included to provide a linear approximation for V, x E,, where the subscript t
denotes the tangential components of the operator. This property turns out to be
very important in the selection of the order of the basis function to be used in the
modeling process. The basis described by B.36) can be classified as belongingto the
Hl(curl) space. The Hk(curl) space consistsof those vectors whose inner products
are square integrable and whosecurl consists of complete polynomials of order k.
The basis given in B.33) thus belongs to the H\302\260(curl) space, since its curl is merely
constant within the finite element. The basis generated by excluding the facial con-
contribution would result in six unknowns\342\200\224 two per edge\342\200\224but the order of the approxi-
approximation would still be H\302\260(enr[) and does not add to the accuracy of modeling the H-
field while doubling the unknown count. It should also be noted that the form of the
facet bases in B.36) are different in the original paper [17].This is due to recent
analysis [22] that shows that the Nedelec constraints [10]are met by B.36), resulting
in smaller dispersion error and better conditionedmatrices.

2.4.2 Three-Dlmenslonal Basis Functions

Edge-based elements have facilitated to a great degree the finite element ana-
analysis of three-dimensional structures in electromagnetics. Linear nodal bases with
their problem of spurious modes and difficulty in maintaining only tangential con-
continuity across material interfaces are not as convenient for electromagnetic field
simulations in three dimensions. On the other hand, the introduction of edge-
basedshape functions provides a robust way of treating general three-dimensional
problems having material inhomogeneities and structural irregularities like sharp
edges and corners.
In the following section, we will consider first the simple rectangular bricks and
will proceed to present edge-based shape functions for more complicated finite ele-
elements such as tetrahedrals and curvilinear hexahedrals. The chapter is concluded
with a brief discussion on hierarchical edge elements.
54 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter]

2.4.2.1 Rectangular Hexahedrals. As in the two-dimensional


Bricks and case,
we derive the edge-based shape for a rectangular brick (seeFig.2.11)
function bj

simple inspection. Since a constant tangential field component must be assigned to


each edge of the element, we can expressthe shape function along each edge of the
element as [15]

Figure 2.11 Rcciangular hrick element, Tfe


numbers denote the local node numbensj
scheme.
Section 2.4 \302\246
Edge-Based Elements 55

where tfx, denote the edge lengths in


hev. If. the x, y, and z directions, respectively, and
the center coordinates of the brick are given by zj.'). If the local edge numbers
v?,
(\320\273?,

are defined as in Table 2.3, the vector field within the element can be expressed as

B.37)
k=\\

where El represents the value of the electric field along the A*th edge of the <?th

element. The vector bases W? defined for the rectangular brick element have zero
divergence and a nonzero curl. Furthermore, the expansionB.37)guarantees tan-

tangential continuity of the electric field across the surfaces of the elements.
A rectangular brick element has limitations in the sense that it is unable to
model irregular geometries. For this reason, the analog of the two-dimensional

quadrilateral (the hexahedral element) is more attractive for modeling practical


three-dimensional problems. As in the case of the quadrilateral element in two
dimensions, a hexahedral element in Cartesian coordinates can be seen as the
image of a unit cube under a trilinear mapping to the t/tf coordinate system (see
Fig. 2.12).
Let us consider those faces for which ? = constant. Therefore, V? must then

possess a normal component


only on that face. Since ? varies linearly along Ihe edges
that are parallel to the ?-axis, the vector function has
\320\251 nonzero tangential compo-
components only along those edges that are parallel to the ?-axis. Using the node-based

TABLE 2.3 Edge Definition for Rectangular Brick

Edge No. Node Node

1 1 2
2 4 3
3 5 6
4 8 7
5 1 4
ft 5 8
7 2 3
8 6 7
9 1 5
10 2 6
11 4 8
12 3 7

8^

4
Figure 2.12 Mapping of a hexahedral de-
dement\321\216
unit
\320\260 cube.
56 ShapeFunctions for Scalar and Vector Finite Elements \302\246
Chapter 2

expression for the shape function in a hexahedral element given in B.23). we may
write the corresponding edge bases as

W%
= A + n)(
\321\211 1 + UO V? edges || to $-axis B.38)
^

\\
= -\302\261
A + \320\2501 + M) V^7 edges || to rj-am B.39)

= T <' + **#<'+ W V< edges || to f-axis B.401


\320\276

where (&, &)


\321\211, denote the coordinates at the kth edge and hek is the length of the

klh edge belonging to the eth element.


The vector bases derived above possess all the desired continuity properties of

edge elements and generally result in about half the number of unknowns generated

by telrahedral gridding. The difficulty in generating a finite element mesh of \320\260\3


arbitrary structure using hexahedra can be a seriouslimitation. In practice, often,

a combination of hexahedra and telrahedra in a finite element mesh is used and


continuity is imposed across the different element interfaces to solve the problem
However, this may result in ill-conditioned matrices.

2.4.2.2 Elements.
Tetrahedral Tetrahedra are. by far, the most popular el-

element shapes employed for three-dimensional applications.This is because


to be
the tetrahedral element is the simplest tessellation shapecapableof modeling arbi-

arbitrary three-dimensional geometries and is also well suited for automatic mesh gen-
generation. The derivation of shape functions for these elements follow the same pattern
as that for triangular vector basis functions. If we consider the tetrahedron shown in
Fig. 2.6 and define the edge numbers accordingto Table 2.4, we have

= = - 4
W\\ N4; \320\246\320\247\320\246)./.y=l
\320\225\302\253(\320\246\320\247\320\246 B.4li

where again ty =
tk denotes the length of the edge between nodes / and./, which in
turn define the A-th edge. The vector field within the element can then be expandedas

B.42:
ft=l

TABLE 2.4 Edge Definition for Tetrahedron

Edge No. Node if Node /j

1 1 2
2 1 3
3 1 4
4 2 3
5 4 2
(y 3 4
Section 2.4 \302\246
Edge-Based Elements 57

where the coefficients represent


\320\225\320\246 the average value of the field along the Ath edge of
the fth element.
An explanation of the physical character of the edge-based
elegant interpola-
function
interpolation by Bossavit [23]. Let us consideredge number
is given 1 connecting

nodes 1 and 2 in Fig. 2.6. Since V/\320\233is orthogonal to facet A34) and VLf is orth-

orthogonal to facet B34}, the field turns around the axis 3-4 and is normal to planes
containing nodes 3 and 4. The field thus has only tangential continuity across el-
element faces. Edge elements can also be describedas Whitney elements of degree one
and can be broadly classified as belonging to the ^(curl) space.
Whitney elements of the second degree are calledfacet elements becausethey
are constant over the face of the tetrahedron. The vector function for the facet
elementcan be written as

=
2(UtVLe] x
\342\204\226\321\202 VL? + L]VLck
x \321\205
\320\247\320\246+\320\246\320\247\320\246
VL'j), ij.k=l 4
B.43)
As explained in [23], we now have a central field (as if emanating from node 4 in Fig.
2.6) on each of the two tetrahedra that share the face A,2,3). The field can be
imagined as coming from the 'source' 4, growing, crossing the facet, and vanishing

into the 'weir 4'. the fourth vertex of the other tetrahedron. Thus, this field has
normal continuity and the flux across the facet forms the degree of freedom for the
element.
Alternative expressions for linear basisinside a tetrahedron have been derived
in [14]. They are given by

w _ 1 f7-< + x f. r in tne tetrahedron _


^
Wl~i ~
87w
otherwise {'m>
10,
with

f7-/ = B.45)
^r/lxr,1

^ =
'4f B.46)

in which /=1,2 6, Vt. is the volume of the tetrahedral element, e,- = (r,, \342\200\224
r(|)//\302\273,
is the unit vector of the /th edge, and //, = |r,^ \342\200\224
r,, |
is the length of the /th edge with
and
\320\263,, r,, denoting the position vector of the /| and i2 nodes. It can be shown that

B.41) is identically equal to B.44) when simplified. Therefore,


g7_,
=

where i| and i2 are given in Table 2.4. The basisfunctions given in B.44) have zero
divergenceand constant curl (VxHf = 2g,). The form of the basis functions given
in B.44) is similar to the zeroth-order edge elements postulated by Nedelec [10].
order of the polynomial approximation for the first-order
The edge element
given in B.41) or B.44) can be taken as 0.5. This is because the value of the basis
function is constant, i.e., 0A), along the edge it supports and is linear everywhere
else within the element. Mur and de Hoop [12]presented edge elements which are
consistently linear, yielding a linear approximation of the field both inside each
tetrahedron and along its edges and faces. However, the curl of the basis is still
58 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2

0A). Since this requires two unknowns per edge, there are twelve degrees of freedom
per element. The basis functions in [12] are derived by first defining the outwardly
directed vectorial areas of the faces as

where r/, / = I..... 4 denote the position vectors of the vertices of the tetrahedron
and i,j, k, I are cyclic.Then the edge-based vectorial expansion function is defined \320\253

,4. '' \320\244.1 B.48)

where V is the volume of the tetrahedron and is a


\321\204(\320\263) linear scalar function of

position given by

in which 17, position vector of the centroid of the tetrahedron.


is the We observe thai
equals
\321\204{(\320\263) unity when r = and zero for
\302\273\342\200\242,\302\246 the remaining verticesof the tetrahedral
element. In that sense, they are very similar to the simplex or volume coordinates
mentioned earlier. They also satisfy the following equalities:

1=1 1=1

The edge basis function


/V'} is
a linear vector function of position inside the tetra-
tetrahedral element, and its tangential component vanishes on all edges of the elemeni

except the one joining vertices i and j. jVJ/


varies linearly along the edge formed bt

j such that = 0 while


nodes i and \342\200\242
/V? r,

These basis functions of divergence and curl.


have nonzero values
An
inspection expressions for the vectorial areasA; reveals that the form
of the
is identical to that obtained by taking the gradient of one of the simplex or volumt
coordinates mentioned earlier, in other words,the three components of the vector \320\233,
have the same functional dependence as that obtained by VL^'

V 1
~
X-> 1 \\h
\"
x det -y det V 1
+ 2 det x3 1 \320\233
V I *\"
\320\2434 1 \024 1
\342\200\242V4 >'4

where the volume coordinate


is
\320\246 for a tetrahedron defined in B.26), \"det\" indicates
the of the determinant of the matrix
value and (Xj,yt,2() denote the coordinates of
the ith vertex. This is only to be expectedsincethe gradient of the shape function is

normal to its corresponding edge in two dimensions and normal to its corresponding
face in three dimensions. The basisfunctions with consistently linear interpolation in
the tetrahedron can thus be rewritten in a more convenient notation as
= \"(I= /.7=1. ,4. A49)

where the normalization factor ty


was introduced. Note how the first order edge
basis is similar in form to the zeroth-order edgebasis in B.41).
Section 2.4 \302\246
Edge-Based Elements 59

Figure 2.13 Tetrahedral clement.

Still higher order basis functions are sometimes necessary for rapidly varying
fields. The second-order edge basis @{rim5)) for a tetrahedral element was first
presented by Lee, Sun, and Cendes [24].We need 20 degrees of freedom to achieve
a quadratic approximation of the vector field inside a tetrahedron (see Fig. 2.13).
Accordingly, the field within a tetrahedron can be written as

fa! fal

- +
-
'IUKUjVLi UkVL'j) FiLJj(Lek\320\243\320\246 \320\254'\320\243\320\246) B.50)

where i,j\\ form


\320\272 cyclic indices. The facet variables F[ and Fi are common unknowns
for two letrahedra that share the same face. Even higher order edge-based elements

complete up to polynomial order two can be constructed.Each tetrahedral element

now has 30 unknowns\342\200\224three along each edge and three on each face.

2.4.2.3 Triangular Prism Elements. The primary attraction of triangular

prisms lies in the fact that they yield fewer unknowns than tetrahedrals while retain-
retainingthe ability to mesh arbitrary geometries unlike hexahedrals. Moreover, it is
sometimes possible to extrude the volume mesh out of an existing surface mesh
using triangular prisms. This feature, however, may not always lead to good quality
elements, especially when the geometry is non-planar with sharp corners. Finally, it
is not easy to construct edgebasis functions for such elements. 6zdemir and Volakis
[25] proposed edge-based shape functions for right-angled and distorted triangular
prisms. The vector basis functions derived in [25] are a combination of edge basis
over the triangular cross section and a linear variation over the height of the prism.
A sketch of the basis function over the triangular and quadrilateral faces is presented
in Fig. 2.14. One of the shortcomings of these bases is the lack of tangential con-
continuity across element faces when the prisms are distorted, i.e.. the vertical arms are
not at right angles to the plane of the triangular faces.
60 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2

(x,y,z)

(a) (b) (c)

Figure 2.14 Sketch of edge basis function over triangular and quadrilateral faces of
prism element. [Courtesy o/T. Ozdemir.]

Figure 2.15 Di&toclcd triangular pjisa

The geometry of an arbitrary triangular prism is shown in Fig. 2.15. The edge

basis functions for the top triangle edges are given by

= /,./=1,2.3; =
\320\233 1.2,3 B.5\320\246
\320\251^\320\251 \320\254^\320\246\320\247\320\246-\320\246\320\243\320\246)\320\260,

and those for the bottom edges are

W%
=
\320\251
= - 1
\320\254\321\206{\320\246\320\247\320\246
\320\246V?D(
-
s), l,j = 4. 5, 6; =
\320\272 4. 5, 6 B.5:,
and the vertical edges are

/=1.2,3; = 7,8,9 B.531


M). \320\272
Section 2.4 \302\246
Edge-Based Elements 61

In the above equations, \320\246are the node-based shape functions (area coordinatesof
the triangle) defined earlier and ,v is a normalized parameter which is zero at the
bottom face and unity at the top face of the prism. It should be noted that the basis
functions for the top and the bottom edges are exactly similar to that of a triangular
basis scaled by a dimensionless parameter. For the vertical edges, the vector v is a
linear weighting of the unit vectors v,. vj, v3 associated with the vertical arms and is
defined as

This particular choice of v minimizes tangential discontinuity across inter-element


faces.
It is worthwhile to note that the edge basis functions for the triangular prism
stated above are not Nedelec-type [10] elements since the first-order bases are not of
the form

\320\251
= = 1
\320\272 number of edges B.55)
\320\260\320\272+\320\240\320\272\321\205\320\263,

where a*. fik are constants and r is the position vector insidethe finite element. The
edge bases for the top and bottom faces fall into the Nedelec form but the bases for
the vertical forms do not, hencethe loss of tangential continuity across elementfaces.

2.4.2.4 Curvilinear Elements. Wang and Ida proposed a systematic method


for the construction of curvilinear elements in [26]. The vector shape function is

expressed in the following form:

= = 1 \320\234
**(*\342\200\242). \320\272
\320\251(\320\263)\321\204*($.ij. \320\236 B.56)

where rj, f)are completely defined in the local coordinate


\321\204\320\272(%, system, vk contains
the edge and facet information and M denotesthe number of degrees of freedom in
the element. These basis functions differ from the bases described earlier in the
chapter in that they are constructed in the local coordinate system.Sincethe direc-
direction vectors are defined in global coordinates, the bases are uniquely defined. The
joint effect of \321\204\320\272
and vk ensures that W*k is unity at node and
\320\272 zero elsewhere. These
basis functions usually lead to a symmetric system of equations. Antilla and

Alexopoulos also proposed a curved brick superparametric element with 8, 27, and
64 nodes in [27] for solving scattering problems. Superparametric elements attach

unknowns to a lesser number of points than required to define the geometry. The
advantage of curvilinear elements lies in the fact that they can model curved surfaces
with more accuracy and lesser number of unknowns than rectilinear elements.
Analytical surfaces and even complicated non-planar surface features can thus be
modeled exactly at low computational cost. However, many mesh generation
packages cannot construct curvilinear elements for arbitrary geometries.
62 ShapeFunctions for Scalar and Vector Finite Elements \302\246
Chapter 2

2.4.2.5 Hierarchical Vector Elements. Finite elementsare saidto be hierarch-


when
hierarchical the basis functions for an element are a subset of the basis functions for

any element of higher order [4]. Hierarchical elements find use in a class of adapiivc
finite elements (called /^-refinement) where the order of approximation is improved

by refining the order of the polynomial basis functions instead of refining the mesh

density. However, there is usually a trade-off when higher order basis functions
extract a heavy price m terms of computer resources.A major problem with going
to higher order bases is the increased density of the finite element matrix and the

likely worsening of the matrix condition number.Moreover, some structures may


have features for which a lower order approximation is sufficient to model the field
variations. This is especially the case where field variations are either uniform or
constant and a higher order of interpolation may actually degrade the solution. It
turns out that the vector finite elements defined by Nedelec [10] and subsequently
derived by Barton and Cendes [14] and Leeet al. [17]have a hierarchical structure.
Vector elements complete up to polynomial order two are available, and basis func-
functions of a given order are fully compatible with basis functions of lower or higher
orders. Thus elements of different orders could be used in the same mesh.
Specifically, lower order elements could be used in regions where field variation is

slowly varying and higher order elements in regions where the field varies rapidly.
The implementation of hierarchical vector elementscan be difficult, especially
at the transition boundaries where elementsof one order mergeinto the elements of

higher or lower order. If several vector elements share an edge, the field tangent to
the edge must be made identical in each of the tetrahedra. This is done by carefully
matching the coefficients of the vector basisfunction corresponding to that edge. For

tangential continuity across a face, the same equality must be enforced between the
coefficients of all the edge and facet functions associated with the face. Table 2.5
given in [28] shows the basis functions for hierarchical vector finite elements, li
should be mentioned that for the zeroth-order edgeelement,the described polyno-
polynomialapproximation to be of order 0.5 is somewhat of a misnomer. It should be taken
to mean that the field variation along the edge is constant, i.e., 0(ru), and the
variation normal to the edge is O(r'). Averaging the orders, albeit a mathematically
dubious procedure, yields the described polynomial order.On the plus side, the table
offers a concise view of the hierarchical nature of these edge elements. Higher order
basis functions are constructed by systematically adding the extra terms up to the
desired order. It should be noted that the bases for the tetrahedron with six and 20

unknowns shown in Table 2.5 is identical to the \320\2570(\321\201\321\2107)


and Hl(curl) edge basis
given in B.41) and B.49), respectively.

TABLE 2.5 Hierarchical Basis Functions for a Tctrahcdral Element

Unknowns per
Element Type Polynomial Order Element Basis Function

Edge 0.5 6 -
L,VLj LjVL,
Edge 1 12 V(L,L,)
Face 1.5 20 Ul.,VLk-UVL))
Face -
Lj(LkVL, L,V^)
Edge 2 30 V\\L,L,a,-Lt)]
Face 4L,L,Lk]
References 63

REFERENCES

[1] A. Bossavit and J. \320\241Verite.


A mixed FEM-BIEM method to solve 3D eddy
current problems. Trans. Magnetics, 18:431-5,March 1982.
IEEE

[2] J. P. Webb. Edge elements and what they can do for you. IEEE Trans.
Magnetics,29:1460-1465. 1993.
[3] S. H. Wong and Z. J. Cendes.Combined finite element-modal of three-
solution
dimensional eddy current problems. IEEE Trans. Magnetics, 24F), November
1988.
[4] Zienkiewicz.
\320\241
\320\236. The Finite Element Method. McGraw-Hill, New York, Third
edition, 1979.
[5] K. Tuncer, D. Norrie, and F. Brezzi. Finite Element Handbook. McGraw-Hill,
New York, 1987.

[6] J. M. Jin. The Finite Element Method in Electromagnetics. John Wiley & Sons,
New York. 1993.
[7] Z. J. Cendes and P. Silvester. Numerical solution ofdielectric loaded wave-
waveguides: 1\342\200\224Finite element analysis. IEEE Tram. Microwave Theory Tech.,
1970.
118:1124-1131,
[8] X. Yuan, D. R. Lynch, and K. Paulsen.Importance of normal field continuity
in inhomogeneous scattering calculations. IEEE Trans. Microwave Theory
Tech., 39:638-642, April 1991.
[9] H. Whitney. Geometric Integration Theory. Princeton Univ. Press, NJ, 1957.
[10]J. Nedelec.
\320\241 Mixed finite elements in r\\ Numer. Math., 35:315-41, 1980.
[11] M. Hano.Finite element of dielectric-loaded
analysis waveguides. IEEE Trans.

Microwave Theory 32:1275-1279, October 1984.


Tech.,
[12] G. Mur and A. T. de Hoop. A finite element method for computing three-
dimensional electromagnetic fields in inhomogeneous media. IEEE Trans.
Magnetics, 21:2188-2191, November 1985.

[13] J. S. van Welij. Calculation of eddycurrents in terms of H on hexahedra. IEEE


Trans. Magnetics, 21:2239-2241. November 1985.
[14] M. L. Barton and Z. J. Cendes. New vector finite elements for three-dimen-
three-dimensional
magnetic field computation. J. Appl. Phys., 61(8):3919-3921, April

1987.

[15] J. M. Jin and J. L. Volakis. Electromagnetic scattering by and transmission


through a three-dimensional slot in a thick conducting plane. IEEE Trans.
Antennas Propagat., 39D):543-550, April 1991.
[16] J. M. Jin and J. L. Volakis. Scattering and radiation from microslrip patch
antennas and arrays residing in a cavity. IEEE Trans. Antennas Propagat.,
39:1598-1604, November 1991.
[17] J. F. Lee,D.K. Sun, and Z. J. Cendes. Full-wave analysis of dielectric wave-
waveguides using tangential vector finite elements. IEEE Trans. Microwave Theory
Tech., MTT-39(8): 1262-1271, August 1991.

[18] P. Monk. A finite element method for approximating the time-harmonic


Maxwell equations. Numer. Math., 63:243-261, 1992.
64 Shape Functions for Scalar and Vector Finite Elements \302\246
Chapter 2

[19] D. H. Schaubert. D. R. Wilton, and A. W. Glisson. A tetrahedral modeling


method for electromagnetic scattering by arbitrarily shaped inhomogeneous
dielectric bodies. IEEE Trans. Antennas Propagat., pp. 77-85, January 1984.
[20] D. R. Tanner and A. F. Peterson. Vector expansion functions for the numerical
solution of Maxwell's equations. Microwave Opt. Tech. Lett., 2B):331-334.
1989.
[21] R. D. Graglia, D. R. F. Peterson.Higherorderinterpolator}
Wilton, and A.
vector bases for computational electromagnetics. IEEE Trans. Antenm

Propagat., pp. 329-342, March 1997.


[22] J. S. Savage and A. F. Peterson.Higher-ordervector finite elements for tetra-
tetrahedral cells. IEEE Trans. Microwave Theory Tech., 44F):874-879,June 1996.

[23] A. Bossavit. Whitney forms: A class of finite elements for three-dimensional


computations in electromagnetism. IEEE Proceedings. 135, pt. A(8), November
1988.

[24] J. F. Lee, D. K. Sun, and Z. J. Cendes. Tangential vector finite elements for

electromagnetic field computation. IEEE Trans. Magnetics, 27E):4032-4035.


September1991.
[25] T. Ozdemir and J. L. Volakis. Triangular prisms for edge-based vector finite

element antenna analysis. IEEE Trans. Antennas Propagat., pp. 788-797, Ma\\

1997.

[26] J. S. Wang and N. Ida. Curvilinear and higher order 'edge' finite elements in

electromagnetic field computation. IEEE Trans. Magnetics,29B):1491-1494,


March 1993.

[27] G. E. Antilla and N. G. Alexopoulos. Scattering from complex three-dimen-


three-dimensional
geometries by a curvilinear hybrid finite element-integral equation
approach. /. Opt. Soc.Am. A, 11D): 1445-1457, April 1994.
[28] J. P. Webb and B. Forghani. Hierarchal scalar and vector tetrahedra. IEEE
Trans. Magnetics, 29B): 1495-1498,March 1993.
Overview of the Finite
Element Method:
One-Dimensional
Examples

3.1 INTRODUCTION

The finite element method (FEM) belongsto the class of partial differential equation
(PDE) methods.Its origin is frequently traced to Courant [1] who in the 1940s first
discussed piecewise approximations in the appendix of his paper. In the 1950s,
Argyris [2] began putting together the many mathematical ideas(domain partitioning,
assembly, boundary conditions, etc.) that comprise the FEM for aircraft structural
analysis. The introduction of FEM to the engineering community occurred in the
1960s, and some feeJ that the conferences on finite elements held in 1965, 2968, and
1970at the Wright Patterson Air Force Base in Dayton, Ohio, U.S. played an
important role in advancing the method. Finite element activity in electrical engin-
engineering also began in the late 1960s with the papers by Silvester [3] (see also the
reprints volume [4] and Arlett, Bahrani and Zienkiewicz[5])addressingapplications
to waveguide and cavity analysis. Later developments on absorbingboundary con-

conditions, perfectly matched absorbers and hybridizations with boundary integral


methods have led to the successful application of the FEM to opendomain problems
in scattering, microwaves circuits, and antennas.Themethod'smain advantage is its
capability to treat any type of geometry and material inhomogeneity without a need
to alter the formulation or the computer code. That is, it provides geometrical
fidelity and unrestricted material treatment. Moreover, the application of the
FEM leads to sparsematrix systems which can be stored with low memory require-
requirements when iterative solvers are employed for the solution of these systems. We
typically state that the FEM systems have 0{N) storage requirements, implying

65
66 Overview of lhe Finite Element Method: One-Dimensional Examples \302\246
Chapter 3

that the memory needed for a solution of an FEM system is proportional to the
number of unknowns N. For most casesthese memory requirements may range from
ION to 40iV depending on the type of problemconsidered and the employed basis or
expansion functions approximating the field within the computation domain. This is
in contrastto boundary integral solutionswhich lead to fully populated systems

having O(N2) storage and OiN*) CPU requirements. However, it should be pointed
out that the number of unknowns for boundary integral equations are generally
much less than those of FEM for the same problem. Nevertheless, when dealing
with nonmetallic structures, the FEM and its hybrid versions is the most attractive
choice.

3.2 OVERVIEW OF THE FINITE ELEMENT METHOD

The geometrical adaptability and low memory requirements of the FEM have made
it one of the most popular numerical methods in all branches of engineering.Its

application to boundary value problems [6] involves the subdivision of the computa-
computationaldomain (region where the fields are to be determined) into smaller elements [7].
[8]. For two-dimensional these elementsare typically
problems, triangles or quadri-
quadrilaterals as discussed in Chapter 2 and illustrated in Fig. 3.1. Additional example
meshes are given in Figs. 3.2 and 3.3 with the latter referring to a three-dimensional
mesh around a sphere.
The subdivisionof the domain into small elements is referred to as meshing or

discretization of the geometry and is an important part of the FEM solution pro-

procedure. By keeping the elements small enough (typically less than 1/10 of a wave-

wavelength per side), the field interior to the elementcan be safely approximated by some
linear or, if necessary, higher order expansion. The collection of these elements and

their associated expansion or shape function is therefore capable of modeling arbi-

arbitrary and rather complex fields in terms of unknown coefficients which may repre-
represent the field values at the nodes (node-based basis)or the average field values over
the edges (edge-based basis).
In the FEM, the equations for the unknown
context of the coefficients of the
expansions are
by enforcing the wave
constructed equation in a weighted (average)
sense over each element.A subsequent step involves the application of the boundary
conditions leading to a matrix system of the form

[A][x) = [b] C.1)


where \\b) matrix and is determined on the basis of the boundary con-
is a column
conditions forced excitation (current source,incident
or the field, etc.). The matrix [A] is
square of size N x N. very sparse and typically symmetric unless nonreciprocal
material existsin the computational domain. Its nonzero entries provide the relation-
among
relationship field or voltage of adjacent elements within the computational domain,
and its specific form is a characteristic of the problem geometry and discretization of
the domain. Once the system C.1) has been constructed, its solution proceeds with
the application of an iterative or direct solver.Iterative solvers are primarily used for
large systems (i.e., large numbers of unknowns, N) sincethese solvers avoid explicit
storage of the entire matrix. That is, only the nonzero entries need be stored by
Section 3.2 Overview
\302\246 of the Finite Element Method 67

(*!./!>

Quadrilaterals

(four-sided elements)

V=Q

(8)

(b)

Figure 3.1 Example illustrations of finite clement meshes: (a) shielded strip-
conductor transmission line problem; (b) shielded circular conductor
transmission line problem.

Figure 3.2 Finite element mesh around an airfoil for scattering computations.
[Courtesy of Daniel C. Ross.]
68 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter 3

Figure 3.3 Structured tetrahedral mesh


around a metallic sphere.

employing established storage schemes such as the compressed row and ITPACK
formats discussed in Chapter 9. Direct solvers such as LU decomposition are still

better suited for smaller size systemssincethey require storage of the entire matrix
including its nonzero entries.
The steps involved in the generation and solution of an FEM system can be

summarized as follows:

\302\246
Define the problem's computational domain
\302\246
Choose mesh truncation schemes (in the case of open domain problems)
\302\246
Choose discrete elements and shape functions
\302\246
Generate mesh (prepocessing)
\302\246
Enforce the wave equation over each element (or Laplace's/Poisson's equa-
equation for statics) to generate the clement matrices
\302\246
Apply boundary conditions and assemble element matrices to form the over-
overallsparse system C.1)
\302\246
Ensure matrix symmetry (for domains with reciprocal materials)
\302\246
Choose solver and solve matrix system
\302\246
Postprocess field data to extract parameters of interest (suchas eigenvalues,
capacitance, impedance, insertion loss, scattering matrix, radar crosssection
and so on)

In this chapter we will present these steps for one-dimensionaJ problems before

discussing them for two-dimensional applications in Chapter 4. Although many of

the two-dimensional problems are approximations of three-dimensional ones, they


are nevertheless attractive becauseof their simplicity. Consequently, they can be used
to illustrate the solution procedure without burdens due to geometrical and formu-
Section 3.3 \302\246
Examples of One-Dimensional Problems in Electromagnetics 69

lation complexity. Of course, one-dimensionalproblemsprovide the simplest way to


illustrate the computational approach and belowwe begin with a brief illustration of
the FEM for solving the classic Sturm-Liouville differential equation.

3.3 EXAMPLES OF ONE-DIMENSIONAL PROBLEMS IN ELECTROMAGNETICS

Consider the solution of the differential equation (one-dimensionalSturm-Liouville


problem) [6]

d ( dU\\
-
-r />(.v)-j- + <i(x) U(x) =j\\x) Q<x<xa, C.2)

where p(x), q{x),and f(x) are known functions and U(x) is the unknown field or
voltage quantity. Depending on the interpretation of U(x), this equation can repre-
represent any of the following problems illustrated in Fig. 3.4.

Parallel Plate Capacitor

U{x)= V(x):potential between plates


Boundary Conditions: V@) = 0, V{xa)
= Va,
p{x) = -I, g(x)= O.f(x)= -p/t

Differential equ.: -^V(x) = -- (orV2V = -p/e) C.3)

Wave Between Parallel Plates

U(x) = Ey(x): electric field between plates


Boundary Conditions: Ey@) = = 0.
?,.(.\302\253\342\200\236) p(x) = -\\/fir,
q{x)= kl(n f{x)
= source function

Differential equ.: ~
?,\320\224+ C.4)
(\342\200\224 =/(*)
-^ ^\320\265\320\263?\342\200\236

Reflection From a CoatedMetallic Conductor

pe\321\204endicular polarization
or

H.{x):
IE.(x): parallel polarization

Boundary Conditions:

EAx
= 0) = 0, \342\204\226
+ jlcQE:\\ I - 2jk0e*\302\260*\"
Overview of the Finite Element Method: One-Dimensional
Examples \302\246
Chapter 3

v=va

(a) infinite capacitor problem

4-
(b) Parallel plate

QQ
*
x-t \321\205~\321\205\320\271

Figure 3.4 Problems represented by the one-


dimensional differential equation. Here
Reflection
(\321\201) from a coated metallic conductor = a in (a) and (b).
\320\273\320\263\342\200\236

\320\255\321\205

= 0 or = \320\236
Diff.equ.: J*(l*f)+kfcrE: -f f1 \320\250 + /\321\201^\320\263\320\257. C.5)

In thislatter case, the boundary condition at .v = xu is a consequenceof the fact that


the reflected field must be of the form

C.6)
Section 3.4 The
\302\246 Weighted Residual Method 71

where R is the reflection coefficient of the coated ground plane and is not known
until the FEM solution is completed. Thus, the total field

E. = ?t\"c + or
\320\257\320\223, Hz = + \320\257\320\223
\320\257!\320\277\321\201 C.7)

satisfies the stated boundary condition. As indicated in Fig. 3.4, ?lnc, and H\342\204\242
represent the z components of the plane wave incident upon the metal backed
dielectric slab.

3.4 THE WEIGHTED RESIDUAL METHOD

We proceed now with the solution of C.2) on the basis of the finite element method.
As a first step we introduce the residual

R{x)= - \342\200\224
(p{x) ^-\\ + a(x) U(x) -f(x) C.8)
which must be zero in accordance
of course with the state problem. However, it is
impractical enforce R(x) = 0 at every
to point in the domain from x = 0 to

x = xa = a. Since U(x) is not expected to vary substantially over a small distance,


say \320\224\320\264\320\263,
we subdivide the domain into small segments (as shown in Fig. 3.5) and
instead enforce the condition

I Wm(x)R(x)dx = 0 C.9)
J Domain of Wm

over of the segments. Remarking that


each Wm(x) is some weighting function to be
defined later, C.9) enforces the differential equation on an average senseover the mth
segment. By changing the integration or testing interval (and weighting function Wm)
from m = I up to m \342\200\224
N, we can construct N equations for the solution of the
discretized potential or field values. Before proceeding to do so, wemake the follow-
observations
following about tVm(x):

\"' 2= = N\021
Unknown Uz = U% = U,e=2 Unknown UN-,
= U|=N\" \320\246\"

XN-1

(e=2)

Element or Segment (N -1)


segment #1

Figure 3.5 Tessellation the line


\320\276\320\223 segment 0 < \320\273-
< x0.
72 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter 3

\302\246
If Wm(x) = S(x - xm) or WJx)
- (xm+]
= 8[x + xm)/2], the resulting

weighted residual procedure is referred to as point matching and leads to


a form of the finite difference method.
\302\246
If Wm(x) is set equal to the basis functions used for the representation of

U(x), the procedure is referred to as Galerkin'smethod.This is the most


popular testing/weighting method for casting the differential equation to a

linear system.
\302\246
The choice of Wm{x) is not completely arbitrary. For the mathematical steps
in the FEM procedure to hold rigorously, Wm{x) and its derivative must be

at least square integrable over the domain. Specifically, for the problem al

hand, it must satisfy the condition

< oo

In addition, Wm(x) must satisfy conditions at the boundary nodes (end-


points at .y = 0 and x = xu for the one-dimensional problem) which are

compatible with the imposed boundary conditions.Certain smoothness con-


conditions on Wm(x) and dWm(x)ldx may also need to be imposed.
Before generating a linear system of equations from C.9) subject to the bound-
conditions,
boundary it is first necessary to cast it in a more suitable form by following the
steps:
Step 1 Take advantage of the weighting function Wm(x) to reduce the order
of the derivatives contained in R(x). To do so we employ integration by parts, giving

\320\223
duy
\"Jo J*-.v=O

The first right hand side (RHS) term can be evaluated by enforcing the known

boundary conditions at the endpoints. Its effect on the overall system will be con-
considered later.

Step 2 Derive the weak form of the differential equation. The weak form of
the differential equation is most appropriatefor numerical solution and is obtained
by substituting C.10) into C.9). We have

+ \320\257{\320\245) ~ ~
Wm(x) = \302\260
Wm{x) U(X)
Jo\"\\PiX) ^T ? W/'\302\273(V)-/(A')](lx [p{x) Sf]
C.11)

which holds provided C.2) is valid. However, becauseof the integral, the weak form
C.11) enforces the differential equation on an average (and therefore weaker) sense.
Equation C.11) is often referred to as a varialional statement of the problem. What \320\27
remarkable about C.11) is that it incorporates in a single mathematical statement the

requirements imposed by the differential equation and the boundary conditions at the

endpoints. That is, upon substitution of the boundary conditionson U(x) and
Section 3.5 \302\246
Discretization of the \"Weak\" Differential Equation 73

dU(x)/dx, C.11) is not only an alternative statement of the differential equation


C.2), but also includes information about the boundary conditions which are essen-
essentialfor the uniqueness of the solution. This is at the heart of the FEM, and later in
this chapter we will observe how the boundary conditions impact the discrete form of
C.11).

3.5 DISCRETIZATION OF THE \"WEAK\" DIFFERENTIALEQUATION

The discretization of C.11) to a linear set of equations is done by introducing an


expansion for U(x) and then making appropriate choices for the weighting functions

Wm(x). This is an essential step in all numerical solution procedures, and the pre-
previous chapter served to introduce the various classes of basis functions used in a
discrete representation of the unknown function. We choose the linear representa-
representation

C.12)
\320\225

where are
\320\251 the unknown coefficients of the expansion and (seeFig.3.6)
x-x\\
X*\\ < X < \320\233'2

0 otherwise
otherwise 10 otherwise

are the shape functions discussed in Chapter 2. That is.

C.14)

Linear approximation between nodes

Global node # eto segment Local Global (e- 1)th elb element
n=e node# node# element

Figure 3.6 Illusiralion of lhc clh segmentor element and the linear shape functions:
(a) the nodal expansion functions: (h) overlay of the nodal expansion
functions.
74 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter 3

where U\\ and are


\320\251 the unknown values at each node. On the basis of this expan-
expansion, the field or potential over the domain 0 < x < xa is determined by a linear

combination of the field or potential values at each node. Although not necessary for
this one-dimensional example, two types of node-numbering schemesare typically
used to facilitate the programming and implementation of the finite element solution.

Local Node Numbers. These are assigned node numbers unique only within a

single element. For the line segments in Fig. segment is formed by two nodes
3.5 each

having the local node numbers 1 and 2. as illustrated in Fig. 3.6. That is, the notation

x\\ refers to the location of the local node 1 of the <>th element. Similarly, refers
\320\251 to
the field or potential at node I of the eih element. Using this type of notation, we can

develop formulations and equations for a single element which can then be incorpo-
incorporated into the overall solution by attaching the superscripte to all local or elemem
variables. In this manner, the uniquenessof the equations is maintained when com-
combined with those from the other elements.The elements (two-nodesegments for one-

dimensional problems) in the computational domain are assumed to have a unique

number from e = 1 to e = Nc.

Node
Global Numbers. Each node of the discretized domain is also given a
unique number from 1 to N, as shown for example Fig.in 3.5, The assignment of
these global numbers is necessary since eventually all unknowns from each element
must be collected (a process referred to as element assembly) into a matrix system

such as that in C.1). At that stage, it is necessary to maintain a single subscript or


array dimension. This necessitatesthe correspondence between the notation \320\273)'
and

where
\321\205\342\200\236, first refers to the local numbering
the notation and hereon the single
subscript be
will reserved for the global numbering notation. As can be realized,
since every node in Fig. 3.5 belongs to two elements, multiple local notations can
refer to the same global node or field value. As an example (see Fig. 3.6),the node
location is identical
\320\273\342\200\236 to the locations implied by the notations x\\ and .v>\"'.
Likewise, for the held values we can state that ?/,,
= V* = ?/-Tl a\"d so on.
When the expansion C.14) is substituted into C.11) we get
f
\320\233', 2 f.vS

,.=1 [ ,=j J-n

- -fix)
\302\273\302\246\342\200\236(*>
/\321\204\320\263> Wln(x)
= 0 C.15)
^ ^ v=0

The latter terms in the brackets are due to contributions from the endpoints of the

domain and their evaluation is subject to the specific boundary conditions. This

equation now explicitly shows how the boundary conditionsenter into the construc-
of
construction linear system. Hereon, we will refer to their contributions
the as [endpoints]
since we have not yet specified the type of boundary condition to be imposed.
We are now ready to make different choices for the weighting function to

generate a system of linear equations for the solution of \\Un). As stated earlier,
this step is also referred to as testing and Galerkin's method is usually employed
Section 3.5 Discretization
\302\246 of the \"Weak\" Differential Equation 75

in the finite element method. Specifically, we choose Wm(x) =


Nf(x) and for each of
these testing or weighting functions a single linear equation is generated. From C.15)
we have

+ [endpoints] = 0 C.16a)

where

[endpoints]= endpoints | +endpoints2

.v=0
*=*.J

~p{x) C.16b)
~a\\
.\321\202=0 x=xd

Since Nj(x) is nonzero only over the eth element, the summation over the elements
can be eliminated at this stage. In other words, integration is carried out only over
the nonzero portion of the integrand. We can rewrite C.16a) in matrix form as

+ fendpoints] = C.17a)
[A'ji\\{Uf)

which can be considered as the weighted discrete form of the differential equation.
This matrix system provides a relationship only between the two nodes forming the
eth element a localizedrelationship among the
and is therefore node fields/potentials.
The endpoint contributions appear only when e = 1 or e = Ne and vanish when the

Neumann boundary conditions^


~
=^ = 0 are imposed. OtherwiseC.17a)
becomes

This is commonly referred to as the elementalmatrix system. Also, the matrix

C.18)

is referred to as the element matrix and its entries are given by

C.19)

The excitation vector entries are specifiedby

=
ri{x)f(x)dx C.20)

Since Nf(x) are linear functions, the evaluation of can be carried


\320\220\320\265\321\203
out in closed
form provided p(x) and q{x) are taken as constant over the integration of the e\\h
element. Specifically, setting p(x) \302\253*
//' and #(.v) = q\" for x\\ < x < x'i, we find that
76 Overview of the Finite Element Method; One-Dimensional Examples \302\246
Chapter 3

Also, on approximating f(x) by f(x)


\302\253=
f over the eth element, we obtain

The above testing procedure will result in 2Ne equations obtained by letting
e = 1,2 Ne in C.17). Since only Ne + 1 unique unknowns exist, it is necessary \321\216
condense or combine the 2Ne equations down + 1. The additional set of
to Ne

equations is a result of testing at the same from the left using the testing
nth node
function N2~\\x) and from the right using the testing function N'(x). Their reduction
to Ne + 1 equations is referred to as assembly of the element equations and is a
standard step in all finite element solutions.

3.6 ASSEMBLY OF THE ELEMENT EQUATIONS

The essence of the assembly procedure is to take the average of the test equations
from the left and right of the node.
\302\253th That is, we consider the weighted average.

dx + N\\{x) R(x)
\320\251-1(x)R(x) dx\\=0

or

\"'*'
\320\223J Tm(x) R{x) dx = 0, m = 2,3 N - 1 C.24)

where Tm(x) is illustrated in Fig. 3.6(b) and is given by

= ~x
Tm(x) x <x < C-25)

0 otherwise

From C.24), the test equation at the with node has the explicit form

= *m. m = 2,3 N-\\ C.26)


?/Ut/\302\253
Section 3.6 \302\246
Assembly of the Element Equations 77

where

1
/ ,dTmdTn
= \320\223\"+'\320\223 -\342\200\224\302\246
p(x)
\320\233\342\200\236\321\210 +
\342\200\224r- q{x) Tm{x) \320\242\342\200\236(\321\205)
dx
Jxa-X dx dx J
A^\\\" + &\320\221.
^
\320\277
\342\200\242 w

\320\273
12 . n = m \342\200\224
1 l->^',)

/4^\"' . n = m + 1

'
f-v\302\253+i

*m
= = \320\271\320\223\321\210\"'
\320\233
\320\243\320\266\320\270)\320\233\321\205) + *\320\223\" C.28)
I
Jtm-I

We have temporarily excludedthe cases of m = 1 and m = N since testing at the


boundary nodes 1 and N requires the inclusion of the endpoint contributions. These
must be dealt with after the specification of the boundary conditions. Thus, C.24)
gives a set of N \342\200\224
2 equations with the other two equations to be supplied later.

When the Dirichlet boundary conditions U{x= 0) = U(x = = \320\236


are
\321\205\342\200\236) imposed at
the endpoints of the domain, the implication is that U\\ (or Ux) = UN (or ?/^')
= 0
and thus the number of unknowns is reduced from N down to N - 2. Consequently,
when U(x) satisfies the Dirichlet boundary conditionson both ends of the domain,
the system C.24) is sufficient for the solution of the unknown vector [Un] provided
the node fields or potentials are set to zero a priori.
In practice(i.e.,in a computer implementation of the FEM), the construction
of the linear system is done by manipulating the element matrix system C.17). This
procedureappears rather tedious on paper because of its repetitiveness but is most
suitable for computer implementation. We will illustrate it below for the one-
dimensional formulation and in connection with the three-element segment given
in Fig. 3.7.
For convenience, let us assume that the Neumann boundary condition is satis-

satisfied at nodes 1 and 2, i.e., \320\251 =\320\251


=0. Thus, the element matrix equations

C.17b) even when testing at elements 1 and


are applicable 3. More specifically, for
e=\\, C.17b) becomes

C.29)
'22jl^2j l\022j

Also, for e = 2 and t' = 3, we have

'*!! (\320\260\320\264

\320\270,

1 4, 2 \\ 3 \\ 4

Figure3.7 Three-element tessellation of a ' e=1 e=2 e=3 '


line scgmenl.
x~ 0 x- xa
78 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter i

and

C.31)
b\\

These are six equations for the four unknown coefficients U\\. U2, U$< and C/4. To
reduce C.29)-C.31) down to four add those which
equations, we simply
correspond
to testing from the left and right This addition of equationsamounts
of the node. to

simply performing the first sum in C.15) and is not an arbitrary decision in the

process. For nodes 1 and 4, there is only testing from one side and therefore the
first equation of C.29) and the second equation of C.31) are left unchanged. For
node 2, we add the second equation of C.29) and the first equation of C.30).
Likewise for node 3, we add the second equation of C.30) with the first equation
of C.31). The resulting (i.e., assembled) system of four equations is

\320\220\\\320\263 0 0 t/,

22 + \320\2202\320\270\320\220\\\320\263 0 \320\2702
C.32)
\320\276 Ah + \320\220]
\320\220222 \320\260]2

\320\276 0 42- \320\2704 b\\

where we employed the global node notation for the {U} vector. In placing this into
the compact notation

[A][V\\
= C.33)

we note that [A] is a tridiagonal matrix regardlessof the number of elements used for
the tessellation of the line segment 0 < x < xa. That is, a maximum of three nonzero
entries appear in each row of [A] and except for the top and bottom row, the
diagonal entry and its adjacent elements are the only nonzero entries. We can
state that the bandwidth of the matrix is three regardless of the number of elements.
Simply put. as the number of nodes/elementsincreases, matrix takes the generic
t he

sparse form

X X 0 0 . ..
X X .V 0 0 . \342\200\242
.

0 X .V x 0 0 ...
0 0 JC X X 0 0
C.34)
0 0 0 X X X 0 0

0 0 X X X 0

0 0 .v .v

\" \"
where the \"x\" symbols imply a nonzero entry and the . . . denote a continuation
of the zero entries, in applied mechanics [A] is referred to as the stiffness matrix
because of the similarity of C.33) to the equation Kx =f for the deflection* of a
linear spring with stiffness under
\320\232 an applied force/. In electromagnetics. [A] can
be interpreted in several ways depending on the physical quantity represented by

U{x). For example, if U(x) = voltage or electric field, then [A] can represent an
admittance matrix with {/>} being the electric current excitation. Alternatively, if

U = electric current or magnetic field, then [A] may be compared to an impedance


matrix.
Section 3.7 Enforcement
\302\246 of Boundary Conditions 79

Becauseof the sparsity of [A], only its nonzero entries are stored when solving
the matrix system C.33). Also, global numbering as used in C.26)-C.28) must be
employed in defining the assembled matrix system. Clearly,the matrix system in
C.32) is identical to that in C.26H3.28) except for the first and last row of C.32)
which correspond to the omitted m = 1 and m = N equations in C.26). In practice,
the assembly of [A] is done by employing C.27) and C.28)directly or by implement-
a double
implementing loop and keeping track of the correspondence between the local and
global numbering. A possible double loop which generates the nonzero entries of
[A] is

Initialize
\320\241 the [A] matrix to zero
DO Wm = l,N
DO 10 \320\270
= I./V

10 A(m,n) = 0.0
\320\241
Loop through all elements and construct [A]

DO20e= l.iV-1
\320\241
Compute element matrix [A*]
DO 30/= 1,2
DO 30./= 1.2
30 Compute AE(iJ) from equations C.21) and C.22)
Assemble
\320\241 [A'] into global matrix

A(e+ l,e + l) = A(e+\\.e+l) + AEB,2)


A(e.e+l) = AE(l,2)
20 A(e +l,e) = AEB, I)

3.7 ENFORCEMENT OF BOUNDARY CONDITIONS

So far we have postponeda comprehensive discussion on the imposition of boundary


conditions.As is the case with any differential equation, a unique solution can be
obtained only after the specification of boundary conditions which constrain the
values of the field/potential at the boundaries (endpoints for the one-dimensional

case) of the domain. These boundary conditions(also referred to as boundary con-


constraints) come in various forms. In their most typical form they provide a specifica-
of
specification the field at the end nodes 1 and N or a specification on the value of the
normal derivative of the field. However, the boundary condition may simply provide
a relationship between the normal derivative and the field at the boundary node(s).
The derivative of C/(.v) at the boundary nodes shouldnot be approximated by using
the expansion C.12). This representation is an interpolation function between the

boundary nodes, but the fields/potentials and their derivatives at the boundary nodes
must be independently specified for a unique solution of the differential equation. As
a reference, we remark that if the spatial variable .v in C.2) was replaced by the time
variable i, the boundary conditions would become the initial conditions of the tem-
temporal response U(i).
80 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter 3

The various conditions to be encounteredin the solution


boundary of differ-
differential equations are with a well-establishednomenclature. Belowis a list
associated
of the boundary conditions typically imposed at the boundary nodes. The type of
boundary condition to be used will depend on the physics of the problem, as dis-
discussed in Section 3.2.

3.7.1 Neumann Boundary Conditions (Homogeneous)

For one-dimensional problems, this boundary condition states that

at the left or right endpoint of the domain. In two- and three-dimensional problems,
it can be stated as

n Vf/ =
\342\200\242
^oft
=0 on S or \320\241 C.36)

where n denotes the outgoing unit normal vector of the domain boundary, as
illustrated in Fig. 3.8. In acoustics this is referred to as the hard boundary con-

condition, and in electromagnetics the magnetic field obeys this condition on metallic
boundaries.
The Neumann boundary condition is the easiest to be numerically enforced in
FEM solutions. In this case, the [endpoint] contributions C.16b) vanish and the
elemental equations C.17b) lead to the global system C.32) without any special
considerations.

3.7.2 Dlrlchlet Boundary Conditions(Homogeneous)

The Dirichlet boundary condition specifies a vanishing field or potential at the

endpoints of the computational domain, i.e.,

U(x)= 0 at endpoints C.37)

In acoustics this is referred to as the soft boundary condition.For two- and three-
dimensional electromagnetic problems, the Dirichlet boundary condition is satisfied

by the tangential electric fields on all metallic surfaces.


When the Dirichlet boundary condition must be satisfied in connection with
the elemental equations C.17a), the [endpoints] are not zero and must be considered
in the assembly of the final system. Since the [endpoints] contribute only to those

Figure 3.8 Illustration of the enclosures for


\320\241
(enclosing contour) S (enclosing surface) tw0. ;ind three-dimensional domains.
Section 3.7 Enforcement
\302\246 of Boundary Conditions 81

elementequationsresulting from testing at nodes 1 and N, the global system C.32)


for the three-element tessellation (see Fig. 3.7)takes the form

a\\2 0 0 U\\ endpoint, h\\


Av
A A-,-, + A\"\\\\ Ah 0 U2 0 b2 H- b\\
4-
0 \"Ah An + A11 Uj 0 b2 -f- b'\\
0 0 A2l fAn. 4 endpoinh
C.38)

However, U\\ = C/4 = 0, as dictated by C.37), and thus N 2 =


\342\200\224 2 unknowns (f/2 and
l/j) remain to be determined of C.38).This
from the solution allows us to discard the
first and last equations of C.38) provided we set U\\ = f/4 = 0 wherever they occur.
In doing so, we have
1'2
\320\233 C.39)

which can be inverted to yield

U2\\_ 1

Thus, even though the [endpoint] contributions are not computable, we can still
solve for the node fields or potentials. We remark that if C.38) was a system of N
equations, the reduced system C.39) will consist of N \342\200\224
2 equations after the en-
enforcement of the Dirichlet boundary conditions.

3.7.3Nonzero Boundary Constraints (inhomogeneous)

We may also encounter situations when the field or potential at the end node is
assigned a specifiedvalue. An example of this situation is the parallel plate capacitor
problem where the upper plate has a potential equal to Vu. As a more general
example, let us consider the situation where in reference to the three-element example
in Fig. 3.7, we set

= 2o. ^7 C.40)
which are typically referred to as inhomogeneous Dirichlet and Neumann boundary
conditions, respectively.
Substituting these values into the system C.38) gives

/4J, A\\-> 0 0 1 Qo 1 endpoint,


4i + \320\260\320\252
\320\273\320\252 A]2 0,1/2. 0
\302\246\342\200\242*
0 Ah A222 + A*n A]2 U} 0

0 O\" A\\i /422 J t74 J Q\302\273

Q,, = = (/, is already specified,the


where \342\200\224 since
Q,,p(xa) N2\\xa) \342\200\224Q,,p(xa). Clearly,
first of the equations can be discarded.After doing so and rearranging, we obtain the
system

A 12 0
A\\\\ \320\220\\\320\263 Al C.41)
0 \302\246L b]
82 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter]

which can be solved for U2, V-i and U4. We again point out that, when dealing with
N nodes, C.41) would be a system of N - 1 equations and with the exception of the
first and last equations, the test will have three nonzero elements as illustrated by tk
matrix C.34).
The procedure of reducing the system C.38) to C.39) or C.41) is often referred
to as condensation of boundary conditions.This reduction is typically performed
during the assembly process by eliminating for example the rows which test at a

boundary node assigned a specified value. Finally, we remark that the condensation

process modifies the excitation column implying that the specification of a potential

or a field value on a boundary is equivalent to a source excitation. Thus, the excita-


excitation of the domain can either be specified through a nonzero /(a) or through the

enforcement of nontrivial boundary constraints.

3.7.4 ImpedanceBoundary Conditions

The impedance boundary condition provides a relationship between the field

and its normal derivative. Referring to Fig. 3.8, it is typically stated as [9]

+ at/ = 0
^\320\264\320\277 on S or \320\241 {\320\252\320\2

where or is a constant. This boundary condition has been found very useful in

modeling the
presence of thin dielectric coatings without a need to tessellate tk

region interior to the dielectric. In finite element simulations, C.42) also plays the
role of the radiation condition or a first order absorbing condition (to be discussed
later). These boundary conditions will be discussed extensively in later chapters and
are used for truncating the computational domain of open domain problems as in
the case of scattering by an airfoil (see Fig. 3.2). They basically provide a statement
on the field behavior at the boundary nodes.The need to mesh the region beyond the

boundary enclosure is therefore eliminated provided the absorbing boundary con-

condition gives the proper field behavior beyond the boundary enclosure.
A generalization of C.42) is

+ aU = f) on 5 or \320\241 C.431

where or and j8 are constants and we can refer to this as the inhomogeneous boundun
condition. The treatment of C.43) is no different than that for C.42). For the one-

dimensional problem considered here, C.43) reducesto

\342\200\224
+ aaU = x =
\321\200\342\200\236.
= a
\321\205\342\200\236 C.44|
ox

When these conditions are used in the finite element solution of the three-elemeni

segment example (see Fig. 3.7), the resulting system is again of the same form as

C.38). From C.16b). we have

[endpoint], =/?@)(A,-or0f/|)
[endpoint],
= -p(x,,)(fiu - a<tU4)
Section 3.8 \302\246
Examples 83

and when these results are incorporatedinto C.38), after rearranging, we obtain the
system

0 0
0
0 V,

-\320\233/\320\232\320\236)
\320\276
\302\246
+ C.45)
\320\276
b\\ + b\\

bl +A,/K-va)

This systemcan now be solved for [U] and we remark that the middle two equations
are unchanged by the imposition of the impedance boundary conditions.It is clear
that when iV equations are involved, the imposition of the impedance boundary
condition will only alter the first and last equationsof the overall system.

3.8 EXAMPLES

To illustrate the application of the various steps in constructing and assemblingan


FEM system, let us consider two simple examples.
EXAMPLE 3.1

Solve numerically the differential equation

2W + n2 ?,.(*) = sin
2\320\2732 nx, 0 < x C.46)

using ten segments and subject to the boundary conditions

?v@) = ?,,(!) = 0 C-47)


Compare the numerical results with the exact solution [10, Chapter 11]
Fv.(.v)
= sin;r.v C.48)
Solution
This problemcorrespondsto a source in a parallel plate waveguide as illustrated in Fig. 3.4(b).
By comparison to the general form of the differential equation in C.2), we note that

p(x) = 1. q(x)= ?r, j\\x)


= 2jC sin jr.v

Also, from Fig. 3.9 and the formulae C.21) and C.22), we find that

Aeu = 10.328987
=^2=-^-

and

. = ,*,=\342\200\224L +\321\217*\320\224*=_9.\320\2305507
\320\224\320\264- 6
84 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapier!

(x=0.

\\ e

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1


\320\232\342\200\224H

Ax=0.1,N=#ofnodes = 11
Figure 3.9 Tessellation of a line segmra;
Ne = # of elements = 10
into ten equal length dements.

Thus, the elemental equations are given by

10.328987
\320\223 -9.835507 II ?J, 1 _ I b\\ \\
C.49t
[
-9.835507 10.328987 J | \320\251\320\263
\\~ \\ b\\ \\

where from C.20)

\302\246>
\320\223*/**-.
lit1 = 2\320\2732 x.
f ~
x\\

with .r' = (e - 1) and x$


\320\224\321\205
- e Ax for e = 1,2,..., 10.
After assembling the elemental equations C.49) subject to the Dirichlet boundary con-

conditions, we get the 9 x 9 system

20.6580 -9.8355 0 0 0 0 0

-9.8355 20.6580 -9.8355 0 0 0 0 0


0 -9.8355 20.6580 -9.8355 0 0 0 0

0 0 0 -9.8355 20.6580 -9.8355


0 0 0 0 -9.8355 20.6580

Ey2
Eyi h
EyA b,
=

b*

\320\272

with

{/>}r = {0.605, 1.1507, 1.5838,1.8619,1.9577,1.8619,


1.5838,1.1507.0.605)
This system in form to that in C.26).
is identical The node fields Eyl and ?,.|, were noi
included they are zero as dictated
since by the boundary conditions. Also, we remark that
the symmetry of the system (i.e., Aq = Aj,) is not unexpected.Electromagnetic problems wiik

reciprocal permittivity and permeability tensors are inherently reciprocal, and this properly\302\273
exhibited in the Hermitian form of the matrix.
Section 3.8 \302\246
Examples 85

Solving the above system via matrix inversion, for example, gives the following results:

Nodc# fi\342\204\2424 ??\"\342\200\242 Error = |?fEM - ??\"ct|

1 0 0 0
2 0.3103 0.3090 0.0013
3 0.5902 0.5878 0.0024
4 0.8123 0.8090 0.0033
5 0.9550 0.95II 0.0039
6 1.0041 1.0000 0.0041
7 0.9550 0.9511 0.0039
g 0.8123 0.80\320\255\320\224 0.0033
9 0.5902 0.5878 0.0024
10 0.3103 0.3090 0.0013
I! 0 0 0

As seen, the difference between the exact and numerical solution is in the third decimal place,
indicating that the employed number of elements are sufficient for anaccurate representation
of the field distribution. To reducethe solution error, more the elements can be used for
tessellation of the line segment. However, as \320\224\321\217 0, the system condition \\\\.s/\\\\ ||.\302\253/~'||,
\342\200\224>

where is a natural norm of the


||.\302\253/|| matrix [A]. Eventually the error would not reduce with

increasing N. unless the machine precision is also increasedwith the inclusion of additional
decimal places in the calculation of the matrix entries and in carrying out the system solution.
Numerical precision is particularly important for solving problems with many thousands of
unknowns (the case with many problems).
practical
Having the node fields, the field is found from C.12) or C.14). Specifically, since

x\\ = 0.1(e O.le, we have


\342\200\224
1) and x% =

Ey(x)
= 10 -
\321\201
?[?\342\200\242;,@.1 x) + - O.le
\320\225*.\320\263(\321\205 + 0.1)]P^Jx -2e + 0.05)
9

0.1

where

x < A.v/2
otherwise

is a pulse function.
EXAMPLE 3.2

The field reflected by a metal-backed dielectric slab due to a plane wave excitation E'J\" = e'*\"*
is given by

Er. = Re'\021\"

where R is the unknown reflection coefficient to be determined.


The dielectric slab is of thickness t, as shown in Fig. 3.4(c). Consider the case of
f = 0.25A0, t, = 4 \342\200\2247/3,=
\320\224, 1 and employ the finite element method to compute the reflection
coefficient as a function
\320\257 of the loss parameter fi. Compare the result with the analytical
reflection coefficient given by
86 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter}

where

and

are the wave impedances in free space and in the dielectric medium, respectively.

Solution
As discussed al the beginning of Section3.2,the pertinent differential equation is

d2E.,ir. _

subject to the boundary conditions

We will set xa = t + At, where Al is the chosen element length, as illustrated in Fig. 3.10. The

choice of \320\224\320\263
(the element size) should be somewhere between Ad/10 and Ad/20, where
Ad = 2jr/Re(ftd) is the wavelength in the material. For our case, Ad = and
B\320\267\320\263/\320\2200)\321\20

Ad = ko/y/Re(trttr), and we will choose \320\224/


= 0.025A0 to ensure that this criterion is satisfied
Since t = 0.25A0, this choice of At leads to Nv = 11 layers or elements, including the air-filled
element occupying the region t < x < t + At. We also observe that for \320\224\320\263= iIk
\320\236.\320\23625\320

sampling rate is 20 elements per wavelength in the dielectric since ReFr) = 4. From |3.2ii
and C22) the entries of the elemental equation are given by (with / = 1,qc = -k\\tK)

= Ah = -L - = 40.- 0.32899e

At -
J_ = -40. 0.164493<n
At

-?,= 0

\302\246

Figure 3.10 Tessellation of the dielectric sb


using N,.
= N - I layers (elements).
Section 3.8 \302\246
Examples 87

where e,c denotes ihe relative of the eih element and we have suppressed
permittivity the
presence of the factor Ao the free-space wavelength
since cancels out in the final result.
After assembly, we will obtain 11 equations since ?-t = E:(x = 0) = 0. Except for the last
and first, all other equations of the assembled system will be of the form

[0.0 Aw.i)m. Amm, Am{m+l), 0 0]|?.} = 0


where {?,lT = |?.:, ?., ?.12(is the unknown node Gelds vector. Anm
= A\",\\+ A'S\021'1.
=
-4-j'f .
\320\220\321\210-1\\\321\202 and = The
\320\220,\321\204\342\200\236+\\)
\320\220\"^- last equation of the system is obtained by recogniz-
that
recognizing the boundary condition at * = t + Al is of the form C.44) with aa=jkQ and
ft,
= 2/*0<J A'\"-Thus.

[endpoint], = +7*(,?,,i- 2jk0eJk\"u+Al)

and the tenth equation of the assembled system will be

+Jko)E:l2 =
That is, b\\2 = 2jk()e'*|>1'+\320\224|> js the only nonzero entry of the excitation column and is a result of
the assumed plane wave incidence.
Upon solution of the assembled FEM system, the reflection coefficient is extracted from
the node field values as

E2U
\342\200\236hem
- E^(x =t+ At)
?Inc(.v = I + At)
A plot of R as a function of the loss parameter /3
= Im(er) is given in Fig. 3.11 (courtesy of
S. Legaull).
It thai the computed values
is seen of RFEM are in good agreement with the exact result
even when the discrete layers in the dielectric are reduced from ten down to only N,. - 1 = 2
(corresponding lo a sampling rate of eight elements per \\d). In this case the decay of the field as it
propagates in the dielectric is very rapid and the employed discretization of the dielectric slab is
not sufficiently fine to pick up the large changes in the field values from one discreteslab
(element) to the next. This is a good example of the fundamental assumptions made when
diseretizing the computational domain. For large jS, the field decrease is slower and thus less
samples are neededto approximate the field within the region while maintaining the same level
of accuracy.

\342\200\224
Exact
\342\200\224
Elements in Dial. = 10
-
\342\200\224
Elements in Dlei. = 5
Elements in Diel. = 2

DC 0.7

fifuit3.il Plot of ihc reflection coefficient


normal
(\302\253I incidence) from a grounded dielcc-
incblab of thickness / = = 4 -iff, 0.4
0.25X,, <<=,
it, = I): comparison of the numerically com-
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Rfm with the exact result
computed \320\233\"\".
88 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter 3

EXAMPLE 3.3

Repeat Example 3.2 assuming that and with


\320\265\320\263=\320\264\320\263=4-\321\203/* all other parameters fcli

unchanged.

Solution

By selecting \321\206,
= er, the impedance of the wave inside and outside the dielectric is

^ = 120jt

and thus the wave does not exhibit any reflection at x = t (i.e., at the dielectric interfacci

The interface is then referred to in the literature as reflectionless. Also, as it enters ilw

dielectric, the wave is absorbed due to the nonzero imaginary components of the conslituuvi;

parameters. If p is chosen so that the wave has decayed to negligible levels by the time it

exits into the air medium, the dielectric layer can be considered as \"perfectly absorbing'
since no reflected field is returned. Such layers have been proposedrecently [11], and it ha>
been shown that certain anisotropic layers can lead to perfectly matched interfaces for
incidences away from normal. These types of layers are important in finite element simula-
simulationsbecause they can be used for simulating a nonrenecting surface. The latter is essential
in solving open domain problems, as is the case with scattering and antenna radiation

problems considered later.


For the one-dimensional problem considered here, we will restrict ourselves to ,t
\342\200\224
simple isotropic dielectric =
layer having \302\253r/i, 4 \342\200\224jfi.
The numerical results for a layn
of thickness / = 0.25\320\233.\320\276
are given in Fig. 3.12 (results are courtesy of S. Legault) as a
function and for three
of p different sampling rates {Nt = 6.11,21). It is again seen ihii

as fi increases, the numerically computed reflection coefficient begins to deviate from ihc

analytical/exact result given by R = -*\342\200\242\"*>\"\302\246


[12]. This is expected and is due to the rapid

decay of the field within the absorber as fi is increased. Thus, higher sampling is needed \321\213
belter model the field values from one discrete layer to the other.

-10

-20
Elements In Diet. =5

solution |
[\342\200\242\342\200\224Exact 4^ Elements In piel.= 10
-40

\302\246\\Elements in Diel. = 20
\342\200\22450

Figure 3.12 Plot of the reflection coeffiacm


-60 n_ 1
. (at normal incidence) from a grounded die!\302\273
trie slab of thickness / = 0.25A,, having
Appendix 1 \302\246
Sample One-Dimensional FEM
\320\274\320\260\320\277\320\273\320\262
Analysis Program 89

APPENDIX 1: SAMPLE ONE-DIMENSIONAL matlab FEM ANALYSIS PROGRAM

%This MATLAB code can be used to reproduce the data in Fig. 3-11
\\ CoUTtesy Of LARS ANDERSEN

%t=thickness of slab
%kO=free space propagation constant
%xa=location of the left
computational domain endpoint
%alphaa=alpha coeff. to be used in the boundary condition at xa; see C.43)
%betaa=betaa coeff. to be used in the boundary condition at xa; see C.43)
%p=p coefficient appearing in the differential equation C.2)
%epr=relative permittivity of the slab
%beta=as defined in Fig. 3-11
%q=coefficient appearing in the differential equation C.2)
%N=number of nodes (N-l=number of layers)
%R_FEM=computed reflection coefficient (to be plotted)

% Initialization

clear;
N=7;
t=0.25;
Dx=t/(N-2)(
xa=t+Dx;

P=-l;
kO=2*pi;
f=0;
alphaa=j*KO;
betaa=2*j*kO*exp(j*kO*xa) ;
%values to generate plot in Fig. 3-11
Z=26;

4epsr=4-j*beta;
for 2=1: 7,

beta=(z-l)/(Z-l)*5;
epsr=4-j* beta;

% Initialize global matrix and vector

for m=l:N,

b(m,l)=0;
for n=l:N,
A(m,n)=Os
end

end

% Compute element matrices and assemble global matrix

for n\302\253l:N-l,

xel=(n-l)*xa/(N-l);
xe2=n*xa/(N-l) ;
if n=\302\273N-lf

eps=l;
90 Overview of the Finite Element Method: One-Dimensional Examples \302\246
Chapter]

else
eps=epsr;
end
q=eps*k0'2;
Ael(l,l)=p/abs(xe2-xelLq*abs(xe2-xel)/3;
AelB,2>=Ael(l,l>;
Ael(l,2)=-p/abs(xe2-xel)+q*abs(xe2-xel)/6j
AelB,l)=Ael(l,2);
belB>=bel(l);
A(n,n}=A(n,n)+Ael(l,l)!

A(n,n+l)=A(n,n+l)+Ael(l,2>;
A(n+l,n)=A(n+l,n)+AelB,l)\320\263
A(n+l,n+l)=A(n+l,n+l)+AelB,2)
b(n,l)=b(n,l)+bel(l) ,-

end

% Enforce boundary conditions

A(l,l)=l?
for n=2:N,
A(n,l)=0;
A(l,n)=0;
end

A(N,N)=A(N,N)+alphaa*p;
b(N)=b(N)+betaa*p;

% Solve the resulting NxN system

x=inv(A)*b;

% Compute reflection coefficient

R_FEM=abs{ (it(N)-exp( j*kO*xa>)/exp ( j*kO*xa) ) ;


zeta0=377;
zetal=sqrt(l/epsr)*zeta0j

R_exact=abs((zetal*tanh(j*kl*t)-zetaO)/(zetal*tanh(j*kl*t)+zetaO));
Rf {z,l)=beta,-
Rf(Z,2)=R_FEM;
Re(z,l)=beta;
Re(z,2)=R_exact

end

%Plot Reflection Coefficient

elf;
plotfRf(:,1),Ret:,2>'r')(
hold;
plot(Re(:,l),Re(:,2));
References 91

xlabeK 'beta' );
ylabeK ' | ft i \342\200\242)
s

% End of program

APPENDIX 2: USEFULINTEGRATION FORMULAE FOR ONE-DIMENSIONAL


FEM ANALYSIS

- -
xn)q{x)dx * + q(xn+l)]
p'*\342\200\236+,
-v)(.v
^ [q{xn)

/2 = (x - \321\205\342\200\236_,J
<?(\302\246*>rfx
(j-J p
- .vJ
(xn+1 q(x) dx % -^
[iq{xn) + <jf(.vtt+1)]

=\302\273 + /K*,,-i')]
p{x)dx \\[\321\200(\321\205\342\200\236)

= -L \320\223\" -
.v,,_,)/(.v) *
(.v
^- [2/(-vn) +/(*\342\200\236_

=
\320\263
\320\223
.v,,

In all cases, hn
= -
xn\\
\\\321\205\342\200\236+\\and hn_\\
= \342\200\224 are
xn-i
\\\321\205\342\200\236
I assumed to be small. These
are derived by introducing a linear approximation for the functions g(.v), j\\x), and
p(x). For example,in deriving the approximation for /4, p{x) was approximated as

\"n-l \302\253n-1

REFERENCES

[1] R. Courant.Variational methods for a solution of problems of equilibrium and

vibrations. Math. Soc, 49:1-23,1943.


Bull. Amer.
[2] J. H. Argyris. Energy theorems and structural analysis. Aircraft Engineering,
26:347-356, 1954.
[3] P. Silvester. Finite element solution of homogeneous waveguideproblems.Alia
Frequenza, 38:313-317, 1969.
[4] P. Silvesterand G. Pelosi,editors.
Finite Elements for Wave Electromagnetics:
Methods and Techniques. IEEE Press, New York, 1994.
[5] P. L.Alett, A. K. Bahrani, and \320\236. Zienkiewicz.
\320\241 Application of finite elements
to the solution of Helmholtz's 1968.
equation, Proc. IEE. 115:1762-1766,
[6] 1. Stakgold. Boundary-Value Problems of Mathematical Physics.Macmillan,
New York, 1968. Volumes I and II.
92 Overview of the Finite Element Method: One-Dimensional
Examples \302\246
Chapter ]

[7] S. \320\241
Charpa and R. P. Canale. Numerical Methodsfor Engineers. McGraw-

Hill, New York, Second edition, 1988.


[8] R. L Burden and J. Faires. Numerical Analysis. PWS Pub. Co., Boston. Fifth
edition, 1993.

[9] T. B. A. Senior and J. L. Volakis. Approximate Boundary Condition,*; in

Electromagnetics. IEE Press, London, 1995.


[10] E. Kreyszig. Advanced Engineering Mathematics. John Wiley & Sons.New
York, Fifth edition, 1983.
[11] D. M. Kingsland, J. Gong, Volakis, and
J. L. J.-F. Lee.Performance of an

anisotropic artificial absorber for truncating finite element meshes. IEEE Tram

Antennas Propagat., 44:975-982. July 1996.


[12] S. R. Legault, T. B. A. Senior, and J. L. Volakis. Design of planar absorbmc
layers for domain truncation in FEM applications. Electromagnetics, 16D):451
464, July-August. 1996.
Two-Dimensional

Applications

41 INTRODUCTION

Having discussed one-dimensional examples, we now proceed with the application of


the FEM to two-dimensional problems. Although these represent simplifications of
the real-world three-dimensional situations, they are much simpler to formulate and
solve. Thus, they are appropriate for illustrating the FEM procedure and all
aspects
of the three-dimensional analysis can be discussed in the context of two dimensions.
Specifically, matrix assembly, absorbing boundary conditions,and various hybrid
formulations of the FEM can be discussed in two dimensions without loss of general-
This
generality. chapter is therefore important in understanding the machinery needed to
carry out FEM analysis and in illustrating its capabilities.
Two-dimensional models can used also be
to generate useful results for a
number of
practical problems in electromagnetics. For example, \320\242\320\225
and TM
mode analysis for straight waveguides of arbitrary cross section can be carried out

using a two-dimensional formulation [l]-[2] or the capacitance of transmission lines,


such as those in Fig. 3.1, can be found by solving Laplace's equation in two dimen-
dimensions. Similarly, the can be computed by carrying
inductance out a two-dimensional
analysis of Maxwell's equations. Also, an understanding of three-dimensional prob-
problems can be obtained by performing an analysis of closelyrelated two-dimensional

problems. Since two-dimensional analysis is much simpler, it is often a quick way to


obtain results for many practical open domain problems.The latter are among the
most difficult to solve and include those of radiation and radar scattering. Scattering
involves the computation of fields returned from a given structure due to plane wave
excitation or. more generally, the impinging radar wave. Two-dimensional analysis
has been extensively used in scattering and has provided engineers with an under-
understanding of scattering phenomena in the absence of three-dimensional simulation
which may be prohibitive due to their greater CPU and storage requirements.

93
94 Two-Dimensional Applications \302\246
Chapic

In this chapter we cover many aspects of the finite element method (and hybr

versions) at sufficient detail to provide the reader with a comfortable level of uui
standing its implementation. Such an understanding is essentialfor a three-dim,
sional analysis where many of the steps must be discussed symbolically due loi
size of the matrices even for very small problem sizes. We begin by first perform;
reduction of Maxwell'sequations to two-dimensional wave equations and \321\200\320
with the solution of the latter by following the same FEM steps discussed in i
previous chapter. That is, we generate the weak form of the wave equation and cur
out its discretization with the introduction of linear shape functions.Matrix as\302\253

bly and boundary conditions are then discussed for determining the propagaii
constants in waveguides, and we give examples of this type of analysis.In proceed
with the solution of open domain problems, we first discuss absorbing bound;
conditions and material absorbersfor truncating the finite element mesh. The \320\277\32

ary modifications of the FEM system are presentedand example calculations,!


given for scattering by cylindrical structures.The use of the boundary integral!
truncating the FEM mesh is presented in Section 4.4, and an example application
scattering by two-dimensional recessed cavities is given. Finally, we discuss I
implementation of two-dimensional solutions using edge-based basis funeiiu
since these are essential in three dimensions and are used extensively in subsequc

chapters.

4.2 TWO-DIMENSIONAL
WAVE EQUATIONS

Throughout this we shall assume that


chapter the fields or potentials are till
independent ofcoordinate
the z or have a known z dependence as is the case
propagation in waveguides. That is, it is assumed that the geometry's cross section
any xy cut remain invariant for all z values. The appropriateassumption depends
the type of physical problem being consideredas discussedbelow.

4.2.1 Transmission Lines

The characteristic impedance Zc and phase velocity vp


of the transmission!
can be computedby solving Laplace's equation

where V? denotes the Laplacian in two dimensions. The characteristic impedt


and capacitance of the transmission line is found by following these steps (seef
4.1):

1. Carry out the FEM solution and find V(x, y) in the absence of dielecir.

2. Determine the charge per unit length of one of the conductors by carry
out the integration

eh Well |j
Contour
Section 4.2 Two-Dimensional
\302\246 Wave Equations 95

Finite
element
Integration
contour for
evaluating
charge on
enclosed
conductor

Figure 4.1 Shielded microsirip line.

where n denotes the outward directed unit normal and the contour is shown
in Fig. 4.1.
3. Evaluate the capacitance per unit length of the free space filled transmission

line from

D.3)
\320\264\320\272

where potential difference between the two conductors.


AV is the
4. Repeat steps 1 to 3 for the original dielectrically filled transmission line to
obtain the per unit length capacitance of the line \320\241
=
-$p.
5. Determine the line's characteristic impedanceusing

}
z ... .. D.4)

where v0 denotes the wave velocity of the air-filled transmission line. We


also note that vp = vo/y/E^ where ee is referred to as the effective dielectric
constant of the substrate and, from D.4)

=
?\342\200\236 D.5)

4.2.2 TWo-Dimenslonal Scattering

In scattering analysis, the excitation is a plane wave of the form

D.6)
for incidence
\320\242\320\225 (also referred to as H. polarization) or

E' = fg D.7)
96 Two-Dimensional Applications \302\246
Chapter 4

for TM incidence (also referred to as E: polarization), as illustrated in Fig. 4.2. Boih


of these plane waves are impinging upon the scatterer at an angle They
\321\2040. may bt

alternatively written as

where k1 = \342\200\224
(j?cos0o +.vsin0o) 'S tne direction of the incident wave and

r = xx + yy is the position vector at which the field is observed. More generally,a


magnetic current source zMiz{x, y) for \320\242\320\225
incidence or an electric current sourct

zJb(x, for TM incidence can be assumed


y) as the excitation. For these types oi

excitation, the fields scattered by the cylinder will also be z directed and thus la
the \320\242\320\225
case the vector wave equation becomes

V, X \320\223-
V, X - zklflrHz
= -ZJ(D8OMI: D.81
B#s)j

where we included the possibility of a magnetic source as the excitation

(k0 = 2jt/A0 = u\\/jup(j. =


\320\272 2jr/A = Using
\321\210^/\320\251).
the identity1

V, x \\- V, x (?#jl = V, x \\- (-z x V,#r)l = -fVr


-
\302\246
V,#:
L?r J Ler J ?r

the vector wave equation D.8) is now reduced to the scalar wave equation

D.41

Scattered <ft
field \\

Scattering
cylinder

Contour, \320\241

Artiftcfai mesh truncation


boundary

Figure 4.2 Two-dimensional scattering configuration.

'Derived from the vector identity [3] V x (A x B) = AV -


\342\200\242
\320\222 BV \342\200\242 V)A - (A \342\200\242
A + <B \342\200\242 V)B.
Section 4.2 Two-Dimensionai
\302\246 Wave Equations 97

for inhomogeneous media.Here,H. denotes the z component of the total magnetic


field. In scattering, it is typically decomposed as

//.- = \320\257!\320\2501+\320\257: D.10)

where Hi is the component of the magnetic field generated by zMt. in isolation or is

simply given by D.6). component The Hfal is referred to as the scattered magnetic
field (z component only) and is the unknown quantity of interest. As will be dis-

discussed later, there are advantages in solving for tff01 directly and in this case the
pertinent wave equation is
\302\246 \\ + = -V, \342\200\242
VfH{\\
_ ^fr =f(x,y)
+j(OSoMt:
V,
(~ V,#fat klnrff?al
(y

D.\320\237)

obtained by substituting D.10) into D.9). From duality, the corresponding wave
equation for TM incidenceis

Vr
\302\246 + ftfer?f\302\253
= -V,
\342\200\242 - kierEl+>\320\274\320\276^- D\320\2332)
f-L V,?f\") (~ V,?i)

where denote
??\320\270\" .*ie z component of the scattered electricfield and El. is the \320\263
component of the incident electric field generated by zjt in isolation or is simply
given by D.7).

4.2.3 Waveguide Propagation (HomogeneousCrossSection)

In waveguide propagation, the field is assumed to be of the form

|U,(a-,v)
\302\246jU/X-v.y)

where /? is the propagation constant along z and V(x,y) is the field value over the
waveguides cross section.Fora waveguide whose nonmetaUic region ?2 is filled with

a homogeneous material certain standard simplifications are afforded. In these wave-


waveguides, the configuration as well as the field behavior is independent of the third
variable. This would be the z direction in a Cartesian coordinate system. Thus, in the
absence of a source, the identity

Vxfi Vxu) =- Vx VxU=--V2U D.14)


holds. In D.14), U denotes the electric or magnetic field vector.
The identity D.14) can be used to rewrite the vector wave equations as (see
Section 1.2)

V2H + fc2H = 0 D.15)


V2E + *2E = 0 D.16)
with =
\320\272 Given that V2H = xV2Hx + + ?V2#:, these equations per-
ku>/JI^. yV2Hy
permit a decoupling of the differential equation among the field components. Thus for
modes
\320\242\320\225 \342\200\224
[\320\225- we
\320\236.\320\235.\321\204
\320\236), only need to solve the scalar wave equation
98 Two-Dimensional Applications \302\246
Chapter 4

= \320\236 D.17)

where V; denotes the surface gradient and we set 82H:/dz2 = -P2H:. This is also
called theHelmholtz equation and is basically the scalar equivalent of the curl-curl

vector formulation. We note that once #. is found from D.17), Maxwell's equations
can be used to obtain the other field components using the expressions
Ey
= Ex
= -Utofi/ytX8HI/8y),
\320\270<\320\276\321\206//)(\320\264\320\230:/\320\264\321\205), Hx = (-jfi/fWHJSx). =
\320\251

(-JP/Y KdH;/ty) (see also Section 1.4).The quantities


=2-
D,18)
we
denote the wavenumber and characteristic mode impedance, respectively.
Alternatively, we could opt to solve the vector wave equation

V, V, = 0 D.19)
x(\342\200\224 xE,)-(^,-^)E,

in which

D.20)

is the total transverse electric field in the guide. Again, D.19) is valid only for cases
where the field variation is independentof the third dimension.
Alternatively, for the TM modes (\320\257.= \320\236, 0), the appropriate
E. \321\204 scalar and
vector wave equations are simply the duals of D.17H4.19).However, the boundary
conditions are of the Dirichlet type when solving for Ez and of the Neumann type
when solving for H-. As was shown in Chapter 3, the relation = 0 serves as
\320\265\320\250_,/\320\255\320\273

the boundary condition for the case.


\320\242\320\225

4.2.4 Waveguide Propagation (Inhomogeneous Cross Section)

Let us again consider the in Fig. 4.3 consisting of a center conductor


waveguide
enclosed shell. The region between the outer boundary
by an outer metallic of the
inner conductor and inner boundary of the outer conductor is the pertinent compu-
computational domain. When the waveguide's material cross sectionis inhomogeneous, the
fields can exhibit variation in the third dimension. Therefore, the normal component
of the field is required for an accurate representation of waveguide propagation,

Also, the resulting differential equation is not easily separable,which leads to a


coupled differential equation and normal components of the field.
in the tangential
To simplify the vector wave equation, we begin by rewriting the del operator
using D.13) as

V = 1 e = x
V, + ? V, -m
| + 1-jfii
\321\203 D11,

implying that

V x E = (V, -jfiz) x \342\204\226,+??\342\200\242;)


= V, x E, + V,?, x z -jfi x E,
= V, x E, + (V,?r+\320\243\320\224\320\225\302\273)
x f

in which E= E, + zE: as before.Thus,


Section 4.2 \302\246
Two-Dimensional Wave Equations 99

Finite element
y* mesh

Metallic
<a) (b)
boundary

figure 4.3 Waveguide configuration: (a) cross section of waveguide; (b) three-
dimensional view.

V x i- V x E = (V, -jfiz) x \320\250


[V, x E, + (V,?: +J0E,) x f)

- V, x
\342\200\224 x
V, E,
- z x \342\200\224
[(V,?. \321\205
f]
-\320\254\321\203/\320\227\320\225,)
Me Mr

+ Vf x (V'?:+/)8E') x D.22)
\\j f]

Next we introduce D.22) into the vector waveequation V x (l/fir)V x E \342\200\224


?j)?rE
= 0
and set each of the vector components to zero. This permits the decomposition of the
original wave equation into a pair of differential equations [4, 5, 6]

V, x \342\200\224x
V, E,
- ^ (V,?. -
*ge,E,
+\320\243/\320\227\320\225,)
= 0 D.23)
Mr Mr

V, x \\\342\200\224
(V,E. +J0E,) x - Ager?:J= 0 D.24)
fj

for the transverse and z components of the wave equation. Clearly, D.23) and D.24)
a
represent pair coupled of differential equations which either needs to be decoupled
or solved as is. Decoupling them using the divergence property yields a nonsym-
metric generalized eigenvalue problem, the solution of which is numerically ineffi-
inefficient. However,
using simple variable transformations [2],

D.25)
100 Two-Dimensional Applications \302\246
Chapter 4

the coupled pair of differential equations D.23) and D.24) can be expressedas

V, x V, x +
<\320\233 p2
\342\200\224
(V,e, + e,) = klere,
(\342\200\224

r. 1= D.26)
/?4 x (V,e: + e,) x fJk2oere.J
\\\342\200\224 fJ
The coupled pair of differential equations D.26) can now be solved for f}2\342\200\224the

square of the propagation constant of the inhomogeneous waveguide- subject to


the following boundary conditions:

on PEC surfaces and

<V* +''>\342\200\242*
=

; D.28)
V, x e, = 0

on PMC surfaces.The differential equation to be discretized is obtained by adding


the two coupled differential equations and weighting them with the necessary weight-
functions.
weighting This fonnulation leads to a symmetric generalized eigenvalue problem
which is a direct result of the transformation D.25).

4.3 DISCRETIZATIONOF THETWO-DIMENSIONAL WAVE EQUATION

From the above presentation, a general form of the two-dimensional wave equa-
equation is

\342\200\242
V \320\253-v,y)VU(x, y)] + klq{x, y)U{x, =\320\224\321\205,
\321\203) \321\203) D.29)

and it is understood that here V = V, = x ^ + \321\203 This can be specialized to the


-jfc.
problems of scattering and waveguide propagation by choosing p{x,y) and ^(\320\273\\.|')

appropriately. When we consider \320\242\320\225


(or Hz polarization) we must choose

U{x,\321\203)
=
\321\203). p(x,
\320\235\320\263{\321\205, >') =
\342\200\224
. <l(x< = Pr
\320\243)

while for TM or Ez polarization

The steps to be followed for the solution of D.29) via the FEM parallel those
given in Sections 3.4 to 3.7 for the solution of the corresponding one-dimensional
(ordinary) differential equation. They involve
\302\246
casting of the original wave equation to its weak form to obtain a single
functional incorporating the conditions imposed by the wave equation and
the boundary conditions.
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation 101

\302\246
tessellation of the computational domain allowing for a discretization of the
weak form to a linear system of equations element by element.
\302\246
assembly of the element equation and imposition of the boundary conditions
to obtain the final linear system of equations.

In this section we carry out the first two steps, and in the subsequent section we
considerthe assembly of elemental equations to solve for the fields and eigenvalues
associated with a metallic waveguide (closed domain problem).

4.3.1Weak Form of the Wave Equation

The pertinent residual of D.29) is

R(x,y) = V \302\246
/>(r)Vt/(r) + k20 q{r)U{r) -/(r) D.30)

where as usual r = xx + yy the position


denotes vector. To derive the weak form, we
multiply D.30) by a weighting or testing function W(x,y) and enforce \320\233(\320\263)
= \320\236
over
the domain of each element rather than point by point. This gives

mm of
W(r)R(r)dxdy= Q D.31)

which is the generalization of C.9) for the one-dimensional case. As discussed in

Section 3.4, the weighting function must again be compatible with the boundary
conditions and be square integrable over the domain. Its derivative must also be
square integrable.
To reduce the order of the derivatives in the residual and r*tcoducethe bound-

boundaryterms, we must make use of the identities

WV-pVU- 4{pWVU)-pVW 4U D.32)


f f V \302\246
(p WVU) ds = I pW(VU \342\200\242
n) dl D.33)

in which Q denotes the pertinent computational domain, is the contour


\320\241 enclosing
Q as illustrated in Fig. 4.4, and n refers to the outward directed unit normal vector to
the contour \320\241We note that the latter is the divergence theorem, which can be
considered as the generalization of integration by parts to two dimensions.

Figure 4.4 Parameters for the application of


ihe
divergence theorem.
102 Two-Dimensional Applications \302\246
Chapter 4

Making use of D.32) and D.33) into the weighted residual equation D.31)
yields

-Mr)V^(r) \342\200\242
V?/(r) + kfab) W(r)U{t) - ds
\320\251\321\202)\320\224\320\263)]

+1 p(r) \302\246
\320\251\321\204V?/(r)] dl = 0 D.34)
ic
This is the weak form of the two-dimensional scalar wave equation and should be
compared to the one-dimensional weak form in C.11). Again, we note the presence
of the integral over \320\241
boundary which boundary allows for the imposition of the
conditions. D.34) provides a single statement
Thus, incorporating the conditions
implied by the wave equation and the pertinent boundary conditions.As noted
for the one-dimensional case, this is at the heart of the finite element method.

4.3.2 Discretization of the Weak Wave Equation

D.34) we proceedby introducing


To discretize a discrete representation of the
field U(x, making appropriate choicesfor the weighting
y) and function W(x, y). As
a first consider a tessellation of the computational
step, we domain ?2 in small
triangular elements (see Fig. 4.1 to Fig. 4.3). We next choose to approximate the
fields in each triangle as a linear function and in so doing, we can choose the
expansion or basis functions Nf(x.y) given in Section 2.3.2 to represent U[x,y)
over Q. Specifically, we expand ll(x,y) as

U(x,y) = YtYtUfNXx<y) D.35)


i=l
\302\253\342\200\242=1

where Uf are the unknown coefficients of the expansion and represent the field or
potential values at the nodes of each triangle. This representation is therefore
referred to as a node-based expansion. As usual, Ne denotes the number of elements
used for tessellating the domain. The procedure of tessellation is referred to as

meshing and typically each side of the triangle is chosen to be less than 1/10 of a
wavelength.
From Chapter 2, the explicit form of the shape functions Nf (x, y) is

0 \\!fl.
otherwise D.36)
v

where

Ar = ldet 1 \321\203\\ =^[(^-4)(>'\320\267-^1)-D-4)(>'5->'|)]


\321\205\320\247 D.37)
.1 A y\\_

is the triangle area. The coefficientsin D.36) are given by

with the indices (ij, k) following the cyclical rule. That is, (ij, k) = A,2,3),B,3.1),
orC,1,2) for the first, second, and third local nodes, The shape func-
respectively.
functions Nf(x,y) are pictorially illustrated in Fig. 4.5. They are equal to unity at the Ah
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation 103

Linear field
approximation
over the eth
triangle

2 1
x, y)

Figure 4.5 Node coordinates for the eth triangle and illustration of the node-based
expansion functions Nf{x,y).

node of the eth element and taper


the other two nodes. linearly to zero at
Consequently, expansion
the eth element
J^=l UfN*(x,y) scales the shape functions
depending on the value of the field or potential at the nodes of the eth element.
Substituting the expansion D.35) for U(x,y) into the weak wave equation
D.34)yields
\320\232\320\227

E E
w

f f f D.39)
Jc J Jn-
Notice that we have not used the expansion D.35)to approximate VU
\320\273 =
\342\200\242 on the
\320\251
boundary since
\320\241 the behavior of \320\251
on \320\241
must be provided through the boundary
conditions.
A linear set of equations can now be obtained by employing Galerkin's method
where we choose the weighting function lV(x.y) equal to the expansion basis
Nf(x,y). i = 1.2,3- Doing so yields

Y,Vf\\\\ \302\246
VAff(r)
[-\321\200(\320\263)\320\2511\320\263) + *grfr)A(f(ryvftr)] dx dy

+
f p(r)NJ{t)n
\302\246
Vt/(r)rf/ = I
f /=
\320\233\320\223/(\320\263)/(\320\263)dxdy, 1,2,3 D.40)
where we have temporarily dropped the sum over the elements since Nj(x,y) is
nonzero only over the eth element. We will later perform the summation over all
104 Two-Dimensional Applications \302\246
Chapter 4

elements during the assembly process of the matrix system. Also, the presence of Ihe
boundary integral is required only if element has an edge bordering
the eth the
contour The
\320\241 contour segment C, refers to the edge of the eth triangle which is
part of \320\241

From D.40), we can obtain a 3 x3 system of equations by running through all

choices =
of \320\243 1,2,3. If we assume that on the
\320\241 field satisfies the Neumann bound-

boundarycondition h \302\246
Vt/ = = 0,
\320\251
we have

Ah Ah Ab~ U\\
Ah Ah Ah U{ =. bl D.41)
Aji Ah Ah. v\\.
or

which is the element matrix system. The explicit form of the matrix entries is

4 = -/ \302\246 dx dy
V/Vf (r)
jJ V/Vj(r)

Nt(r)Nj(r)dxdy, /=1.2.3, 7=1.2,3 D.42)

/,;=[[ Nf(r)/(r) dx dy, j = 1.2, 3 D.43)


J Jar
wherewe assumedp(r) \321\217= and q1' are constant
q(r) \302\253s
\321\200\320\265 over each element to permit a
closed form evaluation of the integrals. SinceN\"(r) are linear functions, the evalua-
evaluationof the [A*] matrix entries can be done in closed form. We have

where imply
((\320\270) column vectors)

=
[\320\232'] L\" =
f J NfNfdxdy] qe^JNf\\{Nj)Tdxdy
Evaluating the entries of the submatrices [K?] and [Ke] yields [note that the constants

b* below are those in D.38) and are not related to the excitation column in D.41)]

D.44)

2 1 \320\223

1 2 1 D.451
1 1 2
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation 105

The latter is independent of the triangle coordinates (except for the area multiplier)
and is referred to as a universal element matrix [7]. By making use of D.44) and
D.45) in D.42), the [Ae] matrix entries can be more compactly written as

= :? + D.46)
^ (l s,j)
=
a% /Gt + *\302\247kj
+
\321\204\321\211
^\321\204
+ kfc

where

8=l
j
'' 0 otherwise
\\

Thus, linear triangular


for the chosen tessellation, the matrix entries are given in

closed be easily evaluated. Dependingon the form of the function


form and can
f(x, y), the excitation column entries
bj may need
to be computed numerically.

4.3.3 Assembly of Element Equations

As was the case for the one-dimensional solutions, the next step in the finite
element procedure is the assembly of the element equations D.41). This refers to the
procedure of carrying out the sum

? {/,') D.47)

as implied by the original discrete


form of the \"weak\" wave equation D.39). Since
several of the elements the same node, the sum or assembly D.47) con-
may share
consolidates the surrounding elements to yield a single equation.This is simply the sum
of the element equations from all of the surrounding elements. Specifically, let us
consider node 1 which is shared by five elements, as shown in Fig. 4.6.
For this case, the sum D.47) producesthe equation

+ Atf'US+l) = D.48)
W2+l }_;H!l+>
1=0 i=0

(e'+2)th

e'thelement

Figure 4.6 Illustration of a node shared by


five elements (triangles).
106 Two-DimensionalApplications \302\246
Chapter 4

for node I with the matrix entries as given in D.46). The correspondingequations for
the other nodes are very similar, except for differences in the
superscripts/subscripts
and the order of the sum. We should remark that the assembled equation D.48) for
node the same regardless of the number
1 is of elements/nodes contained in the
computational domain. That is, even if the entire domain contains thousands of
nodes, D.48) will still involve only six nodal fields, implying that the corresponding
row of the assembled matrix system [/4](t/} = lf>] W'H contain only six nonzero
entries even though the rank of [A] is in the thousands. Thus, the assembledfinite
element matrix is always very sparse and this is a major advantage and characteristic
of the FEM. bandwidthThe and structure of [A] is determined by the connectivity of
the a result of the tessellation scheme.As can be understood,
nodes, the bandwidth
of the matrix [A] is strongly dependent on the node numbering scheme. We can
reduce the bandwidth by numbering the adjacent nodes using consecutive numbers.

This is difficult to achieve, but sparse matrix storage schemes as those presentedin
Chapter 8 can be used to maintain their efficiency for matrices of the same sparsitv
but different bandwidths. Parallel computing architectures take advantage of spar-
sity but in the case of vector processors, narrow matrix bandwidths must also be
maintained for substantial efficiencyimprovements [8].
A key issue in performing D.47) is the transforma-
the assembly as dictated by
from
transformation local to global nodes. This was discussedfor the one-dimensional analysis.
However, because of the easily predictableconnectivity of the elements (i.e., each
element was sequentially numbered and each node was shared by the two adjacent
segments) the local to global transformation was not an issue for the one-dimen-
case.
one-dimensional The issue of node numbering becomes apparent when we look at the
assembled equation D.48).
Since the unknowns Uf must be eventually put into a single column (with one

subscript), it is necessary to have a readily available mapping between the local and

global nodes which are associated with the <?th element* Thus, in addition to the node

geometry data provided to the finite element program, we must also provide infor-

information about the local and global node numbering schemes. Four tables may be
required before carrying out the matrix assembly routine:
\302\246
Node Location Table
A listing of all mesh nodes (interior and boundary nodes) using global
numbers and their corresponding (x,y) coordinates.
This table specifies
the geometry of the input configuration.
\302\246
Triangle Connectivity Table
The global nodes comprisingeach triangle are given by this table. For
example, by referring to Fig. 4.7 we observe that element #3 (e - 3) is
formed by nodes 3, 5, and 2, as given in line 3 of the table. Basically, the

table defines three arrays: n(l, e). nB,e), and \321\217C, The
\320\265). first of these pro-
provides the correspondence between local node 1 of the eth element and the

global nodes. The other two arrays provide the same correspondence infor-
information for the other two nodes of the triangle. Let'ssay for example that we
are working with the local nodes of element e = 4 and want to get the

corresponding global nodes of that element. Then the value of \302\253A.4) will
give the global number of local node 1, \302\253B.4) will provide the global
number of local node 2. and so on.
Section 4.3 \302\246
Discretization of the Two-Dimensional Wave Equation 107

Global node
numbers

Node location table Triangle connectivity table

Global coordinates
{x, \321\203) Element Local node arrays
rtode# x \321\203 e Mte) nB,e) nC,e)
1 *\\ \320\243\\ 1 2 4 1
2 2 5 4 2
3 *\320\267 \320\243\320\267 3 3 5 2
4 5 6 4
\342\200\242\302\246 \342\200\242

\342\200\242

Boundary element
connectivity table
Surface Local node arrays
edge
number
5 rtsd.S) /JsB. S)
1 6 4
2 4 1
3 1 2

Outer surface
of mesh

Figure 4.7 Geometry and connectivity data tables required for matrix assembly.

Boundary Element Table


To impose the boundary conditions it is necessary to identify the surface

edges (line elements) on the outer boundary of the mesh and their associated

nodes. This table can be generated using the data in the previous two tables.
The data manipulations required to generate the surface node and element
information is typically part of the data preprocessorand is an important
step before assembling the final system. An example of such a boundary
element table is given in Fig. 4.7. For the two-dimensional case, it suffices to

identify all segments which bound the mesh and the nodes which form those
108 Two-Dimensional Applications \302\246
Chapter 4

elements. Again, the listed arrays provide the correspondence between the
local surface element nodes and the global nodes.
\302\246
Material Group Table
This is a look-up table the material
for specification of each element. 1\320\273

practice, the same material covers blocks or sectionsof the domain and il is
therefore not necessary to specify the material parametersfor each indivi-
individual element. Instead, one may choose to attach a material code column to
the element connectivity table. That is, the material of each element is
specified through the \"code\" which is in turn associated with specific values
of er and fir.
The first two of the above tables are always required but the latter two may or

may not be needed depending on the application at hand. Also, it may be convenient
to introduce other tableswhen different elements are used or more complexgeom-
geometries are modeled.

4.3.4 Assembly Example: Waveguide Eigenvalues

To show how the assembly is performed, let us do a specific example. For


illustrative purposes, consider the 18-element rectangular waveguide cross section
shown in Fig. 4.8. The node location and element connectivities are given in the

accompanying tables. If we assume that U = H,, then ^ = n \302\246


VI/ = 0 at the bound-

boundarynodes of the rectangular metallic waveguide. Thus, the boundary integral in


D.40) vanishes and the FEM system for the node fields is obtained by performing
the assembly
18

D.49)

where the dimension of (?/) is 16 and {bL'} was set to zero since no excitation is
assumed. A procedure for carrying out the assembly is as follows:
Note the correspondence between the local and global nodes. For example,
C/f=l() t/6, where, as usual, the
= single subscript refers to the global node 6. Thus,
the element matrix for e \342\200\224
10 is

\320\27310
.19
Aw \320\220\\\320\263

\320\220\320\236
Uu = 0 D.50)
,10
\320\260\\1 \320\233
32

If we are interested in obtaining the assembled equation for global node6, we


must also consider all of the elements sharing this node. In addition to element 10.
node 6 is shared by elements 2, 4, 9, and 7. The correspondingelement equations for
each of these are

'\320\220], An An' Hi

Ah Ah Ah
= 0 D.51)
.Ah Ah Ah.
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation 109

Local
node
numbers (\\

Global node numbers Element number

Triangle connectivity table Node location table

Element Local node arrays Global


number node
e \320\2261.e) nB,e) nC, e) number X \320\243

1 1 2 5 1 0 0
2 2 6 5 2 0.5 0
3 2 3 7 3 1.0 0
4 2 7 6 4 1.5 0
5 3 4 8 5 0 0.25
6 3 8 7 6 0.5 0.25
7 5 6 10 7 1.0 0.25
8 5 10 9 8 1.5 0.25

9 6 7 11 9 0 0.5
10 6 11 10 10 0.5 0.5
11 7 8 11 11 1.0 0.5
12 8 12 11 12 1.5 0.5
13 9 10 14 13 0 0.75
14 9 14 13 14 0.5 0.75
15 10 11 15 15 1.0 0.75
16 10 15 14 16 1.5 0.75
17 11 12 15
18 12 16 15

Figure 4.8 Geometry, node location, connectivity, and boundary node data tables
required for matrix assembly.

Ax \320\220\320\263 V2'
\320\220\320\267'

Ax A>
\320\220\320\263 u7 = 0 D.52)
Ax A*.
\320\220\320\263
110 Two-Dimensional Applications \302\246
Chapter 4

\320\2707
= 0 D.53)
A\\\\

U \320\270
.Ax

\320\220\\\\ -4,2 \320\220\321\206

\320\220]] \320\220722^23
= 0 D.54)

Adding the element equationsin D.50) to D.54) which refer to testing at a common
node, yields the system

A\\2 0 0
Ah An + An a]3 0
a\\+a\\{
A . .9
An + Au + a\\\\

.9
0 + '
\320\220\\\321\212 \320\220

0 All 4? + /ill 0

0 0 ^21 + ' \320\220\\\320\263 + -422 .

U2

D.551

U \320\270

We can continue assembling element equations onto this system until all elements
have been accounted for. However, the third equation in D.55), referring to tesling
at node 6, will not change through the remaining assembly process.Clearly, this

equation is no different than the generic equation D.48). Thus, we have established a

pattern for assembling the final system of equations. Specifically, note that in the
final assembled system, the entries will be given by the sum

all D.56)
The index (see Fig. 4.8) e^ of the sum must be kept identical in n(i,e{) and
nU, ?i) when carrying out the sum over et, i and^. For example, A(,b
= a\\\302\260\\
+ \320\233^+

\342\200\224 +
a\\2
\320\220$\320\263 -^3|. and so on. A computer (matlab) routine for carrying out the

assembly is given in the appendix where the coefficients are adjustedfor the chosen
polarization. For a wavenumberof kn = 2\320\273
(i.e., A. = 1), the numerical values of the
assembled [A] matrix are given on the next page and can be used in conjunction with
a given excitation vector [b] to obtain the waveguide fields across its cross section.
0000000000*000*

1
.- \342\200\224
?, v> <
(\320\233

\320\2634
\320\276 <
\320\276
\320\276\320\276

- \342\200\224\342\200\242
\321\216 <\320\233
\\\320\236

\320\235
\320\276 \342\200\224'
\320\276 \320\276

OOOOOOOOOOOOOOsDOO
1\320\236 00 (\320\236
\320\236 1^. 1\320\233

ON \302\253N

\320\236
\320\236
\320\236
\320\236 f-l \320\237
\320\241 \302\253N
\320\236\302\273\320\236
\320\223\320\247
CI \320\236
\320\236
\320\223\320\247
\320\236

*t 't ^ \302\273\320\233
\320\236;O ^
W \320\276
\320\276\320\274\320\265
\320\276

\320\223\320\247 N \320\236
\320\236 f*J \320\236

\342\200\224
\320\276 \342\200\224
\320\276

lN-OOfN\302\273COOOO
\321\201\320\247\321\207\320\236

\342\200\224 \320\276
\320\276 \320\276

\342\200\224
\342\200\224
J5

\320\263\321\207 r-i \320\276


\320\276 \320\263\321\207
\320\276

\342\200\224
^\320\236\320\223\320\247\320\236\320\236
\320\234\320\236\320\236^\320\236\320\223\320\247\320\236\320\236\320\236\320\236\320\236\320\236
\342\200\224 \302\253\320\273
\320\276 \342\200\224
\342\200\224 \320\276
\342\200\224
\320\236 \320\236
\320\236 \342\200\224*
\320\276

\320\236\320\236^\320\236\320\241\320\236\320\236\320\241\320\236\320\236-\320\241\320\236\320\236\320\241\320\236\320\236\
m
\302\273ri \320\276

\320\265 \320\263\320\273
\302\273\320\276
%\320\276\320\276\320\276
\320\277\320\277
\342\200\224
^ \302\253\320\233
1\320\233 \342\200\224
vi
'\320\273
\320\276
\342\200\224
\320\276\320\276 ri \320\276

\321\201 \320\2766w
\320\276 \320\276

\320\236\320\241
** \320\223-1
\320\236\320\241 \342\200\224
I

111
112 Two-Dimensional Applications \302\246
Chapter 4

4.3.4.1 (Neumann ModesBoundary


\320\242\320\225 Conditions). One way to verify the
correct and implementation
evaluation of the FEM matrix is to use it in
computing
the known capacitance of a given transmission line such as the shielded microsLrip
line in Fig. 4.1. In this case, the wavenumber k0 is set to zero and thus only the [\320\251
matrix is used to evaluate the capacitance Co, as in Fig. 4.1. A situation where both
submatrices [Kv] [K] [obtained from the assembly of [Ay] and [K'] given in
and
D.46)] are used is
of computing
that the waveguide cutoff wavenumbers. Upon
comparing the wave equations D.17) and D.29), we observe that the cutoff wave-
numbers are obtained by solving the generalized eigenvalue problem

-[KV][H:] = /\342\204\226\342\204\226 D.57)

where y1 are the eigenvalues and \321\203


refer to the cutoff wavenumbers. Consideringa
rectangular (see Fig. 4.8) with
waveguide dimensions a/b = 2, the computed eigen-
eigenvalues are given in Table 4.1 and the corresponding mode field distributions (eigen-
(eigenvectors) are shown in Fig. 4.9 for TE|0 and TE| |. These calculations were carriedout
by Reddy et al. [9] using 400 triangular elements over the waveguide cross section.As
seen, they are within 0.4% of the exact [10] eigenvalues given by

for the TEmn (m \321\204\320\236\320\276\321\202\320\277\321\204\320\236)


and the TMm/, (m 0 and
\320\244 n 0)
\320\244 modes. We remark
that the accuracy of the calculated eigenvalues deteriorates for the higher order
modes since the latter require a finer tessellation due to their more complexmode
structure.
We can solve the eigenvalue problem D.57)with equal ease for a coaxial cable
of inner radius r\\ and outer radius r2, as shown in Fig. 4.10. The typical triangular

TABLE 4.1 Cutoff Wavenumbcrs for a Rectangular Waveguide

ya {a/b = 2)
\320\242\320\225 TM Analytical 110] FEM Calculation

10 3.142 3.144
20 6.285 6.308
01 6.285 6.308
11 II 7.027 7.027
12 12 12.958 13.201
21 21 8.889 8.993

. t t \342\200\242
t t t t t t t t t ,
. t t I t t t t
t t t t t t t t t 1 T t t
t t t i t t t t t t t 1 1
..'it t t I , ,
. t t i t t II t t t t t
t t t i t t t t t t I t t
t t t i t t t t t t t t t
t i i i I
j
1 t 1 t t t
i t \342\200\242
i t t t t ! t t
I

Figure 4.9 Calculated TE|0 and \320\242\320\225\321\206


mode electric fields in a rectangular wave-

waveguidewith a/b
= 2. [Courtesy of Reddy el al. [9J.\\
5\302\253lion
4.3 \302\246
Discretization of the Two-Dimensional Wave Equation 113

mesh for this type of cross section is illustrated in Fig. 4.3. Calculations were carried
out by Reddy et al. [9] using 340 elements to model the cross section between the
inner and outer conductor for \320\263\320\2631\320\263\321\205
= 4. The first few eigenvalues (cutoff wave
numbers) and eigenvectors are given in Table 4.2 and Fig. 4.11, respectively. For

the shown mode


\320\242\320\225 values, the agreement between the FEM calculated and ana-

analytical [11] wavenumbers is within 0.8%.

Figure 4.10 Coaxial waveguide geometry.

TABLE4.2 Cutoff Wavenumbers for the Three Lowest Order \320\242\320\225


Modes
and Two Lowest Order TM Modes of the Coaxial Cable(the \320\242\320\225\320\234
mode
is excluded)

= 4)

Mode Analytical [II] FEM Calculation

\320\242\320\225,, 0.411 0.412

TE2, 0.752 0.754

\320\242\320\225,, 1.048 1.055


TMo, 1.024 1.030
TM,i 1.112 1.120

Figure 4.11 Calculated and TE2,


\320\242\320\225,, mode electric fields in a coaxial line with

r,/)-) = 4. \\Vourivsy of Reiklyet til- 1^1)


114 Two-Dimensional Applications \302\246
Chapter -I

4.3.4.2 TM Modes (Dirkhlet Boundary Conditions). For the TM modes,


U = E. and ?. = 0 on the boundary. However, = h
BU/\320\264\320\277 V?. =
\342\200\242
bE.Jbn is non-
nonzero on the boundary and it is therefore necessary to treat =
\320\244 dEJdn as an addi-
additional unknown on the boundary. The contribution of the boundary integral then

plays the same role as the [endpoints]for one-dimensional problems (see Chapter 31
As discussed in Section 3.7, since the field values at the boundary nodes are zero, ii is
not necessary to test at these nodes. Instead,we set the boundary node fields to zero
whenever they appear in the system. By avoiding testing (or weighting) at the bound-

boundarynodes (or elements), the integrals over Cs do not enter in the construction of the
final system and can thus be neglected altogether. We also remark that the choice of
omitting testing at the boundary nodes is equivalent to setting the weighting func-
functions to zero when testing at these nodes.
Although the above arguments are sufficient to proceed with the FEM maim
assembly while neglecting the presence of the boundary integral, they are neverthe-
difficult
nevertheless to visualize without going through some of the details. Therefore, below
we will (for a moment) proceed with the assumption that the boundary integral
contribution is needed. We begin with the discretization of the boundary integral

by introducing the expansion

where Ns denotes the number of boundary edges A2, for example, in Fig. 4.8) and f,
carries the usual notation. That is, \320\244? refers to the value of \320\255\320\225:/\320\264\320\277
at the /th local

node of the sth segment of the boundary (see Fig. 4.7). The expansion bases arc
linear interpolation functions between the node values of They
\320\244(\320\263). are of the same
form as those discussed in Chapter 2 (see Section 2.3.1) and Chapter 3. Fora
constant value of .v or y, the two-dimensional shape functions reduce to the same
linear one-dimensional expansion functions given in B.3). Thus, we can write

x\\ - x , .
,boundaries
-
\342\200\224
on v = constant
4 *
= v? - v
IJ(r)
,,s
on.v =
,,.)\302\246
constant boundaries
\320\263> 1
\320\243

\320\236 outside the .vth segment

L\\(t) = 1 - Z.f(r). e C,
\320\263 D.59|

These expansion bases are applicableto piecewise rectangular boundaries, as in Fiji


4.8. In the case of circular boundaries,as in Fig. 4.10, an appropriate set of linear
basis functions is

otherwise

where is
\321\204 the angular variable ranging from = 0
\321\204 to =
\321\204 2\320\273\\
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation US

To obtain the element equations for the TM excitation, we substitute D.58)


along with D.35) into D.39) and proceed as done for the case.
\320\242\320\225 The resulting
element system is

[Ae]\\Ue] )
= {be] D.61)

where the entries of [Ae] and (//} are given by D.42) and D.43). Those of [B\"] are

computed from

Nf(t)L](i)dl D.62)

or

L](r)L)(r)dl D.63)

(dt= dx or
dy for rectangular boundaries) and are associatedwith the last left hand

side of D.39).The latter expression D.63) for By results from the identification that
the ith surface or boundary segment must be an edge of the eth element for a
nonzero value of By. It is also understood that the boundary matrix [Bx] will be
nonzero only when the eth element is associated with a pair of nodes on the outer
surface/boundary of the computational domain.
of the elemental equation D.61)again
The assembly amounts to summing the
equations from each element weighting at the same global node. For the specific

rectangular waveguide example shown in Fig. 4.8. the resulting assembled system
will be of the form

D.64)

The expandedversion of this matrix system is

\342\200\242''is.:
116 Two-Dimensional Applications \302\246
Chapier-i

\320\276 0
0
Bn
\320\262\302\273 0
0 0
0 \320\236 \320\276 0
0 \320\276 \320\276 0
0 \320\276 \320\276 0
0 \320\276 \320\276 0
\320\262\320\274 0 8.12 0
0 \320\276 \320\276 \320\276
\320\276 0
0 \320\276 0
0 \320\276 0
0 \320\276 0 BaM
0 \320\276 ^1.1,14 \320\236

0 \320\276 ^14.14 0

0 \320\276 ?|5.!4 #15.15


0 \320\276 0 \320\236\320\236
\320\236 0

\321\204,

*2 /\302\2732

\320\244,

\320\2444 ^4

\320\2445 \321\2145

\320\244? bi

*8 \321\214%

\320\244, h
b\\o

bu
\321\204\342\200\236

\320\244|2 bn

\320\244|4 bu

\320\261''

in which \320\244\342\200\236
denotes the outward directed normal derivative of E. at the nth node
The values of \320\244 at the interior nodes are irrelevant becausethey are associated with

all zero rows, included for the proper addition of the [-4] and [B] matrices. It is
understood that [A] is a very sparse matrix, as discussed earlier in the chapter.
We observe that the above system involves 12 nontrivial unknowns from the
column
{\320\244} plus 16 unknowns from the {(J) column for a total of 28 unknowns

Clearly, this number of unknowns is much greater than the available 16 equations
For a solution for {[/} and {\320\244} we must add 12 more equations or conditions on the

values of {{/} and {\320\244).This is done through the introduction of the Dirichlet bound-
boundaryconditions satisfied by \\U) = {?.) on the boundary nodes. The procedure i;

referred to as condensation of boundary conditions and was discussed in Chapter 3


To see how it can be done in a consistent manner, let us divide the [U] column into
two parts, one for the interior nodes and another for the boundary or surface node
That is, we write (U) as
Section 4.3 Discretization
\302\246 of the Two-Dimensional Wave Equation 117

\\{US)\\

where in the case of Fig. 4.8, (?/'}= \\Ub, G7, U\\0, Un]T are the interior node fields
and [Us] contains the boundary node variables.Also, we formally define the column
(*} as {\320\244}
= *4.
*\320\267>
(\320\244,,\320\2442, *5. *8. *e. *i2' *i3- *u. *is. *ie)r which excludes all
interior node values of \320\244.
Using this notation, we can rewrite the system D.64) as

\"] Oil
Oil \320\236
[A'S]]\\{U')\\.\\O \\O \\_\\{b')
\\_\\{b)\\ 4

in which [-4\"] refers to the submatrix of [A] containing the interactions among the

interior nodes.
[AIS] and [ASI] are associated with interactions among exterior and
interior nodes, and [Ass] refers to the interactions among the boundary or outer
surface nodes. Similarly to{U1} and \\US), the excitation subvectors {b1} [bs] are
a nd

associated with the interior and boundary nodes,respectively.


When we invoke the Dirichlet boundary condition

the system D.65) can be decomposed into two independent subsystems


= {b')
[A\"]{U') D.66)
and

s'l
{bs} D.67)
The interior node fields are now
decoupled from (\320\244}
and we can therefore proceed to
solve D.66) without consider the solution of (\320\244).Thus, the boundary
a need to
integral can be neglectedwhen assembling the FEM system provided U or its normal
derivative are zero on the boundary C. After [U1] is found, we can return to D.67)

and determine {\320\244),if necessary, by solving the system

= \\b) D.68)

where

{b} = [bs}-[As'][U')
In the case of an eigenvalue problem, the excitation column {/>} is set to zero
and D.66) is written as

~[K\"]\\E;) = ftK\"\\[Es\\ D.69)

where [K\"] and [K11] are identical to those in D.57) except that only the entries
associated with the interior nodes are kept and that pejq\" are defined for the TM
polarization case. Somevalues of for
\321\203 the TM modes, obtained from D.69), are
given in Tables 4.1 and 4.2 for the rectangular and coaxial waveguides. Also, Fig.
4.12displays the fields of the lowest order TM mode for each of these empty wave-
waveguides. The analysis can be carried out using the matlab program in the appendix.
For the mesh in Fig. 4.8, the corresponding [An]. [Ky] and [Ku] are
118 Two-Dimensional Applications \302\246
Chapter -I

fttttttftf...

1 1 t 1 I t
'
\342\200\242
1 \302\246
i ! | |

TM,, mode TM., mode

Figure 4.12 Calculated fields for the lowestTM modesof the rectangular (a/b = 2)
and coaxial (ri/rt = 4) waveguides. [Courtesy of Rcddy cl at. (9j.\\

-2.9438 0.9112 2.4112 0.4112


=
0.9112-2.9438 0 2.4112
[A\"]
2.4112 0 -2.5326 0.9112

0.4112 2.4112 0.9112 -2.5326


0.521 0.0104 0.01040.0104
0.0104 0.0521 0 0.0104
[K\"\\ =
0.0104 0 0.0625 0.0104
.0.0104 0.0104 0.01040.0625
J
5.0000-0.5000-2.0000 0

-0.5000 5.0000 0 -2.0000


-2.0000 0 5.0000 -0.5000
0 -2.0000 -0.5000 5.0000

The above TM mode analysis confirms that the boundary integral in {AM
could be neglected from the start when dealing with metallic domain enclosures. The
final system for the \320\242\320\225
and TM analysis is then obtained by enforcing the boundary
conditions t/(r) only. For the \320\242\320\225
on case, testing is imposed on the boundary and
interior without any specification
nodes for the boundar/ values of U(t). However,
for TM analysis, the boundary values of U(r)areset to zero a priori. Thus testing at
the boundary nodes is avoided and the final TM system is smaller than that corre-

corresponding to the case.


\320\242\320\225

4.4 TWO-DIMENSIONAL SCATTERING

As another example application of solving the two-dimensional wave equation, w<


consider the scattering by a cylinder shown in Fig. 4.2. We shall assume H. incident
as given by D.6) and as based on the discussion in Section 4.2.2, it is best to work
with the scattered rather than the total field. From D.11). D.29), and D.34),the
pertinent weak form of the wave equation is
Section 4.4 Two-Dimensional
\302\246 Scattering 119

T'M da
|

+ - f
n \302\246
Vtf.scal(r)] dl=\\l (is D.70)
\320\251\320\263)\320\224\320\263)
Fr J(.\302\253w+ J

where

/(r) = -V \342\200\242
\320\224 1 +ja>E0Mi:

is the excitation function and cinncroulcr represent the closure boundaries of the
computation domain, as depicted in Fig. 4.13. For Hi = and
\320\265\320\264\"(\0208*\"+1!\"\320\277\320\233)

Mt = 0. it follows that

\321\217; D.71)

in the eth element, where we approximated \302\253r and by their


\321\206, average values eer and
M^, respectively, over the eth element. Also, we note the presence of the boundary
integral over =
\320\241 Coulcr + Cmncr and from D.33) it is necessary that encloses
\320\241 the
entire computational domain (see Fig. 4.13for the directions of the normals on each
contour).

To discretize D.70), we expand the scattered field as

D-72)
\320\201 E
i i i

where Hi* denotes the unknown scattered field values at the nodes and Nf(x,y) is
given by D.36). Subsequently, on choosing Galerkin's testing (i.e., W = NJ), we
obtain the element equations
r -1
f f \320\2231 1
\320\257*' \302\246 + tiVrNiNJ dx
E E VNi
\342\200\224L
e<-
VN> dy
/=l
\342\204\226| J-)S2,L J

i \342\200\242 - \302\246
e
' Vtff\302\260l]dl
\320\233'\320\224\320\275 j '6
N}[n V//f\"]dl

D.73)
,.=i

It has been assumed that the interior domain (dielectricand free space)
bounded by c\"\"\"\" and has
\320\241\321\210\" been subdivided into triangles, whereas the con-
contours c\302\260\"lerand been subdivided into N,{ and Nx, line segments, respec-
c\"nner have
respectively. Note also the excitation function/*' will be nonzero
that only when er and \321\206\320
are not equal to unity, i.e.. only if the element is within the material region \320\257,/,
illustrated in Fig. 4.13.

4.4.1Treatmentof Metallic Boundaries

In the previous section, we eliminated the boundary integral over \320\241\321\202\321\210


since

h \302\246
V//;tal was zero on the metal boundary. However, this is not the case here because
120 Two-Dimensional Applications \302\246
Chapter 4

-:x/\321\217
'
. , \321\201-

outer

Inner
Q Figure 4.13 Illustration of the eonii\302\273
and
C\302\260ula as well as boundary
\320\241\342\204\242\" vcui.-r.
* for the scattering geometry.

Hf*{ represents the scattered and not the total magneticfield. SinceH. = H'.+ \320\257!6\".
it follows that

.
\320\246
(vhI + V//.scaI) = n \342\200\242
VH, = 0, re Cinner D.74i

and thus

n \302\246
VH-scal = -n \342\200\242
VHt, r e CinMr D.75i

which is an inhomogeneous Neumann boundary condition, where n \302\246


VH'. is a

known quantity everywhere on C\"mcl.An alternative form of D.75) can be obtained


by noting that

(see also A.76)) or

where EUin refers to the total tangential electric field on cmn\". Similarly, for the

scattered field

\".? D.77,
Z,, \320\262\320\263

and since Ewn = /


\302\246
(E' 4- Escal) = 0 on conducting surfaces, it follows that

- VtfP\" n =
\302\246
E'ttn r e Cinntir D.7h
+\320\244

with EJan
= Zfl;-(.vsin0o-Pcos0ok/*(l(lfClls*e+1'sin'*ul. Thus, the boundary integral
over Cnncr can be moved to the right hand side of D.73) to be included as pan of
the excitation column. This type of detail in treating a boundary condition (or
constraint) demonstrates how a knowledge of the field over a boundary becoi\302\273

an equivalent source excitation. It was discussed in Section 3.7 for one-dimensional


applications and is referred to as condensation of boundary conditions.
Section 4.4 Two-Dimensional
\302\246 Scattering 121

4.4.2 Absorbing BoundaryConditions

Before casting D.73) onto a matrix system, we must also consider the boundary
conditions on So far,
\321\201\302\260\321\210'. no information has been given with regard to the
boundary condition that must be satisfied by #fal on comer.For scattering prob-
problems, the field continues to propagate to infinity and at large distances from the two-
dimensional scaUerer it has the form Ae~'kr j-Jr, where r denotes the radial distance
from the origin and A is a constant. Thus, since

1 e~

flfal satisfies the condition

00 D.79)
This is the well-known first-order Bayliss-Turkel [12]absorbingboimdary condition

(ABC) and, by its nature, it can only be enforced on circular boundariessuch as that

shown in Fig. 4.14. in this case, f coincides with the normal h, and we then rewrite
D.79) as

D.80)
\320\255\321\217
^\320\271\342\200\224\320\276

where ra is of the circular


the radius boundary Couler. We observe that D.80) is
identical to the impedance boundary
in form condition given by C.42) and its
treatment in the FEM solution is therefore straightforward.
The purposeof the ABC is to create a boundary which does not perturb the
field that is incident upon it\342\200\224in
effect, to simulate a surface which is actually not
there. A measure on how well an ABC simulates a nonreflecting boundary can be
obtained by examining the reflection field due to a plane wave impinging upon the
ABC surface (see Fig. 4.14). F orthe Bayliss-Turkel ABC D.80), it will be seen that

its approximation of a nonreflecting surface may be poor unless it is placed far A to 2


wavelengths) from the scatterer. The latter is highly undesirable because the

Nonreflecting or
ABC surface

Incident
plane wave Reflected
Figure 4.14. Illustration of a circular ABC wave

twfacc and how to measure its effectiveness. (to be suppressed)


122 Two-DimensionalApplications \302\246
Chaptcr4

computational domain is enlarged substantially leading to unnecessarilylarge maim

systems. In the 1980s (see Seniorand Volakis [13] for a review of ABCs)much work
was carried out. aimed at deriving ABCs which provide a better simulation of non-

reflecting surfaces even when placed at a fraction of a wavelength from the scatterer
[12]. [14], [15], [16].These improved ABCsare associated with higher order tangen-
tangentialderivatives. For
example, D.79) is referred to as a first-order ABC because ii
involves a single derivative (with respect to the tangent) of the field.The general form
of the second-order ABC is

where t and n are shown in Fig. 4.14. For the second-order Bayliss-TurkelABC [12],
the coefficients a and j8 are given by

a = \\
-Ao(

where again r0 is interpreted as the radius of the boundary contour c\302\260uler.Based on


their derivation, c\302\260ulcrmust actually be circular for the application of the second-
order Bayliss-Turkel ABC. However, it can be specialized to the case of planar
boundaries by letting oo.
rQ \342\200\224> On constant x boundaries, as in Fig. 4.15, D.81)
becomes

D.83i

with

and a similar expressionis obtained on constant boundaries.


\321\203 This simplified

second-order ABC D.83) is the Engquist and Majda [14]ABC and is extensively
used in connection with Finite Difference-Time Domain solutions.

Figure 4.15 Piecewise planar boundary to:


mesh truncation.
Section 4.4 Two-Dimensional
\302\246 Scattering 123

Regardless of the type of ABCbeingused to simulate a nonreflecting surface at


C*uW, mathematically the ABC replaces the normal field derivativeswith tangential
ones. In contrast to normal
derivatives, tangential derivatives can be carried out
using the approximate since there is no requirement
field expansion for information
outside the computational domain. Specifically, we have

f 1 f I / \320\257'\320\235\320\226\320\2331\\
-
Nf[n \342\200\242
Vtffdl] dt = - Nf I <5\320\257^\"
' 4- \320\254
-^~ dt
J (\320\274\320\275\320\270\321\202
Er J (-outer Er \\ at' I

' \" D.84)


](\321\216\320\270\321\216?\320\263 Jf-omer Br at at

where we used integration by parts to transfer one of the derivativesfrom the field to
the testing function as was done by obtaining the weak form of the wave equation.
To proceed with the discretization of D.84) we must introduce an expansion for the
field on the boundary C\"utcr. Choosing the linear expansion basisD.58)or D.59), we
have

and upon substitution into D.84) we obtain


- \302\246
f Nf[n Vtff'l]dl

4 4- *]
As noted before, L-'(r) = \320\233^(\320\263)
when e Cmcr
\320\263 is an edge on C\302\260ulcr
and \320\273-, belonging
to the eth element, as depictedin Fig. 4.7.
With the evaluation of the boundary integral over C\"nner as given by D.78) and
the discretization of the other over C\302\260ul\"as given by D.85), we are ready to cast

D.73) into a linear system of equations. The element equations are


N, \302\253\302\253, \320\232
- D.86)
?\320\270\321\207(\320\2421}'+\320\264^\320\275/\320\271\321\202 J2ibe)
C=l ,v,=l f=l

The entries of the [Ae] matrix are again given by [see D.46)]

4 = - + + tidr A + fy) D-87)


^\320\2647
<$\321\204
\320\244\320\251
^
and

roe\021

in which the angular cylindrical variable \321\204


has been used in the integral to emphasize
that the employed ABC is only applicable to circular enclosures\321\201\302\260\321\210\321\20
(note:
dt = r0 \321\2011\321\204).
124 Two-Dimensional Applications \302\246
Chapter 4

The excitation column entries consist of two components\342\200\224one from the exci-
excitation function/(r) and another from the boundary condition on c1\"\"\". Specifically
[again, these entries are not related to the b' in D.87)],

% = f f dxdy
\320\222\320\224\320\223 -Jp \\
Nf(r)(E'tan
\342\200\242
!)dt D.89)

where s2 is a segment on cmner belonging to the eth element. Substituting for/1\" and

E(an, a more explicit expression for b' is

= $ A
- \320\233 \\ Nf(r) *
[

-7*o NflrKx sin 0o - vcos\321\2040)


\302\246
?* D.901
f

which can be evaluated in closed form for each of the line segments and triangles.
The assembly of the elemental equations D.86) is carried out in the saint
manner as done for the TM waveguide mode analysis. The resulting global system
will be of the form

D\320\233

]|54[ ]|)w
where we used the superscripts \"interior\" and \"boundary\" to indicate the separation
of the node field column as done in D.65).
This system is similar in all respects to D.65) for the TM mode analysis
However, there is a major difference in that has
(\320\244) now been replaced with the
field itself on the boundary. Thus, the convenient decomposition to a pair of smaller
systems is no longer possible, nor is it needed. Since the ABC permitted the elimina-
no additional
of {\320\244},
elimination equations are required for the solution of (\320\257?\"\"}.Addition
of the two left hand matrices gives

[\320\231 |

where the entire matrix is sparse and the system can be solved using an iterative
solver (see Chapter 9).

4.4.3 ScatteredField Computation

Once or ?fM is computed from


Hfal a solution of D.92), the scattered field
outside the
computational domain and echowidth of the scatterer are two observa-
bles of interest.The echowidth is obtained from the far zone scattered field using the
relation

=
Jim 2*1--^-
(\302\246-\302\246\320\236\320\241
D.9.T,

The field outside the computational domain is obtained by application of the surface

equivalence principle or KirchhorTs integral equation. In either case, the pertinem

expression is (see Chapter 1, Section1.4)


Section 4.4 Two-Dimensional
\302\246 Scattering 125

?/*=>) = -I \302\246
(\302\253'V't/(r') G2/)(r. r') - \302\246
[\302\253'V Gw(j, r')] U(r')} dl' D.94)

where U is equal to \320\225\320\263


or H., depending on the excitation. The contour CK can be of

any shape or form and can be located at any distance from the scatterer. Also,
G^Cr,r') is the two-dimensional Green's function

. ,..u,-
..\342\200\236 -r'l) D.95)
4
where H^^ denotes the Hankel
zeroth-order function of the second kind. This
Green'sfunction can be
interpreted as the field generated by a line source at r'
since it satisfies the differential equation
klG1D= -8{r- r') VzG2l) + D.96)

The scattered field representation D.94) is identical to that given in Chapter 1.


Also, D.94) can be related to the radiated fields due to equivalent current sources.
For example, assuming U = E., from Maxwell's equations [see A.76)]

J = x H
\320\264 =
-j^p- n x [f x V?J D.97)

and since V?. = + ?(\320\264?./\320\250),


\320\277(\320\264\320\225./\320\222\320\277) it follows that x t
(\302\253 = f)

L D.98)
J'd4
Also, the equivalent magnetic current is given by

M = E x =
\321\217 ?? = lM, D.99)
and therefore D.94) can be rewritten as

?!\302\260V)
= i l-jk0Z0J.(r)G2D(r.,') + M,[n \302\246
V'C20(r. r')]} dl' D. 100)

Although C/c can be arbitrary, it is best to choose it so that the integrated fields
are most accurate. For purely metallic scatterers, the computed field and its deriva-
is
derivative most accurate near the metallic surface. Thus, it is appropriate to choose CK
to be near to or coincide with the metallic surface of the scatterer. For TM incidence,
when CK is coincident with the metallic surface. ?/(r)= 0 for \320\263
\320\261 and
\320\241\320\272, thus

\302\246 - D.101)
[n V'Ez(r')}H^(k0\\r r'\\)dl'

Likewise,for \320\242\320\225
incidence, h \302\246
VU(r)
- 0 on CK and D.94)reducesto
1-
\320\2575>)
= r'\\)]dl' D.102)
|
However, when dealing with coated metallic or purely dielectric scatterers, D.101)
and D.102) cannot be used.In this case, the contour CA- is placed above the outer
surface of the dielectric (see Fig. 4.16)and the scattered must be computed from
D.100)or its dual.

To evaluate the scattered field using D.94) or D.100),we proceed by first


discretizing CK into Nx segments each less than 1/10 of wavelength or as dictated
126 Two-DimensionalApplications \302\246
Chapter 4

by the original FEM tessellation of the domain. For simplicity, let us choose Q to

be a circle of radius |r'| = rK. Then


D.1

with

^
\320\244(\320\263')
- 2rrK cos(<p
-
*'
D.104!
= \320\277
\320\244(\320\263)
\342\200\242
VU(t)

2, + r - 2rrK cos@
- <
lA2)(k0Jr2
_ \320\224\320\276
r r u(x'\\
V
4 AJo + ~
y/r2 4
D.1051

and in D.105) we used the simplifications2

\320\263-\320\263'

\320\257'2)(*\320\276|\320\263-\320\263'|)
= -\321\204')] D.106)
-\302\253\320\276 [\320\263\320\272-\320\263\321\201\320\276$(\321\204
|\320\263-\320\263'|

in which \320\257{2)denotes the first-order Hankel function of the second kind. Also, q
refers to the angle between r' and the .v-axis, as shown in Fig. 4.16.
For far zone computations (i.e.,r -*\302\246 oo), we can simplify the above integrands

by introducing the large argument approximations of the Hankel functions.


Specifically,

Figure 4.16 Illustration of the inicgraiioii


contour for computing
\320\241\321\206 the scattered fidi

2Notc: BGzp/ait = n \302\246


VG2* = ^ (n
\302\246
Wj2)
V\302\253) =
(*o \320\257)
^
\342\200\242
\320\224)
(\302\253 H^\\k0R), where VJ? = R -.
Seciion 4.4 Two-Dimensional
\302\246 Scattering 127

D.107)

and we approximate the radical as

77 TT
\342\200\224
for amplitude terms
1 cos(i* )
\321\204 \342\200\224
-\321\204)
\321\201\320\276$(\321\204
\320\223\320\272 for phase terms

D.108)
Substituting D.107) and D.108) into D.104) and D.105)and discretizing the integral,
yields the far zone approximations

D.109)

,./V* \320\265\320\276\320\260(\321\204-\321\204\34
D.110)

In these, = 2tt/Nk
\320\233\321\204 denotes the angular extent of each discretearc segment and \321\204\34
is the value of at
\321\204 the midpoint of the nth element. For this case, Un and refer to
\320\244\342\200\236

the average value of the field and its normal derivative at the midpoint of the \302\253th

segment.
From D.93). the echowidth can be computed from

)+*0 \320\201u\302\273
D.111)
n=\\ n=l

Another expression for the far zone field using equivalent surface electric and mag-
magnetic currents is

\342\200\236-Akor-n/4)

D.112)

for TM incidence and


\320\237\320\223
p-Hknr-n/A)
, r | -]
= J!g.i T J> z- -rxJ(r')+irxfxM(r') D.113)
V 8\321\202\320\263
Vr ^0 J
Jrjt L

for \320\242\320\225
incidence. These are obtained from D.100) onceD.107)and D.108) are used.
For this case M = ?E: and J = -z/^/(k0Z0).

4.4.4 Scattering Example Using ABCs

A classic problem is that of cylinder. As an example


scattering from a circular
application of the finite element method let us consider the
scattering by the dielec-
trically coated circular cylinder shown in Fig. 4.17. The metallic cylinder is coated
with a 0.05A dielectric layer having (e, \342\200\224
4, fir \342\200\224
1). As a first step, our goal is to
determine a suitable ABC radius rn leading to a sufficiently accurate solution. This
128 Two-Dimensional Applications \302\246
Chapter 4

= 0.3A010.4A0,
0.5A0,0.6A0 \\
\\
/
^ABC
/
\\
boundary
\\
/ \\ /0 3A0
\\ \\
I *
1
1 I a = 0.25 A I

\\ / 1
\\
\\ \\
/ /
\320\243
\\ /
\\ /
\\ ?f=4 / Figure 4.17 Geometry of the coated circular
cylinder and illustration of the \320\233\320\222\320\241
bound-
boundary.

PEC BACKED DIELECTRICRING (TM)


0.55

\321\210

0.00
0.00 20.00 40.00 60.00 80.00100.00120.00140.00160.00160.00
angle (degrees in direction)
\321\204

neigensol \320\224.\320\227lambda 0.4lambda x.5lambda + .6lambda

Figure 4.18 Finite element solution of the near-zone TM incidence scattered field
measured at r = ().275\320\224.\320\236.
The geometry is shown in Fig. 4.17 and the
four ourves refer to the radii at which the ABC was placed {rn = \320\236.\320\227\320\233\320\
0.4A0. 0.5A0. and 0.6A.o). [After Peterson and Castillo [17]. CO IEEE.
1989.]
Section 4.4 \302\246
Two-Dimensional Scattering 129

specific example was consideredby Peterson and Castillo [17] and was used to assess
the accuracy of the second order ABC D.81) by comparison with the exact eigen-
function solution [18]. In Fig. 4.18 we show the near zone field ?fal due to a plane
TM incidencefor different values of the ABC radius r0. As expected,the computed
?f\" field is quite inaccurate when r0 = O.3A.O since the ABC boundary is then co-
coincident with the outer boundary of the scattering geometry. By its derivation, the
second-orderABCis valid for large r0 since it neglects [13] terms beyond O(r~9/:) as
well as nonradial waves. Clearly, the choice of rt) = 0.3Xq violates this assumption.
However, the accuracy of the solution improves substantially when r0 is increased to
0.4A.o, and continues to improve as r0 is increased. Typically, it may be necessaryto
increase r0 as much as 2AQ to obtain very accurate results. This is especiallytrue for
the \320\242\320\225
incidence where nonspecular fields caused by traveling and surface waves are
of importance.
Another example application of the finite element method with ABCs for a
noncircular cylinder is shown in Fig. 4.19. The geometry is a metallic triangular

0.7071\320\257

E,=ie'

i i i i i

10

O-Q

5 -

i \320\233 \\

0 - < -
\\ \302\260

Figure 4.19 Geometry and TM incidence


bisialic cchowidth for a triangular metallic
cylinder. The ABC was placed on a circle of -5
radius 2^, and centered at the origin. The
0 VDE V\\
\320\276
wlid line gives the reference data based on Integral \320\276
\320\276
\320\276
the rigorous integral equation solution and
(he circles correspond to data from the finite i i i i 1

dcmcnt-ABC method (labeled as VDE). 30 60 90 120 150


[Afar Peterson mid Castillo f!7/. & IEEE. 0(deg)
im.\\ (b)
130 Two-Dimensional Applications \302\246
Chapter 4

cylinder whose maximum length is 1.7O7Uo. The second-order Bayliss-Turkel ABC


was used again for mesh truncation with the ABC boundary being a circle of radius
= 2\320\245\320\276
\320\223\320\276 centered at the origin. The resulting mesh consistedof 1599nodes and 2952

triangular elements [17]. For a TM plane wave impinging from the negative \320\264-\320\26
the echowid th is shown in Fig. 4.19 as a function of the angle \321\204 measured from the *-
axis. For this bistatic pattern, =
\321\204 0\" corresponds to the direction of forward scat-

scattering and \342\200\224 refers


\321\204 180\" to the backscatter direction (becauseof symmetry, only
the pattern from \321\204 = 180\302\260
= 0\" to \321\204 is shown). Two peaks (lobes) are characterislic
to this pattern, one occurring at \321\204 = 0\302\260and the other at \321\204 = 60\302\260.
The latter is due to
the specular scattering from the upper (nonvertical) sideof the triangular cylinder
and the \321\204 = 0\302\260
peak is due to the coherent addition of the diffracted fields from the
two back edges of the triangle. The echowidth is the lowest B0 dB below the peak
value) along the backscatter direction where the contributions are due to diffraction
from the three visible edges of the triangle. Because of the lower level of these
diffraction contributions and their nonspherical wave character, the accuracy of
the simulation is least accurate there. Higheraccuracy may be achieved by placing
the ABC surface further from the scatterer. This would,of course,result in higher

computational demands. Eventually, numerical errors caused by the larger mesh


would limit the accuracy of the solution unless the machine precision is increased.

4.4.5Artificial Absorbers for Mesh Truncation

In the above section we introduced ABCsfor finite element mesh truncation


(this approach is referred to as the finite element-ABC method). As discussed, the
purpose of the ABC is to eliminate wave reflections from entering back into the
computational domain. That is, the ABC serves as an absorber of the scattered
waves (see Fig. 4.14)and therefore a true absorber can instead be used to absorb
these waves. Indeed, material absorbersare used in anechoic chambers to eliminate
wall reflections and simulate an otherwise openspace,free of nearby obstacles which
can interfere with antenna or scattering measurements. In practice,material absor-
absorbers are shaped as cones and are loaded with carbon or other particles to simulate a

lossy environment (see. for example. Fig. 4.20). tapered T he shape provides a better

impedance matching, whereas the loss in the material causes absorption of the

entering waves.
In a finite element analysis, we can also use material absorbers for mesh trun-

truncation and this approach will be referred to as the finite element-artificial absorber
(FE-AA) method. For numerical simulations, it is not necessary to make use of
material parameters or profiles which are physical. Instead, we can use any fictitious

(i.e., artificial) material profile and employ it for mesh truncation, as illustrated in

Fig. 4.21. The shown absorber can be curved, if necessary, to minimize the computa-
computationalvolume, depending on the scatterer's or radiator's shape.
Not being restricted by the material choices, an optimizer such as the simplex
method can be used to determine the material parameters sn and
\321\206\320\277 thickness t to
minimize reflections over all visible incidence angles. This approach was used b)
6zdemir and Volakis [19] to obtain the parameters given in Fig. 4.21. A homo-

homogeneous absorber cross section was assumed in this design forsimplicity.


However, improved artificial absorbing layers have been recently developed on trie
assumption of inhomogeneity and even anisotropy in the material. In the latter case.
Section 4.4 Two-Dimensional
\302\246 Scattering 131

Metal
<\342\200\224

Very small
reflection

Figure 4.20 Practical configuration of a


matenal absorber.

= 1-/2.7

\302\246t=0.15Ao

figure 4.21 Use of a homogeneous absorber -Metal


Po)
(\302\253o.
for mesh truncation.
parameters[The shown
ure based un a design recommended hy
6:demir and Volakis [)!>].)

it is possible to theoretically designlayers which exhibit no reflection at the dielectric


interface over all incidence angles. These are referred to as perfectly matched layers
(PMLs) and their material parameters are given by Sacks et al. [20]

0
\321\201 0
0 \321\201 0
0 0 I/\320\263

in which c = a-jp is a constant. A reasonable choice for is


\321\201 1 \342\200\224j\\
but other
choices can be employed dependingon the thickness of the layer and the discretiza-
in the numerical
discretization implementation. A study of the PML parameters for optimal
is
absorption given by Legaull et al. [21].
132 Two-Dimensional Applications \302\246
Chapter 4

An example of using the artificial absorber for mesh truncation is shown in Fig.
4.22. This configuration is a rectangular groove situated in an otherwise flat metallic

plane (ground plane). We are interested in computing the scattered field due to \320\260
\320\242\320

plane wave excitation. For our formulation, the excitation is simulated by setting
= 0 on the absorber metal backing
= \342\200\224
E't tan on cavity and ground plane metallic surfaces

and the resulting matrix system is identical to that in D.91) except that [B] is set to
zero and

= *
-f^ f
\342\204\226)A')<bdy

in which/(r) function given in D.11) with Mi: = 0. Upon completion


is the source of

the solution, field is obtained by integrating


the scattered equivalent currents over
the cavity's aperture using D.100) or D.112).
To compute the scattered field we invoke surface equivalence theory. A con-
convenient surface on which to place the equivalent (electric and magnetic) currents is

that coincident with the ground plane (y = 0). Since we are interested in the fields in
the .v > 0 region, we can arbitrarily set those below the aperture to zero for the
application of the surface equivalence principle.The surface magnetic currents are

then given by
0 < x < \320\270\\
.v
= 0
otherwise D.114)

(see Fig. 4.23)and J = nxH = .pxH. everywhere on the .v = 0 plane. However,


because of the ground plane's presence, image theory can be invoked. This leads to
the introduction of new equivalent currents which provide us with the same radiated
field as the original ones which radiated in the presence of the ground plane. The new
equivalent currents are

_ J 2E x \320\273
= 2E x 0 <
\321\203 x < w, =
\321\203 0
otherwise
D.115)

and J = 0. and these radiate in freespace. The fact that the electric current vanishes
is a substantial simplification since the integration of the radiation integral is limited
over the aperture of the groove. Specifically, from D.113)

\342\200\224Absorber

/////////\\\\\\\\\\\\\\\\\\

7////////\\\\\\\\\\\\\\\\\\
Figure 4.22 Illustration of a groove recessed
/////\\\\\\\\\\ in a perfectly conducting ground plane. The
artificial absorber in Fig. 4.21 is used Tor mesh
7//7/\\\\\\\\\\\342\200\224
truncations.
Section 4.4 Two-Dimensional
\302\246 Scattering 133

\320\243 \320\225,\320\235
\320\243
J = 0
\320\234

**-
Metal
w
<
(\320\260)

* U
J = ft \321\205
\320\235 /> =
\302\246 ? = \320\225\321\205
\320\234 \320\273
1

Ground plane

(\320\254)
\321\200 Illustration of the surface equiv-
equivalence for computing the scattered location
principle Original
\321\217
^ \320\233
fields from a groove: (a) original geometry: \320\243 of ground plane
J \342\200\224
^
ft) setup for surface equivalence; (c) equiv- \342\200\224**\302\246
\302\246\342\200\224^
equivalent
currents alter application of image , = 2E x h, 0<x< w

theory. (c)

,
\302\246.116)

and from the finite element solution, the magnetic current is given by

Mequ
= 2(E x >\342\200\242)
= 2~(E'X + ?*tcal)

= ?2 Z,,si D.117)

and we should emphasize that H% refers to the scattered field at the /th node of the
eth element. The latter sum is over the three nodes of the element bordering the
aperture at the computation point for
Mequ. From D.93), the corresponding echo-
width is

D.118)
4(Z0J

Bistatic echowidth calculations for the rectangular groove depicted in Fig. 4.22
are given in Fig. 4.24. The curve corresponds to a groove width of W = 2.5A.O and a
depth of d = 0.2A.o. The incident plane wave was incoming at an angle of 70\302\260
from
the face of the ground plane and the absorber was placed O.15A.Ofrom the top of the
groove. As seen, the echowidth computations using the FE-AA method is in good
agreement with those based on the rigorous FE BI method discussed next (see [22].
[23]). However, care must be given when using artificial absorbers for mesh trunca-
truncation.The accuracy of the results is not assured, and this is more so for the near zone
fields. Also, the convergence rale of the iterative solver may deteriorate for certain
absorber parameters.
134 Two-Dimensional Applications \302\246
Chapter 4

Figure 4.24 Bistatic echowidth (\320\242\320\225


plane
wave incidence at an angle of \321\204\320\264
= 70J from
the face of the ground plane) for a 2.5A(, wide
and deep groove recessed in a ground
0.2\320\233\320\276

plane, as illustrated in Fig. 4.22. Comparison


of the finite clement-artificial absorberand
m finite clement-boundary integral [22]meih-
-30 ods. The absorber was placed 0.15\320\257\320\277
from
20 40 60 80 100 120 140 160 the groove's aperture. [Courtesy of S,
Observation angle deg BiniiiganaYule,)

4.4.6 Boundary Integral Mesh truncation

Absorbing boundary conditions provide an approximation of the relationship


between the normal and tangential field derivatives on the mesh truncation boundary
Couter (see Fig. 4.13).Instead,we could use the exact integral equation D.94)
= H'.-i r')
- /' D.119)
Jo\342\204\2421

where Hz = tfi 4- #fat. Here, the quantities and Hz are not related through
\320\264\320\235:/\320\264\320\277

an explicit expression case with the


as is the ABC D.80) or D.81). Becauseof the
integral and the nonvanishing Green's function D.95), all values of Hz and \320\264\320\235./'\3
on the boundary c\302\260mt are now interrelated. between H: and
The relationship
dHJdn becomes clearer when we
carry out the discretization of D.119). To do so,
we may employ the linear expansions D.58)-D.60)for \320\244 = \320\255\320\235./\320\264\320\277
and a similar one
for the total field #,. Galerkin'smethod, may then be used to generate a linear
system which relates to the nodal values of \320\244and H, on coulcr.
Because of the nonpolynomial form of G^fr, r') in D.119), application of

Galerkin's method with linear basis to discretizeD.119)leads to rather complex


integrals for the matrix, entries, albeit more rigorous [24]. To illustrate the procedure,

we will instead employ the piecewise constant expansion

D.120)

f; +hsj

2
D.121)
s,=l
in which

' 2 D.122)
0 otherwise
1&\320\244
Section 4.4 Two-Dimensional
\320\250 Scattering 135

Figure 4.25 Discretization of the mesh trim- Mesn truncation


cation circular contour. circular contour, \320\241

is the pulse function and we have assumed a circular contour c\302\260utcr, as shown in
Fig. 4.25.
Basically,the above expansions D.120) and D.121) approximate the values of
H. and over each
\320\257\320\257./\320\255\321\217 boundary segment by a constant value equal to the average
of their values at the bordering nodes. They are lessaccuratethan the linear expan-
expansions but provide substantial simplification in discretizing KirchhofiTs boundary
integral D.119). Substituting D.120) and D.121) into D.119) yields

\321\206>.
v'G2fi(r, l' II -- Hi(r)
r) dl' Hi(r)== R(t), e C\302\260wcr
\320\263 D.123)
J

in which R(r) is again the residual.To generatea linear system of equations, we


proceed by applying the weighted residualmethod, viz.,

R(T)W(r)dl=
I

Choosing^(r) = 8{\321\204
\342\200\224
#|
=
\321\204\320\2471),
1,2...., Nsr implying point matching, yields the
system

2
= \302\253i(r,,). q\\
= 1,2 NSi D.124)

in which denotes
\321\202\320\247[
the location of the midpoint in the gtth testing segment. The
entries GSl4l
and Gv are the integrals
136 Two-Dimensional Applications \302\246
Chapter \342\200

J -i,th

tf' = '
= +
f ^
\320\272\321\200\320\270\320\273\320\270
[ u
G20(r,

\302\253jm\302\253u
r')|- c//' D.126)

in which denotes
1-\321\217 radius of \320\241\302\260\320\2701\321\201\320\263.
the For Si, the evaluation
q\\ \320\244 of these integrals
can be carried out via the simplemidpoint method since \320\263* is typically
\320\220\321\204 very small.
However, when q\\ = S\\, the argument of the Hankel function vanishes. Noting thai

lnz + A2z+j \\nz + O(z\\zUnz) D.127)

with A\\
=
\\-j%(\\nY-fo2), A2 =-\\+j^{\\a.y~\\ -ln2) and y= 1.781072418.
it is clear that the integrands in D.125) and D.126) become singular when \321\211
=\320\264,,

However, the evaluation of G,|5| can be readily carried out by substituting D.\320\251

into D.125) to get

In the case of f?v,


. the integral is improper and, to evaluate it, it is necessary to

work with the rightmost expression in D.126). That is, we first integrate the small

argument expansion D.127) and then perform the differentiation before setting
r = Doing
\320\223\321\206. so yields

D.129)

In the above, the 1/2 term can be also viewed as a result of the identity

*.
' dl'
S1 Oft

where the bar through the integral implies principal value and r* denotes that the
evaluation point may be just inside (+) or just outside the
(\342\200\224) contour However,
\320\241

as pointed out by M. Sancer (seeChapter 7), it is not appropriate to introduce the

principalvalue concept for evaluating these integrals since the 1/2 term can be
extractedin the limit as the testing point approaches the integration surface or as
done above where the differentiation is carried out after the integration.
We can now approximate GS[4i and Gv as
Section 4,5 \302\246
Edge Elements 137

=
\302\246vi\320\257\\

D.130)

\302\246i\\Qi
\320\244

\320\223, /\320\2240 . \320\224<

Gv = D.131)

To solve for the nodal valuesof Hz and it


\320\244, is necessary to cast D.124) into a matrix

system of the form


=
[\321\221](\320\244} (\320\232} D.132)

Unlike systems, here the matrices


the earlier [Gv] and [G] are fully populated. Their
entries Gv, and are easily
<?,,\342\200\236
extracted from D.124) in terms of Gv> andG,,,,,
respectively.' and Vqx = H'.(r4l), q\\ = 1,2,..., NSl. When this boundary' element sys-
tern D.132) is combined with finite element equations

\320\270\321\217]
\320\223 1 , oil
\320\223\320\276 \320\276
D.133)
[[\320\273*]

we obtain a total of N + Ns equations for the N node field values and Nf values of \320\244

on c\302\260uler.The entries of [A] and [B]are identical to those given in reference to D.65).
Although more cumbersome, this approach is rigorous and would be exact apart
from the numerical approximation required in obtaining the linear systems D.132)
and D.133).It is commonly referred to as the finite element-boundary integral (FE-
BI) method, and scattering results basedon it were used for reference in Fig. 4.24.
It should be noted that the boundary integral subsystem D.132) is associated
with possible fictitious resonances [25, 26] and the solution fails when the resonances
are excited.To suppress them, one can simply introduce a small imaginary part in k0
or
[27] employ the combined field formulation [28].

4.5EDGE ELEMENTS

In previous sections, FEM solutions were carried out using node-based scalar basis
functions based on expanding the unknown quantity in terms of its nodal values, i.e.,
its values at element nodes. Suchan expansion is suitable for modeling a scalar
quantity. This is indeed the case for static potentials or a single field component
as is the case for homogeneous waveguides and 2D scattering. For inhomogeneous
waveguides, it is necessary to work with the vector wave equations D.23)-D.26)
requiring an expansion of the transverse vector field component E, or H,.
However, it has been found that node-based expansions are not ideal for represent-
the
representing vector nature of an electromagnetic field. Node-basedexpansions require
specification of field values at clement nodes where the field may not be defined
(corners).Also, the implementation of boundary conditions occurring in electromag-
138 Two-Dimensional Applications \302\246
Chapter4

nelics (tangential field continuity) is a challenging task. Further, as noted in Chapter


2, false solutions (also referred to as 'spuriousmodes')to eigenvalue problems can be
generated. An example of an erroneous modefield solution is shown in Fig. 4.26 [29]
for a metallic cylinder enclosing two dissimilar dielectricmedia.The cylinder has a

radius of R = 25cm and one of the dielectric regions is an offset cylinder of radius
a = 10 cm. Mode field solutions are given when the wavenumber of the inner ofTset

is = 2000 and that of the remaining region is \320\2722\320\276\321\210\320\265\320\263


= A96.1.39.22V As
cylinder k*nner

seen, the mode field solution using the node-based elements is substantially in error
and corrective approaches have been extensively studied [29], [30], [31]. Initially, an

Figure 4.26 Failure of the node-based FEM implementation to predict the correel
mode fields inside a coaxial guide with an offset center conductor. Top:
mesh interior to the conducting cylinder offset inner cylinder material
wavenwmber is kl,,K, = 2000. and lhat of the remaining region is

A-^ulcr
= A96.1. 39.22); bollom left: mode solution using node-bused
elements; bottom right: reference mode fields. [After Pmilsen end
Lynch, 1: IEEE. 1991.]
Section 4.5 \302\246
Edge Elements 139

approach referred to as the penalty method [32] was employed to reformulate the

weak wave equation (or variational functional) in conjunction with the node-based
elements.However, in recent years it has been recognized that edge-based elements
or Tangential Vector Finite Elements (TVFEs)remove the shortcomings of node-
based elements.
As discussedin Chapter 2, TVFEs are based on expanding the unknown

quantity in terms of its average values along element edges. The corresponding
basis functions are vector basis functions as opposed to scalar basis functions (sepa-
(separate
expansion for
component)each when node-based finite elements are applied.
TVFEs enforce tangential field continuity along element boundaries for
but allow
normal field discontinuities and have been shown to be free of the shortcomings of
node-basedelements [2, 33, 34, 35]. By using TVFEs, field values are not specified
where the field is not denned, spuriousmodes can automatically be eliminated, and
Dirichlet boundary conditionsare easily imposed.
To describe the Whitney element at some greater detail (seealsoChapter 2), let
us consider the rectangular (.r. v) coordinate system shown in Fig. 4.27. As usual, we
denote the coordinates of the first, second, and third node of a triangular element by
(*b.Vi)> (*2.\320\224'2)' and respectively.
(\320\245},\321\203\320\267), Also, we denote the edge from node 1 to

node 2 as edge #1, the edge from node 2 to node 3 as edge#2, and the edge from
node 3 to node 1 as edge#3. The length of the fcth edge will be denoted lk, and the
unit vector directed from node / toward node/ will be referred to as \320\265\321\206
or ek- We will

assume that a point P internal to the triangle has the coordinates (x,y). The geo-
geometry is illustrated in Fig. 4.27.
Next, we definedas the area of \320\224123, A\\ as the area of AP23, A2 as the area of
AP31, and Aj as the area of \320\220\320\240\320\234.
Using the simplex or area coordinates L\\,L2,and

Figure 4.27 Geometry of triangular element


Jiid definition of parameters for TVFEs.
140 Two-DimensionalApplications \302\246
Chapter 4

(given
\320\246 by L, = N\\
= Ax/A, L2 = N-, = A2/A, and L3 = N3
= A3/A) defined in

Chapter 2, we introduce the vector basis functions Wf, W2, and W3 by

D.1341

\320\251
= N5, = 12(L2VL} - L3VL2) D.135)

D.136)

Following the usual notation, the basis function Wj>. is associated with the kth edge
of the eth element. As noted in Chapter 2, the tk factor serves as a normalization
parameter. Each basis function can be shown to be divergence free Wjj = 0),
(i.e., V \342\200\242
to ensure tangential field continuity across element boundaries and to allow normal

field variation across element boundaries. The field F\" (either an electric field E' or a
magnetic field He) in the eth element is expanded as

D.1371
k=\\

where the unknown coefficient F% represents the average field value along edge#k of
the <?th element. That is, the field Ath edge is expanded in the direction of
along the
the unit vector introduced for the /cth edge. In the case of an electromagnetic scatter-

problem,
scattering the scattered field can be expandedvia D.137) and the known incident
field can then be added to form the total field.
For the triangle in Fig. 4.28 for which (.Vi,>'i) = @.0), (x2,^2)= and (l,0)
(Xh>'i) = @> ')' the three vector basis functions W', W2, and W3 associatedwith

edge #1, #2. and #3 are plotted in Fig. 4.29 to Fig. 4.31.
The vector basis function W? provides a constant tangential component (with
unit magnitude) along edge #k and zero tangential component along the two other

edges. The normal component of W*, however, is varying linearly along all three

edges.

Previously, a FEM analysis using node-based triangular finite elements was


carried out and we ended up with element matrix entries involving the integrals

ff
J in'
VNf-V]dy.
J\\\\Jtr NfNjdy

Figure 4.28 Triangle for which vector basis,


functions are plotted.
Suction 4.5 \302\246
Edge Elements 141

Vector basis function W1

0.8 \342\200\242 -

0.6 \302\246 -

0.4 \342\200\224-
-*
.\342\200\224 <-* /> / /
0.2 \342\200\224

/
0 \342\200\224- s

0 0.2 0.4 0.6 0.8 1


fljnre 4.29 Vector basis function W.. X

Vector basis function W2

0.8

0.6 *^*^
\302\273-.
\342\200\224

_ ^
\302\273-. V 4 \\

0.4 \342\200\224
.\302\253_ N
>\302\253 \\ \\ \\

_ - v N N N \\ \\

0.2 _ - v \\ N \\ \\ \\
1 \342\200\242
. ^ \\ \\ \\ \\ \\ \\
'

0 - ... 1 I t. \\ \\ t t \\ t

0.4 0.6 0.8


Rgure 4.30 Vector-basis function Wi.

with Nf (i= 1,2,3) and N'~ (j


= 1,2,3) being shape functions for the triangular
element and Q'' denoting the surface of eth element. For the standard linear shape
functions Nf = \320\246.closed-form expressions for these integrals were given in D.44)
and D.45). Alternatively, they could be evaluated numerically.
For TVFEs, a similar analysis can out. When
be carried using TVFEs, we can
proceed with the construction of the weak vector wave equation by working directly
with the vector wave equation D.19) for homogeneouswaveguides or 2D scattering
142 Two-DimensionalApplications \302\246
ChapieH

Vector basis function W3

1
/
0.8
/> /

/-. / /

0.6
/ / / /

/ / / / /

0.4 ' / / / / / /
. ' '

/ / / / / / / ;

/
/
/ ij /
*\\
/ tj /
4-
/
0.2 / \302\246>
t / I I I i V

' ' '


I i 1
i I
I I
0 1 i 1 \" '
\\ I i \\

0.2 0.4 0.6 0.8


x Figure 4.31 Vector basis function W,

problems (since E: = H: = 0). For inhomogeneous waveguides, it is necessary to

work with variations of the PDEs


D.23), D.24), and D.26) or their dual. Since
our goal here is to introduce TVFEs we will work with the simpler vector wave
equation D,19). To construct the weak form of

we must employ the identities

A \302\246x
V, (V, x B) = (V, x A)
\342\200\242
(V, x B)
- V,
\342\200\242
[A x (V, x B)] D.138)
V, -(Ax B)ds = i (AxB).M D.139)

- I A \342\200\242 \342\200\242
ds B) = \342\200\242 \342\200\242
ds - L \342\200\242 \342\200\242
B)( A \320\231) dl
V,(V, (V, A)(V, B) (V, D.140)
j |j
(see Fig. 4.4 for a definition of n, Si, and C) in conjunction with the weighted residual

equation

X ~ = \" =
T(' V' X dy \302\260' \342\200\242
-2-3
[Vf (~ E') y2E'ldx

where T, is the weighting function (we did not use the usual notation to avoid
confusion with the TVFE basis notation). The resulting weak equation becomes

x T,)\342\200\242V, x - \302\246
\321\2031!, E,l dxdy
- I T, \302\246
\320\2231
(V, x Er) x =
\321\201//
JJ [(V, A E,) \320\270]

D.142)
Seciion 4.5 \302\246
Edge Elements 143

As before, the boundary conditionsmust be introduced to evaluate the line integral


over C. In case of metallic enclosures, the boundary integral is eliminated since
E, x \320\233
vanishes on the enclosures and thus these edges/unknowns need not be con-
considered.

We can proceed to discretize D.142) by introducing the expansion D.137) and


choosingT, \342\200\224
W?, i.e., Galerkin testing. This gives the element equations

=
[\320\232$]{\320\232) \320\233\320\232\320\226) D\320\23343)

for the eigenvalue problem. The column {??}now refers to the edge field values of
the eth element, and from D.142)

= \320\233f D x Wm) \302\246


(V, x W5) dxdy D.144)
f

\320\232\321\210=\\\\
K-Wjdxrfr D.145)

These can be evaluated in closed form. Specifically [9],

D\320\23346)
hieDmD\"
Km
= +
(\320\233 h + h + h + h) D.147)
-\320\251^

D.148)

/, = -L [CmDH + CnDJ -vdxdy D.149)


f f

= + v dx dy D.150)
^r. (Am \320\222\342\200\236
Al,Bm)\\\\

D.151)

Jdxdy D.152)
in which

=
\320\220\321\210 D.153)
\320\260\320\251-({\342\204\226

=
\320\222\321\202 D.154)
\320\263,\320\251-^\320\254\320\247

=
\320\241\342\200\236, D.155)
\302\253^-\302\253^

Dm = -Bm D.156)
and the remaining parameters are given We note that the
by D.38). subscripts m and
n refer to edge numbers, whereas the subscripts / and / are associated with the node
numbers as specifiedin Fig. 4.27. The above closed form expressions are not neces-
necessarily less expensive to evaluate than a direct numerical evaluation of the integrals
D.144) and D.145). Thus, in practice, one may simply opt to use Gaussianintegra-
integrationfor matrix element evaluation.
The assembly of the element equations D.143) is carried out in the usual
manner. For TVFEs. each edge(or unknown) is shared by two elements only (unless
the edge is on C) and thus the resulting assembled global matrix has greater sparsity
144 Two-DimensionalApplications \302\246
Chapter 4

than the corresponding global matrix for node-based elements. This makes up for
the largernumber of unknowns associated with TVFEs and typically the storage
requirements are about the same for the two types of elements. However, the pre-
preprocessing stage of writing a FEM code based on TVFEs is more involved. In

addition to the geometry and connectivity tablesneededfor node-based elements


(see Fig. 4.7), tables describingedge to node correspondence, groups of edges on

conducting and dielectric boundaries, and unique directions for all edges (unknowns)
must be generated sincetheseare used during the element matrix construction and

assembly process.

4.5.1 Example 1: PropagationConstantsof a Homogeneously Filled


Waveguide

As an example application of TVFEs,we again consider a rectangular wave-

waveguide with PEC walls. The waveguide is assumed to be homogeneously filled with a

material of permittivity e= ereft and permeability ju


= firiiu. For \320\242\320\225
polarization, F,
in D.137) is an electric field E,, whereas for TM polarization F, is the magnetic field

H,. In either case, we are interested in determining the eigenvalues y2 from which we

can then obtain the transverse propagation constants j8


= Jy2
- of the guide.
\342\202\254\320\263\321\206\320\263\320

Assembly of the element equations yields the global matrix equation system

D.1571

which is identical (in form) to D.57) but where \\F] now represents the field values

along element edges rather than at element nodes. For TM polarization the number
of unknowns equals the number of global edges,whereas for \320\242\320\225
polarization the
number of unknowns equals the number of global edges minus the number of

boundary edges (since the field here is known a priori to be zero).


A FEM computer code was written to again model the rectangular waveguide
shown in Fig. 4.8. The geometry was discretizedusing 261 nodes forming 720 edges
and 440 elements. Of these, 60 nodes and 60 edges were on the PEC boundan.
Analytical and numerical values for ya are comparedin Tables 4.3 and 4.4 for \320\242\32
and TM polarization, and we observe that the agreement between analytical and
numerical values is good. In fact, the TVFE simulation yields slightly more accurate
results for comparable discretization than the simulation for node-based elemenb

TABLE 43 Analytical and Numerical ya Values for Modes


\320\242\320\225 Using
TVFEs

Mode Analytical Result FEM Result

\320\242\320\225,,, 3.141593 3.141520


TF.2D 6.2\320\231\320\227\320\250 6.2\320\2572485

\320\242\320\225,,, 6.283185 6.283132


\320\242\320\225,, 7.024818 7.024096
TEj, 8.885766 8.884424
\320\242\320\225,,, 9.424778 9.42fi\302\27345
Section 4.5 \302\246
Edge Elements 145

TABLE4.4 Analytical and Numerical ya Values for TM Modes Using


TVFEs

Mode Analytical Result FEM Result

TM,, 7.024818 7.023634


TM,, 8.885766 8.884080
TM.,, 11.327173 11.326886
TMP 12.9531 IK 12.947988

TM22 14.049629 14.041209


TMr 15.707963 15.698376

4.5.2 Example 2: Scattering by a Square-Shaped Material Coated


Cylinder

As another application, we consider a square metallic cylinder situated in free


space and coated with a dielectric shell of thickness d, as shown in Fig. 4.32. The
cylinder is assumed to be coated by a material layer of relative permittivity er and
relative permeability /zr. It is illuminated by an incident or TM
\320\242\320\225 polarized plane
wave as defined by D.6) and D.7).
As in the case of node-based elements, we will work with the scattered field
as the unknown quantity for this application. For TM polarization, the resulting
element equations will be of the form

D.158)

with the column vector [Hfat] representing the unknowns scattered magnetic field
components along the edges of the triangular mesh. Following the analysis in Section

4.4, we readily find

D.159)
f(v,Xrj.^v,xw;)-k\\ntvrm\302\246 w;Jda

=
\320\272, I w;
\342\200\242
\\a
x
(- v, x di
\320\275\320\2331

- /cJ5 \321\206,,H'l dx dy D.160)


f f WS,-[v(x-V,xH'-

-.dksl
d=0.15Jt0
clnner
P
I * ef=1
Mf-3-/3
\\

E1
V T

\\4
\342\200\224>\302\246
\342\200\242*\342\200\224
Figure 4.32 Coaled square cylinder illumi- a
nnled by TM polarized plane wave. Ho
146 Two-Dimensional Applications \302\246
Chapter 4

in which refers
\320\257'' to the region occupied by the dielectric coating and Cdld is the
contour on the outer boundary of the dielectric as depicted in Fig. 4.32 (see also Fig.
4.16). The computational domain will be terminated by a metal-backed artificial
absorber and therefore we can neglect the presence of any boundary integrals. The
absorber was Q.5X0 thick and placed at a distance 0.5X0 from the boundary between
the coating and free space. The absorber'srelative permeability and permittivity was

set to eaM) = = 2 -jl.


\320\234\302\253/\320\224\320\276

For our scattering calculations,we choose a PEC cylinder of side length


a = k0. The coating has the uniform thickness </ = 0. L5A.O, a relative permittivity
er = I and a relative permeability
= 3 \342\200\224y'3.
\321\206\320\263 The FEM simulation of this domain
used 1264 nodes forming 3620 edgesand 2356 elements. Of these, 40 nodes and edges
were on the PEC boundary cmner, 52 nodes and edges were on the material/free
space boundary Cdlcl and 132 nodes and edges were on the c\302\260uterwhich coincides
with the metal boundary backing the absorber. The final number of unknowns was
thus 3620 for TM polarization.
Once the FEM solution is carried out and the fieldsare found everywhere in
the domain, the next step is to compute the far zone scattered field. An intermediate
part of this process involves the application of the surface equivalence principle(see

Chapter 1) to determine equivalent surface currents on a surface enclosing the scai-


terer (such as C*1\021 or any other contour of choice). For TM polarization, the
equivalent electric and magnetic surface currents are given by

j\" = ilxH = (nx.x + nyy)x (Hxx + hvy)


= - n,Hx)z = JJz D.161)
(nxH,

MB = Exn = (E-z) x +
(\302\273xx nyy)
=
-nyE,x + nxE2y
= + \320\234\321\203\321\200
\320\234\321\205\321\205 D.162)

in which E is found from E = ^~. These currents can be integrated using the

radiation integrals to give the far zone scattered field. For TM incidence, we find

|\320\223' \342\200\224
[~ZqJ:(t') Mx(r')sin\321\204' + MY{

D.163)

where the reader is referred to Fig. 4.16 for a definition of the primed parameters.
In Figs. 4.33 to 4.36. the computed equivalent currents Je and \320\234\321\217
on \320\241\"*1
are

compared with those obtained using a moment method (denoted as MoM on the

figures) analysis [36] with the incident wave impinging at = \320\236'.


\321\204\320\277 As seen, the two

analyses give nearly identical results (for magnitude as well as phase). The corre-
corresponding bistatic echo width as computed from D.163) and the dual of D.93) is given
in Fig. 4.37. Again, the FEM resultsare in agreement with the moment method data.

The scattering patterns show distinctively that the


largest scattering occurs in the
forward (\321\204= 180\320\265)and backscatter =
(\321\204 directions.
0\302\260) The beamwidth of these
forward and backscatter lobes decreases inversely with the size of the cylinder,
whereas the number of side lobes increasewith increasing cylinder size.
Section 4.5 \302\246
Edge Elements 147
0.002

0.0015

\321\212

0.001

0.0005

0.5
Normalizeddistance

Figure 4.33 |Je| on Cdlcl in Fig. 4.32.

4 1
MoM
FEM-- ~
3 '7*1
\"
/1 / \\ i i / \\
/11 [\\ / \\ i i \\
1
2 / 1 I ,
i \\ / I i i / \\.
/ ' \320\232 A \\ / 1'
i / \\
I v A^ '\\ 11 \\
\342\200\242
\\ i/
1 I V I \\ \\1
\320\2721
i /
|i
\\ 1 V
\302\246
/
/ \302\253
i /
/ \\\\
'u
0 \\ I
\320\262 /

S. -1
i / \\
-2 / \\:V
/
; 1
\302\246
-3

-4
0.5

Normalized distance

Figure 4.34 ,:j\" on C1*1 in Fig. 4.32.


148 Two-DimensionalApplications \302\246
Chapter 4

1.8
MoM
-- .
1.6 FEM
/A
1.4 \342\200\242

m 1-2
2 \\
1 \302\246

0.8 \302\246J
1
0.6
0.4 1 f
A

02 If\342\200\224

0
0.5

Normalized distance

Figure 4.35 |M*] on in


C\021\021 Fig. 4.32.

3
2
\302\246
/

I \302\260

? -1

-2
\320\251
-3 MoM \342\200\242

FEM

0.5
Normalized distance

Figure 4.36 m' on C1\021\"in Fig. 4.32.


Appendix 1 Element
\302\246 Matrix for Node-Based Bilinear Rectangles 149

20

-40
30 60 90 120 150
Observation angle ^

Figure 4.37 Bislalie RCS of cylinder in Fig. 4.32. (Results in Figs. 4.33 to 4.37 are
courtesy of L, S. Andersen.) The RCS was computed by integrating
equivalent electric and magnetic currents (Js. M\") on a contour a
small distance from C\"*1.

APPENDIX 1-. ELEMENT MATRIX FOR NODE-BASED BILINEAR RECTANGLES

The bilinear node-based expansion for the rectangle (right-angled quadrilateral) is


given by B.7) in Chapter 2. and has four degrees of freedom. Thus, the element
matrix will have 4x4 entries, viz.

Ah Ah Ah A\\,
A\\\\ Ah Ah A%,

-A%\\ Ah Ae4i AW j

where the expression for Ae,j


is given in D.159). For the rectangle in Fig. 4.38 (see also
Fig. 2.1), the resulting entries are

i= 1,2.3.4

*' 18

v 36
150 Two-Dimensional Applications \302\246 4
Chapter

Figure 4.38 Node numbering and geomem


of ihe rectangle.

APPENDIX 2: SAMPLE matlab CODE FOR IMPLEMENTING THE MATRIX


ASSEMBLY

function[K_TE,Kdel_TE, A_TE,A,K,Kdel,K_TM,Kdel_TM, A_TM,x,y,xnodes,ynodes]=


fero_2d(lambda);

% Generation of the Node Connectivity Table:

%This MATLAB code is courtesy of Y. Botros

N_elements=18;
n_lst_lDcal_node=[l 2 2 2 3 3 5 6 7 8 9 9 10 10 11 12);
5 6
n_2nd_lDCal_node=|2 6 3 7 4 8 6 10 7 11 8 12 10 14 11 15 12 16);
n_3td_local_node=l 5 5 7 6 8 7 10 9 11 10 11 11 14 13 15 14 15 15);
mu=onesA:N_eleraents);
eps=ones(l:N_elements);
ko=2*pi/lambda!

for e=l:N_elementsi

n(l,e)=n_lst_local_node(e);
nB,e)=n_2nd_local_node(e);
nC,e)=n_3rd_locai_node(e);
end;

non_cond=!6 7 10 11);
% Nodes locations table:

xnodes=[0 .5 1 1.5 0 .5 1 1.5 0 .5 1 1.5 0 .5 1 1.5 b


ynodes=[0 0 0 0 .25 .25 .25 .25 .5 .5 .5 .5 .75 .75 .75 .75 b

%Initialization Process:

A=zeros(N_elements);

Kdel=zeros(N_elements);

K=zeros(N_elements);
Kedel=zerosC)j
Appendix 2 \302\246
Sample matlab Code \320\223\320\276\320\263
lmplemenling the Matrix Assembly 151

*Loop through all elements:

for e=l:N_elements;

4E_z/TM polarization :

pe=l/mu((e)) ;
qe=eps((e)) ;

% H_z/TE polarization :

pe=l/eps((e) );
qe=mu ( (e) ) ;

\\ Cooidinates oE the Element nodes:

for i=l:3;
x(i)=xnodes(n(i.e));
;
y(i)=ynodes(n(i,e))
end?

% Compute the Element matrix entries:

Aiea=.5*abs((xB)-xA))*(yC)-yAI-\321\205C)-xA))*yB)-yA)));
\\ A(n(i,e) ,n(i,j))=0;
% Kdel(nli,e),n(i,3) 1=0;
\320\263
Klnli.e),n(i,j))=0;

for 1*1:3;
il=0;
if i==3(

1-1=3,
end;
ipl=(i+t)=il!
12=0;
if ipl~3;
12=3;

end;

ip2=ipl+l-i2;
bi=y ( ipl )-y', ip2) ;
ci=x(ip2)-x(ipl) ;
152 Two-Dimensional Applications \302\246
Chapter 4

for j=l:3;
jl=O,
Lf j==3;

jl=3f
end;
jpl=(jn)-jl;
j2=0;
if jpl=3(
J2=3;

end;

jp2=(jpl+l)-j2f
bj=y(jpl)-y(jp2);
cj=x(jp2)-x(jpl>;

Ae(i,j)=-(pe*(bt*bj+ci*cj))/D*Aiea),

Kedel(i,j)=-Ae(i,j);

if i=j!
Ke(i,j)=qe*(Area/6);
Ae(i,j)=Ae(i,j) + (kD\342\200\2422)*qe*(Ar ea/6);

else;
Ke(i,j)=qe*(Area/12);
Ae(i,j)=Ae(i,j)+(ko'2)*qe*(Area/12),\342\200\242
end;

% Assemble the Element matrices into the Global FEM System!

A(n(i,e),n(j,e))=A(n(i,e),n(j,e)
Kdel(n(i,e),n(j,e>)=Kdel(n(i,e),n(j,e))+Kedel(i,j);
K(n(i,e),n(j,e))=K(n(i,e),n(j,e))+Keli,j)(
end;
endj

end;

K_TE=K;
Kdel_TE=Kdel;
A_TE=A;

K_TM=K(non_cond,non_cond);
Kdel_TM=Kdel(non_cond,non_cond) ;

A_TM=A(nDn_cond,non_cond);

eig_squares_TE=eig(Kdel_TE,K_TE);
eig_values_indices_TE=find(eig._squares_TE >= \320\236),1

eig_values_TE=sqrt(eig_squares_TE(eig_values_indices_TE))
eig_values_TE=sort(eig_values_TE);
Appendix 2 \302\246
Sample matlab Code for Implementing Ihe Malrix Assembly 153

figured)
plot (eig_values_TE,'*');
xlabelf'The Modes case)')
(\320\242\320\225

ylabel('The eigenvalues') ;
title!'Eigen values for modes
\320\242\320\225 in a rectangular waveguide');
grid;
legend('a=l.5 \321\201\321\216
and b=.75 cm');

figureB)
eig_squares_TM=eig(Kdel_TM,K_TM);

eig_values_indices_TM=find(eig_squares_TM >= 0);

eig_values_TM=sqrt(eig_squares_TE(eig_values_indices_TM));
eig_values_TM=sort(eig_values_TM)j
plot! eig_values_TM,'d' ) ;

xlabelt'The Modes (TM easel')

ylabeK'The eigenvalues');
title!'Eigen values for TM modes In a rectangular Waveguide')!
grid;
legend('a=1.5 cm and b=.75 cm');

e_no=l:1&;
table=[e_no' xnodes' ynodes'];

diary data.data

% The General A matrix


A

\302\273
The General matrix
\320\232

\320\232

* The General del matrix


Kdel

\\ The A matrix
\320\242\320\225

AJTE

% The matrix
\320\242\320\225
\320\232

K_TE

\\ The Kdel
\320\242\320\225matrix
Kdel_TE

\\*t***************
% The TM A matrix

A_TM

% The TM matrix
\320\232

KJTM

\\ The TM Kdel matrix


KdelJTM

\\\302\253*********\302\253*\302\253\302\253**\302\253

% The next table indicates the node coordinates.


\\ First column ===> Node Numbers.
154 Two-Dimensional Applications \302\246
Chapter 4

% Second column => x coord.

% Third column => coord.


\321\203

table
diary

REFERENCES

[1] B. M. A. Rahman and J. B. Davies. Finite element analysis of optical and

microwave waveguide problems. IEEE Trans. Microwave Theory Tech..

MTT-32:20-28, January 1984.


[2] J. F. Lee,D. K.Sun. and Z. J. Cendes. Full-wave analysis of dielectric wave-

waveguides using tangential vector finite elements. IEEE Trans. Microwave Theon

Tech., MTT-39(8): 1262-1271, August 1991.

[3] J. van Bladel. ElectromagneticFields.Hemisphere Publishing Corp., New York.


1985.
[4] Y. Lu and F. A. efficient finite element solution of inhomo-
Fernandez. An

geneous anisotropic lossy dielectric


and waveguides. IEEE Trans. Microwave

Theory Tech.. 41F/7): 1215-1223, June/July 1993.


[5] J.-F Lee. Finite element analysis for lossydielectric waveguides. IEEE Tram,,

Microwave Theory Tech., 42F): 1025-1031, June 1994.


[6] J. B. Davies. Complete modes in uniform waveguides. In T. Itoh, G. Pelosi, and
P. P. Silvester, editors. Finite Element Software for Microwave Engineering,

Chapter 1. Wiley, New York, 1996.

[7] P. Silvester. Construction of triangular finite element universal matrices. Int. J.


Numer. Methods Eng., 12B):237-244, 1978.
[8] A Chatterjee, J. L. Volakis, and L. \320\241Kempel. Optimization issues in finite

element codes for solving open 3D electromagnetic scattering and confonnal


antenna problems. Int. J. Num. Modeling: Electr. Net. Dev. and Fields,9:335-
344. 1996.

[9] C. J. Reddy, M. D. Deshpande, R. Cockrell,


\320\241 and F. B. Beck. Finite elemenl
method for eigenvalue problems. Technical report, NASA Technical Paper
3485, NASA Langley Research Center,December 1994.
[10]R.F. Harrington. Time-Harmonic Electromagnetic Fields. McGraw-Hill, New

York, 1961.

[11] N. Marcuvitz. Waveguide Handbook. Peter Peregrinus, London, 1986,

Originally published by McGraw-Hill Co., 1951.


[12]A. Bayliss and E. Turkel. Radiation boundary conditionsfor wave-like equa-
equations. Camm. Pure Appl. Math., 33.707-725, 1980.
[13]\320\242.\320\222.A. Senior and J. L. Volakis. Approximate Boundary Conditions in

Electromagnetics. IEE Press, London, 1995.


[14] B. Engquist Absorbing boundary conditions for the
and A. Majda. numerical
simulation Math. \320\241\320\276\321\202\321\200.,
of waves. 31:629-651, 1977.

[15] L. N. Trefethen and L. Halpern. Wide-angle one-way wave equations


J. Acoust. Soc. Amer., 84:1397-1404, October 1988.
References 155

[16] R. L. Higdon. Absorbing boundary conditionsfor acoustic and elastic waves in

stratified media. J. Physics, 101:386-418,


\320\241\320\276\321\202\321\200. 1992.
[17] A. F. Peterson and S. P. Castillo. A frequency-domain differential equation
formulation for electromagnetic scattering from inhomogeneous cylinders.
IEEE Tram. Antennas Propagut.. 37E):601-607, May 1989.
[18] G. T. Ruck, D. E. Barrick, W. D. Stuart, and \320\241. Krichbaum.
\320\232. Radar Cross
Section Handbook. Plenum Press, New York, 1970,1970.
[19]T. Ozdemir and J. L. Volakis. A comparative study of an absorber boundary
condition and an artificial absorber for truncating finite element meshes. Radio
Science,29:1255-1263,September-October, 1994.
[20] Z. J. Sacks, D. M. Kingsland, R. Lee, and J.-F. Lee. A perfectly matched
anisotropic-absorber for use as an absorbing boundary condition. IEEE
Trans. Antennas Propagat., 43:1460-1463, 1995.
[21] S. R. Legault, \320\242. A.
\320\222. Senior, and J. L. Volakis. Designof planar absorbing

layers for domain truncation in FEM applications. Electromagnetics, 16D):451-


464, July-August 1996.
[22] J. M. Jin and J. L. Volakis.\320\242\320\225
scattering by an inhomogeneously filled aperture
in a thick conducting plane. IEEE Trans. Antennas Propagat., 38:1280-1286,
August 1990.
[23]J. M. Jin and J. L. Volakis. TM scattering by an inhomogeneously filled aper-
aperture in a thick 153-159.
ground plane. IEE Proc. H., 137C): June 1990.

[24] J. M. Jin, J. L. Volakis, and J. D. Collins. A finite element-boundary integral


method for scattering and radiation by two- and three-dimensional structures.
IEEE Antennas Soc. Mag., 33C):22-32,
Propagat. June 1991.
B5J J. R. Mautz and R.F. Harrington. H-field, E-field and combined-field solutions
for conducting bodies of revolution. Arch. Elek. Uhertragung. 32:157-163, 1978.
[26] D. R. Wilton and J. E. Wheeler III. Comparison of convergence rates of the
conjugate gradient method appliedto various integral equation formulations.
In \320\242.\320\232.Sarkar, editor. Application of Conjugate Gradient Method to
Electromagnetics and Signal Analysis, Chapter 5. Elsevier, New York, 1990.

[27] J. D. Collins. J. M. Jin, and J. L. Volakis. Eliminating interior cavity resonances


in FE-BI methods for scattering. IEEE Trans. Antennas Propagat., 40:1583-
1585, December 1992.
[28] P. L. Huddleston, L. N. Medgyesi-Mitschang, and J. M. Putnam. Combined
field integral formulation for scattering by dielectrically coated bodies. IEEE
Trans. Antennas Propagat., AP-34:510-520, 1986.

[29] K. D. Paulsenand D. R. Lynch. Elimination of vector parasites in finite element


Maxwell solutions. IEEE Trans. Microwave Theory Tech., 39C):395-404,
March 1991.
[30] D. Sun, J. Manges, X. Yuan, and Z. Cendes. Spurious modesin finite-element
methods. IEEE Trans. Antennas Propagat., 37E): 12-24, October 1995.

[31] J. R. Winkler and J. B. Davies. Elimination of spuriousmodes in finite element


analysis. J. Comput. Physics,56:1-14, 1 984.
156 Two-Dimensional Applications \302\246
Chapter i

[32] B. M. A. Rahman and J. B. Davies. Penalty function improvement of wave-


waveguide solution by finite elements. IEEE Trans. Microwave Theory Tech., 32:922

928. August 1984.

[33] J. M. Jin and J. L. Volakis. Electromagnetic scattering by and transmission

through a three-dimensional slot in a thick conducting plane. IEEE Tram

Antennas Propagat.. 39D):543-550, April 1991.


[34] J. P. Webb. Edge elements and what they can do for you. IEEE Trm-

Magnetics, 29:1460-1465, 1993.

[35] I. V. Yioultsis andTsiboukis. Vector finite element


T. D. analysis of waveguide
discontinuities anisotropic
involving media. IEEE Trans. Magnetics, MAG-31.

pp. 1550-1553, May 1995.


[36] L. S. Andersen. Scattering by non-perfectly conducting structures. Master's
thesis. Electromagnetics Institute, Technical University of Denmark, Lyngk
No. E544, August 1995.
Three-Dimensional

Problems:

Closed Domain

5.1 INTRODUCTION

Finite elements have been used extensively to model open- and closed-domain elec-
electromagnetic problems in scalar form in two and three dimensions [1], [2], [3].
However, the true power of the finite element method is revealed in three-dimen-
volume
three-dimensional since surface-based integral equation methods have great
formulations
difficulty in dealing with material and structural inhomogeneities. As explained in
earlier chapters, finite elements do not suffer from these shortcomings. But for a long
time [4], reliable full vector formulation proved to be extremely difficult to imple-
implement. Discretization of the curl-curl version of the wave equations A.30) usually
resulted in the appearance of nonphysical or spurious modes.The causeof the
problem is the traditional nodal basis functions that are used to discretize the
unknown field variable.1 The reasons for the failure of node-basedelements in

modeling the vector wave equation will be discussed in a later section.


Fortunately, a novel remedy was found by assigning the degrees of freedom to the
edges rather than to the nodes of elements.These types of elements had been
described by Whitney [5] in terms of geometrical forms about 35 years back and
were revived by Nedelec [6] in 1980. In recent years, Bossavit [7] and others [8], [9],
[10],[11]applied these edge-based finite elements successfully to model full three-
dimensional vector problems. In all these works, edge elements were seen to be
devoid of the shortcomings commonly experienced with node-based elements. In
this chapter, we examine the application of edge basis functions to closed domain
problems as found in packaged microwave circuits and cavity resonators.
'SeeChuplcr 2 for a presentation of Ihc node-based and edge-based elements.Also, see the lasi
section of Chapter 4 for a two-dimensional formulation using edge-based elements along with a discussion
on lhe shortcomingsof node-based elements.

157
158 Three-Dimensional Problems: ClosedDomain \302\246
Chapter 5

The first part of this chapter describes the variational formulation for the

closed domain problem in terms of field intensity. As noted in Chapter 1 (Section

1.11.1), the variational formulation leads to the same system of equations as llit
weighted residual method employed in Chapters 3 and 4. We also formulate the

problem in terms of vector potentials. The field formulation and the potential for-
formulation are equivalent; however, each has its pros and cons. After the formulation
to obtain the linear system of equations,we briefly describe the problem of spuriom
solutions encountered with node-based elements. Generation and assembly of \320\277\320
finite element coefficient matrix using tetrahedrals and bricksare given and issues

related to modeling sources for circuit problems arediscussed.This is very critical for
the computation of the scattering parameters (S parameters) in a circuit. We end the

chapter by presenting a few applications of the finite element method pertaining \321\216
cavity resonators and packaged circuit configurations.

5.2 FORMULATION

The geometry of the problem is illustrated in Fig. 5.1.The structure of inleresi is a

three-dimensional inhomogeneous body occupying the volume V that may include


embedded perfect magnetic and resistive sheet surfaces as well as metallic sheets. We

shall assume that V is bounded by perfect electric walls.

{outer unit normal)

Resistive or
Impedance
sheet

Figure 5.1
Inhomogeneous structure encte
by a mesh termination surface Sa assumed in
be a perfect electric conductor (PEC).
Section 5.2 Formulation
\302\246 159

5.2.1 Field Formulation

As done in the previous chapters, the problem statement is to satisfy the vector
wave equation

V x \342\200\224
V x E - A-k-E
=
-jkoZof
- V x E.1)
ttf)
(\342\200\224

throughout the volume V subject to a given set of boundary conditions on the

surface So enclosing the volume V. Here. J' and M' are the electric and magnetic
current respectively, contained within
sources, the volume. These current sources are
usually known and form the excitation for the problem.
a priori Tn this chapter, we
take the variational approach to formulating the finite element solution. This
approach is often employed in the literature to construct the linear system, but as
discussed in Chapter 1, it typically leads to a system that is identical to that obtained
from the weighted residual method. The variational formulation is therefore another
method for finding the solution of a given boundary value problem.
The calculus of variations originates from a generalization of the elementary
theory of maxima and minima of functions. In the variational technique, we strive to
find the extrema of functionals (seeChapter 1). The functional can loosely be taken
to mean a function which depends on the entire course of one or more functions
within the domain of interest. For the wave equation, we can express the functional
for the total electric field as

F(E)=- E Vx\342\200\224(VxE)-
Mr

+ E. jkoZcJ' + Vxt-M'WdV E.2)


iv

where V is the entire computational domain. For the sake of simplicity, we will
assume a source-free volume domain and considerthat the only sources used to
excite the circuit are coupled through the ports of the geometry. Using the vector
identity

A-VxB = B-VxA-V(AxB)

and the divergence theorem

f E.3)
iv v u

on the double curl in the right hand side of E.2), we get

F(E) = I [ (V x E)
\320\223\342\200\224 \302\246 x
(V E)
- *g\302\253rE eI dv
\342\200\242

\302\246Ui-LMr J

+ ^f [E(nx VxE)]dS+ f
EfdV E.4)
2J.% iv
In E.4), f is the source function given by

fiV1 E.5)
160 Three-DimensionalProblems:ClosedDomain \302\246
Chapter?

and So denotes the surfaces within V for which the tangential component of E and or
H is discontinuous. We remark that the corresponding formulation based on the

weighted residual method would lead to the weak form of the vector wave equation

<R, W) = f (V
[\342\200\224
x E) \342\200\242
(V x W) - *jjerE wl dV
\342\200\242

+ [W \342\200\242
x V
(\320\273 x E)]dS + f
W \342\200\242
P dV E.6,
Js(, iv

where R is the residual vector resulting from the finite-dimensional approximation to

the exact solution and W is the weighting function.


The above functional can be generalized to account for the presence of impe-
impedance and resistive sheets or other discontinuous boundaries.In the case of a resis-
resistive card, the transition condition [12] (see Section1.6)

x (\"fc
\302\253\320\272 * E) = -RnK x (H+ -
H~) E.7)

must be enforced,where H* denotes the total magnetic field above and below the
sheet, R is the resistivity in per square meter and hK is the unit normal
Ohms vector

to the sheet pointing in the upward direction D- side). For an impenetrable impe-
impedance surface, the appropriate boundary condition is (see Section1.6)
* (\"\320\272
\"\320\272 x E) = x H
-\321\211\320\272 E.fci

where = is the unit normal vector to the surface


\320\277% \342\200\224\320\270 and r) is the surface impe-
impedance {hK points into the computational domain and as usual \320\273
points away from the

computational domain). Taking into consideration these boundary/transition con-

conditions, the functional for the total electric field can be more explicitly written as

F(E) = f (V x
\320\223\342\200\224 E) \342\200\242
(V x E) - kl(rE \302\246

\\E-P(E)]dS E.9.
+7*oZbf ^(\321\217\321\205\320\225).(\321\217\321\205\320\225)\302\253\320\2224-|

where is
\320\232 the surface resistivity of a resistive card and equals the surface impedance
for an impedance sheet. It has also been assumed that the relation between tangential
H and tangential E on the surface So can be expressed in terms of the differential

equation

x V
\320\273 x E = f(E) E.10)

Note that the factor of j in E.9) was dropped on the assumption that in carrying out

the extremization of F(E), the differentiation will be applied to P(E) as well. Again.
we note that the corresponding weighted residual equation which would lead to the
same linear system is [13]

\342\200\224 x
(V E) \342\200\242
(V x W) - /&,E \342\200\242
wl dV

\320\273
sK is0
Section 5.2 \302\246
Formulation 161

because of the following identity:

E \342\200\242
x V
(\320\270 x E) = -jk0Z0E \342\200\242
(n x H) = -jk0Z0H \342\200\242
(E x \320\273)

Therefore, the integral over So in E.9) vanishes for PEC and PMC surfacescompris-
the
comprising inner boundaries of the volume V. Thus, the surface integral over 50 reduces
to the integral over the outer boundary or the mesh termination surface. However,
for packaged structures which are bounded with electric walls (PEC surfaces), the
integral expression over the outer boundary vanishes too. For open structures, this is
not the case and leads to the use of absorbing boundary conditions (ABCs)which are

explained in detail in Chapters 4 and 6.


To deal withanisotropic structures, the functional E.9) undergoes a slight
modification the material properties of the object
since (permeability and permittiv-
are
permittivity) now second-rank tensors rather than scalars. Equation E.9) can therefore be
written as

x E)} -
=
[(V x E) \302\246 \342\200\242 \342\200\242
\302\246
f(E) {[\320\264]-1 (V fcjjE {? E}] dV

-i
+jk0Z0 f
x
(\320\273 E) \342\200\242
(/?x E) dS + I E \342\200\242
/>(E) dS E.12)
\320\232
ish JSll

where

M.n- M.T--

\320\264,\320\273. I E.13)
\321\206\321\203\321\203
\321\206\320\263

[Vxx

and

'\"
? = \320\241 fw I E.14)
e.-.v eiy ^

The symmetry of the final system of equations now depends on the symmetry of \321\206
and ?. For packaged structures, when the surface integral over the outer mesh
boundary So vanishes, we are left with the functional

F(E)=
J [(VxE)
( E.15)

The functional representation for open boundaries is the subject of the next chapter.
The extremum of the functional can be found by using the Rayleigh-Ritz minimiza-
minimization
procedure to differentiating F with
which amounts then setting respect to E and
it to zero, as
explained Chapter 1. In practice,
inthe differentiation is done after
introducing the expansion for E. By setting the derivative with respect to each
coefficient of the expansion to zero, a set of equations is obtained which is said to
be stationary with respect to the first variation in F.
162 Three-Dimensiona] Problems: Closed Domain \302\246
Chapter?

5.2.2 Potential Formulation

The finite element problem can also be formulated in terms of vector and scalar

potentials. The potential formulation is attractive for obtaining reliable solutions


throughout the frequency spectrum. It is well known that the field formulation is

unstable very low frequencies


for where the system of equations becomes nearly

singular. In circuit and waveguide applications, characterization of low frequency

design behavior could have significant importance. The potentials used as the solu-
solution variables are the magnetic vector potential A and the electric scalar potential \320\244

The potentials are defined in terms of electric and magnetic fields as [14]
V x A = \320\222 E.16)

\320\225 E.17)

where is the
\320\222 magnetic field density and w is the angular frequency. Again, assuming

a source-free domain and using Maxwell's equations, the curl-curl wave equation

E.1) can be rewritten as

V x - = 0
V x (- A w2e(A + V<J>) E.1
J

The boundary conditionsfor perfect electric and perfect magnetic materials are a

little more complicated than the field formulation. The perfect electric boundary
conditions in terms of potentials are

A x n = ^ (E x n) = 0
w

=
\320\244 0

while the perfect magnetic conditionreducesto

n = Hxn = O E.20)

The corresponding functional can then be formulated as before to yield

1
F(A , 4\320\233
~ \320\223--VxA-VxA-
\320\223'
ty2e(A + - A I
\320\243\320\244) dV
\320\244) -=\\
2jy\\_fj, ']\302\246

E.21)
2J

where a is a scalar parameter usedto provide a gauge. The formulation is ungauged


for a2 = 1. The gauge varies between 0 and I and is guaranteed to provide the same
field values as the field formulation. However, it may be used to improve the numeri-

numerical
stability of the algorithm.
As in the field formulation, the surface integral over So vanishes for PEC and

PMC boundaries as well as for interior surfaces without transition conditions. The

sheet boundary condition can also be appliedrather easily as in the field formulation

by setting
Section 5.3 \302\246
Origin of Spurious Solutions 163

x V
\320\273 x A

E.22)
where is defined
\320\232 as the surface impedance. The functional given by E.21) is
discretized by using edge basis functions for representing the vector potential and
nodal basis functions for the scalar potential. Useof the edge-based vector potential
eliminates spurious modes (to be discussed in the next section) in the solution spec-
spectrum and also helps in the enforcement of boundary conditions on edges and corners
of the desired structure. It should be noted that the potential formulation results in a
larger solution space for an identicalmesh topology than the field formulation. This
is due to the extra nodal unknowns required to discretize the scalar potential \320\244.

However, as noted earlier in this section, the formulation allows robust analysis even
for the lower frequencies in the analysis spectrum [15J. The solution for E.21) is
obtained by extremizing F(A, \320\244)
with respect to A and \320\244.

5.3 ORIGIN OF SPURIOUS SOLUTIONS

Conventional finite element basis functions give rise to spurious solutions when

E.31) is solved. As Wong and Cendespoint out in [16], the origin of these spurious
solutions lies in the infinitely degenerate eigenvalue = 0 in the eigenvalue spectrum.
\320\272

Given the eigenvalue matrix system along with the PEC condition n x E = 0 on the

boundary, there exists an infinite number of scalar functions such


\320\244 that =
\320\244 0 on
the boundary. Then E = is a
-\320\243\320\244 corresponding to the
permitted eigenfunction
eigenvalue =
\320\272 0. If the discretization scheme fails dimensional to model this infinite
nullspaceof the curl operator properly, spurious solutions to the eigenvalueproblem
will appear. This is also the reason why spurious solutions do not occur when the
scalar wave equation is discretized since the nullspace of the Helmholtz operator is
the trivial eigensolution E = 0.
One way to get rid of spurious modes is to formulate the eigenvalue problem
such that \320\272= 0 is no
longer a permissibleeigenvalue. This is achieved by enforcing
V E =
\342\200\242 0 E.23)

exactly everywherein the source-free solution region. Then the only solution corre-
corresponding to the =
\320\272 0 eigenvalue is the trivial E = 0. In finite
one elements, solving
an eigenproblem along with a constraint E.23) is well known [17]. Researchers have

mostly tried the penalty function approach of constrainedminimization [18], [19]


since it is simple to implement. However, the penalty approach is a mere fix and not
a cure for the problem. Since the spurious eigenmodes are now shifted far into the
visible spectrum, they are not completely eliminated and are dependent on a user-
defined parameter which specifies how strongly the divergenceless condition E.23) is
imposed. The spurious modes do not really go away since the nullspace is still being
modeled with very few variables when nodal elements are used.
Other than the penalty method, derivative continuous finite elements (C1
elements) have also been proposed[16]to alleviate this problem. In this method,
an auxiliary vector field f is introduced such that
164 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5

E = V x \320\241 E.24)

Since substitution of E.24) for E into E.15) results in second derivatives, we needio
construct first derivative continuous elements or C1 elements.As shown in [16],
discretization of E using node-based C1elements eliminates the problem of spurious
solutions since the nullspace of the curl operator is modeled exactly. The added
constraint of normal derivative continuity across inter-element boundaries provides
the excess degrees of freedom to increase the size of the \320\243\320\244
subspace. However, C1

elements are not commonly found in finite elements and may need to be explicitly
derived for the problem at hand.
Another method of eliminating spurious modes, without getting rid of the
eigenvalue = 0,
\320\272 is by using edge elements [20].Webb in [21] provides an elegant
rationale as to why spurious modes do not appear with edge-based elements and why
they are likely to be presentwith node-based vectorial elements.
Let us consider a single tetrahedral element. With a vector first-order node-

based formulation, the element has 12 degreesof freedom (three each for the four

vertices), i.e., span a 12-dimensional


the bases space V. Since we are solving the curl-
curl equation, the space of \320\243\320\244
must contain alt vectors of polynomial orderless than
or equal to one. Therefore, \320\244 must be a second-order polynomial which needs ten
degrees of freedom to be complete.However, discounting the constant term which
has zero gradient, the dimension of the \320\243\320\244 subspace is nine. In a general mesh, the

degrees of freedom belonging to the \320\243\320\244


subspace must satisfy both tangential
continuity in and
\320\244 normal continuity in This
\320\243\320\244. requirement is usually not met
in general unless the mesh is a specially constructed C1mesh.Thus, the elements
belonging to the \320\243\320\244
subspace are relatively few.
In edge elements, however, the picture is different. The space V has dimension
6. More importantly, the dimensionof the \320\243\320\244
subspace is only3 since \320\244
needs to be
only a first-order polynomial. Hence, the elementsof the subspace of \320\243\320\244
must now
satisfy only tangential continuity which is achieved by making \320\244
continuous across
element boundaries. This is automatically satisfied by edge elements and. therefore,
the \320\243\320\244
subspace is fully represented by the mesh entities. In fact, the dimension of
the \320\243\320\244
subspace is one less than the total number of internal nodes in the mesh.
Thus, the null space of the curl operator
\320\243\320\244 is modeled more completely using edge
elements rather than using node-based elements.
The above discussionconclusively proves that the root cause of spuriousmodes
is the improper modeling of the nullspace of the curl operator. Any basis function
which approximates it fully will be stable and free of spurious modes. As it turns out,
conventional Lagrangian finite elements are unsuitable; either C1 node-based
elements or edge-basedelements of any order can be used to obtain the true solu-
solutions.

5.4 MATRIX GENERATION AND ASSEMBLY

To discretize the electric field E within this volume, we subdivide the volume into

small tetrahedra, rectangular bricks or any suitable three-dimensional element (see


Fig. 3.3) each occupying the volume Ve{e= 1,2 M), where M is the total
Section 5.4 Matrix
\302\246 Generation and Assembly 165

number of elements.As in the previous chapters, we expand the discretization vari-


variable within each finite element in terms of polynomial basis functions. In this section,
we will use the electric field as the unknown variable. Other electromagnetic quan-
quantities like currentsand potentials can be expandedsimilarly.
Let us introduce the expansion (valid within the eth volume element)

E.25)

where WJ are the edge-based vector basis functions and EJ denotesthe expansion
coefficients of the basis WJ. The upper summation index m represents the number of
edges comprising the element, and the superscript e stands for the element number.
On substituting the expansion E.25) into E.15) and setting 3F(E)/9?/
= 0, we obtain
the system of equations

\320\274 \320\274\320\274
- -
? {Ae}{Ee} ki J218\"^&)
=
J2\342\204\226\342\204\226)
10} E.26)

where

4 = f i(Vx W?)
\342\200\242
(V x Wf)* E.27)

W-Wjdv E.28)

Cf =jk0Z0 1 Wf
-
(n x U)Js ~\\ -(fix Wf)
- x
\321\204 Wj)dS E.29)
|_Jy Jsj k J
In the preceding equations, all matrices and vectors following the summation sign
have been augmented using global numbers.
The surface area the boundary of element
S*' indicates e. As mentioned earlier,
all free faces, i.e., faces on the boundary, have zero contribution to the surface
integral since we consider only packaged structures in this chapter. This reduces
the original unknown count and eliminates the need to generate equations for
those edges/unknowns which would otherwise have to be included in the solution.
We can further simplify the surface integral contribution to [C] by taking advantage
of the inter-element continuity afforded by finite element basis functions. Dueto the
continuity of tangential H at the interface between two elements, an element face
lying inside the body does not contribute to the integral over S* in the final assembly
of the element equations. As a result, the last term of E.26) merely reducesto the
integral over imperfectly conducting or impedancesurfaces
contribution (S*).
For
simplicity, let us assume that no impedance surfaces exist in the structure.
As will become clear in the derivation below, it is difficult to solve the problem in the
presence of impedance structures. With all surface integrals reduced to zero, we are
left with the equation

[A][E]- kf)[B][E}
= 0 E.30)
which corresponds to the generalized eigenvalue problem for finding cavity reso-
resonances. The matrices [A] and [B] are N x \320\233'
symmetric, sparse matrices with N being
the total number of edgesresulting from the subdivision of the structure excluding
166 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5

the edges on the boundary. Their entriesare given by E.27) and E.28). As usual, \\E\\

is a column vector of order N denoting the coefficients of edge bases and =


\320\272
kl is
the eigenvalues of the system.
For mathematical treatment, we rewrite E.30) as

= k[B]{E]

whose solution yield the resonant field distribution {E}and the corresponding
will
wavenumber k0. The inclusion of impedancestructures in E.31) would require us to
solve a nonlinear eigenproblem given by

[A)\\E] = E.32)

where k0 is the desired eigenvalue and [R] is the contribution from the surface
impedance terms. This problem is usually solved by obtaining an initial guess
from the solution of the linear eigenproblem and subsequently using this solution
to find the true solution iteratively. For more discussionon such a solution process,
the interested reader is referred to [22].

As mentioned in Chapters 3 and 4, the finite element matrices are sparse and

can be filled very quickly. If the element shapes used for discretization are simple.
then the local element matrices A\" and can
8\320\265 be derived analytically and usedto fill
the global system after taking the proper numbering into account. For tetrahedra
using H0(curl) elements (see Chapter 2), the element matrices are given by

(b6,b6) (b6lb4)
\342\200\224(\320\254\320\261,
\320\252$) (b6, b3) -(b6, b2) 0\302\2736.
bl)

-(bs.be) (bs,b5) -(bs, b4) -(bs, b3) 0>5. \320\2542) -(b5, bi)

1 1 (\320\2544,\320\2545)
-(\320\2544,\320\2545) (b4.b4) (b4. b3) -(b4,b2) (b4, bi)

-(b3lb5)
(\320\254\320\267.\320\254\320\261) (b3<b4) (b3.b.o -(b3. b2) (b3, b,)

-(\320\2542.\320\2546) (b2.bs) -(b2,b4) -(b2. b3) (b2. b2) -(b2. b.)


. (b,,b6) -(b,,bs) (b,, b4) (hi. \321\2143) \"(b,,b2) (hi, b|)
E.33)

where (b,,by) denotes the dot product of the edge vectors b, and by of the tetrahedral
element. Referring to Fig. 2.6and Table 2.4 as an example, we note that

b, = (x2 - - + (z2
\320\243\\)\321\203
- zx)z

b2
= (x-i --

The element matrix for


[3\320\265] a linear edge basis expansion on a tetrahedron is given by

[23]
Section S.4 Matrix
\302\246 Generation and Assembly 167

\342\200\224
J

2(\320\233,-/.2 \320\2331-\320\2332- \320\233.-\320\2332- /12-/22- /12- /22- \320\233\320\227-/23

+/22) \320\233\320\267+2/23 /14 + 2/24 + /23


2\320\233\320\267 2/,4 + /24 -\320\2334+ /24
-
/\320\277- /.2- 2(\320\233,-\320\233\320\267 2 -/23
/\321\206-\320\233\320\267-2\320\233 /.2 -/23 \320\233\320\267
/\320\267\320\267-

t- 2/23 /|4
+/\320\267\320\267) + 2/34 + /\320\267\320\267
-\320\233\320\267 -/.4 + /34 2/,4 + /34
-
/\320\277- /12- /\321\206- 2(\320\233,-/.4 /J
\342\200\224
/24 2/,2-/24 2\320\233\320\267/34
1-2/24 1-2/34 +/44) + /34
-\320\233\320\267 -/|4 + /44 -/14+ /44
/22- 2/,2 -/23 /.2 -/24 2(/22-/23 /22-/23- /23-/33-
+ /23
2\320\233\320\267 + /\320\267\320\267
-\320\233\320\267 + /34
-\320\233\320\267 /24
+/\320\267\320\267) 4 \302\2462/34 2/24+ /34
-
/22- /23- 2/,2 -/24 /22 /23- ^^ -/24 2/23-/34

2/,4 + /24 /.4 + /34 + /44 /24 + 2/\320\2674 +/44)


\342\200\224
/24 + /44
\302\246 -
/23-
\320\233\320\267- \320\233\320\267-
/\320\267\320\267- -/34
2/.\320\267 /23-/33- 2/2\320\267 \"/34 /34
2(/\320\267\320\267

2/,4 + /34 -\320\2334+ /44 2/24 + /34 -/24 + /44 +/44)

E.34)

with
Fjj=:Fr F/. Here, Ft is the inward oriented normal vector to the tetrahedron's

face opposite to node / and has an amplitude equal to the area of the triangular

face.

The vector edge-based expansion functions for rectangular bricks were pre-
presented in [24], [25]. These basis functions were reviewed in Chapter 2 and consist of
12 unknowns for each of the 12 edges in the rectangular brick element. They are
rather simple to derive analytically and are presentedhere for the sake of complete-
The
completeness. [Ae] elemental matrix is given by

V
E.35)

where the square matrices [K\\], [K2], and are


[\320\233\320\2233]

2 -2 1 -1
2 2 -1 1
E.36)
1 _j 2 -2
1 1 _2 2
168 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5

2 1 -2 -1
1 2 -1 -2
E.37)
2 -1 2 1

1 _2 1 2

2 1 -2 -1

2 -1 2 1
E.38)
1 2 -1 -2

-1 -2 1
and [ ]r denotesthe matrix
the notation transpose (see Fig. 2.11 for the definition of
parameters and node specification).2Also.hxy_: denotes the length of the edges as
shown in Fig. 2.11. Substituting the values of [Kt], [K2]. and [tf3] in E.35) yields
a A2 x 12) system for the [Ae] element matrix. The values for the matrix [If] arc
given by
'
0 0
0 [L{\\ 0 E.39)
36
0 0 [1,1
where
4 2 2 \320\223

2 4 12
E.40)
2 14 2
12 2 4

implementation of the finite element discretization


In general, the involves two
numbering systems, some
and thus
unique global edge direction must be defined to
ensure the continuity across all edges [26].Here,we choose
of n x E this direction lo
be coincident with the edge vector pointing from the smaller to the larger global
node number. This convention ensuresthat the edge directions are consistent with
the local and global numbering of the unknowns.

5.5 SOURCE MODELING

To model microwave circuits, we usually need sources at the input ports to excite the

circuit. The source modeling issue is critical since it provides the input boundary
condition problem. A small error in the source model could lead to large
for the
errors in the 3D simulation of circuit parameters. In providing the source model fora
microwave circuit, it is assumed that the field is already accurately solved (stabilized)

throughout the volume of the configuration. A rather simplistic way of establishing


the stabilized mode pattern would be to usea voltage source excitation at the input

2The reader is cautioned lhat the matrix given in [25] contains misprints which have been corrected
in E.35).
Section 5.5 Source
\302\246 Modeling 169

port and use a feeder transmission line of suitable length to excite the actual circuit.
The disadvantage of this method is that the feeder structure increases problem size
and wastes valuable computer resources. Moreover, complicated transmission line

configurations cannot be adequately modeled and evanescent modes cannot be


included. In this section, we will be talking about the transfinite element method
in modeling the source excitation for circuit [27]. [28].
applications
The transfinite element method matches the projection of the desired modal
fields at the port with the fields obtained from the three-dimensional solution inside
the structure. Thus, it allows the specification of the desired modal excitations\342\200\224

propagating as well as evanescent for obtaining the circuit parameters in a systema-


manner.
systematic The modal patterns are obtained from an eigensolution of the 2.5D
problem, as discussed in Chapter 4. Since the dominant eigenvalues for the ports
can be obtained relatively cheaply and the feed line of the source model mentioned
earlier be eliminated completely, the transfinite
can element method is very efficient
numerically. The following derivation for modeling the source excitation borrows
heavily from [28].
at hand is to characterize the scattering
The problem parameters of an JV-port
circuit. At first, assume for we
simplicity that port 1 is excited with the dominant
mode. We also assume that the normal direction of propagation is the \320\263-axis. The
electric field at each port surface can then be described as

\321\206\320\234\321\206\320\265-\342\204\242 E.41)

ciijMtje-w, for / = 2 N E.42)


7=1

where

= \\jg \320\243\320\274\302\273 / \320\241


\320\220\320
\\
\320\2251\320\237\320\241
\320\234\321\206\320\265\021 (j.4j)

since only the dominant mode excitation is being considered for now. In E.42), \320\234\321\
is the modal pattern (eigenvector) for mode/ at port i. y,j is the propagation constant
of theyth mode in the /th port and are the scattering coefficients to be determined
aVj
in the analysis. As noted earlier, the modal information is obtained from a 2.5D
analysis of the port eigenvalue problem.
It remains to match the fields at the ports to the edge-based volume finite
elements in the interior of the domain. Usingthe continuity condition for tangential
electric fieldsat material interfaces, we set the projection of the modal approximation
equal to projections on the port plane from the finite element basis functions inside
the computational volume. Tangential continuity of edge-based finite elements
across inter-element boundaries simplifies the field matching condition. Moreover,
continuity of the tangential magnetic field at the port surface is automatically satis-
satisfied since it is a natural boundary condition of the electric field formulation.The field
matching condition is imposed through the surface integrals on the port planes.
Assuming unit power at each port, we can normalize the surface integrals for each
mode on port surfaces such that

-!-E(> x V x
E,v E.44)
-I .-=o
170 Three-Dimensional
Problems: Closed Domain \302\246
Chapters

where 5,- is the aperture surface at the rth port and

for propagating modes


= V-T for evanescent modes (-
{1y

On substituting E.44) into the surface integral of the electric field functional, we

obtain

. No. of Modes
\342\200\224
(Ex V x E) \342\200\242
\320\2535 =->/xo
Yl
i s, Mr /=l
. . / No.o
.ofModes
\342\200\224
(ExVxE).?rfS = /e**0 \320\234- E.47)
2)

Thus the boundary the port integrals over


surfaces are obtained analytically. The
obvious advantage of this technique is that the scattering parameters\342\200\224air-are
obtained as a result of the solution process and are, therefore, variational.
Moreover, all the unknowns on the port collapseto the desired scattering parameters
for the 3D solution. Of course, the modal solution needs to be computed initially
from a 2.5D analysis of the port eigenproblem.
The functional by enforcing the mode matching condition at the ports
obtained
is outlined in [28]. The resulting system of equations is derived by separating the
interior domain from the exterior and coupling the known modal solutions at the
ports through the surface integrals shown in E.47). The final system of equations is

given by extremizing the functional by the Rayleigh-Ritz technique and is expressed


in matrix form as

Uml] [AisJiM,) [AlS,]{M2\\

{Ml\\[ASlS,){M2)
[M2)[AS!Sl)[M2}+Ja>n[S]2_

E.48)

In defining the entries, the subscript i denotes an interior


matrix edge and the sub-

subscripts 5( and Si denote edges on ports 1 and 2, respectively. The matrix entries of

[AM] are the usual volume integral representation given by

= x \342\200\242
V x ~
At f G- V
W/ W* *k'W/ dV
\342\200\242w*)

The column [E] represents the interior volume field coefficients as in E.31), whereas
{a\\\\ and \\a2) stand for the unknown modal coefficients of the expansions E.41) and
E.42). Thus, the transfinite method gives rise to a partly sparse and a partly dense
matrix. However, the dimension of the dense part is limited to the surn of the number
of modes per port and the number of ports in the structure. It is, therefore, always
much smaller than the sparse portion of the matrix which represents the volume
unknowns in the geometry. Other circuit excitations,like a voltage or a current
source, can be modeled in the usual way by placing a voltage or current elemeni
and integrating over the volume of the element or elements occupiedby the source.
Section 5.6 \302\246
Applications 171

More details on source modeling are given in Chapter 7 for antenna and scattering

applications. The integrals over the sources are given in E.4) and E.5).

5.6 APPLICATIONS

Finite elements in closed domains have a variety of applications, especially for


analyzing cavity resonances, modeling packaging discontinuities and transitions,

characterizing shielded circuit configurations, and in analyzing inhomogeneous


waveguides with arbitrary material filling. In the subsequent sections, a few examples
are mentioned to illustrate the highlights of the technique.

5.6.1Cavity Resonators

Solving Maxwell's equations for the resonancesof a closedcavity is important


in understanding and controlling the operation of many devices, including particle
accelerators, microwave filters, and microwaveovens.
The rectangular cavity is a simple structure, but is widely used for feeding
complexmicrowave devices. In Table 5.1, we present a comparisonof the percentage
error in the computation of eigenvalues for a 1 cm x 0.5 cm x 0.75 cm rectangular
cavity using edge-based rectangular bricks and tetrahedra. The edge-based formula-

formulation
using tetrahedral predicts the first six distinct
elements non-trivial eigenvalues
with less than four percent error and is seento provide better accuracy than rectan-
rectangular brick elements. Both the tetrahedral and the brick elements used in the com-
computation are H\302\260
(curl) elements. The maximum edge length for the rectangular brick
elements is 0.15cm whereas that for the tetrahedral elements is 0.2cm. Figure 5.2

shows that the tetrahedral elements have slightly less error when the same number of

unknowns with bricks are used for modeling the rectangular cavity. However, it
cannot be categorically stated that the brick elements are better than tetrahedrals
for modeling rectangular structures. The primary advantage of letrahedra lies in
their generality in being able to automatically mesh arbitrary structures. This is

TABLE 5.1 Eigenvalues (An, cm~n for an Emply I cm x 0.5cm x 0.75cm


Rectangular Cavity

Computed Computed
(bncksJ70 (tctra.J60 Error (%) Error (%)
Mode Analytical Unknowns Unknowns (bricks) (telra.)

TE,0, 5.236 5.307 5.213 -1.36 .44

7.025
\320\242\320\234\342\200\236\342\200\236 7.182 6.977 -2.23 .70

\320\242\320\225\320\276\320\2777-531 7.725 7.474 -2.58 1.00


TEm, 7.767 7.573 -3.13 -.56
TM,|, 8.179 8.350 7.991 -2.09 2.29
\320\242\320\225,
11 8.350 8.122 -2.09 .70
TMjH 8.886 9.151 8.572 -2.98 3.53
TE102 8-W7 9.428 8.795 -5.38 1.70

Source: After Cltutterjee et ul. .(


/2\320\243/. IEEE. 19i>2.
172 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5

1.\320\254-

1.25-

Tetrahedron
X
Brick

0.75: \"\302\246\302\246\302\246\342\200\242\342\200\242\342\200\242\302\246
\\
0.5 :
\\^2l2T
0.25:
0- 5.2 Performance comparison
( ) 500 1000 1500 20
Figure of rec-

rectangular bricks and tcirahedrals. [After


Number of unknowns et at. [29]. (?> IEEE. 1992]
Chatterjce

not true for rectangular bricks unless staircasing is permitted to model curved sur-
surfaces as is done in the finite difference method. Bricks are used primarily for con-
convenience in meshing and can lead to a reduction in unknowns in solids with

rectangular cross sections.


For our next example, we compute the eigenvalues of a rectangular cavity,
half which is filled with
of a dielectric with permittivity er = 2. In Table 5.2we
compare the exact eigenvalues with those computed using edge-based tetrahedral
finite elements. The exact eigenvalues of the half-filled cavity as described in Table
5.2 are computed by solving the transcendental equation obtainedupon matching
the tangential electric and magnetic fields at the air-dielectric interface. As seen.
these results agree with those predicted by the finite element solution to within one

percent.

TABLE 5.2 Eigenvalues (*,,, cm\021) for a Half-Filled I cm x 0.1 cm x 1 cm


Rectangular Cavity Having a Dielectric Filling of tr = 2 Extending from
2 = 0.5cm to z= 1.0cm
Mode Analytical Computed 192 Unknowns Error (%)

\320\242\320\225\320\263|0| 3.538 3.534 .11


TEz20l 5.445 5.440 .10
TErl02 5.935 5.916 .32
TEz301 7.503 7.501 .04
\320\242\320\225\320\26320, 7.633 7.560 .97

\320\242\320\225\320\263\321\210, 8.096 8,056 .50

Finally, Table 5.3 presents the eigenvalues of the geometry illustrated in Fig.
5.3. This is metallic cavity
a closed with a ridge along one of its faces.Note that even
with a relatively coarse initial mesh B67 unknowns), the dominant eigenvalues are
recovered with less than two percent error. However, a much finer mesh is needed to
obtain a reasonable approximation to the modal field pattern or the eigenvector of

the geometry.
As the degeneracy of the eigenvalues increases, the eigenvalue problem
becomes increasingly ill-conditioned and the numerical solution is correspondingly
Section 5.6 \302\246
Applications 173

TABLE 5.3 Ten Lowcsl Nontrivial Eigenvalues (A-o, cm\021) for the Ridged
Waveguide Geometry: 267
(\320\263)Unknowns; (b) 671 Unknowns

No. (a) (b)


4.941 4.999
2 7.284 7.354
3 7.691 7.832
4 7.855 7.942
5 8.016 7.959
6 8.593 8.650
7 8.906 8.916
8 9.163 9.103
9 9.679 9.757
10 9.837 9.927

1.0 cm

0.5 cm

figure 5.3 Geometry of ridged -0.4 cm-\302\273 0.4 cm


'\342\200\242- -
cavity.

less accurate [30]. Therefore, for the partially filled rectangular cavity, the absenceof
degeneratemodesgives results which are accurate to within one percent of the exact
eigensolutions. As expected,the solution yields a set of eigenvaluesequal to the
degrees of freedom (unknowns). Of these,there is an inherent presence of zero
eigenvalues the number of which equals the number of internal nodes. These eigen-
eigenvalues correspond to the dimension of the nullspace of the curl-curl operator and
were explained in an earlier section. The zero eigenvaluesare easily identifiable, and
because they do not correspond to physical modes,they are always discarded.

5.6.2 Circuit Applications

In this present two examples of using


section, we the closed domain formula-
for
formulationmodeling transitions
circuitand packaging in microwave circuits. In Fig. 5.4.
a 50 ?2 coplanar waveguide (CPW) to a 50 ?2 microstrip line transition is modeled in a
packaged environment. The transition uses both surfaces of the GaAs substrate to
print the microstrip and CPW metallizations. This is done for achieving better
packing density in high frequency design. To reduce cross-talk,oneof the substrate
surfaces hosts CPW components and the other surface contains microstrip elements.
174 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5

Top view

Side view (cut along the center)

CPWon top surface CPW/MS


h\342\200\224^\\ ground
Figure 5.4 Geometry of iwo-layer microstnp-
CPW transition on GaAs substrate (t, = UA
with a rectangular via hole. Dimensions arc

Mlcrostrlp on bottom surface as follows: Wt = 50\320\264\321\202;


Wc = V/\\ = 75 /\320\273\320
Rectangular via Wv - 200\321\206\321\202;
W = \320\256\320\236\320\236\320\264\321\202:
H\\ = lOO,im;
I #3=//, = 400/mi. [\320\233/iw \320\243<\321\216*
\302\253 ,'
\321\217/
/??? \\3l\\\\

This configuration serves two purposes: first, the CPW ground serves as the micro-

strip ground and second, the CPW aperture and the top conductor in the microstnp
run in parallel with the separation to reducecross-talk.
Without the rectangular via

hole, the transition geometry has a significant cross-talk of about -15 dB at 20Gto.
However, as the frequency increases, the transmission coefficient (Si\\) increases to
about due
\342\200\2244dB to radiation effects from the open endsof the microslrip and the

CPW. On drilling a via hole connecting the CPW and the microstrip, the crosstalk
levels stay below 20 dB for a wider frequency range (from 20 GHz to 50 GHz).a<
shown in Fig. 5.5. This is just oneof a wide range of applications where a full-wave
three-dimensional solution can be used to meet design specifications for critical pan\302\273
of a complicated circuit.
Another type of common inter-chip feed-through is the hermetic bead transi-
transition. In Fig. 5.6, a coax-to-microstrip transition is modeled by approximating the

circular coax cross section with a rectangular stripline configuration. Dimensionsarc


chosen such that the low-frequency stripline impedanceis approximately 50 ft. The
dielectric filling of the bead has the same permittivity = 10.8)
(\320\261\320\263 as the substrate.
The dielectric bead is hermetically enclosed within a PEC wall of thickness 1.5mm
and the air gap spacing is chosen to be 0.4mm.Predictably, the air gap thickness

governs the insertion loss at lower frequencies with the loss increasing as the gap
spacing is enlarged. At higher frequencies, impedance mismatch will further degrade
interconnect performance.
The reflectionand insertion losses for the hermetic transition were computed
by Yook et al. [32] using the finite element method with first order edge-based
Section 5.6 \302\246
Applications 175

iii.
0

-10

i, -20
03

. \321\201\320\265\320\274
\321\201
-so
f
S^FEM

-40 - \320\264S2i: FDTD -


Figure S.5 Scatteringparameters of the CPW
\342\204\226
microstrip transition with rectangular via v 3U : FDTD
connection. [After Yook el at. .'<;. IEEE [31].] -so 1 ... 1 .. 1 ...
Symbols are reference data computed using
20 40 60 SO 100
the Finite Difference-Time Domain (FDTD)
Method. Frequency [GHz]

Topview Cross-sectional view


(A-A')

Figure 5.6 I Icrmctie packaging. Dimensions are as fol-


bead transition in electronic
follows: W\\ \\ = 0.55 mm;; W, = 0.21mm:
W,, = 1.27 mm;; W44 = 2.225 mm;
i = 5 mm: Ht =0.635 mm; W2=4nim; Lt = 1.5 mm; 4
= 0,4mm;

<;,.
= 10.8. Hfter Yook et til. T IEEE \\32\\.)

tetrahedral elements. The results are given in Fig. 5.7 and as expected the insertion
loss increases from close to 0 dB at 10 GHz to nearly -3 dB at 25 GHz. To lower the
insertion loss, the air gap spacing can be decreased and the geometrical parameters
can be optimized to reduce the impedance mismatch at higher frequencies.
176 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5

-
\320\276

-10 -

\320\276
\320\276
\320\276
0
\320\276
Sn : FEM

- \320\276 \342\200\242 -
S21: FEM

\320\276 S^ : FDTD

S21:FDTD

1 i
Figure 5.7 Scattering parameters of the her-
10 15 20 25 hermetic bead transition shown in Fig, 5.6
Frequency [GHz] {After Yook et al. IEEE
\320\244 [32].]

APPENDIX: EDGE-BASED RIGHT TRIANGULAR PRISMS

In this appendix we presentthe matrix entries needed for finite element analysis using
right triangular prisms (see Chapter 2 for their basis functions). Figure 5.8 shows a
right triangular prism as an edge-based vector finite element [33], [34]. The top and
bottom surfaces are identical and parallel to each other, while the vertical arras are
perpendicular to the base of the prism. The vector electric field inside the element is

\\K*

d,: Side lengths

(a) (b)

Figure 5.8 Right-prism with edgc-basedunknowns: (a) perspective view; (b) top
view. [Courtesy of T. 6zdemlr.\\
Appendix \302\246
Edge-Based Right Triangular Prisms 177

an interpolation among the nine vector unknowns each parallel to and constant
along a particular edge of the prism.
The prism is specified by its height, the lengths of. and the angle between two
sides of its triangular surface, namely, c, </2. <h-> and <*\\ \342\200\242
respectively. First, we need to
compute somescalarand vector quantities that will be used later in computing the
matrix elements. These are illustrated in Fig. 5.8 and given by

E.50)

A, =f/2sina3 E.53)
= sin E.54)
\320\2332 </\320\267 at
=
\320\2333 d\\ sin o2 E.55)
= -I
hi E.56)
u2
= cosarjl - sina3^ E.57)
=
cosofjl
\320\270\320\267 + sinai'? E.58)

The edge-basedbasis functions in their final forms are given by (see also Chapter 2)

E-59)

E.60)

E.61)

Z-^-L^Jd-.- E.62)

$
= c/2
- \320\246 A
-
z/c) E.63)
(^Z.,1 Ji ^

M3' = dy z/c) E.64)


(L\\j\302\261-Lifyo-

Ki;=fZ.'i E.65)
K5 = fZ.5 E.66)

K5 = fLS E.67)
178 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5

where

are the usual nodal shape functions for a triangle in the (?, rj) coordinate frame as

denned in Fig. 5.8(b).


The relevant quantities for constructing the element matrix for the prism are

EWWCU = f (V x (V x WJ) dV
W?) \342\200\242
f [
cos ft,
_\342\200\224
d,dt /cos/?b , cosftm
\"\320\263
\320\233\320\220\320\257 Kkll
cosfty,,,
I I L I, \320\220\320\272\321\202
, I I \"l I, ;. Kjn
t V \320\234\320\270 \320\251\"\321\202 \"\320\272\"\321\202 \\"\320\277

sin
+ i /, \302\253n
\321\214\320\247ft* ft,,,,) E.\320\231)

(VxWl')-(VxMj)(/K

44 /~ cos ft,, \342\200\236 cosftm , cosfty,,, cosftn


\320\245\321\214 X/\"
~~^ ~\320\234\320\223\320\245-\"\"
\320\220\320\233, Mm *\320\233

(\320\250
+TATX7Tsin^sinA\"n)
3 \320\233\321\203\320\230\320\220\320\233,\342\200\236\302\253\342\200\236
/

?^A'Q = f f [ (V x WO \342\200\242
(V x K?) t/K

E.7Di

=
\302\246\342\200\236 (vxM;v(VxM';)rf^
[[[
= ?^^C,Y E.7!l

= (vxm;v(VxK?)^
\\\\\\
= -EWKCU E.7:

= Kf)
\320\265\320\272\320\272\321\201\342\200\236 (v x (v x
\342\200\242
K)dv
[ [J
cos ft,
s
2 h,h,
\342\200\242
dv
w^ w;

d,d, (emPh, ., cos^' .. cos ft,,,


ft,,, cosftB N
= c ,

\342\200\242
dv
w;- ivr;

= i EWWDit E.75i
Appendix \302\246
Edge-Based Right Triangular Prisms 179

= Wf-KJrfK
EWKDlt
JJJ
= 0 E.76)
EMMDie=\\\\\\ M',-M',(IV
J J J i\"
= EWWDlf E.77)

=
?\320\233\320\250>\342\200\236 Mf.K?rfK
MI-K'idF
jj|J
J .1 rr
=0 E.78)

j j j \321\203

= cx\302\253 E.79)

where V is the volume of the prism. Also

ifr = *
@
[ ar + or, otherwise

Air/, f /?i
=
\320\245\342\200\236 +
\320\270>\320\270*, [(colofj -cota2)(^nv + i]rws) + 2(&wr + $rws)]
-\321\203- | j

+ \342\200\224
12
[3(cota3
\342\200\224
cot^K^vtr ... + 1r%s) + 2rjrnt(cot2 o--.\342\200\224
cotor2cota3 + cot2aj)

in which r,.? = 1, 2. 3 and are


ff, \302\273?r
\320\270\320\263, given by

w, = 1,
&=-?-. 4i=0
cosa3

In all the above formulae the indices /,,/, \320\272


and ?, m, n follow the cyclic rule given by
the following table:

i j \320\272 m \302\253

1 2 3 1 2 3

2 3 1 2 3 1

3 1 2 3 1 2

test, we compute the eigenvalues


As a of the rectangular cavity shown in Fig.
5.9. The cavity was first discrelized using bricks, and these prisms were formed by
slicing each brick diagonally. Note that the bricks at two corners were slicedalong
different diagonals from the bricks at the other two corners.
180 Three-Dimensional Problems: Closed Domain \302\246
Chapter 5

1 cm k, cm'1 % Error
Mode
0.5 cm (Exact) Prism Brick Teha

ioi 5.236
\320\242\320\225 0.73 -1.36 0.44

\342\204\2421107.025 2.32 -2.23 0.70


0.75cm
7.531 0.53
\320\242\320\225
on -2.58 1.00

7.531 0.64
\320\242\320\225
201 -3.13 \342\200\2420,56

(Actual mesh)
8.179 0.22
\320\242\320\234\321\211 -2.09 2.29

(a) (b)

Figure 5.9 Eigenvalues of 8 rectangular cavity: (a) discretization of the rectangular


cavity; (b) comparison of eigenvalues using bricks and tctrahcdrals.
[After dzdetnir and Volakis, \302\251
IEEE, 1997.]

The percent error for triangular prisms along with the results for bricks and
tetrahedrals [29] is given in the table next to the cavity mesh in Fig. 5.9 and shouldbe
compared with Table 5.1. The number of segmentsalong the .v-, y-, and r-directed
edges were seven, four, and five, respectively,for both the triangular prism and (lie
brick discretizations resulting in 382 edge unknowns in the triangular prism caseand
270 edge unknowns in the brick case. The tetrahedral discretization, on the other
hand, resulted in 260 unknowns. As seen, the performance of the triangular prisms \320\262
comparable to that of bricks and tetrahedrals.
As a second test, we consider the eigenvalues of a cylindrical-circular cavity
with metal walls, as shown in Fig. 5.10. The table accompanying the figure shows tk

percentage error in calculating the first five eigenvalues. Note that the prism model-

modelingis quite good given that the discretization results in only four edges along Ik
radius as well as the axis of the cylinder. This example showsthe advantage of the
triangular elements over rectangular ones in being able to model cavities with arfe
trary cross section.

Radius = 1 cm

(Actual mesh) % Error


Mode cm'1
\320\272.

(Exact) (Computed)

2.405
\320\242\320\234\320\276\321\216 159

TE111 3.640 2.17


3.830
\320\242\320\234\321\206\320\276 -2.90
TM01i 3.955 0.81
\320\242\320\225
211 4.380 \342\200\2428.97

(a) (b)

Figure 5.10 Accuracy of triangular prisms in calculating the eigenvalues of a cylind-


cylindrical-circular cavity of Icm radius and height. [Courtesy ofT. dzd
References 181

REFERENCES

[1] P. P. Silvester and R. L. Ferrari.Finite Elements for Electrical Engineers.


Cambridge Univ. Press, secondedition, 1990.

[2] A. F. Peterson and S. P. Castillo.A frequency-domain differential equation


formulation for electromagnetic scattering from inhomogeneous cylinders.
IEEE Trans. Antennas Propagat.. 37E):601-607, May 1989.
[3] R. Millra and O. Ramahi. Absorbing boundary conditions for the direct solu-
solution of partial differential equations arising in electromagnetic scattering prob-
problems. In M. A. Morgan, editor. Finite Element and Finite Difference Method in

Electromagnetic Scattering, Chapter 4. Elsevier, New York. 1990.


[4] Z. J. Cendes and P. Silvester. Numerical solution of dielectric loaded wave-
waveguides: I Finite element analysis. IEEE Trans. Microwave Theory Tech.,
118:1124-1131, 1970.
[5] H. Whitney. Geometric Integration Theory. Princeton Univ. Press, NJ, 1957.
[6] J. C. Nedelec. Mixed finite elements in tf3.Numer. Math., 35:315-341, 1980.
[7] A. Bossavit and J. Verite.
\320\241 A mixed FEM-BIEM method to solve 3D eddy

currentproblems. IEEE Trans. Magnetics,18:431-435,


March 1982.

[8] M. Hano. Finite element analysis of dielectric-loaded waveguides. IEEE Trans.


Microwave Theory Tech., 32:1275-1279, October 1984.
[9] G. Mur and A. T. de Hoop. A finite element method for computing three-
dimensional electromagnetic fields in inhomogeneous media. IEEE Trans.
Magnetics,21:2188-2191,
November 1985.

[10] J. S. van Welij. Calculation of eddy currents in terms of H on hexahedra. IEEE


Trans. Magnetics, 21:2239-2241, November 1985.
[11] M. L. Barton and Z. J. Cendes. New vector finite elements for three-dimen-
three-dimensional
magnetic field computation. J. Appl. Phys.. 61(8):3919-3921, April 1987.

[12] \320\242. A.
\320\222. Senior. Combined resistive and conductive sheets. IEEE Trans.

Antennas Propagat., 33:577-579, 1985.


[13] J. L. Volakis, A. Chatterjee, and L. Kempel. Review of the finite element
method for three-dimensional electromagnetic scattering.J. Opt. Soc. Am. A,
II D): 1422-1432,April 1994.

[14] R. F. Harrington. Time Harmonic Electromagnetic Fields. McGraw-Hill, New


York, 1987.
[15] R. Dyczij-Edlingerand O. Biro.A joint vector and scalar potential formulation

for driven high frequency problems using hybrid edge and nodal finite elements.
IEEE Trans. Microwave Theory Tech., 44A): 15-23, January 1996.
[16] S. H. Wong and Z. J. Cendes. Combined finite element-modal solution of three-
dimensional eddy current problems. IEEE Trans. Magnetics, 24F), November
1988.
[17]\320\236. Zienkiewicz.
\320\241 The Finite Element Method. McGraw-Hill. New York, third
edition. 1979.
[18] B. M. A. Rahman and J. B. Davies. Penalty function improvement of wave-
waveguide solution by finite elements. IEEE Trans. Microwave Theory Tech., 32:922-
928, August 1984.
182 Three-Dimensional Problems: Closed Domain \302\246
Chapter S

[19] J. P. Webb. Finite element analysis of dispersionin waveguides with sharp metal
edges. IEEE Trans. Microwave Theory Tech., 36(12):1819-1824. December

1988.

[20] A Bossavit. Solving Maxwell's equationsin a closed cavity, and the question of
spurious modes. IEEE Trans. Magnetics. 26B):702-705, March 1990.

[21]J. P. Webb. Edge elements and what they can do for you. IEEE Tram,

Magnetics, 29:1460-1465. 1993.

[22] B. R. Crain and A. F. Peterson.Analysis of propagation on open microstrip


lines using mixed-order covariant projection vector finite elements. lm, J,

Microwave and Millimeter Wave CAE, 5B):59-67, March 1995.


[23] Jin-Fa Lee and R. Mittra. A note on the application of edge elements for

modeling three-dimensional inhomogeneously filled cavities. IEEE Tram.

Microwave Theory Tech., 40:1767-1773, 1992.


[24] J. M. Jin and J. L. Voiakis. Electromagnetic scattering by and transmission

through a three-dimensional slot in a thick conducting plane. IEEE Tram.

Antennas Propagat., April 1991.


39D):543-550.
[25] J. L. Volakis, J. Gong, and A, Alexanian. A finite element boundary integral
method for antenna RCS analysis. Electromagnetics,14A):63\342\200\22485, 1994.
[26] X. Yuan. Three-dimensional electromagnetic scattering from inhomogeneoiii

objects by the hybrid moment and finite element method. IEEE Trans. Aniennm

Propagat., 38:1053-1058, 1990.


[27] Z. J. Cendesand J. F. Lee. The transfinite method for modeling MMIC devices.

IEEE Trans. Microwave Theory Tech., 36:1639-1649, 1988.


[28] J. F. Lee. Analysis of passive microwave devices by using three-dimensional
tangential vector finite elements. Int. J. Num. Modeling, 3:235-246, 1990.
[29] A Chatterjee, J. M. Jin, and J. L. Volakis. Computation of cavity resonance),
using edge-based finite elements. IEEE Trans. Microwave Theory Tech,
40A1):2106-2108, November 1992.
[30] G. H. GoJub and C. F. Van Loan. Matrix Computations. Johns Hopkins 1'\320\277\321\

Press, Baltimore, MD, 1983.


[31] J. G. Yook, N. I. Dib. and L. P. B. Katehi.Characterization of high frequency
interconnects using finite difference time domain and finite element methods.
IEEE Trans. Microwave 1727-1736,
Theory Tech.,42(9): September 1994.

[32] J. G. Yook, N. I. Dib, E. Yasan, and L. P. B. Katehi. A study of hermetic


transitions for microwave packages. In IEEE MTT-SInt. Microwave Svmp
pages 1579-1582, 1995.
[33] T. Ozdemir and J. L. Volakis. Triangular prisms for edge-based vector finite
element antenna analysis. IEEE Trans. Antennas Propagat., pages 788-797

May 1997.

[34] Z. S. Sacks and J. F. Lee. A finite element time domain method using prism
elements for microwave cavities. IEEE Trans. Electromagnetic Compatibility,
November 1995.
Three-Dimensional

Problems: Radiation

and Scattering

\320\230
INTRODUCTION

In the previous chapter, we outlined methods to deal with packaged three-dimen-


three-dimensional
CD) in electromagnetics.
structures However, the solution of packagedor
bounded structures is only one of the problems that finite elements can be applied to.
Open problems, like radiation and scattering, present a unique challenge to finite
domain methods. Since the mesh of the computational domain cannot be extended
to infinity, boundary conditions must be appliedto simulate the effect of the infinite
domain. This was discussed in detail in Chapter 4 for two-dimensionalBD) geome-
geometries.In this chapter, we concentrate on solving
and scattering problem the radiation
for 3D structures using vector absorbing boundary conditions (ABCs).
As described in Chapter 4, absorbing boundary conditions were derived for
finite element solutions of 2D scattering problems.However, the method's imple-
implementation and performance for scattering by three-dimensional geometries has
become acceptable only recently [1].The3D implementations of the FEM for radia-
radiationand scattering have been limited primarily to a hybrid solution using the bound-
integral
boundary (BI) technique [2], [3], [4] and those incorporating ABCs [I], [5], [6], [7].
The finite element-boundary integral method
(FE-\320\222\320\223) will be discussed in the next
chapter. This chapter focuseson ABCsand artificial absorbers to truncate the finite

element mesh for applications to radiation and scattering.


The motivation for using ABCs comes from the localized nature of its effect
which preserves the O(N) storage advantage of the finite element method. This
feature permits scalability to large 3D problems. The boundary integral is equivalent
to
employing a global boundary condition for terminating the mesh and conse-
consequently leads to a full submatrix. restricting the method's utility to small or regular
geometries. Recently, artificial absorbers [8] and coordinate stretching methods [9]
have been used to absorb the outgoing waves and minimize non-physical reflections

183
184 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter d

back into the computational domain. These techniques maintain sparsity but may
worsen convergence properties for iterative equation solvers.Even for direct solvers.
the ill conditioning of the matrix could lead to unstable or incorrect solutions.
In the first part of this chapter, we present a survey of the more popular vector
ABCs artificial absorbers. Detailed derivations for the ABCs along with recent
and
advances made
in understanding their behavior are included.In the following sec-

tions, we formulate the open domain problemin terms of the linear functional ami

incorporate the ABC into the unite element system.The scattered and total fid!
functionals then presented. In the last
are section of the chapter, we include
examples
of various applications solved using finite elements and absorbing boundary condi-

conditions or artificial absorbers.

6.2 SURVEY OF VECTOR ABCs

The motivation for applying ABCs to simulate open domain problems was discussed
in detail in Chapter 4. In three dimensions, the advantages of locality and subsequent

scalability are even clearer. All 3D finite element formulations rely on a vector

representation of
the underlying variable to
generality over a wide clas>
maintain

of problems. ABCs for 3D problems must, therefore, be expressed in terms of


vectors. Three-dimensional ABCs usually start with the Wilcox representation [lOj
of the radiating field variable, which is the electric or the magnetic field. The tan-
tangential component of the curl of the radiating function is then expressed in termsof
the tangential of the function and its higher-orderderivatives
components on the
ABC surface. of the derivative
The order usually determines the order of the ABC
and the degree of its accuracy. The higher the order, the better the accuracy of the
boundary condition. Of course, this accuracy is achieved at the expense of increased
complexity in modeling the boundary condition. In the next section, we outline the

derivation of the ABCs for vector formulations,

6.2.1 Three-Dimensional Vector ABCs

Consider the scattering volume Vd enclosed by a fictitious surface 50.asshown


in Fig. 6.1. We shall assume that the immediate surrounding of the surface Sois free
space and the fields near So are thus governed by the vector wave equation

where kn is the free-space wave number. We also assume that the field has a well-

defined phase front in the region under consideration.This is a valid assumption


when the ABC boundary is placed far enough from the radiator or the scattcra
Since we are
concerned only with local behavior, we can assume that the phase fronts

can be treated as parallel regions. Consequently,the surface describing the phase


fronts can be specified by a net of coordinate curves orthogonal to each other
denoted by /j and \321\214
and a third variable n denotes the coordinate along the norm.il
to the phase front (see Fig. 6.2).
The point of observation in the Dupin coordinate system [II] can now b;
defined as
Won 6.2 \302\246
Survey of Vector \320\233\320\222\320\241* 185

Hgere &1 Illustration of scattering structure


h enclosed by an artificial mesh termination
uitfacc, So, on which the absorbing boundary
mndition is imposed.

Figure 6.2 Dupin coordinate system and related parameters.

= Wtt+X0(/|,/2) F.2)
where n is the unit normal and Xo(/|.i2) denotes the surface of the reference phase
front. The curl of a vector in the above coordinate system is given by
\320\255\320\225
F.3)

where Vr x E is called the surface curl involving only the tangential derivatives and
is defined in [12], [13] as

Vr x E = -n x V?n + 12\320\272\\?,,
-
fidc2?V, +\302\253V-(Ex \320\270) F.4)

In F.4), and
\320\272\\ denote
\320\272\320\263 the principal curvatures of the surface under consideration,
?/,, ?,j are the tangential components, and ?\342\200\236
is the nomrial component of the
electricfield on the surface. The principal curvatures are associatedwith the princi-
principaldirections }t2 of a surface and are given by [11]

K]
~ __~ Lf F.5)
\321\217, ft, d

-_L-_J_
2~ ~ F.6)
R2 hi
186 Three-Dimensional
Problems: Radiation and Scattering \302\246
Chapter 6

where ht. h2 are the metric coefficientsand R{, R2 are the principal radii of curvature
Using the aforementioned coordinates, the Wilcox expansion for a vector

radiating function can now be generalized to read

where /?, = p, + n, i = 1,2 and


pt is the principal radius of curvature associated with
the outgoing wavefront at the target. The lowest-orderterm in F.7) represents the
geometrical optics spread factor for a doubly curved wavefront and reduces to the
standard Wilcox expansion [10] for a spherical wave. Moreover, F.7) can be differ-
differentiated term by term any number of times and the resulting series converges abso-
absolutely and uniformly [10].

6.2.1.1 Unsymmetric ABCs. In the 3D finite element implementation using


vector basis and the electric field as the working
functions variable, we need to relate
the tangential component of the magnetic field in terms of the electric field at an\\
surface discontinuity. Therefore, our next task is to derive a relation between

n x V x E (i.e., \321\217 x H where H is the magnetic field) and the tangential components
of the electric field on the surface. Taking the curl of the electric field expansion given
by F.7) and crossing it with the normal vector, we have

^pt
Ayr
\320\247\320\233
p=Q

where = \321\203/\320\251\320\251
\320\270 and

Considering that is zero due to the


\320\225\320\276\342\200\236 divergenceiess condition [10]and simplifying,
we obtain the first-order absorbing boundary condition

(\320\243,

orJxVxE- (Jka +
-
~K) E,
\320\272,\342\200\236
= 0+ 0(n~3) F.101

for a conformal outer boundary. An example of a geometry enclosed in a conforrml


ABC boundary is shown in Fig. 6.3.
The first-order conformal ABC derived in F.10) is identical to the impedance
boundary condition for curved surfaces as derived by Rylov [14]. It should be noted
that in the above equation, VtEtt and are each
\320\272,\342\200\236 proportional to n\021. Therefore, the

leading order behavior of is


F.10) O(n~3), i.e.. only the first two terms of F.7) arc
exactly satisfied by F.10). If the scattered field contains higher order terms, applica-
applicationof F.10) will give rise to nonphysical reflections back into the computational
domain. To reducethesespurious reflections, we need to either shift the mesh trim-
Won 6.2 \302\246
Survey of Vector ABCs 187

Composite simuiute enclosed in \320\273


mesh termination
\342\200\242\302\253\302\253formal boundary. The
.ftornclry of the mesh termination allows
aallcr problem sizes than with spherical
bmiiuitions.

cation boundary farther away from the scatterer or employ higher order boundary
conditions which satisfy higher order terms of F.7).
To reducefurther the order of the residual error, we include the tangential
components of the curl of F.9). This yields

nx V x [n x V x E - (/7@ + -
~K-)
*-,\342\200\236 E,]

\302\246
\320\225
\321\200\320\272,\342\200\236\320\232
\342\200\242\321\200>
F.11)
\320\270\"+|

where = *jk-2
\320\272\320\272
is the Gaussian curvature. Using the result derived in F.9) and
simplifying F.11) reduces lo

nx V x [ft x V x E - {jl<o + -
\320\232)
\320\272\321\202 \320\225,]

\\

F.12)

If we take a closer look at the term in the square brackets on the RHS of F.12), we
find that it can be written as

- - - -
+ + 3*7,,
(\320\224\320\276 A
\320\232 V \321\205
\321\205 \320\225 (jk{) + \320\272\321\202
*^- {w \320\233\"-)
\320\225(}

where we have substituted

=\" x E -
7,?\321\200\342\200\236+\321\200*-)\342\200\236\320\225/\342\200\236
F.13)
L -\320\232-)\320\225,

using the relation derived in F.9).


188 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapicr 6

Now the dominant terms on the RHS of F.12) can be eliminated by consider-
the
considering higher order operator

x
\320\223n V x -(jkjk0 + \320\252\320\272\321\202 x V x E - (jk0+-\302\253\302\246\342\200\236,
-f -)E,)
-^--^-J |(\302\253

The residual of F.14) can be reduced further to yield the absorbing boundary con-
condition of second order which satisfies F.7) to 0(/\320\2235), where n is the normal distance

from the object surface to the phase front. This second-order ABC is found to be

x V x -(jkn + -
4\320\272,\342\200\236 T\302\246 x V x E - (/-,, + \320\272\321\202
- f
\342\200\242)
E,}
[\320\271 -^- jl\\n

= 0 F.151
\\
^),En
Km /

and the residual is equal to

The operator on the LHS of F.15) can be applied repeatedly to obtain ABCsof
increasing order; however, higher order basisfunctions are needed for their imple-
implementation.
After some algebraic manipulation, the terms on the LHS of F.15) reduce \321\216
simpler ones. In addition to the wave equation, the following vector identities were
utilized to carry out the simplifications and are provided below for the reader's
convenience:

nxVxE, = \320\271\321\205\320\243\321\205\320\225-
V,?,,

h x V x V,?N = V,(V \342\200\242


E,) + lKmV,En

nxVx(wxVxE) = Vx x -
kfe,
- {(V x
\320\220\320\272 + (V x
{\302\253(V \320\225)\342\200\236| \320\225)\342\200\236/, E),,/,]
where Ak = tct The derivation
\342\200\224\320\2322. of these identities is given in an appendix to this
chapter. Upon simplification, the second-order ABC can be compactly written as

- (D - x V x E
2\320\272\342\200\236,)\320\277+ {4^, - Kg + D(jkQ - \320\251+
\342\200\242
~K \342\200\242
\320\226 + \320\272,\342\200\236
\320\224*\321\201|}
\320\225,

+ V x {/1(V x E)J + (flc + - ^L - 2f


\320\252\320\272\321\202 V,?n = 0 F.17)
\302\246)

in which

and

F.11A
Section 6.2 \302\246
Survey of Vector ABCs 189

The second-orderABC derived in [15] is recovered by setting \\/r.


\320\272\321\205=\320\2722=

6.2.1.2 Symmetric Approximation. It has been shown by Peterson [15] that


the LHS of F.17) when incorporated into the finite element equations givesriseto an

unsymmetric matrix system in spherical this problem. coordinates. To alleviate


Kanellopoulos and Webb [16] suggested an alternative derivation involving an arbi-
arbitrary parameter which would lead to a symmetric matrix while sacrificing some
accuracy. Below, we discuss a different approach which leads to a symmetric ABC
without the introduction of an arbitrary parameter.
On consideringthe series expansion of the term \320\270
x V x V,?n, we have

\321\202\321\201
l -iknn
\321\217V x
\321\205
V,En =
4* ?,=.

(p-\\)Km-L-L

=jk0V,E,, + lKmV,En + O(n~5)


and on making use of the vector identity

V,(V E,) =
\342\200\242 n x V x V,En
- bcmV,En

given earlier, we arrive at the following result


V,(V
\342\200\242
E,) =jk0V,En + O{n~5) F.19)
Sincethe ABC F.17) was derived to have a residual error of 0(/T5).we can replace
jkoV,Eri with V,( V \342\200\242
E,) without the order of the approximation.
affecting Doing so,
the second-order ABC with a symmetric operator can be rewritten as

(D
- 2icm)n xVxEx

+ V x + 2f =
\342\200\242 0 F.20)
xEU+1
{\302\253(V
(jk0 \320\227\320\272\321\202
-^-- V,(V
\342\200\242) E,)
JK0 \\ Km /
which -
= \320\272] It can be easily shown the above condition
in \320\220\320\272 \320\272\320\263. that boundary
leads to a symmetric system of equations when incorporated into the finite element

functional for surfaces having \320\272\321\205


= \320\2722-
Equations F.10) and F.20) reduce to the
boundary conditions derived in [16] on setting \320\272\\ \\/r
=\320\272\320\263= which have been
found to work well for spherical and flat boundaries [1]. Symmetry cannot be guar-
guaranteed when \320\272\\
\321\204 as
\320\2722 explained in the next section.

6.2.1.3 Finite Element Implementation. The boundary condition outlined in

equation F.20) cannot be incorporated into the finite element equations without
modification. As explained in Chapters 3 and 4, the absorbing boundary condition
is implemented in the finite element system through the surface integral over the

mesh termination surface 5().

E-#ixVxEdS=f E-P(E)dS F.21)

where P{E) denotes the boundary condition relating the tangential magnetic field to
the tangential electric field on the surface.
190 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter \320\272

Let P\\(E) denote the first-order absorbing boundary condition given by F.10),
where the subscript represents the order of the ABC. Therefore, the surfaceintegral
contribution for the first-order ABC reduces to

\320\225./>,(\320\225)
= (\320\2240+ *\342\200\236,)[ E-(f-E,)tfS F.22i
f E-E,dS-\\

Using some basic vector identities and consideringthat E,


= \342\200\224
n x ft x E, we deduce

that

f [ ?l\\ ?? F.23)

which is a readily implementable form of the first-order ABC. However, the second-
order ABC does not simplify as easily.If P^IE) denotes the second-order ABC given
by F.20), we can rewrite it in more compact vector notation as

I-[Vx |/5(V x E),,}] + 7 \342\200\242 \342\200\242


{V,(V E,)} (Oi

where the tensors E, /3, and f are given by

[4I + D (jk0 - K|) + \320\272\\


+ \320\272\321\202\320\220\320\272]
f, ?,
Kg
U jJC,

+ n - D (jk0- -
+ \302\253\"IKm*\302\253) hh
\320\2722) F.25)
m

= 0{'i
{\" + \320\250 'i + hh\\ F.261

jko(D-2Km)

Substituting the second-order absorbing boundary condition in the surface

integral given in F.24), we have

EP2(E)dS = f E-(EE,)dS+ E- V x {w(V x E)n)\\dS


f [ ()\302\247\302\246

+ E-{f-Vt(V-E,)\\dS
f

us examine the integral E, = h x E, we have


Let since \342\200\224
n x
I\\. Again,

= ct\\El +a2E?dS
/i F.28)
J.Vo

after employing some simple vector identities.


The other two integrals/3) do not reduce as easily
(A and to simple, imple-
mentable forms.
They are first simplified using basic vector and tensor identities, and

then the divergence theorem is employed to eliminate one of the terms. Considering
the integrand of the second integral /2, we note that
Section 6.2 \302\246
Survey of Vector ABCs 191

wherewe have set = (V


\321\204 n = (V x E),,.
x E) \302\246 Using someadditional vector identi-
identitiesand letting /3 E =
\302\246
F, we get

F \342\200\242 = V \302\246
V x \320\277\321\204 \321\205 +
\320\233
(\321\204\320\277
\342\200\242
V \321\205
\321\204(\320\277 F)

Using results from


the [11], the first term in the above identity can be further
simplified to read1

V \342\200\242
x F)
(\321\204\320\277
= V, \342\200\242
x F)
(\321\204\320\277 + -?- \342\200\242
(n xF))-J
{\321\204\320\277
\302\246x
(\"
[\321\204\320\277 F))
on
= V,
\342\200\242
x F)
(\321\204\320\271 F.29)

where V, denotes the surface gradient operator and ./ = *,+ The


\320\2722. integral h can
now be written as

/2 = f V(
\342\200\242
x F)
{\321\204\320\277 dS + f V x
\321\204( dS
\320\233\342\200\236

We next apply the surface divergence theorem to the first term on the RHS of this

expression to yield

f V, ,
\302\246 x
F)
(\321\204\320\277 dS - f
\302\246x
(n
\321\204\321\202 F) dl = 0 F.30)
Jin Jc

since the surface So is closed.We note that m = / x h and / is the unit vector along
the edge of the surface element and denotes
\320\241 the contour of integration (see Fig.
1.1).On the basis of F.30) and considering that Jt is a simple scalar, /2 reduces to

/2=f 0{DxE)nfdS F.31)

We now turn our attention to simplifying /} for implementation in the finite


element equations. Considering the integrand of /3, we have

where i/r
= V \342\200\242
E,. Next, setting G = \302\246we
E,
\321\203 obtain

G \302\246
Vx/,
- n = V \342\200\242 -
(\342\204\226) V/V G -
\342\200\242
Gn F.32)
j \320\251
^

The first term in the above identity can be written as

V \302\246
($G)
- V,
\342\200\242
($G)
- J
(\342\204\226\342\200\236)
(\342\204\226\342\200\236)
+?- on

and as usual Gn
= n-G. Also, since BGjdn = C?-V
V\342\200\242 the LHS
G,4-\320\243\320\241\342\200\236. of
F.32) reduces to

'Thebook by van Bladel [131a's0 contains an extensive list of identities associated with divergence
and curl operators.
192 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter t

G \302\246
Vtfr
- ii =
V,
\342\200\242
WG)
-
*v \302\246
G, F.33)
J \320\251

We can thus replace the integrand of /3 with the expression in F.33) and use the

divergence theorem to eliminate the first term of F.33). Specifically,

= f m-(^G)dS- f
s\302\273

where m has been defined earlier and the contour integral vanishes when the princi-

principalcurvatures of the outer boundary are equal to zero, i.e.,for a rectangular ABC
surface. The integral, however, does not vanish for spherical or cylindrical bound-

boundaries, as was pointed out in [38]. In our computations, we have ignored the con-
contribution of the contour integral that results from the non-vanishing portion of the
surface integral. For further discussion on how to include the effect of the contour
integral without destroying symmetry of the finite element matrix, the interested
reader is referred to [38].The integral /3 can finally be rewritten as

F.341

Using F.28), F.31), and F.34), the complete surface integral term incorporat-
the
incorporating conformal second-order ABC reduces to

f
E \342\200\242 dS
/\302\2732(E)
=
[ (a,El
+ a2El) dS + [ /3[( V x E)J2 dS

(V-E,){V.G.E),}rfS F.35)

It remains to be seen whether the integrals in F.35) lead to a symmetric system


when incorporated into the finite element equations. With this in mind, we will
examine three simple shapes and check whether they preserve symmetry of the finite

element system. It will then be possible to generalize our findings to a more general

mesh truncation boundary.


Let us consider the case of a sphere of radius r. Since the two principal cur-
curvatures of the sphere are identical (\320\272\\
= \320\2722
\342\200\224 ihe
11\320\263)- first-order boundary condition
reduces to the simple Sommerfeld radiation condition

E \342\200\242
f =jk0
\320\233(\320\225)dS f (El + El)dS F.36)
Js,> J-s,

On a spherical boundary, the second-order ABC also reduces to the comparatively


simple form

F.37)

The ABC given in F.37) is identical to the boundary condition derived in [16] fora
spherical mesh termination surface and leads to a symmetric system of equations.
Section 6.2 \302\246
Survey of Vector A BCs 193

Next, we considera piecewise planar termination boundary in which case


=
\320\272j \320\272:
= 0 (see Fig. 6.4). The first-order ABC then reduces to the Sommerfeld
radiation condition, and the second-order ABC for a planar boundary simplifies to

[
E \302\246
P2(E) dS =
\\ \\jk0E>
+
^- [(V x E)J2 - JL (V
\342\200\242
E,J]
dS F.38)

Since the planar boundary is a special case of a sphericalboundary, F.38) again


reduces to a symmetric system of equations.
Now we examine the situation when the mesh termination boundary is cylin-
cylindrical in shape and of radius p. The principal curvatures of the cylindrical surface are
then ff|
= 0- Since the principal curvatures are no longer
= 1 /p and \320\2722 identical, the
tensors 5 and \321\203 do not reduce to simple scalars. The first-order ABC for a cylindrical
outer boundary is given by

[ f F.39)
)sn Js,,

and the second-order ABC gives

2
JSa JS,,

F.40)
-L

where y(._are obtained by substituting


\320\260\321\201], flc, and
\320\260\321\2012>
= and
\320\272\\ \\/\321\200
= in the
\320\2722 \320\236

original /J, and p. It is seen that


expressions for \320\271. the first-order ABC given by F.39)
leads to a symmetric matrix for a cylindrical boundary. On the other hand, the
second-order ABC does not yield a symmetric matrix for an arbitrary choice of
basis functions. However, the boundary condition outlined in F.40) preserves sym-
symmetry on using linear edge-based elements for discretization.
The above discussion enables us to conclude that the first-order boundary
condition leads to a symmetric system for surfaces having arbitrary principal cur-
However,
curvatures. symmetry is guaranteed for the second-order ABC only when the
two principal curvatures of the mesh termination boundary are identical, i.e., only
when the outer boundary is limited to a planar or a spherical surface.Thus, if we
want to enclose a scatterer having arbitrary shape within a conformal outer bound-
boundary,an unsymmetric system of equations will have to be solved. It should, however.

zi

ABC enclosure

Figure 6.4 Scatterer enclosed in a piecewise


planar ABC surface.
194 Three-Dimensional
Problems: Radiation and Scattering \302\246
Chapter fc

be noted that the resulting unsymmetric system will, in general, have fewer

unknowns than its symmetric counterpart.

6.2.2 Artificial Absorbers

The purpose of the absorbing boundary condition is to absorb the outgoing


wave so that there are no reflections back into the computational domain. An alter-
alternative to these ABCs is to use lossy layers to absorb the outgoing field [8]. The
absorber proposed in [8] was a three-layer absorber with the relative permeability
equal
(\320\264\320\263) to the relative permittivity (er) to preserve duality. The constitutive para-
parameters of the medium were obtained by minimizing the reflection coefficient over a
wide range of incidence angles, using a multidimensional simplex optimization algo-
algorithm. These absorbers usually consist of nonphyskal materials to obtain the high
level of absorption over the entire spectrum. The advantage of such a technique is
evident: the user can specify the number of layers and their thickness and can, there-
therefore, customize the absorber over a wide range of incidenceangles and even frequen-
frequencies.
However, in finite element analysis, it is not yet clear whether they can be used lo
terminate the mesh closer than ABCs in a wide class of problems.Moreover, in three

dimensions, the mesh for such structures can get increasingly complicated and reduce
their viability when compared with ABCs. In the following section, we discuss a

particular version of artificial absorbers called PML {Perfectly Matched Layer)


that has shown considerable promisein dealing with open boundary problems.

6.2.2.1 Perfectly Matched Layer (PML). The PML concept as envisioned by

Berenger [17] rests on the concept of splitting each of the electric and magnetic field

components into two parts. In the case of the most general medium, Berenger defines
an electric conductivity </ and a magnetic conductivity </. However, Berenger goes
one step introducing additional degreesof freedom
further by in the electric and

magnetic conductivity of the medium. The resulting medium, thus, becomes an


artificial absorber that may no longer support Maxwellian waves. Nevertheless,
the resulting solution of the modified Maxwell's equation permits the mathematical
introduction of media interfaces which are completely nonreflecting for all incidence

angles and material layers which are highly absorptive.


As an example, Berengerreplacedthe x component

BE. \321\202. \320\250,

of Maxwell's time domain equation by the split equations

9>^ F-42)

With this the field component ?x was split


replacement, into ?xv and \320\225\320\272
with the
corresponding introduction of the new conductivities of and ali. Thus an additional

degree of freedom was introduced in defining the conductivity of the medium leading
to the appearance of the split field components. Similarly for the dual of F.42) we
have
Section 6.2 \302\246
Survey of Vector ABCs 195

\320\264\321\203
F.43)

where now was


\321\201\321\202'1
generalized to ctJ and and
\321\201\321\202'1
Hx was split to 7iX}.
and

In generalizing F.42) and F.43), Maxwell'sequation VxW =


becomes [18]

\320\224
F.44)
)

in whichal,.,- denote the new electric conductivity parameters. The dual equations
of F.44H6.46) canbe obtained by replacing ?ah with H(lf,, Hat, with 4
\342\200\224?Uh-with \320\224\32

and vice versa, and crjj>v-- with ffJiV.t.. The latter are the new magnetic conductivity

components. If the condition

<x7e0= (\320\243'/\320\264\320\276 F.47)

is satisfied, it can be shown that matched to vacuum


the medium
(hence, is perfectly
the acronym PML) and plane
waves propagating normally into a PML half-space
should be completely absorbed [17]. The most powerful attribute of PMLsis that
zero reflection can also be obtained irrespective of incidence angles under special
conditions.
As an example application of the PML concept, let us consider the propagation
of a plane wave through the PML medium (Fig. 6.5). The following derivation

borrows heavily from [17]. We assume \320\260


\320\242\320\225
plane wave of magnitude ?<)traveling
in the xy plane. The electric field and the split components of the magnetic field, H:x
and //.,., can be expressedas

ty-tocos*

In F.48), is
\321\204 the angle the electric field makes with the _y-axis (equal to the angle
between the incident direction
field and the \320\264\320\263-axis)
and a, fi, HzM, H:vo are
unknowns to be determined from the split Maxwell's equations. Substituting the
196 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter 6

0, a' a*)
\320\240\320\234\320\246\320\236,

a* a*
\320\240\320\234\320\246\320\276*
\\
\320\240\320\270\320\246\320\260\302\273\320\260$,\320\260\302\273\320\260*) oj>)

PML(oS,<yj.O,0) tfj 0, 0)
\320\240\320\234\320\246\320\276\302\273,

Figure 6.5 Crosssection of 3D geometry wiili


PML enclosing the outer boundary of the
mesh. PML ((t'.itJ.ctJ.c^!) implies lhat \\k

h h. medium has a conductivity of (<^,<rj) in the


\320\260'
\320\260*
\320\240\320\234\320\246\321\201\320\263' ,
\320\260\321\200) \320\260\342\200\236
\320\260\321\203)
\320\236\321\203, direcljon
\320\264. and a conductivtty of (of.e*)
PML@, 0, the r-axis.
along

values from F.48) into F.44) and F.45) and the relation for \320\257-,after taking the
necessary time derivatives, we obtain
\321\201

sin sin = +
e0E0 \321\204
-j\342\200\224E0 \321\204 0 (H:x0 H2}<>)

oi
= \320\260
e0E0 cos \321\204 (H:xt) +
-j\342\200\224Eq \321\201\320\276\321\212\321\204 \320\235.\320\273)
w

I Mo -j \342\200\224
I Hzxn
= <*?o cos \321\204

=
I$Eq sin
H:vq
\342\200\224_/\342\200\224=-1 \321\204 F.49)

Eliminating H:x0 and i/.,.o from F.49), we arrive at a relation between or and p.

F.50)
a

Solving for a and ^ and considering a positive direction of propagation,we have

or = \302\246

F.51)

where G \342\200\224
Jwvcos2 + w,,sin2
\321\204 and
\321\204

F.52)

The remaining unknowns /r\".^ and /r\"r,.o can now be obtained as


Section 6.2 \302\246
Survey of Vector ABCs 197

1
-\342\200\224
, F\0253>

W-.n
= Eu wY sin2 \321\204
ZqG

Thus, the impedance of the plane wave in the PML medium is given by

Z = F.54)
f

Assuming that both a*K and condition F.47), the variables G, \320\270'\321
ctJ; satisfy the PML
become
\320\2701,, unity irrespective or angle of incidence.Consequently,
of frequency the

impedance of the plane wave in the PML medium reduces to the free-spaceimpe-
impedance, and hence, the PML medium is perfectly matched to vacuum. Thus, any plane
wave traveling from vacuum to a properly matched PML medium will be entirely
transmitted. Another interesting thing happens to the magnitude of the propagating
plane wave. The electric field of the plane wave in the PML medium can be written as
E = Eair where

/ . cos
\320\273: + .)'sin (\320\220
0z\342\200\224i. \320\223 1 \320\223 1
\320\260'\321\214\321\202\321\204
\321\201\320\263^\321\201\320\276\321\214\321\204
ft, exp I
= z. \342\200\224* z. x \342\200\224i z.
\321\204 -jco exp eXp y F.55)
V \321\201 / L eo^ J L W J
where denotes
\321\201 the velocity of light in free space. Therefore, the PML acts as an
absorbing medium and. in the limit, will eventually all propagating waves of
absorb
all frequencies and incidence angles. The same results hold true for the TM case with
the electric field replaced with a magnetic field. Since an arbitrary plane wave can be
considered as a superposition of TEr and TM- modes,the above analysis holds true
for all plane waves.
The story, however, is not We have merely established the fact that
complete.
the PML absorbs plane waves upon ft very effectively, [t is still unclear
incident what
happens when a plane wave is incident at the interface of two PML media. It has
been shown in [17] that for an interface normal to the x- or y-axis and lying between
two matched PML media having the same conductivity couplet (<jj,a(!)or (o$,er*), a
plane wave is transmitted without reflection regardless of angle of incidenceand
frequency. This is true between vacuum and a PML medium as well because vacuum
can be thought of as a zero electric and magnetic conductivity medium. This prin-
principle is illustrated in Fig. 6.5. Not considering the corners of the domain, PML media
along the .Y-axis are given conductivity values of @,0,&v\\ tfj). whereas PML media
parallel to the >-axis have conductivity values of (a\"x, crj, 0,0). The corners have
conductivities which superposition are a of the intersecting
layers. The corners
play a very
important role since they absorb the transverse components of the
field entering into the PML layer [19]. The above derivation can be easily extended
to three dimensions by considering a separate couplet of conductivities in the third
orthogonal direction and superimposing the conductivity couplets in the two
pre-
preferred directions of propagation.
In [17]. Berenger terminated the PML boundary with a perfect electriccon-
conductor. Therefore, the reflection of the propagating plane wave occurs at the electric
wall and is turned back into the medium. The plane wave then passes back through
the PML layer, a part of it getting absorbed along the way, and then re-enters the
198 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter d

domain. A refinement is to use a second-order


of this approach ABC termination on
the outer wall rather than the perfect condition to reduce spurious reflection!\302\273
electric
There are some important issues raised by the PML concept.
\302\246
The PML concept falls out of the split-field
essentially formulation of

Maxwell's equations. Unfortunately, this also renders the PML medium


non-Maxwellian with active sources embeddedinside the medium [19].
It is not
\302\246
very clear how to include the PML formulation for frequency
domain problems since the conventional curl-curl formulation may not be
valid.

We will try to answer a few of these questions in the next section. However, this

topic is still an area of active research and the behavior of PML is not very well

understood.

6.2.2.2 Interpretations of PML. In the previous


Various section, we outlined
the split-field of
formulation PML. What if the fields were not split into two con-
constituent parts? Pekel and Mittra [19] explorethis concept and arrive at some inter-

interesting conclusions. They rewrite the set of PML equations F.44)-F.46) such that all
split fields are eliminated. Assuming Uxy + Ux. = Ux for all components of the
electric and magnetic field, the modified equations in the frequency domain are

- --
at\" erf,
+ oj)
(><?\342\200\236 Ex = -jkyH. +jk:Hy >ejk.Hy
JOlKQ + O

+ aV) = -Jk.Hx +jkxH. F,56)


Ey
-j^Z^L._flcxHt
\320\241 f

= -JkxHy+jkyHx - x
+
\302\25360 ax) Ez I jkyHx

for the electric field, where kXtV%: denote the (x, \321\203,
z) components of the propagation
vector k. The magnetic field equations can be obtained from F.56) as outlined
earlier. On imposing the PML condition F.47), duality is restored between the
electric and the magnetic fields. This is necessaryso that electric and magnetic
field formulations lead to equivalent final results. Thus, F.56) shows that in addition

to the regular terms arising from Maxwell's equations for an anisotropic medium
with uniaxial
a 3x3 conductivity tensor, there are excitation-dependent source
terms. source terms are proportional
These to the differences in the conductivities,
The PML medium is thus active.
Considering \320\263-directed propagation through the PML medium and enforcing

continuity and phase matching conditions at the PML interface, it is found that the

following constraints must be satisfied [19]

F,571

= = = = \320\276
^\320\264- \320\232 \302\260; \302\260i \302\253^

where kz = ?0A cos


\342\200\224jkri) The
\320\262. cr'' values need to be permuted as the direction of

propagation changes from z to x or y. This result is identical to the PML medium


Section 6.2 \302\246
Survey of Vector BCs
\320\233 199

proposed by Berenger [17] and is illustrated in Fig. 6.5. When the unsplit PML

equations F.56) are once again rewritten with the values in F.57), we obtain

. = -jkxHy +jkyHx

for \320\263-directed propagation. These equations are identical to the ones presented in [9],
based on coordinate stretching. In the coordinate stretching technique, the spatial
variable is replaced
\302\253 by the complex spatial variable \320\270
given by

F.59)

assuming that the wave is propagating in the \320\270-direction. The spatial variable \320\270
can

.v, v. or z. The problem


be either with this formulation is that the curl-curl equation
must be reformulated in the PML medium to allow the fields to satisfy Maxwell's

equations. However, it provides us with valuable insight into the true nature of the
PML medium.
As was shown PML medium
in [19]. the can be thought to consist of an
anisotropic a
material with
conductivity tensor. Along these lines, Sacks et al. [20]
(see also Kingsland et al. [21]) have proposed an anisotropicabsorber with perfect
transmission characteristics over all incident anglesand frequencies for planar sur-
surfaces. Assuming diagonal tensors, the permeability and permittivity tensors in the
most general case can be written as

a 0 0
0 b 0 F.60)
Mo 0 0 \321\201

The corresponding reflection coefficientsfor and


\320\242\320\225 TM cases are given by

cos \342\200\224 cos


TE \320\262, (s/b/u) \320\262,

\342\200\224 F.61)

rtm _(yfr/a)cose,-cosfl,
cos 91 + (y/b/a) cos\320\262,
with and
\320\262, 0,, as displayed in Fig. 6.6. From the phase matching condition at the

interface, we also have

\\fbc sin0, = sin0, F.62)

In order lo make the reflection coefficientindependent of incidenceangle, we choose


-Jbc\342\200\224
1. The zero reflection condition is achieved by setting a = h. Thus, we can set
a to be an arbitrary complex number a \342\200\224jfi
(for example) and rewrite the aniso-
anisotropic material tensors as
200 Three-DimensionalProblems:Radiation and Scattering \302\246
Chapter 0

PML medium

Figure 6.6 Plane wave incidence on a PMl


interface.

'\302\253-JP 0 0
JL-L- 0 a-JP 0
F.63)
\320\265\320\276
\320\234\320\276 I
0 0 -
\320\276\321\202

The design of the anisotropicabsorber,therefore, reduces to determining the values


of or and p. The parameter p is more crucial since it controls the absorptivity in the

PML medium. In [22]. it is shown that the choice of p is critical for the performance
of the absorber. If p is too small, field decay is insufficient to eliminate reflection
Too large a value of P leadsto reflection since the mesh is insufficient to model the

sudden jump in material property. This phenomenon is illustrated clearly in Fig. 6.7
A finer mesh will improve the situation but will perhaps never remove the problem
The value of p optimized for normal incidence is given by the relation [22]

^-= -0.0106|/?|+0.0433
F.64)

where \\R\\ is the desired reflection coefficient in dB, kg


=
k0/ cos0, t is the thickness of

the PML, and iV is the sampling density in the PML. Typical good choices for a and

p are a \302\273 =\302\253


p 1.

The use of anisotropic absorbers thus simplifies the implementation of the


PML to a certain degree since the curl-curl equation can now be solved within

the framework of Maxwell's equations. It is without doubt that the PML is a ven
effective absorbing medium. In fact, it is probably the best artificial absorber known
to date since il is reflectionless in the limit for all incident angles, frequencies, and

polarizalion. how effective is PML and how close to the target can we placeit? It
Bui
has been that the PML does not do a good job of absorbing evanescent
shown waves

[24] implying that it still needs to be placed far enough from the target for the
evanescent modes to die out. This fact coupled with the convergence difficulties
Section 6.3 Formulation
\302\246 201

250

beta == 0.5

1.

- 2. -
200
+ 3.
\\ N
4
\\ 10.
2. 150 -
N

\\
N

1 \\ \\
\\
\\ \\
\\
100 1
+ \\ \\

\\
\\
\\
\\ \\
1
\\ \\
\320\266
50 + \\ \\ -

N \\
+ N.
\320\226

\320\251
*, t t 1\302\273'

60 62 64 66 68 70 72 74 76 78 80
Segments Number Along Waveguide (alpha = 1.0,f = 4.5 GHz)

Figure 6.7 Field decay ihe PML medium for different


in values of /) and a = I.
Curves to ihc decay of a wave in a substrate guided by a stripline.
refer
[After Gong unit Volukis. :c IEE. 1995\\2i].]

presented in solving for fields in an active medium indicate that further research
needs to be done to determine the viability of the PML when compared to the ABCs.

6.3 FORMULATION

In the following section, the open domain problemis modeled with ABCs described
earlier and is formulated in terms of the finite element functional. A Rayleigh-Ritz
minimization is then carried out in the usual way to find the stationary point of the
functional.

6.3.1Scatteredand Total Reid Formulations

problem of scattering by an inhomogeneous


Let us consider the target having
possible material discontinuities.As outlined in Chapter 4 for the 2D case, it is
necessary to enclose the radiator or the scattered embedded inside the volume V,
by an artificial surface So (contour in the case of 2D domains) on which the ABC is
enforced (see Fig. G.I).The generalform of the ABC is as given in F.21) and F.37)
202 Three-Dimensional
Problems: Radiation and Scattering \302\246
Chapter \321\214

provided E is interpreted as the scattered field ESC1\". On writing the same equation
with the total field as the working variable, we get

n x V x E = />(E) + UilK
F.65)

where

Uinc = n x V x Einc - P(Einc) F.66)

and E = Escal + Einc is the total field with Einc being the incident electric field, to
usual, Escat denotes the scattered field. ConsideringF.65)to be the boundary con-
condition employed at 50, we can express the functional for the total electric field as

F(E)= f (V x
\320\223\342\200\224 E) \342\200\242
(V x E)
- k%erE
\342\200\242
eI dV
J^LMr J
+Jk0Z0 f \\z
x E)
(\302\253
\302\246x
(\320\270 E) dS

F.671

where the scalar \320\232 is the surface resistivity (\320\233)


when integrating over a resistive card
and equals the surface impedance (??)for an impedance sheet (see Chapters 1 and 5).
The linear functional F.67) is formulated in terms of the total field, but we can
easily revert to a scatteredfield formulation by setting Escat = E
\342\200\224
Einc and
noting
that the scattered field must satisfy the wave equation inside the domain of interest
Let us consider the case where the computational volume V is occupied by a
dielectric structure and is bounded internally by the surface of a perfect conductor
and externally by the mesh termination boundary. On examining the terms iaside the
volume integral in F.67), we can define

G(E,E) = x E) \302\246 x
E)
- A&E \342\200\242
eI dV
| l^- (V (V F.6S)

Expressing the above relation in terms of the incident and the scattered fields, we

have

G(E. E) = GtE\"\021, E**1) + 2G(Eseat, Einc) + G(Einc,E\342\204\242) F.69)

The first and third terms on the RHS of F.69) cannot be simplified any further than

the form given in F.68). The second term does, however, lend itself to more simpli-
simplification. Making use of a simple vector identity and the divergence theorem, we can
rewrite G(Escal. Einc) as

G(Esca\\EinC)= f E8\"\" x
\342\200\242 \342\200\224
V x Einc - dV
[v ftgerElnc]
iV L Mr J

-
f \320\225\321\210|.(/1\321\205\320\243\321\205\320\223)\320\231 F.70)
Scclion 6.3 Formulation
\302\246 203

since

f \320\223\342\200\224
(VxEscal)(VxEillc)]^

= f ESG\" x \342\200\224
\342\200\242
\320\223V V x Eincl dV - f
\342\200\224
Escal \342\200\242
(n x V x Eiac)dS F.71)
iv l \320\224, J JsM,

and the surface integral cancels out everywhere inside the computational domain
except on the mesh termination boundary SQ. If we define Vd to be the volume
occupiedby dielectric materials, then the remaining volume (Vo = V - Vd) is the
volume occupied by free space. On incorporating this into F.70), we have

I EseaI
\320\223\".\320\225|\320\237\320\241)=
\342\200\242
[V x V x Eint
J v,,

+ f [v x
ESC1\" \342\200\242 \342\200\224
V x Einc - *g\302\253rElBCl rfK
Jr., L Mr J
- EsciI \342\200\242
x V
(\321\217 x E\342\204\242)dS F.72)
f
JA\"o

Since the electric field satisfies the wave equation in free space, the first term
incident
of F.72) is identically zero. The third term cancels exactly with the cross term
E
|S[| U'nc dS in tlie total field functional
\342\200\242
F.67). The second term can be simplified
by employing the first vector Green's theorem to yield

f
E8*\" \342\200\242 \342\200\224
[v x V x Eillc - kl\342\202\254rEinc]
dV
iv,t L Mr J

= f
\342\200\224 x
(V E8\"\") \342\200\242
(V x Einc)
- klefE**1
\302\246
Einc dV
\302\246k/M-r

\342\200\224 \302\246x
E5\342\204\2421
+jk()Z0 f (n H\342\204\242)dS F.73)
is,, V-t

where the normal to S,i is directedaway from Vj. The surface integral over the
dielectricinterface Sdsince the tangential component of the scattered
occurs elec-
electric field disconlirraous
is over the interface between two dielectrics having dis-
dissimilar permeabilities. It should be noted that F.73) is valid even when there are
multiple dielectric regions present. If the dielectric regions have the same perme-
permeability (fj,fl =
(iri
= \342\200\242
\302\246
\302\246
=
/^
= l, for example)and different permittivities, the sur-
surface integral contribution over the dielectric interfaces, S,i Sd , is zero. If
different permeability values are also present, then the permeability values must
be substituted into the element equations and the direction of the normal for the
two elements on the interface should take care of the respective signs.
Using F.72) and F.73). G(Escal. Eillc) reduces to
204 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter 6

\342\200\242
\320\2231 -
0\342\202\254\320\263
J K, Mr

_L Escal - x Hf
(\321\217

F.74)
-J.

The impedanceand resistive sheet boundary conditions can be incorporated in a


similar way into the scattered field functional. After simplification, the functional
1) for tne scattered field is given by

F(Escal) = f (V x
\320\223\342\200\224
Escal)
\302\246 x
(V Escai) - \302\246
\320\233^\320\225*\302\2731 dV
E5\"\"]

x E**1) \302\246x
+Jk0Z0 f i\320\273
(\320\270 (\320\231 EscaVS
J.S,

\342\200\242
+ f ESCttl />(\320\225\320\266\320\2601)^

\342\200\224
Eseat \342\200\242
x Hinc) dS
+ 2jk0Z{) f (\302\253

+ 2 f f-(VxD.(Vx Einc) - \342\200\242


Eincl rf^
^\342\202\254,.\320\225\320\260\320\2641
J vd LMc J

+ 2jk0Z0 I \\-
x E30\021)
(\320\231
\342\200\242
x Ei
(\320\231
JskK
+/(Einc) F.751

where F,/ is the volume occupied by the dielectric (portion of V where er or /i, art
not unity) and Sd encompasses all dielectric interface surfaces. The function/(E\"\"|
is solely in terms of the incident electric field and vanishes when we take the first
variation of F(Escal).

6.4 APPLICATIONS

The open domain problem has varied applications in scattering, radiation, and

microwave circuit simulations. The computation of the radar echo area from geo-

geometrically complex 3D structures is of primary interest for object detection and

identification. Antenna applications are gaining widespread popularity in recent


times primarily lo model radiation characteristicsin the presence of complicated
objects. Modeling the patch radiation from antennas mounted on airborne struc-
structures or humantissue penetration of radiation from cellular phones are cases in
point. In microwave circuits, calculation of power lost through radiation is often
critical for EMI/EMC specifications or satisfying
meeting design goals in densely,

packed high frequency integrated circuits. In the following sections, we will present

examples that model some of the phenomena mentioned earlier. The first few

examples demonstrate the validity of the conformal ABCs; the remaining ones
Section 6.4 \302\246
Applications 205

are progressively more complicated both in terms of modeling difficulty and struc-
structural features. They include scattering simulations as well as applications of ABCs
and artificial absorbers in computing radiation from antennas and microwave
circuits.It should also be mentioned that all scattering and circuit computations
were done using H0(cur() elements (i.e., six unknowns per tetrahedron as men-

mentioned in Chapter 2). The radiation problems were solved using linear basis func-
functions on bricks and triangular prisms. An iterative solver was used for solving the
final system of equations in all cases. Storage was, therefore, never a problem in
any of the applications although convergence rate was geometry and excitation
dependent.

6.4.1ScatteringExamples

It was discussed in Chapter 4 that of particular interest in scattering is the


evaluation of radar echowidth (for 2D) or echo area for three dimensions. The latter
is given by

igscaii
<r3D
= lim \342\200\224.\342\200\224-
4\320\273\321\2022
r-oo
|Elnc|

with E**1 as described by the far zone expressions A.63). The currents J and M now

represent equivalent currents computed from (see Fig. 1.3)

= /5 x H = n x\\ -

= \320\225\321\205\320\271

where fi denotes the outward normal of a surface Sc that encloses the scatterer. This
surface can be arbitrary but for better accuracy it should be placed as close as
possibleto the scatterer. In the following, we will mention the a\342\204\226la^
echo area
or radar cross section (RCS),as it is commonly referred to. The subscripts in these

quantities simply identify the polarization of the incident and scattered fields used
for the evaluation of the RCS. Specifically,

IE-scat -i2
= |E .
afq lim \\n? '*}

implying that
om is the measured RCS due to the pth component of the scattered
field for a ^/-polarized incident plane wave. As usual, p and q represent either \320\262
or \321\204

spherical components with measured


\320\262 from the \320\263-axis and is measured
\321\204 in the xy
plane from the .v-axis.
Below we present RCS calculations, using the presented finite element-ABC
formulation, of cubes,cavities (inlets), plates and conesphere composite and metallic
targets. These calculations were carried out using the University of Michigan code

FEMATS [25].

6.4.1.1 Cube. We chose the cube as a validation examplebecause the sharp


edges and corners of this geometry make the problem very difficult to solve using
node-based elements. However, by their very definition, edge elements pose no such
206 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter li

problem boundary conditions on the edges and corners of this structure


in enforcing

Figure the measured [26]bistatic


6.8 compares cross section @mc = 180\". \321\204\320\266
= 901

of a metallic cube having an edge length of \320\236.755\320\233


with the corresponding pattern

computed by the 3D finite element-ABC (FE-ABC) code. The results agreequite


well with reference data. The second-order vector ABC derived earlier in F.37) on a
spherical mesh truncation boundary was employed and placedonly 0.1 A. from ihc
edge of the cube. About 33,000 unknowns were used for the discretization of the
computational domain, and the finite element matrix contained a total of 264.000
distinct nonzero entries.This compares very favorably with the approximate storage
requirement of 2 million nonzero entries for a matrix obtained from the method of

moments (assuming the same sampling rate as the FEM of 14 points/X).

15

10 -_

5 -.

-E -5-

FE-ABC(vd = 0\302\260)

-15- Measured
\342\200\242

90\
-20- 0 Measured

-25
30 60 90 120 150 160
Observation angle 0O, deg.

Figure 6.8 Bistalic echo-area of a perfectly conducting cube having edge length of
0.755;.. Plane wave incident from 0 \342\200\224 = 90'.
IW: \321\204

Figure6.9 shows the normal incidencebackscatter RCS of a perfectly conclud-


cube
concluding function of its edge length. The meshesconstructed
as a for this experimeni

were terminated on conformal boundaries, i.e., on another cube placed a small


distance (more than 0.15\320\233) from the scatterer. As seen, the agreement with measured
data [26] is remarkably good over a 50-dB dynamic range. The interesting thing
about this example is that terminating the mesh conformally did not have any effect
on the accuracy of the final solution. Moreover, generalization of the Wilcox expan-
expansion to conformal F.7) seems to work well, at least for
structures planar structures
Consider next the scattering from a variation of the cube solved earlier.The
target shown in Fig. 6.10 consists of an air-filled resistive card block
@.5A. x x
0.5\320\233 0.25A.) attached to a metallic block @.5A. x 0.5A. x 0.25A.). This
Section 6.4 \302\246
Applications 207

30

20 -

10 -

S 0-

-10-

-20-
-30

Edge length In wavelengths

Figure 6.9 BackscattorRCSofa perfectly conducting cube at normal incidence as a


function of edge length.

Resistive a/2
sheet

T
a/2
Metal

Figure 6.10 Geometry of cube (u = h = 0.5X)


consisting of a metallic section and a dielectric
= 2 -jl).
\302\253lion(<\302\246, where the latter is
bounded by a resistive surface having R = 7.Q.

example is used to validate the implementation of compositedielectric structures in


the FE-ABC technique. In Fig. 6.11, we compare a principal plane backscatter
pattern (both polarizations) obtained from the 3D FE-ABC implementation with

data computed using a traditional moment method code.2The computed curve is


again seen to follow the reference data closely.For the FE-ABC solution, the

\"Courtesy of Northrop Corp.. B2 Division. Pico Rivera. CA.


208 Three-Dimensional Problems'.Radiation and Scattering \320\250 f
Chap\302\253t

-30

Observation angle 0O, deg.

Figure 6.11 RCS pattern in the.vr plane for the composite cube shown in Fig. 6.10.
The lower half of the cube is metallic while the upper half is air-filled
with a resistive card draped over it.

scatterer was enclosed within a cubical outer boundary placed only 0.3A. away from
the scatterer. This resulted in a 30,000 unknown system which converged lo ihe
solution in about 400 iterations when the Sommerfeld radiation condition is

employed to terminate the mesh and in 1600 iterations when the second order

ABC was used. Increased iteration count for higher order ABCs is a direct result
of the shift in the spectrum of the matrix. Higher order ABCs usually result in more

eigenvalues of the coefficient matrix to shift toward the negative real axis. For this

geometry, the second-order ABC did not provide a significant improvement in accu-
accuracy (only about 0.1 dB) over the first-order condition.However, this is not true in all
cases as will be demonstrated later.
The problemsize with the conformal mesh termination is much smaller than
the 40,000 unknown system which results when the same target is enclosed in a

spherical termination boundary. The decrease in the unknown count is even more
dramatic as we go to larger scatterers. The same case was run with a higher dis-
discretization resulting in a system of 50,000unknowns; however, there was no signifi-
significant difference in the far-field values with the earlier case. The geometry for ihe
backscatter pattern shown in Fig. 6.12 is the same as the geometry drawn in Fig.

6.10 with the air-filled section now occupied by a lossy dielectric having er = 2 -\320\224
The backscatter echo-area pattern for the \321\204\321\204 polarization as computed by the FE

ABC computer program is again seento be in good agreement with corresponding


moment method data.''

'Courtesy of Northrop Corp.. B2 Division. Pico Rivera. CA.


Section 6.4 \302\246
Applications 209

10

0-

\342\202\254

-10
-

-20
180
Observation angle <p0, deg.

Figure 6.12 RCS pattern ihc compositecube shown in Fig. 6.10.


in the \320\273\320\263
plane for
The composition of ihe cube is the same as in Fig. 6.11, except that the
air-filled portion now consists of dieleelrk. The solid curve is the FE
and the black dots arc momenl
\320\233\320\222\320\241
pattern, method (MoM) data for
the ?l\021 = 0 polarization.

6.4.1.2 Inlets. As another example, we compute the scattering from perfectly

conducting inlets. The aperture of an inlet usually has a large radar cross section
around normal incidence. Therefore, a good understanding of its scattering charac-
characteristics is critical if measures need to be taken for reducing its echo-area. A differenl
method for simulating electrically large jet-engine inlets can be found in [27]. An

accurate computer simulation of such a geometry provides a cost-effective and ready


way of allowing the designer to experiment with complex material fillings to meet
design specifications.All the examples shown here are for empty inlets due to lack of
referencedata for more complicated structures. However, it is just as easy for finite
elements to model empty inlets as arbitrarily filled ones.

Let us consider the perfectly conducting rectangular inlet with dimensions


U x 1\320\233 x 1.5a. For the plots shown in Figs. 6.13 and 6.14, the target was enclosed
within a sphere of radius 1.35A.,which is only about \320\236.35\320\233
from the farthest edge of
the scatterer. This resulted in a system of 224,476unknowns and converged in an
average of 785 secondsper incidenceangleon a 56-processor K.SR1. The computed
values from the FE- ABC code agreevery well with measured data for both HH and
VV polarizations. We include the example of the spherical mesh boundary to illus-
irate how quickly a relatively small structure A.5A.') can become prohibitively expen-
expensivein terms of computing resources and time.
To reducethe computational demands, the natural choice is to use the con-
formal mesh termination scheme formulated in the previous section and utilized in
the last example. Therefore, instead of using a spherical mesh truncation surface, the
210 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter i>

-15

Observation angle 0O,deg.

Figure 6.13 Backscaiter pattern of a metallic rectangular inlet (IX x IX x 1.5X) for
HI) polarization. Black dots indicate computed values, and the solid
line represents measured data [28]. Mesh termination surface is
spherical.

\302\253I*

-15

Observation angle 0O,deg.

Figure 6.14 Backscaiter pattern of a metallic rectangular inlet (U x |h 1.5*.) for


VV polarization. Black dots indicate computed values, and the solid
line represents measured data [28]. Mesli terminaiion surface is
spherical.
Section 6.4 \302\246
Applications 211

mesh can be terminated with a rectangular box placed only 0.35X away from the

scatterer (see inset of Fig. 6.15).The problem size reduces dramatically to 145,000
unknowns, a 35% reduction over the spherical mesh termination scheme. The con-
convergence time for each excitation vector is about 220 seconds,less than 4 minutes,
when run on all 56 processors of a KSR1.The computed
are again compared values

with measured data for the HH polarization in Fig.


the agreement is excellent, 6.15;
albeita bit worse than the spherical case. However, this observation is overshadowed

by the fact that the problem size has been reduced by more than a third and comput-
time
computing by about a fourth. Thus in many cases, a conformal ABC makes it possible
to obtain a solution with the resident storage capacity and within a reasonable time
interval. The results for the VV polarization with a rectangular mesh termination are

equally accurate.

-15

Observation angle <p0, deg.

Figure 6.15 Buckscatterpattern of a metallic rectangular inlet (U x 1\320\273


1.5;,) for
\321\205

HH polarization. Black dots indicate computed values, and the solid


line represents measured data [28|, Mesh termination surface is piece-
wise planar.

Next, consider the scattering from a perfectly conducting cylindrical inlet. Even
though integral equation codes are more efficient for such bodies of revolution, the
goal with this test is to examine the performance of the conformal absorbing bound-
boundaryconditions. Moreover, the real strength of finite elements lies in its ease of
handling material inhomogeneities encountered in practical structures. The target

is a perfectly conducting cylindrical inlet having a diameter of 1.25\320\233and a height of


1.875\320\233. A rectangular outer boundary was first used and placed 0.45\320\233 from ihe
farthest edge of the target to enclose the scatterer. For this case, the radar cross
section for a ^-polarized incident wave in the y: plane is shown in Fig. 6.16 and is
compared with measured data. The agreement is quite good for all lobes except the
212 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter*

-10

FE-\321\207\320\220\320\222\320\241
(HH poi)

-45
0 30 60 90

Observation angle 0O, cleg.

Figure 6.16 Backscaltcr pattern of a perfectly conducting cylindrical inlet (diameter


= 1.25\320\233.
height = 1.875J.) for HH polarization. The solid line indicates
measured data [30].and the black dots indicate computed data. Mesh
termination surface is a rectangular box.

third. However, the backscatter echo-area computed for the same geometry by
Shankar [29] using the finite difference-time domain method agrees with the com-
computed results via the FE-ABC for all incidenceangles.In the
absorbing bound- [29].
was
boundary placed a few wavelengths from the scattering structure.
Next,we employ a conformal termination scheme with a cylindrical surface for
mesh truncation. This example further demonstrates that a truly conformal mesh

boundary is possible with the ABCs derived earlier in the chapter. The cylindrical
outer boundary was placed about 0.45A.from the target, and the computed RCS lor

a \321\204-polarized incident wave is given in Fig. 6.17. For purposes of validation, Ik


results are compared with measured data [30] and with a body of revolution code[3l]
(Fig. 6.17). As can be observed 6.16 from Figs.
6.17. the far-field results fort
and

cylindrical and a rectangular termination do not differ significantly. However, the


savings in computational cost is quite impressive.The cylindrical mesh termination
has only 144,392 unknowns compared to the 191,788 unknowns for a rectangular
truncation scheme. A spherical mesh termination would have swelled to aboui
265,000 unknowns, given identical sampling density and outer boundary distance

Thus, the problem size was reduced by about 45% and the computation time by a
similar, if not greater, amount. The savings in computational resources is quite
significant even when we compare the rectangular and cylindrical termination
schemes\342\200\224a 25% reduction in problem size and a similar decrease in computation
time. This phenomenon is only to be expected from the geometrical point of vie\302\273
and, of course, improves with the problem size.
Won 6.4 \302\246
Applications 213

-10

-40 4
BOR

\342\200\242
FE-ABC (HH pol)
-45
0 30 60 90
Observation angle 0O, deg.

Figure 6.11 Backscatierpattern Of a perfectly conducting cylindrical inlet (diameter


= 1.25;,. height = for
1.875\320\233) HH polarization. Black dots indicate
computed values, the solid line represent!! measured data [30], and
the doited tine is the body of revolution data [31]. Mesh termination
surface is a circular cylinder.

6.4.1.3 Plate. The motivation for testing the FE-ABC method for perfectly
conducting plates is twofold. It is usually very difficult to model the scattering from
the edges of the piate even using integral equation methods. Therefore, in this section
we present examplesto seehow the method performs at edge-on incidence. Second,
we examine the performance of termination boundaries of esotericshapes.The first
choice is to enclose the plate in a rectangular box. The second choiceis to use a box
with half cylinders attached to the faces normal to the plane of the plate\342\200\224the
reasoning being that because the edge of the plate behaves like a line source and
scatters cylindrical waves, a cylindrical mesh termination is most suitable for wave
absorption. Both mesh termination schemes require approximately the same number
of unknowns; the superiority of one over the other is thus decided only on the basis
of accuracy of the computed backscattev values.
The test case is \320\260 3.5\320\233x 2A. perfectly conducting rectangular plate. In Fig. 6.18,
we plot the backscatter pattern for the \321\204\321\204 polarization in the xz plane, i.e., over the

long side of the plate. Generally, the agreement with reference data is quite good.
However,the backscatter echo-area at edge-on incidence is not calculated accurately.
Thus, we need to check whether other mesh termination shapes will perform better.
In Fig. 6.19,we show the RCS of the conducting plate in the it plane, i.e., over
its short side, for the \321\204\321\204
polarization. The backscatter echo-area for edge-on inci-
incidence is picked up very well for a rectangular-cylindrical termination, whereas a
rectangular truncation scheme gives completely incorrect results. Thesetwo schemes
214 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter f>

-30
0 30 60 90

Observation angle 0O, deg.

Figure 6.18 Backscatter pattern of \320\260 x 2\320\273


3.5\320\233
(\320\260\321\204\321\204) perfectly conducting plate in
the xz plane. Tbe while dots indicate box termination: the black dots
represent a combined box-cylinder termination.

m
\302\246o
? 0-

-10-

-20-

-30

Observation angle <p0, deg.

Figure 6.19 Baekscatter pattern of


(<\320\263^)
\320\260 x 2a
3.5\320\273 perfectly conducting plate in
the .vr plane. The white dots indicate box termination; the black dots
represent a combined box-cylinder termination.
6.4
S\302\253lion \302\246
Applications 215

have approximately the same storage requirement; in fact, the box-cylinder combi-
combination
yields a slightly smaller system of equations.This example truly illustrates the
power of a conformal truncation scheme composed of simple shapes; not only are
the results far more accurate but even the storage requirement is slightly less.

In the above simulations, the boundary was terminated at O.35A. from the flat
face of the plate and 0.5A. from the edges of the plate. To test the accuracy of the
A.8C method as a function of mesh termination distance, we consider the backscatter
patterns from the edges of the plate as the mesh termination distance is increased.
Figure 6.20 shows that the backscatter values from the plate edges slowly take the
shape of the reference data as the mesh truncation distance is increased.However,
even though the results are seen to approachreferencedata as the mesh boundary is
pushed farther away from the plate edge,the experiment also shows us the limita-
limitations of this technique. The FE-ABC method is a true 3D technique; therefore,
although it is possible to use it in solving 2D problems, the associated computational
cost makesit unjustifiably expensive. A surface formulation using integral equations
or a hybrid finite element-boundary integral formulation (see Chapter 7) is more
efficient for such applications.

30

Reference

\320\236
Rect(.4Sl)
20-
Mixed
\342\200\242 (.45\320\257)

Mixed
\320\264 (.651)

10 -

-10-

-20 I
30 60 90

Observation angle 00, deg.

Figure 6.20 Backscalter pattern (n^) of a 3.SX x 2a perfectly conducting plate in

the .i~ plane. The numbers in the legend indicate mesh termination
distance from the plate edges.

6.4.1.4 Conesphere. The last scattering example geometry presented in this

chapter is unique in its own way. A conesphere is a hemisphere attached to a cone,


and this shape is typical for airborne structures. It is a difficult geometry to mesh
sincea surface singularity exists at the tip of the cone. The singularity can be
216 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter i>

removed in two ways: (i) by creating a small region near the tip and detaching it from
the surface or (ii) by chopping off a small part near the tip of the cone. The second
option inevitably leads to small inaccuracies for backscatter from the conical tip,
however, this option was chosen since the conical angle in our tested geometry h
extremely small (around and
7\302\260) the mesh generator fails to mesh the first case on

numerous occasions. In Fig. 6.21. we plot the backscatterpatterns of \320\260


4.5\320\233
long

conesphere having a radius of 0.5\320\233. and


for \320\262\320\262 polarizations.
\321\204\321\204 The mesh truncation

surface is a rectangular box placed 0.4A. from the surface of the conesphere. The far-

field results compare extremely well with computations from a body of revolution
code [32].
In this conesphere example, the choice of a piecewise planar rectangular
boundary might be questioned. A truly conformal boundary would have been t

larger conesphere placed an appropriate distance from the target conesphere


However, there are
two serious limitations. As mentioned in the derivation of the

ABCs earlier in this chapter, ABCs need a piecewise smooth surface where die
scattered or radiated field can be expressedin terms of an infinite series in I/V
ABCs have also been found to fail for concave and re-entrant structures. Thus,
using a truly conformal ABC surface in the form of a conesphereis not a good
idea. The second hurdle in using arbitrary conformal surfaces as the mesh termina-
termination
boundary is the difficulty in implementation. Surfaces of arbitrary curvature will

usually lead to loss of symmetry in the finite element matrix, thus resulting in a more
complicated solution process for a small reduction in the size of the problem. In
order to address these problems, mesh termination strategies are being investigated
which use artificial absorbers instead of ABCs.Applications of artificial absorbers

will be shown in the next section.

6.4.2 Antenna and Circuit Examples

This section demonstrates the application of ABCs and artificial absorbersfor


truncating element meshes in computing
finite the radiation patterns of antennas ami

reflection coefficients of microwave circuits. The implementation of the ABCs is vcr\\

similar lo the scattering formulation except that there is the additional aspect of
source modeling. This is outlined in detail in the next chapter.

6.4.2.1 Conformal Patch Antennas. Figure 6.22 illustrates the generalconfig-


configuration of printed antennas on conformal platforms. Various patch configurations
situated on cylindrical platforms were consideredfor the purpose of examining the

performance of ABCs for this application. Among those studied, we present the

analysis of a 2 cm x 3 cm patch antenna printed atop a metallic cavity which is filled


with a 5 cm x 6cm x 0.07874cm substratehaving a dielectric constant of e,. = 2.1\"
This cavity is recessed in a metallic cylinder (whose infinite dimension is along
the:
axis) with a radius cm. and the scattering
of 15.28 and radiation calculations for this
patch were carried out at 3GHz. The second-order vector ABC is placed \321\202/\321\206,
from
the cavity aperture, while the lateral walls of the ABC were placed 0.5A.0from lbs
cavity aperture. The H-plane antenna pattern for an axially polarized patch is shown
in Fig. = 0 and z,- = -0.375 cm {with
6.23 where the probe feed is placed at \321\204\320\263
= 0\" and
\321\204, Zf
= 0 corresponding to the center of the patch). As in the case of
6.4 \302\246
Applications 217

1\320\224

4\320\257

-40
-90 -60 0 30 60 90
Observation angle deg.
\320\2620,

-40
-90 -60 -30 0 30 60 90
Observation angle 0Ol deg.

Figure 6.21 BacksCaitcr pal tern of a perfectly conducting and \320\262\320\262


conesphere for \321\204\321\204
polarizations. Black dots indicate compute\302\273] values using ihe FE-ABC
code (referred to as FEMATS). and the solid line represents data
from a body of revolution code [32].Mesh termination surface is a
rectangular box.
218 Three-Dimensional
Problems: Radiation and Scattering \302\246 t-
\320\241\320\254\320

Patch

Metal

Figure 6.22 Cavity-backed patch anuara\302\273


with ABC mesh termination.

-50
-180.0 -90.0 0.0 90.0 180.0

Angle (<f>) [deg]

Figure 6.23 Convergence of ihe FE-ABC method for computing the H-plane radia-
of a cavity-backed axially
radiationpattern polarized patch. The reference
data is provided by a rigorous FE-BI formulation for the same cavity-
backed antenna.

scattering, the radiation pattern calculated via the FE-ABC method is seen to be in
excellent agreement with thepattern computed by the more rigorous finite element
boundary integral (FE-BI) [33] approach even when the mesh is terminated only
0.3A. from the aperture.

6.4.2.2 Patch Antenna on Ogive. In this example, an artificial absorberisused


for terminating the radiation boundary. Artificial absorbers are especially useful

for such situations when a truly conformal ABC may be much more difficult to

implement. Figure 6.24(a) shows the setup, where a 2cm x 3cm rectangular patch
Section 6.4 \302\246
Applications 219

I- 6\" j. 6\"

6\"

(Side view)
(Top view)

(a)

Feed 6 cm
\320\261-Polarlzed radiated power
location

2 cm 6cm

Cavity
Patch
1.125em
-50
3 cm -180 -90 0 90 180
Observation angle (deg.)
\321\204

0.08 cm
Computation
T Measurement
(b)
(c)

Figure 6.24 Cavity-hacked rectangular patch on ogive: (a) setup, (b) antenna
dimensions, (c) comparison with the measurement. [After Ozdemir and
Volakis, t' IEEE, 1997 [34].]

is placed on the aperture of a 5 cm x 6cm x 0.08cm rectangular cavity recessed in


the ogive's surface. Also, Fig. 6.25shows the artificial absorber used for termin-
terminating the mesh around the patch. The ogive is electrically much larger than the

cavity (ten times the maximum cavity dimension). The ^-polarized radiation does

'\321\210-\320\230\321\210-1-12.7

PEC

Figure 6.25 Illustration of the mesh termina-


termination
approach using artificial absorbers for
analysis on doubly curved surfaces.
\320\222\320\264\320\265\321\202\320\260

of T.
|C<We\302\273.v Ozdemir.]
220 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapter 6

not interact with the edges of the ogive and can therefore be computed by localiz-
localizingthe mesh near the cavity on the ogive. Thus, the radiated power patlerr.
accounts only for the antenna (and the curvature of the platform), but does not
include interactions with the ogive's tips. Figure 6.24(c)shows the computed fr
polarized radiation as compared to the measurement [35]. The agreement with

measured data is very good for this polarization. However, predicting the \302\251\34

polarized radiation (not shown) requires modeling the entire ogive as ihi\302\253

particular polarization has a vertical surface field component which is known lo

cause diffraction from the ogive's tips. A way to account for such secondary
diffractions is to interface the finite element-artificial absorber (FE-AA) method

with a high-frequency technique, and an encouraging study in this direction has


been carried out in [36].

6.4.2.3 Cylindrical Via. For all practical problems in circuit design, a lull
wave analysis of the circuit components can be carried out only for small parts of
the circuit. Therefore, in analyzing microwave circuit problems,ABCs may be
required to predict radiation loss from circuit elements, for analyzing circuit discon-
discontinuities or for modeling small critical paths in a large, complicated integrated circuit
design. Figure 6.26 shows a 0.77-mm radius cylindrical via discontinuity connectim

5.9 mm
0.33 mm

I * 3.3 mm
Figure 6.26 (a) Side view of cylindrical \321\210

connecting two microsirip lines, (b) To-


view. [After Wang et at. Q IEEE,
IV94[y\\

two striplines, each 3.3 mm by 0.33 mm. The metallization in the dielectric serves a-u

finite @.33 mm) ground plane. In [37], the authors


thickness analyze the structure
with open and closed side walls of the top microstrip. As seen in Fig. 6.27, the
power loss due to the open walls at frequencies below 10GHz is negligible
However, for frequencies above 10GHz there is a significant amount of radiation
loss from the via.
Appendix Derivation
\302\246 of Some Vector Identities 221

6.27 Comparison of
\320\230\321\206\320\270\320\263\320\265 scattering para-
of
parameters cylindrical via for open and closed
of top microstrip. [After
5 10 15 20
\302\253alls Wang el al, r;
IEEE. 1994 [37].] Frequency (GHz)

APPENDIX: DERIVATION OF SOMEVECTOR IDENTITIES

The curl of a vector in the Dupin coordinate system is given by

\320\255\320\225
VxE = Vr \342\200\224
+ \320\273\321\205
\321\205\320\225
F.76)
\320\260\320\270

where Vr x E is called the surface curl involving only the tangential derivatives and
is defined as

E = -it x -
V> x VEn +
i2K{Elt t^iE,, + nV \302\246x
(E \320\273) F.77)

As before, and
\320\272\\ denote
\320\2722 the principal curvatures of the surface under considera-
consideration,
Eh, ?,, are the tangential components, and is the normal
\320\225\342\200\236 component of the
vector E on the surface.
We are interested in the evaluation of the three vector identities given earlier in
the chapter. Let us considersimplifying the tangential components of the curl of a
vector, E in this case. Using the definition of the curl given above, we have

x V
\320\273 x E = -(\320\273x n x V)?,, - - K2El2i2+n x \302\273x-
V Bn)

= Vx E,
\320\243,?\342\200\236+\320\273\321\205 F.78)

where x
-{\320\273 n x V) = V,. The first vector identity is, therefore, easily proved.
222 Three-Dimensional Problems: Radiation and Scattering \302\246
Chapici \320\273

Next, we will prove the second of the three identities. We start with the lerm
x V
\321\217 x V,E,, and simplify it using the definition of the curl of a vector given above.

1
V?n -n \342\200\224-

[\320\252\320\225
\320\255?
= -\321\217 V
\321\205 x \342\200\224-
\321\217
\320\255\320\271

Since V E =
\302\246 V \302\246
E, + (V \342\200\242 we can
+ \320\250\342\200\236/\320\264\320\277,
\320\277)\320\225\342\200\236 simplify the above relation even

further by substituting the appropriate expression for the normal derivative of the
normal component of the electric field and using the fact that the electric field is
divergence-free in a source-free region.

x V
\320\273 x V,?,, = V,(V
\342\200\242
E,) + (V \302\246
n)V,En
= V,(V \342\200\242
E,) + 2KmV,En Fjjf)|

where = (\320\272,
\320\272\321\202 -I- curvature.
\320\2722)/2is the mean
The proof of the is more complicated because it involves
third identity two curl
operations on the electric field. We first need to switch the positions of the outermoa
and
\321\217\321\205 the Vx operators to arrive at a simplified form of the rather
compto

expression. Therefore,

x V
\321\217 x (n x V x E) = V x x n
{\321\217 x V x E| - \342\200\242
(n x
\321\217\320\243, V x E)
-ArUVxE^t,+(VxE),,t2}
= -V x {V x E - x E),,}
\320\273(\320\243
-
nV,
\342\200\242
(n x V x E)
- A*{(V x t, + (V x E),, t,| F.8!|
E),;

Now we use the fact that the electric field satisfies the wave equation to reduce the

expression even further.

nxVx(/5xVxE) = -/c?)E-I- V x [ii(V x l


\320\225)\342\200\236}

- -
= V x x
{\320\273(\320\243 \320\246\320\225,
\320\225)\342\200\236} Ak{{V x E),2t, + (V x
E)r,t2l

where = \320\272\\
&\320\272 \342\200\224
\320\2722.

Thus, we have shown that all three identities hold as long as the vector. E it
this case, is divergenceless and satisfies the vector wave equation.

REFERENCES

[I] A. Chatterjee, J. M. Jin, and J. L. Volakis. Edge-based finite elements anc

vector ABCs applied to 3D scattering. IEEE Trans. Antennas \320\240\320\263\320\27

4lB):221-226. February 1993.


References 223

[2] X. Yuan. Three-dimensional electromagnetic scattering from inhomogeneous

objects by the hybrid moment and finite element method. IEEE Tnins. Antennas

Propagat., 38:1053-1058, 1990.

[3] J. M. Jin and J. L. Volakis. Electromagnetic scattering by and transmission


through a thiee-dimensionalslot in a thick conducting plane. IEEE Trans.
Antennas Propagat.. 39D):543-550, April 1991.
[4] J. Angelini. Soize.
\320\241 and P. Soudais. Hybrid numerical method for harmonic
3D Maxwell equations: Scattering by a mixed conducting and inhomogeneous
anisotropic dielectricmedium. IEEE Trans. Antennas Propagat.. 41A ):66-76.
January 1993.

[5] J. D'Angelo and I. D. Mayergoyz. Three-dimensional RF scattering by the


finite element method. IEEE Trans. Magnetics. 27E):3827-3832, September
1991.
[6] I. D. Mayergoyz and J. D'Angelo. New finite element formulation for 3D
scattering problem.IEEE Tram. Magnetics, 27E):3967-3970, September 1991.

[7] L. \320\241
Kempel and J. L. Volakis. Evaluation of new vector ABCs for conformal
printed antennas. 1994 URS1 Radio Science Meeting Digest, Seattle, WA.
[8] T. Ozdemir and J. L. Volakis. A comparative study of an absorber boundary
condition and an artificial absorber for truncating finite element meshes. Radio
Science.29:1255-1263, 1994.
September-October
[9] W. Chew
\320\241 and H. W. Weedon. A 3D perfectly matched medium from modi-
modified Maxwell's equations with stretched coordinates. Microwave Opt. Tech.
Lett., 7A3), September 1994.

[10] \320\241. Wilcox.


\320\235. An expansion theorem for electromagnetic fields. Comm. Pure

Appl. Math.. 9:115-134. May 1956.


[\320\230]\320\241. Tai.
\320\242. Generalized Vector and Dyadic Analysis. IEEE Press.New York,
1992.

[12] D. S. Jones. An improved surface radiation condition. IMA J. Appl. Math.,

48:163-193, 1992.

[13] J. van Bladel. Electromagnetic Fields. Hemisphere Publishing Corp., New York,
1985.
[14]S. M. Rytov. Computation of the skin effect by the perturbation method.
Tlieor. Plm., 10:180-189,
J. Exp. 1940.Translation by V. Kerdemelidis and
K. M. Mitzner, Northrop Navair, Hawthorne, CA 90250.

[15] A. F. Peterson. Absorbing boundary conditions for the vector wave equation.
Microwave and Opt. Techn. Letters, 1:62-64, April 1988.

[16] J. P. Webb and V. N. Kanellopoulos. Absorbing boundary conditions for finite

element solution of the vector wave equation. Microwave and Opt. Techn.
Letters, 2A0):37O-372. October 1989.
[17]J.P. Berenger. A perfectly matched layer for the absorption of electromagnetic
waves. J. Phys., 114B):
\320\241\320\276\321\202\321\200. 185-200, October 1994.
[18] D. S. Katz. E. T. Thiele, and A. Taflove. Validation and extension to three

dimensions of the Berenger PML absorbingboundary conditions for FD-TD


meshes. IEEE Microwaveand Guided Wave Letters, 4(8):268-270, August 1994.
224 Three-Dimensional
Problems: Radiation and Scattering \302\246
Chapter 6

[19] R. Mittra and U. Pekel.A new look at the Perfectly Matched Layer (PML)
concept for the reflectionless absorption of electromagnetic waves. IEEE
Microwave and Guided Wave Lett., 5C):84-86, March 1995.
[20] Z. J. Sacks.D. M. Kingsland, R. Lee, and J.-F. Lee. A perfectly matched
anisotropic-absorber for use as an
absorbing boundary condition. IEEE
Trans. Antennas Propagat., 43:1460-1463, 1995.
[21]D. M. KingsJand, J. Gong, J. L. Volakis, and J.-F. Lee.Performance of an

anisotropic artificial absorber for truncating finite element meshes. IEEE Tram.

Antennas Propagat., 44:975-982, July 1996.


[22] S. Legault, A.
\320\222.
\320\242. Senior, and J. L. Volakis. Designof planar absorbing layers
for domain truncation in FEM application. Electromagnetics, 16D):451-454.
July-August 1996.
[23] J. Gong and J. L. Volakis. Optimal selection of a uniaxial artificial absorber

layer for truncating finite element meshes. IEE Electronics Lett., 31A8):1559-
1561, August 1995.

[24] W. C. Chewand J. M. Jin. Perfectly matched layers in the discretized space: An

analysis and optimization. Electromagnetics. 16:325-340, 1996.


[25] A. Chatterjee. M. Nurnberger, J. Volakis, and M. Casciato. FEMATS: A gen-
general purpose scattering code using the finite element method. In IEEE Naiumal

Radar Conference Proceedings, pp. 339-344, Ann Arbor, MI, May 1996.
[26] A. D. Yaghjian and R. V. McGahan. Broadside radar cross-sectionof a per-

perfectly conducting cube. IEEE Trans. Antennas Propagat., 33{3):321-329. March


1985.

[27] D. C. Ross,J. L. Volakis, and H. T. Anastassiu. A hybrid finite element-modal


analysis of jet engine scattering.IEEE Trans. Antennas Propagat., 45:277-285,
March 1995.
[28] A. Woo, M. Schuh. M. Simon, H. T. G. Wang, and M. L. Sanders. Radar

cross-section measurement data of a simple rectangular cavity. Technical


Report NWC TM 7132, Naval Weapons Center, China Lake, CA, December

1991.

[29] V. Shankar, W. F. Hall, A Mohammedian. and Rowell.


\320\241 Development of a

finite-volume, time-domain solver for Maxwell's equations. Technical report,


Rockwell International, May 1993. Prepared for NASA/NDC under contraci
N62269-90-C-0257.
[30] M. Schuh, A. Woo, M. Sanders, and H. T. G. Wang. Radar cross-section
measurement data of four small cavities. Technical Report 108782,NASA
Ames, 1A, November 1993.
[31] A. Glisson and D. R. Wilton. Simple and efficient numerical techniques for

treating bodies of revolution. Technical Report 105,University of Mississippi,


Oxford, 1982.
[32] J. M. Putnam and L. N. Medgyesi-Mitschang. Combined field integral equation

formulation for axially inhomogeneous bodies of revolution. Technical Repon


MDC QA003, McDonnell-Douglas Research Labs, December1987.
References 225

[33] L. \320\241.
Kempel and J. L. Volakis. Scattering by cavity-backed antennas on a
circular cylinder. IEEE Trans. Antennas Propagal., 42:1268-1279. September
1994.
[34] T. Ozdemir and J. L. Volakis. Triangular prisms for edge-based vector finite
element antenna analysis. IEEE Trans. Antennas Propagal., pp. 788-797, May
1997.
[35] R. J. Sliva and H. T. G. Wang. Personal communication.

[36] T. Ozdemir, M. W. Nurnberger, J. L. Volakis, R. Kipp, and J. Berrie.A


hybridization of finite element and high frequency methods for pattern predic-
prediction of antennas on aircraft structures. IEEE Antennas Propagai. Mag.,
38C):28-38, June 1996.
[37] J.-S. Wang and R. Mittra. analysis of MMIC structures
Finite element and
electronic packages using absorbing boundary conditions. IEEE Trans.

Microwave Theory Tech., 42C):441-449, March 1994.


[38] V. N. Kanellopoulos and J. P. Webb, The importance of the surface divergence
term in the finite element-vector absorbing boundary condition method.IEEE
Trans. Microwave Theory Tech., 43(9):2168-2170, September 1995.
Three-Dimensional

FE-BI Method

7\320\233
INTRODUCTION

In this chapter, the finite element-boundary integral (FE-BI) method for full three-
dimensional geometries is presented. This is one of the most powerful computational
electromagnetics techniques
(\320\241\320\225\320\234) in use today and represents a hybridization of
the traditional method of moments with the finite element method. Interest in FE-BI
stems from the fact that volume integral equations have difficulty modeling com-
combined metal and dielectric structures and they lead to more computationally intensive
programs as compared to the finite element method. In the FE-BI method, the

boundary integral (or integral equation) is used to satisfy the following requirements:
1. Bound or terminate the computational domain in which the finite element
method is used.
2. Relate the electric and magnetic fields on the boundary.
The manner in which the FE-BI method satisfies these requirements will be pre-
presented.

The most general formulation will be given followed by several particular


examplesincluding: scattering and radiation by cavity-backed antennas recessed in
either an infinite metallic plane or a metallic cylinder, scattering bodies or antennas
placed within an axisymmetric boundary, and infinitely periodic structures such as
array antennas, it is in these special cases that the FE-BI method has proven most

valuable since each case individually combines the flexibility of the finite element
method with the efficiency of a specialboundary integral mesh closure. For both the
general and specialcases,comments regarding computational cost, flexibility, and
accuracy will be addressed. For the case of cavity-backed antennas recessed in a
ground plane, an extremely efficient solution technique that utilizes Fast Fourier
Transforms (e.g., the CG-FFT method) will be presented in detail.

227
228 Three-Dimensional FE-BI Method \302\246
Chapter?

For the most part, the method of weighted residuals and Galerkin's technique

(see Chapter used rather than the variational


1) is formulation (used in Chapters 5
and 6) unless otherwise noted. In addition, we assume isotropic media throughoui
the computational domain. However, the formulation presented herein can be
extended to the more general anisotropiccasewithout difficulty and this is done
in the appendix for brick elements. To simplify the implementation of the method,
the material within the computational domain is assumedto be homogeneous within
each finite element, but can vary on an element-by-element basis. Also, most for-
formulations presented in this chapter are in terms of the total electric field sincethis is
most convenient in the case of antenna analysis. Magnetic field formulations can be
obtained using duality; however, such formulations are not suitable when an infinite
metallic surface is included in the geometry as will be explained later in this chapter.
We begin with the general three-dimensional formulation.

7.2 GENERAL FORMULATION

The most general situation to the FE-BI method may


which be applied is a three-
dimensional volume shape and composition.Sucha regionis shown in
of arbitrary
Fig. 7.1, and its boundary am be either physical or fictitious (a mathematical entity
only). The finite element method permits a completely arbitrary material composi-
within
composition the computational domain. For example, metallic and various dielectric
structures can exist within the volume with no fundamental difference in the for-
formulation. For metallic bodies, the appropriate field components are forced to zero

Figure 7.1 Computational volume in which


the finite element method is to be applied.
Section 7.2 General
\302\246 Formulation 229

(for the total electric field formulation considered herein). Various dielectric and
magnetic materials are by appropriate
specified permittivities and permeabilities
on an element-by-element basis. This is in marked contrast to the surface integral
equation (method of moments) sincein that case the material must be homogeneous
within each enclosed domain.
We begin with a derivation of the FE-B1 equations using the physical equiva-
equivalence principle.

7.2.1 Derivation of the FE-BIEquations

The derivation of the FE-BI equations begins with the vector wave equation. This
second-orderpartial differential equation is solved by first taking the inner product of
the vector wave equation and a vector sub-domainbasis function, W,, thus forming the
weighted residual (see Chapters I and 4). Our goal is to minimize this residual or
equivalently to minimize the difference between the solution of the FE-BI discrete
approximation and physical reality. This proceduregeneratesNt, equations where
Ne is the number of sub-domainbasis functions associated with the electric field within
and on the boundary of the volume.The resulting integro-differential equation is

X
f
v x \320\223V E\"\"l \342\200\242
WidV-kl I 6rEm
\342\200\242
W,\302\253/K
=
iv Mr iv
|_ J

-I Vx[\342\200\224l.W,</K-/fcoZof ' J'-W,rfK G.1)


iv \\_V-r\\ iV

In this, the left hand side contains the unknown interior electric fields (Eml) while the
right hand side has the impressed sources (J', M').Since the excitation of the system
is not relevant to the derivation of the FE-BI equations, the right hand side can be
expressed as

./?\"'= V x \320\230 + A-oZoJ' 1 \342\200\242


W, dV G.2)
-1 J

and its evaluation is left for specific applications. In practice, the electriccurrent J' in
G.2) is useful for modeling filamentary current sources such as the ones used to
excite patch antennas, examples of which will be presented later in this chapter.
The magnetic current M' can be used to represent aperture feeds within the compu-
computational domain or on its boundary.
The FE-BI equation G.1) contains second-order derivatives of the unknown
electric field due to the use of the wave equation. It is desirable to transfer one of the
derivatives from the unknown electric field onto the weight function so that linear
weight and expansion functions used. This derivative
may be transfer is accom-
accomplished by invoking the first vector Green's theorem [1] (see Chapter 5). Doing so.
G.1) becomes

-jkoZoI n x Hinl \342\200\242


W,</S =,/?nl G.3)
is
230 Three-Dimensional
FE-B1 Method \302\246
Chapter\302\273

This is theweak form of the wave equation, and it possesses useful properties
compared to G.1). It has a symmetric volume contribution since an identical number
of derivatives are required of both the unknown electric field and weight function
and we will be using Galerkin's testing procedure. Hence,one may expect a sym-
symmetric linear system associated with the volume integral provided the material within
the computational volume is reciprocal (i.e., not general anisotropic).
Recall that in the beginning of this chapter, we stated that there are two
requirements which should be satisfied on the boundary of the finite eleraem
mesh: A) mesh closure and B) relating the tangential electric field to the tangential
magnetic field. The latter requirement is clearly illustrated in G.3) since the surface
term includes the surface electric field through the testing function Wf and ttw
tangential magnetic field n x H'm. In addition, with some foresight, we recognize
that G.3) represents an underdetermined system since the test functions are only
associated with the electric field while both the electric field and surface magnetic
field are unknown. We must therefore find a means of closingthe mesh, relating the
surface electric and magnetic fields, and providing additional equations.
The exterior excitation (e.g., a plane wave) can be introduced into G.3) by
considering the incident, reflected, and secondary(or scattered) fields as separable.
Specifically, the total exterior magnetic field can be expressed as the sum

+ H**1 {1A)

Here, incident
the field, H', and the reflected field, Hrefl, are known while the sec-

secondary field (Hscal) is obtained in terms of the interior fields using the surface
equivalence principle. This field decomposition is useful in particular for
analyzing
structures that are infinite in extent such as a conformal antenna recessed in a
metallic ground plane since the reflected field is already known and hence need
not be computed. For finite structures, such as a scattering body within a encased
surface of revolution domain (see the FE-SOR method later in this
chap-
discussed
chapter), the reflected field is omitted and only the impressed (incident) and secondary

(scattered) fields are considered, where Hacal would have also to include any reflected
fields lhat may be present. For radiation analysis, the impressed and reflected fields
are omittedaltogether and the total field is set equal to the secondary field.
A magnetic field integral equation (MFIE)[2],[3] can be formed once surface

equivalent currents are used as illustrated in Fig. 7.2. These currents can then be used
to express H**\" giving the MFIE [4]1

-n x [H'(r) + Hrefl(r)] =
-^-(Lx[Vx
5(r, r') \342\200\242
J(r')] dS'

+jk0Yol n x S(r.r') \342\200\242


M(r')dS' G.5)
h

'The first right hand side term is due to the identity (valid just interior to the surface S)

where (see Chapter 4) the horizontal bar implies the principal value of the integral. However, as pointed
out by Sancer [4]. the principal value is not necessary since the numerical evaluation of the integral with r
on S does not produce a singularity. The 1/2 factor is actually obtained without invoking the principal
Section 7.2 General
\302\246 Formulation 231

\"Bubble\"
above ground
Aperture / ; 7| ^p|ane
in ground

Ground
plane
B=0)

.Cavity
Cavity

iiguii; I.* vuliuuc auu uuiuiu-


\\.\320\273?\320\250};\321\210\320\270\321\207\320\27011\320\250 \302\246 ^S\320\266
I\302\246 below
QBIUW ground

iog surface involving a metallic ground plane.


II\342\200\224i__
l^^^^^_^^^^^_lr I/ olane
plane

As usual the electric and magnetic currents are associatedwith the tangential exter-
externalfields, e.g.. J = n x Hext and M = Ecxt x n, respectively. Also, G.5) enforces the
identity J = nxHm.
An alternative boundary integral equation can be derived by introducing the
electric field integral equation (EFIE). to do so we decompose the electric fields as

Ecxl = E1 + Ercfl + Escal G.6)

where the secondary field (Escal)can be written again in terms of J and M in Fig. 7.2

by invoking the equivalence principle (seeChapter 1). Doing so results in the EFIE

[2], [5]

-n x [E'(r)+ Eren(r)] = + -i n x [V x ?(r, r') \342\200\242


M(r')] dS'
^
+ jk0Z0 Lax <5(r, r')
\342\200\242dS'
J(r') G.7)
is
A third alternative boundary integral equation is a linear combination of the MFIE
and EFIE known as the combined field integral equation (CFIE) [6].[7]
<l-a)MFUE+\342\200\224 EFIE G.8)

where a is a parameter between0 and 1 indicating the amount of emphasis given to


EFIE or MFIE (e.g.,Q.25 implies 25 percent EFIE and 75 percentMFIE).Both the
MFIE and EFIE suffer from spurious resonanceswhen S forms a closed surface, as
in Fig. 7.1. That is, they break down at frequencies corresponding to the resonance
frequenciesof the enclosed volume. The CFIE avoids these resonancesby mixing the
MFIE and the EFIE [8].We note that there are alternative forms of both the MFIE

value theorem by placing r slightly off the surface S and then taking the limit as r approaches S.
Alternatively, the V operator can be moved oulskie the integral and applied after completing the integra-
Note that ihe identity
integration.

V x G0(r. r',) = -V x [7C0(r.r')]= -VG,,(r. r') x 7

implying V x Gn(r. r') \302\246 =


J(r') -VG0(r. r') x J(r') (given in Chapter I h must be used to relate the curl of
the free space dyadic Green's function with the scalar one.
232 Three-Dimensional
FE-BJ Method \302\246
Chapter 7

and EFIE that may be more suitable for a particular application. However, lor

simplicity and illustration, we will only use G.5), G.7), or a variation of them.
Neither of these boundary integral equations suffers from spurious resonances
when used to simulate cavities or antennas that are recessed in a ground plane
such as the case shown in Fig. 7.2.
Note that in both G.5) and G.7), the dyadic Green's function is left unspecified.

For finite geometries (e.g., no infinite structures such as a metallic plane), the free-

space dyadic Green's function Go should be used.In contrast, for structures involv-

involvingan infinite metallic surface, such as a recessedconformal antenna, a practical


choice is the second kind electric dyadic Green's function [9]. In this, the metallic
ground plane condition is automatically included in the Green's function and
need not be enforced via currents. Hence the mesh boundary S is restricted to the
\"bubble\" above the ground plane, as shown in Fig. 7.2.
For cases where the volume is flush mounted with the ground plane, the
\"bubble\" reduces to the aperture in the ground plane and for this limiting case
the electric currents can be eliminated from the boundary integral equations [10],
[11]. Also, the MFIE must be used for conformal antenna scattering calculations
when the aperture lies in a ground plane since the incident electric field is exactly
canceled in the aperture by the reflected electric field in the EFIE G.7). Hence, no
excitation for the system can be specified for the EFIE. In contrast, the EFIE must
be used for open metal geometriessincethe MFIE is only valid for closed surfaces. \320\233

compromise is to use the CFIE since it has both the MFIE and EFIE in it. However,
for the examples considered in this text, it is convenient to use the MFIE and hence
for the rest of the chapter unless otherwise noted, we shall use the MFIE.
The weak form of the vector wave equation G.3) involves the fields within the

enclosing boundary while the fields in G.5) are in the outer region. Thesefields must
be coupled together to effect a hybridization of G.3) and G.5). This is done by
enforcing tangential field continuity across the computational volume's boundary
= \320\277
n x \320\235\320\2501 x Hcxl on the surface S G.9)

n x Einl = n x Eexl on the surface S G.10)

The continuity condition associated with the magnetic fields G.9) is often termed a

\"natural\" condition is enforced by setting


and Hexl = Hinl in G.5). The electricfield
continuity condition G.10) must be explicitly enforced in the formulation and \320\272
often termed an \"essential\" condition.
Combining G.3),G.5),and G.9), we obtain the coupled FE and BI equations
V X E\"\" ' V X W'
dV - k\\ f ffEint
\342\200\242
W, dV -jk()Z0 lax Hint \342\200\242
V?,dS = fj\302\273
f
Jl- Hr iv h

-li [Qi(nxHinl))dS-i L Qi x V xd x dS' dS


-[h \320\277'\320\243\320\2751\320\234

,
\302\246x t, x \342\200\242
EM dS'dS =.f?xt G.11)
[a n']
where the exterior excitation term /f*1 is given by

= -jk0Z01 \302\246 + dS = jk0Z0/rl G.12)


ft* Q, x [H1
\320\271 \320\235\320\263\321\201\320\237]
is
Section 7.2 General
\302\246 Formulation 233

and the testing functions associated with H15*1 are indicated by Q,-. Note that there are
at present three classesof unknown fields in G.11): Eint, Ecxl, and Him. In G.11), the
magnetic field continuity was explicitly enforced: however, the elec-
condition G.9)
electric field continuity must also be enforced
condition G.10) to solve G.11). This can
be accomplished in A) implicitly
one of two ways:
by using identical basis functions
for Eim and Ecxl on the surface S (hence,all occurrences of Ecxl are replaced by Eim in
G.11)); or explicitly by enforcing the auxiliary relation

[Q, \342\200\242 -
n x (\320\225\321\213 Ecx1)] dS = 0 G.13)
s
which satisfies G.10) in a weak or average sense. Also, the testing functions for the
interior (FE) problem, W,, are not necessarily the same as the testing functions used
for the exterior (IE) problem,Q,. In fact, the testing functions W, are associatedwith
the interior electric field while the testing functions Q, are associated with the surface
magnetic field. Hence, for the general case of a different expansion for the exterior
and interior electric fields,we have the coupled equation
interior equation:

V X W/' V X E\"\" \342\200\242


dVr
II - kl
\342\200\224 \342\200\242
dV -\342\200\224/Kq^()UJ \342\200\242
\302\246
x \320\237
n \320\273 dS == /f!'int
Him \302\253\320\236
I
f I crTTj
\320\273\320\264
frW, Einl
C/ (If ' jk0Z(> I \320\230,\320\230
W,
Jl-
\320\233- Mr
Mr Jl'
Jl-' is
exterior magnetic field testing equation:
- il IQi
\302\246x
(n Hmi)]dS
-
<f
I ffi x
Q,\342\200\242. V x x
\320\264 n] -i
\302\246
Him dS' dS
2 is is is L

EcndS'dS=ff

coupling equation:

[Q, \302\246
n x (Eml - Eex VS = 0 G.14)

7.2.2 Solution of the FE-BI Equations

The solution of G.14) proceedsby expanding the volume electric field and
surface tangential electric and magnetic fields in terms of sub-domain basis functions

Eim =
J^ E/Wj volume electric field

_
?c*i
y2 E-V/ surface electric field

'V, +*\302\246\342\200\236+.\\</u

Hinl =
J2 HjQj surface magnetic rield G.15)
/=\320\233',+\320\233',.,

where N,. is the number of volume electric field unknowns, Nes is the number of
surface (exterior)electricfield unknowns, JV/,5 is the number of surface magnetic field
unknowns, and the total number of unknowns is given by =
\320\233' Nv + Nes + #/,,,.The
234 Three-Diraensional FE-BJ Method \302\246
Chapter 7

volume electric field is expanded in terms of volumetric basis functions that are based
on the edges of the geometry (see Chapters 2 and 5). The surface electric and

magnetic fields are expanded in terms of separate functions that have support only
over sub-domain surface patches which are often triangular or rectangular in shape.
In both cases, since we are using Galerkin the procedure for converting a continues
physical problemto a discreteapproximation of that original problem, the same
basis functions used for testing in G.14) are used for field expansions, though W, and

Q, are not necessarily identical. However, in G.15) V; is identical to Q, though


different symbols are used for clarity in this text.
As shown above, the enforcement of magnetic field continuity across the inter-
interface was performed by equating the exterior and interior magnetic fields across the
volume boundary, S. Enforcement of the tangential electric field continuity rausi
also be accomplished. One approachrequires the use of the three equations in G.14|

including the coupling equation. When the three field expansions G.15) are substi-
substituted into G.14), we get

interior electric field testing:

lJs *
j=Ne+\\

exterior surface magnetic field testing:

- i V x f x
\320\243. H\\-il IQt
\302\246x
<fi Qj)]dS | Q/ [n
x
\342\200\242 \342\200\242
\320\231'1
QjdS'dS

Y. Ej\\l I Qr\\nx:dxn']-\\jdS\"ds)=f
L J
j=n,+\\
<Jtt+\\ US is J

* = 1.\320\233,+
\320\233\320\223,+ 2 N

interior/exterior coupling equation:

X>Jl[QrnxW,]rfs|-
1JS >
?
J=\\ ;

!= Ne+\\,Nr + 2 N

where JV(,
= N,, + \320\233',,,is of electric field
the total number unknowns and
M = N,, + Ncs +
iV/,s is of
theunknowns.
total number
A different approach implicit satisfaction of the
involves the electric field con-

continuity requirement by employing identical basis functions. In this case, the surface
basis functions V/ are chosen to be identical to a surface evaluation of the volume
basis function W/. e.g., Vy = W; as (x. y. z) -*\302\246 surface. Accordingly, we rewrite the

electric and magnetic field expansions as


Section 7.2 General
\302\246 Formulation 235

E = volume and surface electric field


Y^ ?/W/
N

H=
^2 H.iQj surface magnetic field G.17)

Tlius, the exterior field expansion to be identical


by constraining to the interior field
expansion, continuity This also results in a reduction
is assured. of the matrix order
sincethe interior surface electric field unknowns are now identical to the exterior
surface electric field unknowns. We can omit the coupling equation in G.14) to get

interior volume electric field:

V X W|' V X E\"\"
dV - \342\200\242 Him dS
\342\200\242
n x
f k\\ f e,W, Eim dV -jk0Za J W, =/}BI
iv Mr Ji- is

exterior surface magnetic field:


- i|2Js [Qr (n x Hin')] dS-i L Q,\302\246
[n
x V x x
\320\252 n'l \342\200\242
Him dS'dS
Jsls L J

-finYnl I Q, \302\246x 5 x n'l \342\200\242


Eral dS1 dS =/fl G.18)
[n

Using G.17) in G.18). we also get


interior volume electric field testing:

J2 Hj\\i W, \342\200\242
n x
Q;ds\\ =/}\302\246\", /=1,2 Ne

exterior surface magnetic field testing:

Y) Hf\\-\\i[Qr(nxQ,)]dS-i V
iQ/\320\223\320\277\321\205
x S x n'l Q.rfS'(

i=Ne+l,Ne +2 N G.19)

We observe that the number of unknowns is equal to the number of equations /V and
/V,.
\342\200\224 It
N,.. is also understood that each contribution is nonzero only when both the
test and expansionfunctions have support, e.g.. although all electric field unknowns
are shown in the second equation of G.19).only those associated with the surface
have support.The linear system represented by G.19) is solved using either a direct
or iterative matrix solver to determine the unknown electric and magnetic fields (see
Chapter 9).
This latter set of coupled equations G.19) yields the least number of unknowns
and the simplest formulation to implement. However, this approach limits the flexi-
236 FE-BIMethod
Three-Dimensional \302\246
Chapier 7

bility of the method since the discretization required by the interior fields must also
be used on the exterior electricfield. Consequently, the boundary integral portion of
the formulation may need to be oversampled to accommodatethe geometrical
requirements of the interior region. The converse is also true where many volume
unknowns are essentially wastedto permit a detailed surface mesh. Regardless, iht
result is a potential waste of computational resources and flexibility.

7.2.3 Comments on the GeneralFE-BI Formulation

The equations, G.11)-G.19), are the most


FE-BI general forms and can be
used under any circumstances. Such equations have been implemented by Aniilla
and Alexopoulos [2] while Eibert and Hansen used [12]
a slightly different integral
equation. An example of the kind of problems that can be modeled using such a
flexible method is shown in Fig. 7.3 where the geometry is a coated finite height
ogival cylinder. Figure 7.4 illustrates the RCS comparison between a FE-CFIE
method [2] and measured data for ?^-polarization.

Figure 7.3Coated cylinder with ogival cross


section: 60.96 cm x 20.32cm metallic cylin-
cylinder having a 0.317Scm dielectric coaling
wilh e, = 2.68-y0.01. [After Aniilla and

Alexopoulos [2], if J. Opt. Soc.Am., 1994\\

The versatile FE-BI methods are still associated with large demands on com-
computational resources. The FE portion of G.11) will permit the specification of a
complex inhomogeneous volume fill while imposing a low memory and compute
cycle burden, principally due to the resulting sparse matrix. However, the two
boundary terms are essentially identical to a surface method of moments formula-
with
formulation the resulting fully populated matrix. Figure 7.5 illustrates the fill profile of a
typical FE-BI matrix. The dark region indicates matrix entries that are nonzero while
the white space denotes zeros and hence correspondsto portions of the matrix ihai
Section 7.2 General
\302\246 Formulation 237

30.0

Theta-90deg
\302\246
\302\246
\302\246
Measured data
20.0

10.0

-10.0

-20.0

Figure Radar cross section for the coaled


7.4
ogive shown in Pig. 7.3 for ?w-polar-
cylinder
-30.0 +-\320\223-\320\223
izalion. [After Antilla and Akxopouios[2],($.';
0.0 60.0 120.0 180.0 240.0 300.0 360.0

J. Opl. Soc. Am.. 1994.)

Bl system

i1 flj
J !'!i jjP'b
Figure 7.5 Fill profile for a typical finite 120
clement-boundary integral matrix. [Courtesy 20 40 60 80 100 120
ofS. Bimligamvale.] Column

need not or stored. Clearly, the volume


be calculated finite element formulation
results in a sparsely portion of the matrix
populated while the boundary integral
results in a fully populated portion of the matrix. The result is that for many cir-
circumstances, the boundary integral terms imposea higher demand on computational
resources as compared to the FE portion. This is true even though the Bl term is
responsible for only a small fraction of the total number of unknowns, as shown in
Fie. 7.5!
238 Three-Dimensional FE-BI Method \302\246
Chapter?

An example will illustrate this point. Consider \320\260 \321\205


\320\233 A.
\321\205
\320\220. volume thai is

discretized using edge-based brick elements with edge length a conservative Ji/10.

Each faceof the volume will have 180 unknowns per field component. Including the

edges at each face junction, the total number of surface unknowns is 1200per field
component or 2400 tota] surface unknowns. The boundary matrices associated with
these unknowns would require approximately 12 MB of RAM if single precision
complex number storage is used. This is obviously not a large burden even for

many personal computers. However, consider the effect of doubling the sample
rate as is often necessary for complexgeometriesor radiation problems. In itm
case, each edge of the mesh will be A/20. The number of unknowns per field com-

component is now 4800 or 9600 total surface unknowns, and these require over 184 MB
of RAM! Although this amount of RAM can be found in high-end engineering
workstations, it is clear that the boundary integral memory demand does not scale

favorably.
Hence, the FE-BI method is usually implemented for certain special cases that
reduce BI's demand
the on resources. In the next several sections, we will preseni
examples of these special cases beginning with the case of a cavity-backedvolume
recessed in a metallic ground plane. However, we first introduce the important topic
of excitation and feed modeling.

7.3 EXCITATION AND FEED MODELING

Solution of the FE-BI system requiresspecification of the source(s). Several sources


are typically used to excite the FE-BI system. T hesesourcesare broadly divided into
two categories: exterior and interior. By far the most common exterior source is the
plane wave. This is due to the fact that any incident field may be decomposedinto a
number of plane waves and superposition guarantees that the solution to a general
excitation can be found by the sum of the solutions for each plane wave. However,
an arbitrary wave source may be readily specified (for example, a Gaussianbeam or
a measured antenna pattern).
Interior sourcesare specified as either electric or magnetic currents. A number of
these sources are used in practice to model both filamentary (probe) feedsas well as

aperture feeds. The FE-BI method is particularly important for antenna modeling
since sufficiently accurate and efficient artificial mesh truncation procedures are nol
available for accurate near-field calculations. Sincethe input impedance is required for
antenna analysis and gain calculations, near-field accuracy is critical as well as a good
feed model. In the following section, we present a brief discussionof feed modeling.

7.3.1 Plane Wave

One source, useful for determining the radar cross section (RCS),is the plane
wave

H' = Y0[k! x E'l G.211;


Section 7.3 Excilalion
\302\246 and Feed Modeling 239

where Yo 1/ZO is the free space admittance,


= the polarization of the incident field is
by e'. and the incident field by k' =
indicated is denoted \342\200\224
f. That
direction is,
k' = \342\200\224
[jccos sin0'
\321\2041 -fvsin^'sin^' + ?cos9'] where (\321\204',&) denote the angles of the
incident plane wave field. If we assume an infinite, metallic ground plane, the
reflected field is given by

=
\320\235\320\263\321\201\320\237
x \320\225\320\263\320\265\320\237]
\320\2430[\320\272\320\263 G.21)

where kr = cos \321\204'


\342\200\224x sin & \342\200\224
sin \321\204'
\321\203 sin & + z cos & and ef denotes the polarization of
the reflected plane wave as dictated by the boundary conditions. Thus, in the ground
plane aperture, the sum of the incident and the reflected magnetic field is twice the
incident field while the sum of the incident and reflected electric fields is zero. Using
these fields in G. \\2) and assuming evaluation in the z = 0 plane, we get the generat-
functional
generating for the system excitation vector

ff = -J2kaI Q,
\302\246
f x [k' x eV-'*\302\253(i'\"] dS G.22)

7.3.2 Probe Feed

Another useful source is an impressed electric current, J', which can be used to
simulate a probefeed.This feed has been used to model a patch antenna feed by the
method of moments [13]as well as in the finite element method [10]. Assuming an
infinitesimally thin current filament, the excitation term G.2) for an arbitrarily
oriented probe feed is given by

ff = -jk0Zj- J' = -jkQZ0IQl- W,.(r) G.23)

where / indicates the orientation of the probe, W, is the weight or testing function
associated with the /th unknown, and /0 is the current flowing through the filament.
For current probes lying along the .v-, >>-, or r-directions, we have

J' =
\302\246
\320\273-
- -
/0\302\2530' Wf(x,
\320\243\321\200\320\226\320\267 z)
\321\203,
\320\263\342\200\236)

J' =
\302\246 - -
\321\203 /0<5(.x Zp)W^{x. y.
\321\205\342\200\236\320\251: z)
2 \302\246
J' = hAx
~
a>Rv
- yp) Wf (.*. y, z) G.24)

where the coordinates


(.v/(, yp, zp) indicate the location of the source within the com-
computational domain. Note that the filament is not constrained other than lying within
the computational volume. However, in practice a filament length of one finite

element cell is usually assumed for simplicity (so thai integration over thai length
will yield the moment /()/). With this assumption, the excitation vector generating
function becomes
240 Three-Dimensional FE-BI Method \302\246
Chapter 7

f'f
= [8b- - -
zp))x
\302\246
W,dV
\320\243\321\200\320\251\320\263
= -/*0Z0/0Ax Wf{., yp, Sp)
-jk0Z0I01
= -jk0Z0I0
- - \302\246 dV
W,
=
xp)8{z
\320\251\321\205 zp))y -/A:(,Z0/0\320\224\342\200\236
\342\204\226*(*,... z,)
j(
- - = -jknZ0/0A:
[ xpWy
[\320\230\321\205 yp)]i\302\246
W, dV Wj{xp,yp, \302\246) G.3)

where Ihe integration volume in G.25) includes all elements containing the source.
Also, the three expressions in G.25) correspond to x~, y-, and s-directedfilamems,
respectively. If more than one test edge is involved in the feed model (e.g., when the

probe feed is in the interior of an element), the total contribution is given by

evaluating G.25) for all edges for which / \302\246 0.


W, \320\244
A word of caution is in order. The probe feed presented herein has been
criticized for being too simplistic in that all of the current is concentrated in an
infinitesimally thin filament and that the assumption of a constant current is unreal-
unrealistic. Admittedly, ignoring the diameter of the probe (and the coaxial aperture) leads
to errors, particularly in the input reactance, and approximate formulae have been
proposed to correct this error. However, these corrections can hardly be considered

as satisfactory. Furthermore, the constant current assumption results in limiting the


use of the probe feed to very thin regions where the fields are not expected to van
substantially (e.g., the thin substrate between a patch antenna and its underlying
ground plane).
To correct these shortcomings, it is tempting to borrow solutions developedfor
the integral equation methods. In particular, Aberleand Pozar [14]useunknowns on
the surface of the probe feedand enforce an EFIE on the probe to include the effects
of its finite diameter and current variations along the probe. However, to use such a
model requires use of the Green's function for the structure surrounding the probe.
Typically, such a Green's function is not available for structures considered for finite
element analysis. Use of a different Green's function, such as the half-space Green's
function, is not correct and should not be used.

7.3.3 Voltage Gap Feed

Another commonly used excitation, often employed in the integral equation


methods, is the voltage or gap generator.In this feed model, the voltage acrossa
small gap is specified as fixed. Thus, Gauss' Law can be used to write

G.261

where E' is the impressed the voltage assumed a priori


electric field, V is across the
gap, and S is the length of the
gap parallel to the incident field.
Assuming for simplicity that the gap coincides with the /th edge/unknown, the

gap voltage feed is implemented by forcing the electric field (the unknown) to be

equal to ^ where d is the length of the /th edge. This can be done during each
iteration of an iterative solver or via an appropriate auxiliary (additional equation)

condition.
Section 7.3 Excitation
\302\246 and Feed Modeling 241

7.3.4 CoaxialCableFeed

As noted above, the probe feed mode) is of acceptableaccuracy for very thin
substrates and for circumstanceswhere the diameter of the probe may be safely

ignored. For thicker substrates or for circumstances where the diameter of the
probe and the size of the coaxial (or coax) aperture need beconsidered, an improved
feed model is necessary. The coaxial feed geometry is shown in Fig. 7.6. The following
derivation and model are based on the development given by Gong and Volakis [15].

Cavity Patch

(a) Side view \320\276\320\223


\321\217
caviiy-backed
antenna with u coax cubic feed: (b) Cavity-cable junction
Illustration the FE mesh al (he
\320\276\320\223 cavily-
cablc junction. [After Gong and Volakis [15],
\321\201 (a) (b)
IEEE. /9*5.1

With the presence of the coax cable aperture, the boundary integral (not/Jnl)
in G.3) will include the term

G.27)

where n = z and S/ denotes the aperture of the coax cable. Assuming \320\260 mode
\320\242\320\225\320\234

across Sf. the fields within the cable may be expressed as

G.28)

where

A+\320\223). G.29)

In these expressions, p is the radial variable from the center of the inner conductor, p
and \321\204
are the usual cylindrical unit vectors in the coax aperture, /\342\200\236
is the current

flowing through the coax cable, and en is the relative permittivity of the dielectric

between the inner and outer conductorsof the coax cable. Also, \320\223 is the reflection

coefficient of the incoming coax \320\242\320\225\320\234


mode at the coax cable opening but can be
eliminated from G.29) to obtain the relation

\302\273 GJ0)
w

We observe that G.30) is the constraint at the cable junction in terms of the
new quantities /i0 and eu which can be used as new unknowns in place of the fields E
and H. However, before introducing G.27) into the system, it is necessary to relate c{)
and Ao to the unknowns (edges) lying within the coaxial aperture. Since the actual
242 Three-Dimensional FE-B1 Method \302\246
Chapter ?

field has a 1 fp in the cable, direct enforcement


behavior of field continuity on a
point-by-point is not possible using edge elements having
basis constant or linear
variation. Instead, when using edge elements, onecan only specify that the potential
difference across the coax cable be equal to that across the bordering volume edges
Specifically, we find that

=
\320\233\320\240 ~
a) = eQIn -.
?\321\201\320\276\302\273\321\205(\320\233 i= Np (\321\200=\\\320\233 Nc) G.3li

where A V denotes the potential difference between the inner and outer surface of the

cable, and ?coux denotes the field in the coax. Also, Np denotes the global number for
the edge across the coax cable and Nc- is the number of edges in the coax aperture
When the condition G.3 J) is usedin the function G,27), it introduces the excitation
into the finite element system without a need to extend the mesh inside the cable or to
employ a fictitious current probe. Specifically,we have

\342\200\224
CF _ /-coax

with

\320\273
C

Note that/f is nonzero only for thoseedgeswhich coincide with the aperture of the
coaxial cable. It is also apparent that//0\"* is a constant and becomes part of the
excitation function/1\"' when moved to the right hand side of the matrix system
Basically, the excitation column entries will be zero or equal to/?oax for those edges
coinciding with the coaxial cable aperture. Upon solution of the system, the input
admittance at the coax aperture (z = \320\236)
can be obtained using the expression
\342\200\236 f 2/0 1
=J_

where Zt. is the characteristic impedance of the coax cable and the integration path \320\241

is around the center conductor.

7.3.5 Aperture-CoupledMlcrostrlp Line

Figure 7.7 illustrates a microstrip antenna fed by a microstrip line network


underneath the antenna via a coupling aperture. Caremust be taken in incorponuiiij
such a feed into the FE method [16], [17].This is becausethe microstrip line usually
requires dramatically different discretization as compared to the cavity's geometries,
It is convenient to separate the computational domain to accommodate the
smaller element required for the guided feed structure. Also, different
dimensions
finite element shapes may be favored for the antenna and the feed region. Fa-
example, consider a circular patch antenna. The cavity/antenna fields may be dis
cretized using tetrahedral elements, whereas in the microslrip line region rectangular
Section 7.3 Excitation
\302\246 and Feed Modeling 243

Antenna elements

(\302\246)
1
/ / / / / /*/
St A1) 4
Figure 7.7 Cross an aperture cou-
section \320\276\320\223 ,
\"
pW patch antenna, showing the cavity region \\
I and the microsirip line Truncation
II \320\223\320\276\320\263
two plane \\
region
Afferent FEM computational domains. Coupling aperture Sa

bricks are the best candidates since the feed structure is rectangular in shape and the
substrate has a constant thickness.

Although both types of elements employ edge-based field expansions, the


meshes across the common area (coupling aperture) are different, and consequently

some connectivity matrix must be introduced to relate the mesh edges across the
aperture. This can be accomplished using the coupling equations G.9), G.10), and
G.13),introduced previously. However, since the aperture is very narrow, a 'static'
field distribution may be assumedat any given frequency. Therefore, the potential
concept may be applied to relate the fields on either sideof the aperture. To do so, let
us first classify the slot edges as follows [17]:

Cavity mesh (assumed to form right triangles across the aperture)


\302\246
Ef j=\\, 2,3.... parallel edges
\302\246
Ef 7 = 1,2,3,... diagonal edges
Feed mesh (assumed to form square elements across the aperture)
\302\246
\320\251 j
= 1. 2, 3,... parallel edgesonly
In these, parallel edges refer to edges parallel to the Cartesian axes. To ensure equal
potential across the aperture requires that

El;2
=
^(eJEJ
+ eHlE*+l) G.33)

in which

*,= G.34)
{+]

and in these, / and d are the lengths of the parallel and diagonal edges, respectively.
That is, / is simply the width of the narrow rectangular aperture between the two
meshes. The coefficient (/ is equal to \302\2611depending on the sign conventions asso-
associated with the meshes to either side of the coupling aperture. Essentially. G.33) is a
potential approximation to the electric field continuity conditions G.10) and G.13).

7.3.6 ModeMatched Feed

Alternatively, a mode matching procedure can be used at the aperture to


rigorously include the higher order (evanescent) modes that are excited at the coax
244 Three-Dimensional FE-BJ Method \302\246
Cbapler ?

aperture. This amounts to including an additional boundary integral at the feed

aperture and synthesizing a coax Green's function via a modal series. Modal series
feed models are discussed by Reddy et al. [18]. In this, the authors present as iin
appendix the formulas for rectangular,circular, and coaxial feed apertures.
The electric field across the feed aperture can be expressedas a sum of tncidcni

and reflected fields


E= E; + ? foXV y) + M\342\204\242(x, y)] ?<\"\302\246- G.351
m

with

= E'(x,y) e-*\021
E'(x, y, z) G.361

where E' is the incident field, and E^E and E\342\204\242


are the and
\320\242\320\225TM modes, respec-
respectively. The reflection coefficients are given by am and bm for \320\242\320\225
and TM modes,
respectively, while
ym is the corresponding propagation constant within the wave-
waveguide. Note in G.36), the incident field is assumed to be propagating
that in the +:
direction; however, this is simply a convenient choice and other propagating fields

can readily be implemented.


Since the waveguide modes are orthogonal, the reflection coefficients \320\270\32

given by

\\ ^ G.37|

is,
Across the feed aperture the magnetic field is obtained by Faraday's Law

H = & V x E G.39|

assuming a nonmagnetic material (juf = 1). This is used across the feed aperture in

G.3) by equating it to the interior magnetic field (H = Hinl). Hence, the excitation for

the system is given by

/Tod*=jk0Z0\\ \\y/,x S\\.HdS G.40|

and upon applying G.35)-G.40) at z = -L (e.g., within the waveguide, a distance!


from the aperture), G.40)becomes

/-mode= 2yme

Note this model assumes that the


that finite element mesh extends a distanceI into

the waveguide to reduce the number of modesthat need be retained.


Section 7.4 \302\246
Cavity Recessed in a Ground Plane 245

74 CAVITY RECESSED IN A GROUND PLANE

One of the most successful applicationsof the FE-Bl method involves antennas
situated in a cavity recessed in an infinite, metallic ground plane. The reason this
application is so well-suited for the FE-BI method is that the costly boundary
integral portion of the formulation while the material flexibility
is minimized of
the finite element method is retained. FE-BI was first applied to this problem by
Jin and Volakis [10] where they utilized brick elements. The computer program
developedunder that effort, FEMA-BRICK, was exceptionally efficient and has
been used by government, industry, and academia for the analysis and design of
very large (hundreds of elements) patch and slot antenna arrays. The secret of this
particular computer program's efficiency is the use of a biconjugate gradient-fast
Fourier transform (BiCG-FFT) matrix solver [16], [19].This approach will be dis-
discussed below as well as a description of the BiCG-FFT method. The FE-Bl method
has sincebeen implemented using tetrahedral [11], [20] and prism [21] elements to
enhance flexibility: however, use of such flexible finite elements
generally preclude
the use of a BiCG-FFTmatrix solver. An additional enhancement of the basic
recessedcavity formulation involves grafting geometrical theory of diffraction
(GTD)coefficients to the FE-BI results to approximate a finite ground plane [22].
Figure 7.8 illustrates a cavity-backed metallic ground plane. The
aperture in a
aperture lies in the xy plane and can have arbitrary shape. The cavity can also have
arbitrary composition, but we assume a metallic boundary on all hs sides except for
the aperture. The material within the volume is assumed to be inhomogeneous, and
unless otherwise noted, the exterior region is assumedto be free space. The aperture
can be either open or partially covered with an infinitesimally thin metallic patch.
There can be more than one cavity/aperture, but for the sake of simplicity, we
assume that all apertures lie in the z = 0 plane.

Aperture

Ground
plane
(Infinite)

Recessed
cavity

- Base of
cavity
Figure 7.8 Illustration of a cavity recessed in
a metallic ground plane
246 Three-DimensionalFE-B1 Method \302\246
Chapter)

7.4.1 Formulation

The efficiency of this method lies in the fact that the only exposed (nonmeiallk)
surface is the aperture. Sincethe cavity's other walls are metallic, for a total electric
field formulation, the boundary conditions on those walls require a vanishing tan-
tangential electric field. The assumption of the aperture lying in an infinite metallic

ground plane allows further simplification. The tangential surface electric fields
(Ecltl) in G.11) short out over the entire ground plane except for the aperture
Thus, the magnetic current (\342\200\224ft x Eoxl) has support only over the aperture (it also

vanishes over any surface patch lying within the aperture).


Normally, an electric current (ft xHexl) would be required over the entire,
infinite ground plane in order to represent the metallic plane. However, a special

dyadic Green's function can be used in place of the usual free-space Green'sfunc-
function, and since this Green's function satisfies the Neumann boundary condition, the

electric currents are no longer required. Hence,the Green's function enforces the
metallic plane boundary condition (without a need to introduce an equivalent elec-

electric current) thereby reducing the number of unknowns. This dyadic Green's func-

function is an electric dyadic Green's function of the second kind [9], and for the case of
an infinite metallic plane, it can be derived using image theory. Sincethis Greens
function converts a surface tangential magnetic current to an exterior magnetic held.
image theory states that the Green's function is simply twice the free-space Green's
function

where both the source and test points lie in the .vv plane. Note that G.42) contains a

minus sign, but alternative representations without the minus sign may be used

accompanied with appropriate changes in the formulation. The evaluation of the


exterior magnetic field requires only magnetic currents rather than both electric and

magnetic currents, and the total magnetic field is given by

z x Hinl = z x Hw = z x (Hj + \320\235\320\263\320\265\320\237) Ge2 x


\342\200\242
dS'
\320\225\321\213 G.43i
-./*\342\200\236yA[zx z]

An important difference between G.43) and G.5) is that since the magnetic field in
the aperture can be completely represented by the tangential electric field in the
aperture, e.g., through the use of G.43), we can substitute the surface magnetic
field integral into G.3). That is, the FE-BI equation, assuming surface field
expan-
expansion terms identical lo the volume expansion terms, is given by

f fuv Einl V x W,
\342\200\242 . \302\246 1

Mr
h-\\_ J
- kl I \342\200\242 x ft' -
ft x \320\252\320\273 dS' dS =f'f +/fl
? [\\V, Eml]
G,44)

Hence, rather than having separate but coupled equations G.19), we have a single
equation G.44).
Section 7,4 \302\246
Caviiy Recessed in a Ground Plane 247

The FE-B1system is then determined by substituting in suitable edge-based


expansion functions, as was done in the previous section. Doing so. we get

- kli i \342\200\242 x n' \342\200\242dS'cIS


n x \320\252\320\273 1
[\\V, W,]

=/|Dl+/\320\223. /=1,2.3 N G.45)

where we have assumed that the volume expansion functions reduce to the surface
expansion functions in the aperture, thus ensuring the enforcement of G.10). Note
that in G.45), the surface term is nonzero only for test and source edges that lie in the

aperture.

7.4.2 SolutionUsingBrickElements

Although the formulation presented above can be and has been implemented
using various different element volumes (e.g., bricks,prisms, or tetrahedrals) [2]. [3].
[11]. [12],the use of brick elements with uniform discretization allows for a particu-
particularlyefficient matrix solver: an iterative, FFT-based method.Bricks are attractive for
discretizing rectangular volumes sincethey are easy to implement and readily con-
conform to the cavity's dimensions. Bricks are not suitable for drum- or odd-shaped
cavities sincethese volumes introduce stair-casing and thereby reduce the accuracy
of the solution.
As stated above, if brick elements are used, then a particularly efficient imple-
implementation can be achieved. Experience has shown that the boundary integral portion
of the FE-B1 equation G.44)dominates the computational cost for many applica-
applications both in terms of memory and compute cycles.This is because the sub-matrix
associated with the boundary integral is fully populated and hence requires ('(N^,)
storage and compute cyclesper iteration, where jVbl is the number of unknowns

associated the boundary integral.


with
However, in the case of brick elements, a more efficient implementation can be
used due to additional symmetry in the Bl sub-matrix. Brick elements utilize a uni-

uniform surface discretization, and in this case, the boundary integral terms depend on
the physical distance between unknowns in terms of rows and columns of the surface
grid. Hence, the boundary integral sub-matrix entries can be written as

|r[,| G.46)

where [m. n] is the row and the surface grid) of the test edge and [m',n'] is
column (in
corresponding position source In
of the
edge. these terms, the arguments to the
Green's function is the difference between the row (m \342\200\224 m1) and column (n \342\200\224
n1)
of the source and test points. Hence,the sub-matrix that results from G.46) is Block

Toeplitz. This particukir matrix structure lends itself well to efficient solution.

Specific formulae for the boundary integral matrix entries are presented in the
appendix.
248 Three-Dimensional FE-BI Method \302\246
Chapter 7

Iterative solution algorithms refine an initial guess at the solution until a preset
error threshold is satisfied. The major computational cost driver in any iterative

algorithm is the matrix-vector multiply. All iterative solvers require at least one

matrix-vector multiply per iteration


(some require two) [23]. The FE-BI equations
G.44) can be solved using a symmetric matrix iterative solver such as the biconjugaic
gradient (BiCG) method. To apply this algorithm, it is useful to divide the malm
into individual operations for the FE and BI portions, viz.

1 U?\302\273 = If
{^ci Lt\302\260] raj J i4c>\"
{\321\204 {/imi
{,/

where [x/] representsthe FE matrix and [\320\251denotes the integral sub-matrix. As

suggested by G.47), the matrix-vector multiply required by the BiCG method can

be performed as two separate operations. The FE matrix can be multiplied by the

search vector (see Chapter 9 for the BiCG algorithm) and the result retained in a

temporary vector. The BI sub-matrix can be independently multiplied by the portion


of the search vector associated with boundary integral unknowns (the remainder of

the search vector would be multiplied by null rows and is therefore omitted). This
result is then added to the FE product, and it is identical to the product vector
obtained if the entire matrix were multiplied by the search vector at one time
However, since the costly matrix-vector operation has been partitioned into a FE
and BI portion, it can be optimized to exploit the sparseness of the FE matrix and

the Block Toeplitz structure of the BI sub-matrix, respectively.


Sparse matrix multiplies are very efficient because only nonzero entries \320\266

considered and stored. In fact, since the FE matrix entries for a brick element are

regular, there is no need to storemore than a handful of entries. Each brick has 1?

edges, four aligned along three orthogonal directions(.v,.v,:).Since each matrij

entry represents an interaction, all possible interactions can be represented by 2M


interactions: 144 A2 squared) for the first volume integral in G.44) and another l+t

for the second volume integral. Many of these potential interactions yield zeros, and
thus significantly fewer interactions need be computed or stored. However, in prac-

practice, the logic required to determine significant and insignificant (null) interaction*\320\272

extensive and usually 288 complex memory locations per layer of the mesh are

allocated. Each layer is stored separately sincethe layer thickness need not be con-
constant from layer to layer and therefore the matrix entries will be different. However,

if the layers are of constant thickness(e.g.,all bricks throughout the mesh are
identical), only a single set of 288 interactions need be computed and retained
This is still insignificant compared to the storage cost of the actual FE interaclion
matrix! Specific formulae for the brick FE entries are provided in the appendix.
The Block Toeplitz structure of the BI sub-matrix, [#], can be exploitedto yield
impressive memory consumption and run-time efficiency. Careful inspection of

G.46) indicates that all interactions can be represented by computing and storing
a single row of the interaction matrix and a portion of another row. For example, if

the unknowns were numbered with all .v-directed edges first followed by all j-direc-
ted edges (assuming the aperture lies in the z = 0 plane), then all interaction
between two .v-directed aperture edges are represented by the first /Vv entries of

the matrix's first row. The next Ny entries of the first row represent all interactions
between .v-directed test edges and .v-directedsourceedges.Finally, the last Nv entries
Scciion 7.4 \302\246
Cavity Recessed in a Ground Plane 249

of the first row associated the first r-directed


with edge represent all possible inter-
interactions between r-directed This is the minimal set of interactions
edges. that need be

computed, although in practice two full rows of the matrix may be computed and
stored to simplify the required logic. (The additional partial row represents all inter-
interactions where the test edge is v-directedand the source edges are .v-directed. For the
situation considered herein, these are identical to the interactions involving an \320\273-

directed test edge and a /-directed sourceedgedue to symmetry.)


These minimal interactions may be used in an iterative solver by utilizing the
matrix partitioning scheme given by G.47). That is, the FE entries may be repre-

represented by a sparse FE matrix, [sf\\, whether stored or computed on-the-fly, and the
BI interactions may be represented by [\320\251.During the crucial matrix-vector multiply

(per iteration), a sparse matrix operation is usedfor the FE portion and a FFT-based
matrix-vector multiply is used for the Bl portion. The FE matrix-vector multiply

requires only C'(N) while the Bl matrix-vector multiply requires only


operations,
\320\241{[\320\233/\321\202\320\260\321\205log2(A/max)]2}
where N denotes the number of unknowns for

the entire system and Mmm is the maximum number of edges in either the x- or y-
direction. The following section presents a detailed description concerning the imple-
implementation of a FFT-based matrix-vector multiply scheme.

7A3 FFT-Based Matrix-Vector Multiply Scheme

The BI matrix-vector multiply can be computed on the basis of the discrete,


linear convolution theory and the Fast Fourier Transform (FFT) [16], [19]. In the
following, the general coordinates (\320\270. v) may be considered as =
(\320\274 x, v
= y) for the
planar case. As seen later, planar topologies are not the only cases that yield a Block
Toeplitz matrix structure, and therefore use of general coordinates simplifies the

extension of this presentation to other situations {e.g., apertures in an infinite,


metallic cylinder).
The BI integral equation G.46) can be represented by the following four sets of
equations:

guu[t-t\\s-s'] = +kl f f Wu(u\\v')Wu(\"'V)G\"''(u-u',v-v')du'dv'dudv


J.v, J.v

gur[t
- /'. .v - =
\320\273'] -A-o f [ Wv{u\\ </) Wu(u, v)G\"\"(u -u',v- v')du' dv'dudv
Is,.isr,
gm[t-l'.x-s'] =
-k20 f f Wu(u\\v')Wl{u,v)Gin\\u-u',v-v')du'dv'dudv

-
?,\342\200\236[/ /'. s -.?'] = +kl \\ I Wv(u\\ v')W,,(ii, v)G\"\"(u - v-
\302\253', v')du'dv'dudv
Js,. is,,
G.48)
where gm represents the u-u interactions (first Nu entries of the first row of \320\251), gm,

represents the vv interactions (last N..interactions of the (iV,, + 1)th row), and so on.
Wluy{u,v) are the edge-based testing/expansion functions, and e refers to the test
element while e' denotes the source element. is the
\320\233',, number of grid points in the \320\270-

direction while Nv is the number of grid pointsin the u-direction, as displayed in Fig.
7.9.
250 Three-Dimensional FE-BI Method \302\246
Chapter!

Vk

Nv-2
Nv-3
\342\200\224
\342\200\224
\342\200\224
\342\200\224 -

\320\270 \320\270
3
3
2
2
1 1
t=Q
s = 0^23 Nu-2 S= 0^23 Nu-3
(a) (b)

Figure 7.9 Illustration of collocated surface meshes. Nu is ihe Dumber of nodes in

the \302\253-direction while Nv is the number of nodes in the \302\253-direction.

In G.48) of the physical surface


the row mesh is denoted by / while the column

of the physical surface mesh is indicated by s. For a mesh involving both \320\274-directd
and ?>-directed edges, there are two different collocated meshes:one for w-direcied
edges and one for u-directededges,as shown in Fig. 7.9. Figure 7.9(a) corresponds to

the \321\213-directed edges while Fig. 7.9(b) refers to the v-directed edges. Notice that there
are /V,, \342\200\224
2 rows and Nu \342\200\224
I columns for the unknowns on the \320\275-directed edges. Also,
there are iVt. \342\200\224
I rows and Nu \342\200\224
2 columns for the unknowns on the ^directed edges.
This example illustrates our comment that there are two collocated mesh schemes
with different numbering conventions.
These two separate meshes are important in implementing a FFT-based
matrix-vector product. Since such a matrix-vector product relies upon the physical
distance between the test and source functions rather than their matrix position,
understanding the physical layout of the meshes permits proper filling of the data
arrays and the correct calculations of the matrix-vector product. All of these com-
comments are based upon the fact that each of the sub-matrices represented in G.48) j&

used in amatrix-vector product and those productsare truncated, discrete, linear


convolutions, and hence amenable to the BiCG-FFT method.
To proceed, we define the two-dimensional discrete Fourier transform (DFTi

pair

\302\2532-1.
=
\321\203]} ?
.t=0
Af2-I
\320\233/,-1
\302\246\342\200\242
\321\205
/* \320\223/ *\320\223~\"'
i*1 \342\200\224 / /\320\223\320\223\320\273
7 1/, \320\247 -^201^19-, / , G.491
\320\2302
\320\223\320\276
.\302\253=0

Using G.49), the convolutions in G.48) can then be rewritten as

t', s']g[t -t'.s- a'] = , s]} , s]}) G.50)


/'=0 .\302\253'=0
Section 7.4 \302\246
Cavity Recessed in a Ground Plane 251

where \"\342\200\242\"
indicates a Hadamard (e.g., term-by-term) product. The order of the
relevant DFTs must be \320\233/,
> 2 (number of rows)-1 and M2 > 2 (number of
columns) - 1, where the number of rows and columns of the discretization may
vary with each convolution. For example,the first convolution in G.48) is associated
with \320\274-testing and \320\270-source edges and hence the number of rows and columns is
(jV,.
-
2) and (Nu
\342\200\224
1) respectively. The field sequences are loadedinto an \321\205
\320\233/| M2

array in row/column order of the field discretization, and the remaining entries form
a zero pad.
The Green's function sequence be loaded into a similar array (in the same
must
manner), and periodic replication must be performed to provide the necessary \"nega-
\"negativelags.\" The data array (matrix) entries representedby G.48) using the first u-
directed edge for testing and all of the \320\270-directed edges for sources (e.g., part of the
first row of the matrix) represent all interactions where the source edge is to the right

and above the test edge, as shown in Fig. 7.9. The \"negative lags\" are situations

where the source edges are to the left and/or below the test edge.
If the sequence has the property, g[t
- t',s- = g[l'
,\321\207']
- t, s' -
s], then this

replication process takes the form

g[t,s] = g[t..s] 0</<^\342\200\224l 0<.\320\263<^-1

= g[Ml+2-t.s]
^-<t<Mx-\\ 0<.9<^-l

+2-t.M2 + 2-s] ^-<t<M^-\\ !^l<s<M2-\\ G.51)

The first group in G.51) consists of caseswhere the source edge is to the right and

above the test edge. The secondgroup represents the cases where the source edge is
to the right and below the test edge. The third group is for interactions where the
source edgeis to the left and above the test edge. Finally, the last group represents
interactions when the source edge is to the left and below the test edge.
If such symmetry is not present, all possible lags must be computed requiring
longer matrix build time since more than the first \320\275-directed and t<-directed edges
need be used as sources.
Whether even symmetry exists or not depends on the specific

expansion functions and thus is implementation specific.


Once the periodic arrays are loaded, the required matrix-vector product for
the \321\210-interactions may be performed in O((M\\ logM|)(AMogA/2)) operations
rather than the O(((NU
- 2)(N,,- 3)J) operations matrix- required for a standard
vector product. The comparison Fig. 7.10 with
is shown =
\320\233/| in
2(JV,,
- 3), M2 = 2(NU - 2) and Nv
= NU~N. Clearly, when the number of nodes
per sideexceeds10-15, the FFT-based matrix-vector product is more efficient than a
conventional matrix-vector product, in practice, the FFT-based product is more
efficient than a standard product in terms of wall clock time for N < \\Q since in
order to exploit the memory savings afforded by uniform zoning of a convolutional
kernel without using FFTs, additional overhead is incurred to match the appropriate

matrix entry with the correct search vector entries. Similar results are obtained for
the other convolutions in G.48).
The interested readeris referred to [16], [19], and [24] for additional details.
252 Three-DimensionalFE-BI Method \302\246
Chapter!

.60E+06

0.0 5.0 10.0 15.0 20.0 25.0 30.0


Number of rows/columns in grid (/V)

Figure 7.10 Comparison the operation count (complexity) for a traditional


of
matrix-vector versus an FFT-based
product matrix-vector product. A'

is the number of nodes in each direction of the surface mesh.

7.4.4 Examples

For the convenienceand educational development of the reader, information

on obtaining a fully functional FE-BI brick program is provided at the end of this
chapter. This program, LM RICK
\320\222 (a.k.-d. Low Memory Brick) utilizes the optimiza-
optimization
techniques described previously for brick element implementations of the FE-BI

method. These features include:

1. Automatic mesh generator for rectangular cavity-backed patch and slot

antennas.

2. Precomputation of only necessary FE interactions and a custom sparse


matrix-vector product that utilizes this minimal data set.
3. FFT-based matrix-vector product for the BI sub-matrix.

This program can compute the Radar Cross Section (RCS) of a cavity-backed
patch or slot antenna, the radiation and gain pattern of a probe-fed conformal patch
or slot antenna, and the input impedance of such antennas. Due to the efficient
implementation, a large, finite array of similar or dissimilar elements may be mod-
modeled. Also, since the FE method is used in the cavity volume, the dielectric fill may be
inhomogeneous on a brick-by-brick basis.
For example,considercalculating the RCS attributed to a 4cm x 3 cm patch
antenna recessed in an 8 cm x 6 cm x 0.1cm cavity that is homogeneously filled with
dielectric (er = 2.0). This antenna is shown in Fig. 7.11. Several different discreiiza-
Section 7,4 \302\246
Cavity Recessed in a Ground Plane 253

FlgoFe7.ll Cavity-backed patch antenna geo- \321\204


mclry used in the examples. I

tions are used to illustrate the computational scaling associatedwith the FE-BI
method using a BiCG-FFT solver. All calculations are made on a Pentium 60-

MHz personal computer running Linux, and the iterative solver tolerance was set
at 0.01. The first case involves 0.5cm x 0.5cmx 0.1cm bricks which resulted in 411
unknowns. A total run time of 0.38 hours was required to compute the RCS for this
problem at 201 different frequencies. For the same geometry, using
0.5cm x 0.5 cm x 0.05cm bricks, the number of unknowns was 932 and the run
time was 1.63 hours. When the grid cell size was 0.5cm x 0.5cm x 0.025 cm, the
number of unknowns was 1974 and the corresponding run lime was 5.86 hours.

Hence, as the number of volume unknowns increases, the memory consumption


increaseslinearly, since the volume unknowns are associated with a sparse matrix
whereas the solve time increases super-linearly (e.g., between linear and quadratic).
In the previous paragraph, the xy grid was kept constant (i.e., 0.5 cm x 0.5cm)
as the brick height was varied. Let us now consider the situation as the brick cell

height is kept constant at 0.1 cm and the xy grid which is relevant to the boundary
integral is varied from 0.5cm x 0.5 cm to 0.25 cm x 0.25 cm and then down to
0.125cm x 0.125cm. As noted above, the solve time for the 0.5cm x 0.5cm x
0.1cm grid cell size was 0.38 hours for 201 frequencies. In the case of a
0.25 x 0.25cm x 0.1cm grid cell size, the corresponding solve time is 2.22 hours
(with 1781 unknowns). Finally, for the smaller cell size of 0.125cm
x 0.125 cm x 0.1 cm the number of unknowns grows to 7401 with a corresponding
CPU time of 14.5 hours. Examining the ratio of solve time to number of
unknowns, it is clear that boundary integral unknowns scale more favorably
than volume unknowns. For example, in each set of runs, the cell grids of
0.5cm x 0.5cm x 0.025 cm and 0.25 cm x 0.25cm x 0.1 cm have roughly the

same number of unknowns ; however, when the increase in unknowns occurred


within the volume rather than the surface, the solve time was roughly twice as

long! The efficiency associated with the boundary unknowns, even though they
254 Three-Dimensional FE-BI Method \302\246
Chapter 7

lead to a fully populated matrix, is due to the use of a FFT-basedsolver and the
improved convergence of the solver for dense systems. If a more traditional matrix-
vector product is used for the boundary integral portion, it would result in a

dramatically less scaling. Figure 7.12 illustrates


favorable the computed RCS far
these latter three discretizations. Note the RCS resonant frequency changes indi-

indicating that the increased surface unknowns are refining the solution, e.g.. improv-
improvingthe estimate of the fields within the volume and on the aperture. In the cases
where discretization was held constant but the volume was subdivided
the aperture
into thinner
layers, no appreciable change was observed in the radar cross-section
computations. Hence, the thin thickness of the cavity was sufficiently sampled
using a single layer of elements whereas increasedaperture discretization improved
accuracy.

-10.0

-20.0

-30.0

-40.0

0.5 x 0.5 x 0.1 cm


-50.0
0.25 x 0.25 X 0.1 cm

0.125x0.125x0.1 cm
-60.0

-70.0
3.0 3.5 4.0 4.5 5.0

Frequency [GHz]

Figure 7.12 RCSasa function of frequency forx 3cm patch antenna


a 4cm printed
on a 8 cm x 6 cmx 0.1cm cavity filled with a dielectric having et = 2;
three different volume discretizations arc shown.

To illustrate the accuracy of


the FE-BI method, the radar cross sectionof a
6 cm x 2 cm cavity
x 5 cm with a 3 cm x 2 cm aperture was computed using the FE-
BI program cited above and a method of moments (MoM) program for the same
geometry. The cavity was assumed to be filled with a material having a dielectric

constant of (r = 2.17.Figure7.13illustrates the comparison for normal incidence us


a function of frequency and polarization (the 00-pol denotes the case where the
electric field has no component).
\321\204
Section 7.4 \302\246
Cavity Recessed in a Ground Plane 255
-10.0

-20.0

-30.0

-40.0

FEM:00-pol
-50.0 MoM: 0</>-pol

MoM: 00-pol

-60.0 I

4.0 5.0
3.0

Frequency [GHz]

Figure 7.13 Comparison between the FE-BI method and the method of moments
6 cm x 5 cm x 2cm cavity
for computing the radar cross section \320\276\320\223\320\260
with a 3 cm x 2cm slot aperture, \\MoM data are Courtesy of James T.
Aherk, 1097.]

Another example of the use of a planar FE-Bl computer program involves the
design and analysis of finite confonnal antenna arrays. The FE-Bl method described
above, where brick elements are used for subdividing the cavity volume and the FFT
is used to handle the Block Toeplitz matrices, allows for the simulation of rather
large antenna arrays on a modest computer.Figure7.14illustrates the gain pattern
of a 5 x 5 patch antenna array. Each antenna element was similar to the one shown
in Fig. 7.11 except that the dielectric constant of the substratewas ef
= 13.9 and the
center-to-center spacing was 10cm in each direction. The pattern at er = 2.45GHz
was allowed to radiate broadside and steered 30 degreesfrom broadside using stan-
standard beam steering techniques. Note that when the FE-BI method is used to simulate
this array, all mutual coupling is included in the solution, thus increasing the fidelity

of the model. This example was run on a Silicon Graphics workstation using
approximately 14 MB of RAM. It involved 10,275 unknowns and took approxi-
approximately seven minutes to compute each pattern!

7.4.5Aperture in a Thick Metallic Plane

A variation of the planar cavity-backed geometry consideredabove is the case


where the lower metallic surface is removed. The resulting geometry has two aper-
apertures: one in the upper metallic plane and one in the lower metallic plane. This
geometry, which resembles an aperture in a thick metallic plane, is shown in Fig.
256 Three-Dimensional FE-B1 Method \302\246
Chapter 7

10.0

-10.0 -

-20.0 -

-30.0-;

-40.0
-90.0 -60.0 -30.0 .0 30.0 60.0 90.0

Deg(Theta)

Figure 7.14 Radiation pattern of a 5 x 5 patch antenna array for broadside and 30
degrees broadside.
\320\276\320\223\320\223
[Courtesy of Jeffrey Tackell. 1997.]

7.15. For this case, the lower aperture results in an additional BI integral resulting in

the FE-B1 equation

V x Emt VxW, -
!\302\246
/c5efEinl
\342\200\242
W,
- x z G, \342\200\242
\342\200\242
z x ECTll dS' dS
\\dV k\\ f f [w, J
is* is'+L

=
-kl\\ f rw,xz-G2-zxEcxtlrfS'f/5
J /7'l+/rI G-52)
is-is'-*-

where S+ and indicate


5\320\223 the upper and lower apertures, respectively.
Solution of G.52) parallels that of G.44) except that two boundary integrals are
involved. As is the case with G.44), G.52) can be discretized using brick elements and

Slot aperture

Figure 7.15 A ihick metallic plane with limit


Region = 0 and - =
in the \320\263
apertures -;,, planes.
Section 7.5 \302\246
Cavity-Backed Antennas on a Circular Cylinder 257

a FFT-based matrix-vector product schemeto yield maximum efficiency. Note that


the intra-aperture interactions possessthe discrete convolutional property required
for FFT solution. There are no inter-aperture interactions since the boundary in-
integrals for each interaction are decoupled. Also, since brick elements are used, the
same boundary matrix can be used for both apertures except for different geo-
geometrical parameters as appropriate.

Figure 7.16 illustrates the coefficient as a function


transmission of aperture

length (A/X) while the of


otherthe is
dimension
aperture fixed at = 0,U.
\320\222 This
example illustrates the capability of the FE-BJ method to compute the transmission
coefficientfor an aperture in an increasingly thick plane; a difficult problem for the
more traditional Methodof Moments since such a solution approach would require
the computationally expensive cavity Green's function [27].

6.0
f\\

f = 0.01A
150

4.0 - f =0.101

\342\200\242
/;t'\\ f=0.2Sl
3.0 -

2.0 ' '


d
/
, 1
'^ \321\207*

7.16 Transmission coefficient as a


1.0 -
Figure
function of aperture length lor normal inci-
1 I 1
incidence. The circles correspond to data for a 0S>
thin in
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
conducting plane presented [25].
Jin and Volakis Aperture length (A/A)
[After [26\\. <C, IEEE \342\204\226t.\\

The interested reader is referred to [25] and [26] for further details concerning
the application of the FE-B1method to transmission problems.

7.5 CAVITY-BACKED ANTENNAS ON A CIRCULARCYUNDER

A conformal antenna analogous to the case of a metallic


geometry, ground plane,
involves an infinite metallic
cylinder. An example of one such antenna
circular is

shown in Fig. 7.17 where the patch elements are printed on a cavity-backed dielectric

substrate that is recessed in the cylinder. In this case. G.44) may be discretized using
cylindrical shell elements similar to the brick. Cylindrical shell elements possess both

geometrical fidelity and simplicity for cylindrical-rectangular cavities. Figure 7.18


illustrates a typical shell element which has eight nodes connected by 12 edges\342\200\224four

edges aligned along each of the three orthogonal directions of the cylindrical coor-
coordinate system. Each element is associated with 12 vector shape functions given by
258 Three-Dimensional FE-BI Method \302\246
Chapter 7

Figure 7.17 Illustration of four conformal


patch antennas mounted flush with (he sur-
surface of a metallic cylinder.

W,2(p. 0, z) = W,(p, 0, z; -, 0,.,z,,+), W43(/>, 0, =


\320\263) Wp(p. 0. z; \342\200\242,
0,. z,. -)
. 0.2) = W,(p, 0. 2;.. 0,,zh. -). Wg7(p, 0, z) = W/p, 0,2:., \321\204,,
sh. +)

Wi4(/>. 0, r) \302\273 ,2; pft, -. 2,, +). WmO\302\273,0. 2) = . 0.2; a,. -.2,,-)
. 0,2) = .2; zb, -).
\321\200\320\271,., W67(/t>. 0,2) = , 2: p.,,.,
\321\204, +)
\320\2634,

W,j(/>, 0.2) = W.(p, 0.2; ^. 0r... -f). W26(^, 0.2) = W.(/>. 0.2: A. 0f... -)
W4g(p. 0. r)
= W.(/>. 0.2; 0,...
\320\233. -). W3T(/>.0.2) = W.(/>. 0. 2; 0,.
\320\233. \342\200\242,
+)

G.53)

where W\302\253is associated with the edge wliich is delimited by local nodes (l.k) as
shown in Fig. 7.18 and (p. 0.2) denotethe cylindrical coordinates. As can be inferred
from G.53), three fundamental vector weight functions are required for the complete
representation of the shell element. They are

p. 0 2:J5.0.2.*) =
^
(p - p)(s- 2) 0 G.54)

W.(p, 0,2; 0,
\320\224, f, s) =
j-
(p - p)@
-
0) \320\263
Section 7.5 \302\246
Cavity-Backed Antennas on a Circular Cylinder 259

Figure 7.18 Cylindrical shell clement.

where the element parameters {pa,/\321\215/,,0/,0,-,;/,.-/) are defined in Fig. 7.18,


t =
Pi,
\342\200\224\320\260
pa,
\342\200\224 and h = z,
\321\204\320\263\342\200\224\321\204,
- zh. Each local
edge is distinguished by r,
p, \321\204,
and as given
\320\273 in G.53). The which in the definition of the ^-directed
^- term appears
weight G.54) is essential in satisfying the divergence-free requirement, i.e., so that
W,
= 0. Note that
V \342\200\242 as the radius of the cylinder becomes large, the curvature of
these elements decreases, resulting in weight functions which are functionally similar

to the bricks presented by Jin and Volakis [26].


In addition to the use of cylindrical shell elements, the dyadic Green's function
in G.44) must also be changed to account for the metallic boundary condition on the
cylindrical metallic surface. Specifically, the dyadic Green's function Ft,2) must now

satisfy both the radiation condition and the Neumann boundary condition at p =a

pxV x ^2(w. z\\a,


\321\204, \321\204',:')
= 0 G.55)

This dyadic Green'sfunction may be expressed exactly as an eigenfunctionseries[9]


1
\320\223
(kA2l
B\321\217J,,^

1
G.56)
<2*)%f=

\320\247\320\243)

where G-*(a. =
G*'-(a,
\321\204,\320\263) z),
\321\204,
=
\321\203 kpa and kp
= Jk\\
- ft?.
260 Three-DimensionalFE-BIMethod \302\246
Chapter 7

A different representation of this Green's function was used in [28]. In this,


G.56) is converted into a creepingwave series using Watson's transformation. The
resulting series converges in two terms or lessfor large radius cylinders, and hence ii
is more efficient than G.56) as the radius increases. For reference, G.56) typically
requires approximately 2k0a series terms where k0 is the wavenumber in free space
and a is the radius of the cylinder. For large radius cylinders(greater than 5\320\233
or so),
60 series terms or more for each component in G.56) are required for each combi-
combination of test and source edges! The creepingwave series expansion for the cylinder
Green's function is presented in [29].
Replacing the Green's function components in G.44) with G.56). the fields
associated with a conformal cavity recessed in a metallic cylinder can be solved in
the same manner as the planar metallic ground planeexample given previously. This
includes the use of an efficient FFT-based solver since the boundary integral sub-
matrix is Block Toeplitz provided the surface is discretized using uniform patches

(e.g., cylindrical shell elements).

7.5.1 Examples

The FE-BI method has been applied to scattering and radiation by cavities
recessed in an infinite metallic cylinder. Consider the scattering by a cavity-backed

patch antenna recessed in a circular cylinder. The patch is 3cm x 2cm and is placed

-10.0

-20.0

-30.0

-40.0

-50.0 a= cm
200

\320\276 Planar

-60.0 I . . . .

4.0 5.0 6.0


Frequency [GHz]

Figure 7.19 Radar cross section of a conformal patch antenna for transverse mag-
magnetic (TM) The various curves correspond
polarization. to different
cylinder radii, and the antenna is Bush mounted to the surface of the
cylinder.
Section 7.5 \302\246
Cavity-Backed Antennas on a Circular Cylinder 261

\302\253>10em

14.95cm
\302\253=

a.20cm

- \342\200\242
\342\200\242 = 200 cm
\320\262

Figure 7.20 Smith chart for a 3.5cm x 3.5cm patch antenna for frequencies
between 2.4 and 2.7 GHz. The cylinder radius varied between I Ocm
and quasi-planar B00 cm).
262 Three-Dimenstonal FE-B1 Method \302\246
Chapter 1

on top of a dielectric filled cavity that is 6 cm x 5 cm x 0.07874 cm. The dielectric

constant of the substrate = 2.17, and


is er the antenna is flush mounted with the

cylinder surface. One of the strengths of the FE-BI method for singly curved con-

formal antennas is its ability to investigate the effects of curvature on the RCS of a

patch such as the one shown in Fig. 7.17. The RCS for various cylinder radii

(denoted as \"a\") is shown in Fig. 7.19.


One antenna parameter expected to be sensitive to curvature is the input impe-
impedance. This is because the input impedance is strongly influenced by the modal fields

within the certain excitations (polarizations),thesefields are strongly


cavity, and for
dependent on curvature. One such polarization is obtained by feeding the patch
antenna parallel to the cylinder axis (e.g.,if the patch is centered on (\321\204,
z) = @,0),
this feed is along the z-axis).Figure 7.20is a Smith Chart representation of the input
impedance for a 3.5 cm x 3.5cm patch antenna on different cylinder curvatures. The
input impedance is strongly influenced by the curvature of the cavity/antenna.

7.6 RECENT ADVANCES IN THE FE-BI METHOD

The material presented thus far in this chapter was developedduring the late 1980s

and early 1990s. Recent published results have


the extended the flexibility of the FE-
BI method as more research groupshave applied the method to a greater variety of

problems. The examples presented previously in this chapter involved a cavity-


backed inhomogeneous region recessed in either a ground plane or an infinite metal-
metallic
cylinder. The principle reason for these restrictionsinvolves the computational
cost of the boundary integral. For the cases considered so far, boundary integral
solvers are available which are efficient in terms of both memory consumption and
compute cycle requirements.
Below we mention other extensionsof the FE-BI method. Among them is the
finite element-periodic moment method (FE-PMM) approach for the simulation of

infinitely periodic structures with unprecedented flexibility. Also, recent papers have
introduced nonplanar, noncylindrical boundaries in the implementation of the FE-
BI. Further, techniques such as the surface of revolution (SOR) and fast multipole
method (FMM) offer increased flexibility with moderate cost. An overview of recent
publications follows. The interested reader should consult the cited papers for imple-

implementation details.

7.6.1 Finite Element-Periodic Method of Moments

An extension of theFE-BI method was proposedby McGrath and Pyati [30]


and Lucas and Fontana [31] to investigate phase antenna In
arrays. this, the finite
element method is used to model an antenna element that forms the unit cell of an

infinitely periodic array. Traditionally, periodic moment method (PMM) formula-

formulationshave been used to investigate the performance of infinitely periodic antennas.


However, the use of pure integral equation formulations limit the utility of the PMM

approach to layered substratesand simpler antenna elements which can be easily


modeled using the method of moments.
Section 7,6 Receni
\302\246 Advances in the FE-Bi Method 263

The flexibility of the finite element method permits the antenna element to be
constructed on arbitrary materials and can have arbitrary shape. Rather than using
one of the integral equations introduced at the beginning of this chapter, the finite
element-periodic moment method (FE-PMM)utilizes a periodic integral equation.
Specifically. Floquet modes are used to periodically replicate the boundary and
radiation conditions imposed on the unit cell. Note that in this application, periodic
boundary conditions are not only applied to the aperture of the antenna, but also on
all nonmetallic sides of the unit cell.
An example of the use of the FE-PMM method is illustrated in Fig. 7.21 where
a notch radiator is immersedwithin the unit cell mesh. This is an exampleof the type
of detail readily modeled via the finite element method. Figure 7.22 illustrates the E-

Coax
aperture
Substrate surface mesh
Substrate
Ground

plane face

(a) (b)

Figure 7.21 Finite element mesh for flared notch antenna: (a) unit cell illustrating
coax aperture; (b) substrate surface mesh. [After MvGruth [32].\\

-o-3.0 GHz
-*-4.0GHz
-a-5.0GHz

Figure7.22 E-plane active reflection coeffi-


scan angle for the notch antenna
versus
coefficient 10 20 30 40 SO 60 70 80 90

shown in Fig. 7.21. [After McGrath Scan angle \320\262


(deg), E plane
[32].]
264 Three-DimensionalFE-BI Method \302\246
Chapter 7

plane reflection coefficient for an infinite


active array of such radiators. The H-plane
active coefficient
reflection for that same array is given in Fig. 7.23.
Recently, the FE-PMM method has been applied to a variety of additional
applications such as reinforced concrete, artificial dielectrics, and pyramidal cone
foam absorber [33]. The FE-PMM analysis has also been used to study bandgap
materials [34].

o- 3.0GHz
-*-4.0 GHz
-a- 5.0 GHz

Figure 7.23 H-plane active reflection coeffi-


10 20 30 40 50 60 70 80 90 coefficientversus scan angle lor the notch antenna
Scanangle \320\262
(deg), H plane Shown Fig. 7.21. [After McCralh
in [i.'].|

7.6.2 Finite Element-Surface of Revolution Method

At the beginning of this chapter, the most general form of the finite element-

boundary integral formulation was presented.As noted, the BI consumes much of


the CPU and memory requirements unless specialtechniques are used such as the
BiCG-FFT. This is, of course, a specialized implementation of the BI since it is
only
applicable to planar or cylindrical apertures. Other specialized finite element-bound-
integral
element-boundary formulations have been presented in the literature, and these offer a
practical solution methodology for an important class of problems.One approach
couples the interior finite element solution with an external region that is a surfaceof
revolution (SOR). This special boundary permits the efficient enforcement of the
boundary and radiation conditions and can be used to simulate several cubic wave-

wavelength geometries. The major uses of this finite element-surface of revolution (FE-
SOR) method are to compute the scattering by complex objects and the radiation by

axisymmetric antennas.
The FE-SOR beenconsideredby Boyse and Seidl [35] for scatter-
method has
scatteringby nearly axisymmetric bodies. Their formulation involved the use of a finite
element expansion of the fields within the SOR and an eigenfunction series expan-
expansion in the azimuthal direction for the boundary integral. They used node-based
tetrahedral elements within the mesh and a Fourier modal-azimuthal expansion
utilizing Hermite polynomials.
A group at the Jet PropulsionLaboratory (JPL) has also published a seriesof
articles [36]. [37]. [38], [39] detailing their hybrid FE-SOR formulation. In their

implementation vector edge finite element expansion functions are used rather
Section 7.6 Recent
\302\246 Advances in the FE-BI Method 265

than the node-based elements in [35]. Also, dissimilar boundary integral basis func-
functions are used and the finite element and boundary integral regions are coupledusing
the method presented in [36]. Recently Zuffada, Cwik,and Jamnejad presented an
analysis of a circular waveguide antenna with a choke collar [39].In this, they used a
magnetic field finite element formulation

-^ -(VxT-VxH)- \342\200\242
G.57)
\320\275]^-?\320\263.

and a combinedfield integral equation

G.58)

where the integro-differential operators ZM and Zj are those used by the body of
revolution (BOR) integral equation formulation [40]. Also in [39], the essential
boundary conditionsare enforced using

G.59)

and tetrahedral finite elements are used to discretize G.57). Finally, the SOR basis
functions are given by

pit)
G.60)
Pit)

where Tk{t) is a triangle function spanning the kth annulus on the SOR. The vari-
variables I and \321\204to the local
refer SOR coordinates, and n refers to the mode in the
Fourier series expansion.
As an example of the implementation in [39], Fig. 7.24 illustrates a circular

waveguide antenna with a choke collar. The H-planepattern for this antenna (see
Fig. 7.25) was computed using the FE-SOR method to determine the fields in the
shaded region shown in Fig. 7.24.
We note that in [39], a mode matching technique was usedto model the feed.
This is in effect a separate integral equation applied across the feed aperture and

Choke ring

Figure 7.24 JPL circular waveguide wilh


choke ring. The FE-SOR method was used
lo model the shaded prca. [After Zuffada.
Cwik, and Jumnejiul[\320\251, \320\223
IEEE. 1997,\\
266 Three-Dimenslonal FE-Bl Method \302\246
Chapter 1

Circular waveguide with choke ka = 2.4


10

H plane

I
-10

I Meas (JF
-20 \\

-30 4 >

vA/ Figure 7.25 H-plane pattern for the choked


-40 circular waveguide shown in Fig. 7.24. [After
45 90 135 180
Cwik, and Jamnejad [39]. '?, IEEE
Zuffada,
\320\262
\342\204\2267.\\

illustrates application for the FE-BI method.When


an additional using the mode
matching technique, the sum of the modes synthesizes the required Green's function

used to represent the feed structure as discussed previously.

7.6.3 Fast Integral Solution Methods

Recent advances of the FE-BItechnique for three-dimensional geometries


involve use of fast integral equation solution methodsto speed up the computation
of the Bl matrix-vector products in the iterative solver. Among them, the fast multi-
pole method (FMM) to implement the boundary integral for two-dimensional [41]
and three-dimensional FE-BI formulations [42] has been considered. This particular
formulation is suitable for planar cavity-backed structures (it is based on the same
approach as was presentedearlierin this chapter), and the authors demonstrated a
speed-up in solve time for large aperture problems.The requirement for a large
aperture, hence a large separation between boundary groups, is inherent in the
FMM. Details on the theory and implementation of the FMM are given in

Chapter 8.
In the FMM, the boundary integral unknowns are grouped together and the
interaction between groups is computed rather than the interaction between indivi-

individual unknowns. These group interactions are then disaggregated to provide the

necessary interactions. The resulting order of operationsand memory are O(N\


rather than O(N2)t where or < 1.5. Hence, an improvement in the solution efficient
provided the distance between groupsis large. If this condition is not met. accuracy
issues arise since the interactions of elements can no longer be approximated via
FMM grouping. Also, the overhead associated with grouping the boundary
unknowns and then disaggregating them can be detrimental to the method's effi-
but
efficiency, this is problem dependent.
Another fast integral method, referred to as the adaptive integral method
(AIM), was recently introduced by Bleszynski el al. [43].AIM is also an O(N[^}
method, and ils improved speed is based on mapping the boundary integral mesh on
Appendix I \302\246
Explicit Formulas for Brick Elements 267

a uniform grid so that the O{N log N) FFT can be used for carrying out the matrix-
vector products.Researchon fast integral methods is currently very active, and the
reader should consult future publications on the development and application of
these techniques.

APPENDIX 1: EXPLICIT FORMULAS FOR BRICK ELEMENTS

The FE-BI equation for a cavity recessed in a metallic plane is given by G.45) and is
reproduced here for convenience

-*o[ f [WrzxGr2xz-Wy]</SVS}
J5 h' J

=/ !nl +/\320\223. / = 1,2,3 n G.61)


As suggested in G.47). the linear system G.61) can be written in matrix form as

{if) T[Q\\
[0]] {?}bi} _
~ {/=\302\253\342\200\242}
-
1 J ( }
(?f) [[0] [0]J (?*} {/\342\200\242?\"}

where [A] represents the FE matrix and [Q\\ denotes the boundary integral sub-
matrix. In this appendix, we give explicit formulas for the \\A] matrix and entries
formulas that permit numerical evaluation of [Q\\. We begin with [A] Cor anisotropic
media. The correspondingformulae for isotropic media are presented in Section 5.4.

Element Matrices for Brick Elements: Anisotropic Media

The unknown field. E, can be written in terms of three components corre-


corresponding to edges lying parallel to the x-, v-, and r-axes,e.g..

Ev
= G.63)
^\320\225\320\223\320\260\320\235\320\251\321\205,\321\203.=)

where ?(/, are the unknown field expansion coefficients associated with thev'th local
edge which is parallel to the (/-axis where d = [x, y,:) and Wdi
are the expansion
functions associated with each local edge. In G.63), there are four field expansions
per brick for each component corresponding to the 12 edges of the brick. The super-
superscript e denotes the global element number, and the index j denotes the local edge
numbers.The localedgesare defined as follows:
268 Three-Dimensional FE-BI Method \302\246
Chapter 7

edge) = node) -* node2 edge2= node4


\342\200\224>
nodej

edge3 = nodes -* node^, edge4= nodeg


\342\200\224>
node7

edge5 = node, -* node4 edge6= nodes


-> node8
edge7 = node2 -* node3 edgeg
=
node6 -> node7

= node!
edge\302\273, -* node5 edge10= node2
-\302\273\302\246
node6
= node4 -* nodeg edge12= node3
-* node7 G.64)

Figure 7.26 illustrates this brick element, the local node numbers usedin G.64), and
the edge lengths, {/?, \320\251,,
\320\272\320\263].

Figure 7.26 Brick elcanent. The numbers indi-


indicate the local node numbers.

As mentioned in Chapter 7, for brick elements it is sufficient to compute and


store a 12x 12 matrix for each layer of the mesh that corresponds to the first term of
G.61), which for anisotropic media is written as

Vs)]dVe G.65)

A second set of 12x 12 data arrays for each layer of the mesh is required for the
second integral in G.61), which for anisotropic media becomes

G.66)

The material parameters are now tensors and are given by

Mxv (J-xy M

G.67)
fi-zy fizz .

G.68)
Appendix I \302\246
Explicit Formulas for Brick Elements 269

The interactions associatedwith G.65) can be written as

, \320\251\320\230\321\203\321\203
+
^?[\320\220],+^\320\2251\320\233\320\237\320\267'

/<\302\246?,.-
^\321\202\320\260\320\263

Kfirx
= +
^[/f]4

[ ]\320\267 G.69)

For G.69), the values for the brick element matrices are given by

2 -2 1 -1 2 -2 _j

2 2 -1 1 1 \302\273
-1 -2

1 -1 2 -2 2 - 1 2 1

1 1 -2 2 1 -2 1 2

1 -1 1 1 1 I 1 -1

1 1 -1 -1 1 -1 1
G.70)
1 -1 1 1 1 - 1 -1

1 1 -1 -1 1 1 -1 1

2 1 -2 -1 1 1 -1 -1

2 -1 2 1 1 1 -1 -1
[4s =
1 2 -1 -2 1 \342\200\224
1 1 1
1 -2 1 2 1 _ 1 1 1

The reader is cautioned that these matrices are not the same as the ones used for
isotropic media (see Chapter 5).
The element matrix terms for anisotropic dielectric media are given by
270 Three-Dimensional FE-B1 Method \302\246
Chapler 7

B) ~ fM B) ~
1\320\251\320\226\320\265\321\205\321\205 hxhyhUx>, l2) _
~ \320\254'\321\205\320\251\320\265\320\245;
\321\202
x y \320\273\320\263
36 24 \320\2424\342\200\224

G.71,
24 36 24

<21 _ \342\200\224
l~\" \302\246\"*'\342\200\242'\302\246 l l Jl
24 24 36

where [L], is given in Chapter 5 and [L]2 is given by

2 12 1'
2 12 1
12 12 G.72)
12 12

Boundary Integral Contribution

The boundary integral present in G.61) can be written as

/Bl = - t f x W,)
K\302\253
-
BS0)
\342\200\242
(i x W,)] dS' dS
Jsis
/Bl =^ -2 f f [W, \342\200\242
(zx Go x z) \342\200\242
Ws]dS'dS
G.7?)
is is
rBl _.
^Bl(l) ,

where Co is the free-space dyadic Green's function defined in Chapter 1. From this

Green's function, we recognize that

'I
/Bhi)
2\321\217
f [W,
\342\200\242
(zx 7 x ^- \320\272 dS'dS
z) \342\200\242
W,] \302\246 G.74)
JSJS

and

G.75)

where =
\320\233 + {}>- y'f.
\321\203/(\321\205-\321\205'J
For convenience,we can expressthe surface basis functions as

= -
Wv(x, v; x, y. s) -^ (x x) G.76J

where the parameters S, x, and \321\203


can be adjusted to match the basis functions
presented in Chapter 2. Using G.76) in G.74), we find

' _ rBI(l)
rBI(l) \342\200\224
T
, ,\320\2221(|)
'.v.v *yy G.77)
Appendix 1 \302\246
Explicit Formulas for Brick Elements 271

where

/Bid) _ Slsi

G.78)
dx'dy'dxdy

in which the limit subscripts \"/\" and denote


\"\320\270\" the lower and upper limits, respec-
respectively. The integrals in G.78) cannot be evaluated in closed form;however, they may
be evaluated numerically without much difficulty. For the self-cell case, the integra-
integrationshould be broken inlo subregions over which the terms within are
(\342\200\242)assumed
constant. The pertinent integral is then of the form

-ffTf
h', Jv| J..-; J.v;
\342\200\224jj-\\dx'dy'dxdy

*>'** + \320\273'^'\320\273*
-\320\223\320\242\320\223\320\223
Jv; Jr; .!.\302\273\342\200\242;
\302\246!..\302\246! 1T-r\\ J Jv, J J \320\273

G.79)

where the latter integral can now be evaluated analytically. Specifically we have [27]

dx'dy'dxdy
/(.v-\320\273-'

\\\\x lnCv + +
\320\233) .1' ln(.v + R)\\ dx dy

\\x-x')(y~y')
i(.v-.v')ln[Cv-.\302\273')

-
X')(V
(\302\246\320\243
- /)[(.Y- .V') \320\243\320\233]_*

boundary term, G.75). requiresfurther


The second manipulation before eval-
evaluation. applying the divergence theorem twiceand converting
Specifically, one of the
gradient operators from V to V (e.g.. V'Go = \342\200\224
VG0). we have

:/5't/5 G.81)

Evaluating G.81) using G.76), we get

\320\2372)
= +,81B)
/\320\2221|2\302\273 G.82)
272 Three-DimensionalFE-B1Method \302\246
Chapter 7

where

_ \302\246 *A
fB\\Q)

slss rBI(c)
xy
2khffX)
G.83)
.Bi(c)

/BlB) _ \302\246 V\302\273


/Bile)
\"
\"+2Ag(A!;J

Once again, /BI(C> is evaluated using analytical formulas for the self-celland standard
numerical integral techniques for all other interactions.
For triangular elements, the interested reader is referred to [5], [44], [451,
and [46]. These papers provide various coordinate-free evaluations of the self-cell

integrals.

APPENDIX 2: BRICK FINITE ELEMENT-BOUNDARY INTEGRAL COMPUTER


PROGRAM

To assist the reader in understanding some of the difficult concepts in Chapter 7. Dr.
Leo \320\241
Kempel Corporation has made available
of Mission Research a fully func-

functional finite element-boundary integral computer program (source code)and user's

guide via the World Wide Web:

http://www-personal.engin.umich.edu/~volakis/
Some features of this computer program. LMBRICK (a.k.a. Low
Memory
Brick), are as follows:
1. Automatic mesh generator for rectangular cavity-backed patch and slot
antennas.
2. Precomputation of only necessary FE interactions and a custom sparse
matrix-vector product that utilizes this minimal data set.
3. FFT-based matrix-vector product for the BI sub-matrix, hence the ability to

model large arrays.


4. Probe and plane wave sources. Probes are parallel to the .v-. >\302\246-,
and r-axes.
though they may be arbitrarily placed within the antenna cavity.
5. Capability of computing monostatic and bistatic radar cross section (RCS)
and antenna gain.
6. Capability of computing input impedance at each feed probe.
7. Ability to model layers, blocks, or fully inhomogeneous (brick-by-brickl
isotropic dielectric and magnetic materials.
8. Carefully documented to aid the user in understanding the method.

Readers are encouraged to report problems and suggestions for further


cap-
capabilities to Dr. Kempel via his e-mail address: l.kempel(\302\253;ieee.org.
References 273

REFERENCES

[1] \320\241. Tai.


\320\242. Generalized Vector and Dyadic Analysis. IEEE Press, New York,
1992.
[2] G. E. Antilla and G. Alexopoulos.
N. Scattering from complex three-dimen-

three-dimensional
geometries by a curvilinear hybrid finite element-integral equation
approach. J. Opt. Sot: Am. A, 11 D): 1445-1457, April 1994.
[3] J. M. Jin, J. L.Volakis, and J. D. Collins. A finite element-boundary integral
method for scattering and radiation by two- and three-dimensional structures.
IEEE Antennas Propagat. Soc. Mag.. 33C):22-32. June 1991.
[4] R. E. Collin. Field Theory of Guided Waves. IEEE Press, New York, 1991.
[5] S. M. Rao. D. R. Wilton, A. W. Glisson. Electromagnetic scattering by
and
surfaces of arbitrary shape. IEEE Trans. Antennas Propagat., 30:409-418, May
1982.
[6] J. R. Mautz and R. F. Harrington. A combined-source formulation for radia-
radiation and scattering from a perfectly conducting body. IEEE Trans. Antennas

Propagat., 27:445-454, July 1979.


[7] E. Arvas, A. Rahhal-Arabi, A. Sadigh, and S. M. Rao.Scattering from multiple
conducting and dielectric bodies of arbitrary shape. IEEE Antennas Propagat.
Soc. Mag., 33:29-36,April 1991.

[8] A. F. Peterson. The interior resonance problem associated with surface integral
equations of electromagnetics: numerical consequences and a survey of reme-
remedies. Electromagnetics, 10C): 293-312, July-September 1990.
[9] \320\241.\320\242.Tai. Dyadic Green's Functions in Electromagnetic Theory. IEEE Press,
New York, 1994.
[10]J. M. Jin and J. L. Volakis. A hybrid finite element method for scattering and
radiation by microstrip patch antennas and arrays residing in a cavity. IEEE
Trans. Antennas Propagat., 39A1): 1598-1604, November 1991.
[11] J. Gong,J.L.Volakis, A. Woo, and H. Wang. A hybrid finite element bound-
boundaryintegral method for analysis of cavity-backed antennas of arbitrary shape.
IEEE Trans. Antennas Propagat., 42(9):1233-1242,September 1994.

[12] T. Eiberl and V. Hansen. Calculation of unbounded field problems in free space
by a 3D FEM/BEM-hybrid approach.J. Eleclromagn. Waves Appl., \320\251\\):\320\254\\-

78, 1996.

[13] D. M. Pozar. Input impedanceand mutual coupling of rectangular micro-


strip antennas. IEEE Trans. Antennas Propagat., 30F): 1191-1196. November
1982.
[14]J. T. Aberle D. M. Pozar. Accurate and versatile
and solutions for probe-fed

microslrip patch antennas and arrays. Electromagnetics, 11A-2): 1-19. 1991.

[15] J. Gong and J. L. Volakis. An efficient and accurate model of the coax cable

feeding structure for FEM formulations. IEEE Trans. Antennas Propagat.,


43A2): 1474-1478,December1995.
[16]J. L. Volakis and K. BarkeshlL Applications of the conjugate gradient FFT

method to radiation and scattering. In \320\242. \320\232.


Sarkar. editor. Application of the
274 Three-Dimensional FE-BI Method \302\246
Chapter\"

Conjugate Gradient Method to Electromagnetics and SignalAnalysis, Chapter 6.


Elsevier, New York, 1991.
[17] J. L.Volakis, T. Ozdemir, and J. Gong. Hybrid finite element methodologies for

antennas and scattering. IEEE Trans. Antennas Propagate 45C):493-507.


March 1997.
[18] J.
\320\241 Reddy. M. D. Deshpande, C. R. Cockrell,and F. B. Beck. Analysis of

three-dimensional cavity-backed aperture antennas using a combined finite ele-


element method/method of moments/geometrical theory of diffraction technique.
Technical Report 3548. NASA Langley Research Center, Hampton. VA.
November 1995.

[19] T. J. Peters and J. L. Voiakis. Application of the conjugate gradient FFT


method to scattering from thin planar material plates. IEEE Trans. Antennas

Propagat., 36:518-526, April 1988.


[20] A. C. Polycarpou.J. T. Aberle. and C. A. Balanis. Analysis of arbitrary shaped
cavity-backed patch antennas using a hybridization of the finite element and
spectral domain methods. In IEEE Int. Symp. on Antennas and Propagation
Digest, pp. 130-133. Baltimore, MD. July 1996.

[21] T. Ozdemir, 1997.Personalcommunication.


[22] C. J. Reddy, M. D. Deshpande, C. R. Cockrell, and F. B. Beck. Radiation

characteristics of cavity backed aperture antennas in finite ground plane using


the hybrid FEM/MoM technique and geometrical theory of diffraction. IEEE
Trans. Antennas Propagat., 44A0): 1327-1333,October 1996.
[23] J. L. Volakis. Iterative solvers. IEEE Antennas Propagat. Soc. Mag., 37:94-96,
December 1995.
[24] J. M. Jin and J. L. Volakis.Biconjugate gradient FFT solution for scattering by

planar plates. Electromagnetics, 12A): 105-119. January-March 1992.


[25] J. R. Mautz and R. F. Harrington.Electromagnetic transmission through a
rectangular aperture in a
perfectly conducting plane. Sci. Rpt. 10, Air Force
Cambridge Res. Labs., Hanscom AFB, MA. February 1976. Contract F19628-
73-C-0047.
[26] J. M. Jin and J. L. Volakis. Electromagnetic scattering by and transmission
through a three-dimensional slot in a thick conducting plane. IEEE Tram,

Antennas Propagat., 39D):543-550, April 1991.


[27] K. Barkeshli and J. L. Volakis. Electromagnetic scattering from an aperture
formed by a rectangular cavity recessed in a ground plane. J. Electroma^n.
Waves Appl., 5:715-734, 1991.
[28] L. \320\241
Kempel and J. L. Volakis. Scattering by cavity-backed antennas on a

circular cylinder. IEEE Trans. Antennas Propagat., 42:1268-1279, September


1994.
[29] P. H. Pathak and N. N. Wang. An analysis of the mutual coupling between

antennas on a smooth convex surface. Technical Report 784583-7, Ohio Stale

University ElectroScience Laboratory, Columbus, OH, October 1978.


[30] D. and V.
T. McGrath P. Pyati. Phased array antenna analysis with the hybrid
element method.
finite IEEE Trans. Antennas Propagat., 42A2): 1625-1630.
December 1994.
References 275

[31] E. W. Lucas and T. P. Fontana. A 3-D hybrid finite element/boundary element


method for the unified radiation and scattering analysis of general infinite

periodic arrays. IEEE Tram. Antennas Propagat., 43B): 145-153, February


1995.
[32] D. T. McGrath. Phase array antenna analysis using hybrid finite element meth-
methods. PhD thesis. Air Force Institute of Technology, Dayton, OH, 1993. AFIT/

DS/ENG/93-4.

[33] D. T. McGrath. Prediction of high power and wideband transmissivity of


periodic structures.In AMEREM Con/., Albuquerque. NM. May 1996.
[34] D. T. McGrath and V. P. Pyati. Periodic structure reflection and transmission
calculation using the hybrid finite element method. In IEEE Int. Symp. on
Antennas ami Propagation Digest, pp. 142-145. Baltimore.MD, July 1996.

[35] W. Boyse and A. Seidl. A hybrid finite element method for near bodiesof
revolution. IEEE Trans. Mag.. 27:3833-3836. September 1991.
[36] T. Cwik. Coupling finite element and integral equation solutions using
decoupled boundary meshes. IEEE Trans. Antennas Propagate 40:1496-1504.
December 1992.
[37] T. Cwik, \320\241
Zuffada, and V. Jamnejad. Efficient coupling of finite element and

integral equation representations for three-dimensionalmodeling.In T. Itoh, G.

Pelosi, and P. Silvester, editors. Finite Element Software for Microwave


Engineering. John Wiley and Sons. New York, 1996.
[38] T. Cwik, \320\241
Zuffada, and Modeling three-dimensionalscatterers
V. Jamnejad.
using a coupled finite element-integral equation formulation. IEEE Trans.
Antennas Propagat., 44D):453-459, April 1996.
[39] T. Cwik, \320\241
Zuffada. and V. Jamnejad. Modeling radiation with an efficient

hybrid finite element -integral equation-waveguide mode matching technique.


IEEE Trans. Antennas Propagat., 45(l):34-39, January 1997.

[40] L. N. Medgeysi-Mitschang and J. M. Putnam. Electromagnetic scattering from


axially inhomogeneous bodies of revolution. IEEE Trans. Antennas Propagat.,
32:797-806, 1984.
[41] S. S. Bindiganavale and J. L. Volakis. A hybrid FE-FMM technique for elec-
electromagnetic scattering. IEEE Tram. Antennas Propagat., 45A): 180-181,
January 1997.

[42] N. Lu and J.-M. Jin. Application of the fast multipole method to finite-element
boundary-integral solution of scattering problems. IEEE Tram. Antennas

Propagat., 44F):781 1996.


-786, June
[43] E. Bleszynski. Bleszynski, and T. Jaroszewicz.AIM:
M. Adaptive integral
method for solving large-scale electromagnetic scattering and radiation prob-
problems. Radio ScL, 31E): 1225- 1251, 1996.

[44] D. R. Wilton. S. M. Rao, A. W. Glisson. D. H. Shaubert, \320\236. Al-Bundak,


\320\234.

and \320\241. Butler.


\320\234. Potential integrals for uniform and linear source distributions

on polygonal and polyhedral domains. IEEE Trans. Antennas Propagat.,


32C):276-281, March 1984.
276 Three-Dimensional FE-BI Method \302\246
Chapter 7

[45] R. D. Graglia. On the numerical integration of the linear shape functions times
the 3D Green's function or its gradient on a plane triangle. IEEE Trans.
Antennas Propagat., 41A0): 1448-1455, October 1993.
[46] T. F. Eibert and V. Hansen. On the calculation of potential integrals for linear

source distributions on triangular domains. IEEE Trans. Antennas Propagm


43A2): 1499-1502, December 1995.
Fast Integral
Methods
S. Bindiganavale and J. L. Volakis

6,1 THE ADAPTIVE INTEGRAL METHOD

When iterative methodsare used for the solution of hybrid finite element-boundary
integral (FE-BI) systems, such as that in D.133), most of the CPU time is typically

spent in computing the matrix-vector product appearing in

[Gv]\\H^undar\302\273\\ + [G]{4>\\ = [V\\ (8.1)

which is repeated here from D.132). The greater CPU time is due to the fully

populated matrices [Gv] and [G]. Consequently,the CPU time for carrying out the
matrix-vector products is 0{Nl),whereas the corresponding CPU time for sparse
matrices approaches0{N).As the total number of unknowns
usual, N denotesin the

domain and Nh refers to the unknowns. It was noted in Chapter


mesh boundary 7
that the FE-BI method is a robust solution approach which combines the best
features of partial differential and integral equation methods. The BI reducesthe
computational volume to a minimum without compromising accuracy. However, the
O(NJ) CPU and memory growth of the BI compromises the method's utility for

large-scale simulations. Efforts have therefore beenongoingto reducethe computing


resources consumed for the solution of the BI subsystem. Use of the FFT in the case
of Toeplitz subsystems reduces the CPU and memory requirements down to
O(Nf, log N/,)- This approach was discussed in Chapter 7 and has been generalized
to triangular grids by Gong et al. [I]. More recently, a procedure was introduced lo
cast arbitrary surface grids onto overlaying equivalent uniform grids, as illustrated in
Fig. 8.1. As a result, the resulting boundary matrix is again Toeplitz and the FFT
can
be used to carry out the matrix vector products, One such procedurewas introduced
by Bleszynski et al. [2] and employs equivalent delta sources to represent the fields
exterior to the radiator or scatterer. These delta sourcesare placedon an equi-spaced
grid and are evaluated by matching moments of the fields generated by the original

277
278 Fast Integral Methods \302\246
Chapter 8

\320\263\320\243\320\263\320\273\"\320\233
\320\220
\320\233
\320\233
\320\233\320\233\320\220/\320\243\320\233
\320\233
\320\233
\320\220
\320\233
\320\260-\320\220

Figure 8.1 Mapping of the original triangular


triangular discretization
Original
grid to a uniform AIM grid.

surface currents/fields and those due to the delta sources on the new equi-spaced
grid. For planar BI surfaces,the delta sources are placed on a rectangular grid,
whereas in three dimensions the equi-spacedgrid is cubical. Therefore, three-dimen-
FFTs
three-dimensional must be used in the same manner as done with A'-space methods [3]. [4].
The method introduced by Bleszynski et al. [2] is referred to as the Adaptive
Integral Method (AIM) and has been implemented for scattering and radiation [2],
[5]. In all these applications of AIM, the FFT is only used to compute the matrix-
vector products associated with the far zone fields, whereas the near zone interac-
interactionsare computed using the original fields/currents on the BI surface. That is. [G]in
(8.1) is decomposed as

where [Gnear] is a banded or sparse matrix and [Gfar] is Toeplitz in form. With this
decomposition, the overall CPU and memory requirements of the BI subsystem are
reduced down to <?(/V/J5)or Jess.The constant in front of /V^5 though varies with (he
bandwidth of [Gnoal.].Typically, [Gnear] includes those elements which are a distance
of 0.3A. to 0.5A.from the testing point to maintain the accuracy of the solution.
Clearly, this that AIM is more efficient
implies for large-scale simulations involving
bodies which span many wavelengths. However, it has been observed [5] that AIM is
particularly attractive even for small bodies which include fine details as is the case
with antennas. In some situations with only 1150 BI unknowns, as much as tenfold
reduction in CPU and memory has been observed.
AIM belongs to the category of matrix compression methods. At this time
other techniques are also being investigated to speed up the matrix vector products
and reduce memory requirements of large-scalesystemswhich may involve hundreds
of thousands of volume and boundary integral unknowns. Among these, the fast
multipole method (FMM) is being considered by several research groups and \320\272
discussed below.
Section 8.2 Fast
\302\246 Multipole Method 279

8.2 FAST MULTIPOLEMETHOD

The fast multipole method (FMM) is an efficient approach for calculating the
matrix-vector products associated with dense subsystems as that in (8.1). One of
the first applications of FMM was given by Barnes and Hut [6] for calculating
interstellar body interactions. More recently, the FMM was used quite successfully
to handle very large-scale interactions [7], [8]. The reader is referred to [9] for an

early overviewof the FMM.


The first application of FMM to acoustics and electromagnetics appeared in
[10] and [11]. These articles demonstrate the O(NJ,5) CPU and memory requirement
of FMM. However, even lower CPU requirements are possible by using the multi-

multilevel FMM discussed in [12] or the windowed FMM [13].


Below we describe the FMM method at a tutorial level for two-dimensional

applications. The reader is cautioned that the speed-up achieved by the various
compressionschemescan compromisethe accuracy of the solution [14].

8.2.1Boundary Integral Equation

For simplicity let us consider the solution of the boundary integral

= 2#!nc -
#.(r) +J- \\ r'\\)dl\\ \320\263'
\320\235(\320\263\320\223\\\320\2720\\\320\263 \320\263,
\320\244(\320\263') \320\241
\320\261 (8.2)
2 ic

where denotes
\320\2571\320\237\320\241 the \320\242\320\225
incident/excitation #u2)(-) is the zeroth-order
field and
Hankel function of the second kind. This integral is a specialization of D.119)
and can be combined with the finite element system D.133) for the solution of H.
and \320\244.
Physically, (8.2) describes the field relation at the aperture of a dielectri-
cally filled groove, as illustrated in Fig. 8.2. It is constructed by enforcing the
condition

jf
(\302\253\320\276./'\32

(Boundary/aperture)
\320\223

Figure 8.2 Geometry of the groove recessed


in a ground plane. FEM domain
280 Fast Integral Methods \302\246
Chapter 8

on the aperture, where

-r'\\)dx', \320\263'\320\265\320\241
\320\263.

(8.3)

From D.76) we can identify that

bHz
_ jk0 +jko p
Lx
Zo
and from D.114)

where \320\233/,denotes magnetic current over the aperture.


the equivalent As discussed in
Section 4.4.5, for image theory
(8.3) accounts which resulted in the introduction of

the factor of 2 in the right-hand side of (8.2).In the next few subsections we examine
the discretization and evaluation of the integral (8.3) using various versions of the
FMM. This exposition provides a close look at the characteristics of FMM for
electromagnetic applications and demonstrate the features which are responsible
for the CPU speed-up and memory reduction.

8.2.2 Exact FMM

In accordance with the FMM (see Figs. 8.3 and 8.4), the Nh boundary
unknowns introduced for the discretization of (8.2) or (8.3) are subdivided into

Basis elements

Testelement
Group center

_ . . Group/'
Source element _ _ ^
\321\203 j (sourcegroup)
x

Global origin
Figure 8.3 Computation of the boundary inic-
center
Group gra| mairix vector product using exacl FMM.
Section 8.2 Past
\302\246 Multipole Method 281

Group \320\223/.
;
Group Group /Group
K+1 Kl / K-1
min

4 $

't,
Far group |
< 4mn
\320\263,,.
=r> Near group

Figure 8.4 Compulation of the boundary inicgral matrix vector product using exact
FMM.

groups with each group assigned Mi, unknowns. Thus, a lotal of Lb \302\253\302\253 Nh/M/, groups
are constructed. The key step in all FMM procedures is to rewrite the integral (8.3)
as a product of terms each being a function of r (observation point) or r' (integration

point) but not both. In this manner, the evaluation of the integral is carried out by
considering the group-to-group interactions separately from the intergroup interac-
interactions. Beyond the math, this breakdown of interactions/operations can be viewed in
the context of the manager-worker model. Basically, we can view each group as
managed by the center element with the workers comprising the elements of the
group. Communication/interaction among the groups takes place through the man-
managers who in turn interact with the group elements. The decompositions reduce the
direct interdependence of each group member with the other elements belonging to
different groups, and this is at the heart of the CPU speed-upafforded by FMM. As
stated earlier, though, there are inherent approximations as part of the group decom-
decomposition process which must be understood in order to assess the accuracy of each
FMM algorithm.
To achieve the decomposition of (8.3) into a product of functions in and
\320\263 r',
we first invoke the addition theorem to rewrite the Hankel function as

+ r, - \\r,
-

(8.4)

where denotes
\320\263\321\206- the between the centers of the / and /' groups, as illus-
distance
illustrated in Figs. 8.3 and 8.4. Also, \321\204\321\206-
and are the angles between the vectors ru>
\321\204\320\263>\320\263
and \342\200\224
with
\320\263/- \320\223/ the x-axis, respectively. The source and observation points r/ and 17
have their origin at the center of the /' and / groups, respectively, while r' and are
\320\263

measured from the origin. Typically, the semi-empirical formula

6/2 = k0D + 5 ln(A-0?> + \320\273\320\263) (8.5)

is used to truncate the sum (8.4), where D is the diameter of the circle enclosing the
groups. This is consistent with the radius of convergence associated with the Hankel
function. In general, Q/2 = Mh, ensures convergence. (It will be shown that Q is ihe
282 Fast Integral Methods M Chapter8

number of directions in which the radiation of the group is sampled. With M/, being
the number of basis elements in the group, Q = 2M/, satisfies the Nyquist criterion
for faithful replication of the source group radiation.)
Next we introduce the Fourier integral of the Bessel function

'\302\246\"\"-'tf0 (8.6|

and in conjunction with (8.4) we can now rewrite (8.3) as

4\320\273\320\263
ikr'^ (8.7)
j2)r

where = \320\2720(\321\205\321\201\320\276$\321\204
\320\272 is measured
+ \321\203$\321\202\321\204) from the In this,
\320\273\320\263-axis.

= .(r')e\"\"i'dl'
f (8.8)
J \320\263

is identified as the far-field pattern of the source group and

0/2
= \320\2351~\\\320\272\320\276\320\2631\320\263)\320\265-\320\235\321\204-\321\204><1+1\022)
\320\223\342\200\236@)
]\320\237 (8.9)

is referred to as the translation operator providing the group-to-group (/ to /')


interactions. From (8.7)-(8.9).we observe that integral (8.3) has now been
the
decomposed into terms which separate out the dependence on \320\263
and r'. The final
evaluation of \320\257!0\021
proceeds by discretizing the integral over \321\204
to yield the expression

= (8.10)
\320\257\320\223@
-^ \320\224*? 7>(*,) VriQ^\"

which is the radiated fields from some location in the source group /' to a point
within the receiving group /. Note that =
2tt/Q
\320\220\321\204 indicates the angular spacing
between the propagation vectors of plane waves emanating from a group. Thus
= \\...Q, whereas = As mentioned earlier,
\321\206&.\321\204,
\321\204\321\207 q\342\200\224 k4 \320\2720(\321\205\321\201\320\276&\321\204\321\207+\321\203&\321\202\321\204\321\207).
the number of plane wave directions is setequal to twice the number of elements
in thegroup (Q = 2Mh), thus satisfying the Nyquist sampling theorem with respect
to the integration over \321\204.
Given the above steps, the exact FMM procedure for

carrying out the matrix-vector product can be summarized as follows:

1. Compute pattern of the source group (aggregation).Mathematically, this

corresponds to evaluating given in (8.8). The evaluation of


\320\243\320\263(\321\204\321\207) frV(<fr,j
for a single source group and at a single direction requires Mh operations,
corresponding to the number of elements in the group (the integration over
the line segment is performed as a summation). Consequently for Lh groups
and Q directions for each group, the operation count is QMf,Lb.
2. The next step is to employ the translation operator to evaluate the pattern
of a source group at the center of the test group. Mathematically, this
operation amounts to computing the coefficient A^q) \342\200\224
\320\243\320\263(\321\204\321\
The evaluation of /4/@,,) involves an operation count of QLJ;, where again

L/, denotes the number of groups and Q is the number of directions.


Section 8.2 Fast
\302\246 Mullipole Method 283

3. Finally, at the receiving group, the fields are redistributed (disaggregation).


Mathematically, this amounts to computing the expression \320\257'^'\320\241\.")
as given

in (8.10). Evaluating the sum at a single point requires Q operations. Thus


for Lh groups, each containing Mf, unknowns, the operation count is

QLhMh.

From the above we concludethat the operation count of the above three
total

steps is CxQM,,Lb + C^QL\\. Also, the near (by this it field


is meant that groups in the
near vicinity of each other are treated using the standard moment method procedure)
operation count is o(O(N>,Mh)-On choosingQ ~ O(Mh). the operation count of the
three steps reduces to C\\M,,Nh + Ci(NllMh). On setting Mh ~ */\320\251, implying
Lb = s/Wj (an optimal choice), the final operation count is NJ,'5 and this should be
compared to the usual N% operation count of direct solvers. The reduction of the

operation count from O{NJ;)down to O(N,JS) is indeed dramatic. An appreciation of


the CPU reduction can be acquired by setting, for example, N,, = 2000 which is a
relatively small number of elements. However, further improvements can still be
achieved by nesting groups leading to multilevel FMM [12].

B.2.3Windowed FMM

In the exact FMM, the translation operation between groups assumed iso-
tropic radiation. However, it is suggestive that the groups would interact strongly

along the line joining them and lessso in other directions. Indeed, it was shown in
[13] that the translation operator could be contemplated as composed of a geome-
geometrical optics (GO) term (along the line joining the source and test group) and two
diffraction terms associated with the shadow boundaries of the GO term. To
illustrate the validity of this concept, we plot in Fig. 8.5 the translation operator

Groups 1 and 3 j
\342\200\2420=54;
tn =5\320\224

Q=54;/>,. =15 A

0 = 54; rr =45A
Groups 1 and 8 -\\

120 180

Figure 8.5 The Translation operator for difTcreni groups on the boundary of a 5(U
wide groove: 750 Bl unknowns; 27 groups.
284 Fast Integral Methods \302\246
Chapter 8

for different group separation distances along the groove of width For this
50\320\257.

example, the number of unknowns on the boundary was 750, resulting in 27

groups. As seen, the \"lit\" region of the translation operator narrows as the
group separation distance is increased, eventually displaying the predictable sine
function behavior for large group separation distances. The tapering off of the
translation operator from a value oscillating around 2 down to zero for larger
\342\200\224 values
\321\204 \321\2041\320\223 is characteristic of the geometrical optics plus diffraction terms in
the context of traditional high frequency methods. We may also comment that this
high frequency model enables the identification of a lit region even for groups
which are not widely separated (for example, see Fig. 8.5 for the translation
operator between groups 1 and 3).
The key characteristic of the windowed FMM is the exploitation of the dimin-
diminished value of for - in the windowed FMM, the com-
large
\320\242\321\206(\321\204) Basically,
\321\204\321\204\321\206.

computation of for these


\320\242\321\206(\321\204) angles is avoided altogether. This can be accomplished
by multiplying with the filter (windowing)
\320\242\321\206\\\321\204) function

where

and a is a taper factor to be specified. Note also that was


\320\224, selected to provide a

larger bandpass window when i\\,,- is smaller as dictated from high frequency analysis.
The discretized plane wave expansioncan now be written as

=
\321\217?\321\201\302\253, (8.13)
1\320\243\342\200\236.{\321\204\321\207)\320\24211.(\321\204,)\320\243\320\263(\321\20411)\320
\320\220\321\204^2
_^\320\276
\320\266

By taking into account only the nonzero sector of 1\320\2431\320\223(\321\204),


the operation count of the
translation process is now reduced to C^LJ,~ N},/M}, with the corresponding total

operation count given by CxMhNt, + C4(Nl/Mf,). Groupingthe unknowns into N\\l}


elements per group results in a total operation count of O(jV*/3). This should be

compared with the operation


\320\236(\320\233^/2)
count of the exact FMM.
The computation of the boundary integral matrix vector product by employing
the windowed FMM is depicted pictorially in Fig. 8.6 illustrating that the filler
function has the effect of eliminating plane wave mteractions at directions away
from the line joining the interacting groups.

8.2.4 Fast Far Field Algorithm (FAFFA)

This is an approximate version of the FMM since the algorithm is based on


introducing the large argument approximation of the Hankel function. That is, the
approximation

(*0|r
- r'|) - \302\253-J^'-'-r*.
U
_^ \320\265-\320\234-,\321\202\342\200\236, (8\320\2334)
Section 8.2 Fast
\320\250 Muftipote Method 285

Group *r \\
Group Group /Group

Window

Far group Near group

Figure 8.6 Computation of the boundary integral matrix vector product using
windowed FMM.

is used. As shown in Fig. 8.7, r/</ is the distance between the center of the test group /
and the center of the source group /'; is the
/\342\200\242\342\200\236/< between
distance the Hth source
element group center; and rlm
and its is the distance between the wth test element and
its group center.

Group tar,
Group Group /Group
K+1 \320\232
,' K-\\

i e

ru I >

Far group < => Near group


r/r | 4nin
Figure 8.7 Compulation of the boundary integral matrix vector product using the
FAFFA.

The introduction expansionof the large argument


necessitates that the FMM
procedure be used which are very well separated. However.(8.14)
only for groups
allows for the immediate decoupling of the test-source element interactions, thus
enabling the computation of the matrix-vector product for far-field groups with a
reduced operation count. This is illustrated below by going througli the steps of the
FAFFA corresponding to the three steps of the FMM:
1. The aggregation of source elements in a single source group now involves

Mb operations, corresponding to the number of elements in the source


grcnip. Specifically,

(8.15)
286 Fast Integral Methods \302\246
Chapter 8

and since the above aggregation needsto be done for all source groups, the

count becomes O[(Nh/Mh)Mt,]


~ O(Nh),where Ni,/Mh
operation represents
the total number of groups.Also, this operation, being dependent only on
the test group rather than the test element, needs to be repeated for all
Q = Nh/Mb lest groups leading to a total operation count of O(NllMh\\
for aggregation. It should be noted that use of the large argument expan-
expansion, rather than the addition theorem for the Hankel function, results in the
aggregation sum being a function of the test group also (Vtt) unlike the
exact FMM where the aggregation sum is a
function of source group only
(K/(<?)), Thus, the technique by which the Exact FMM and the FAFFA
reduce the operation count differ in the fact that while in the exact FMM the

aggregation sum is characterized by a source group {/') and a direction (<j>)


which is not interwined with the test group direction, the aggregation sum in
the FAFFA is characterized by the source group (/') and the test group (I),
2. The main advantage of FAFFA is due to the faster computation of the
translation operator. We have

(8.16)
where in the FAFFA the translation operator simplifies to

(8.17)

This should be compared to the sum (8.9) for the exact FMM. Clearly,
(8.17) needs to be done only at the group level and involves O(Nl/M},)
operations for allsource group combinations, making
possible test and it

the least computationally intensive step.


The disaggregation or redistribution process is again the operation

v
\302\261 Ae-^'i-*\342\204\242
? (8.18)
r=\\

Since this operation involves only the source group instead of the source
element, it needs to be done for each source group, implying O(Nh/M,,)
operations to generate a row of the matrix-vector
single product. To gen-
Mh rows,
generate corresponding to a test group, operation
the count would be
O(N/,). With N,JMh test groups, the operation count is O(Nl/Mb).
Consolidating the above three steps for the FAFFA algorithm, we have

Op. count ~ C, NhMh + C-, -j- (8.19)

where the first term refers to the operations associated with the near-field terms. As
before, Mh = and
\321\203/\320\251 the total operation count is 0{Nl5). While the operation
count for this algorithm could be further reduced down to O(Nl;^) by performing
the process of \"interpolation\" and \"anterpolation\" as described in [15] for very large

objects, we found that the accuracy deteriorated for the considered applications.
Hence, only the O(N};5) version was used.
Section 8.3 \302\246
Logic Flow 287

8.3 LOGICFLOW

The operation counts described in the previous section for the various algorithms are
illustrated with the help of flow diagrams and sections of code from the computation
of the matrix vector productsfor the far groups. Figures 8.8 and 8.11depict the flow

diagram and code for computing the matrix-vector product in the exact FMM. It is
seen in Fig. 8.11 that each of the aggregation, translation, and disaggregation
operations consists of a single multiplication which is described below.
\302\246
The aggregation operation consists of the product of an entry of the trial
vector (represented as Dum(J) in Fig. 8.11) with an aggregation factor,
represented in Fig. 8.11 for the /th element and Kin direction as
SrcGc(J,K). This is given by SrcGc(J,K) = \320\264,\302\253. **>'r*.>
#\302\253<*\302\253***+\342\200\242'\342

where Aj is the length of the J\\h discretization element, is the


\321\204\320\272 Afth
radiation direction and Tj(!r is the direction vector of the 7th element, mea-
measured from the center of the group (JGr) it belongs to, The result of the
aggregation operation yields a term characterized by only the source group
and radiation direction (represented in Fig. 8.11 as V(JGr,K)).
\302\246
The translation operation involves the multiplication of the aggregation
sum, V( JGr ,K), with a translation factor,
represented in Fig. 8.11 for the
IGnh test group, JGnh sourcegroup, and Kill radiation direction by
Trans (iGr , JGr ,K). This is given by

\320\232/2
= -*\302\273*\320\273*+*>2)
TransdGr. JGr. K) \320\265~\321\210> H);\\k{)riarJlir)
J2
u=-K/2

where is the
\321\204\320\272 Kth radiation direction, rtOfJGr and are the distance
\320\244\321\216\320\270\320\260\320\263
and angle between the groups. The result of the translation
IGnh and JGnh
operation yiekLs a term dependent only on the test group and radiation
direction (representedin Fig. 8.11 as GrGr (IGr , \320\232) ).
\302\246
The disaggregation operation involves the multiplication of the translation
sum, GrGr (IGr ,K), with a disaggregation factor, represented in Fig. 8.11
for the /th test element and the A'th radiation direction by TestGc (I,K).
This is given by

TestGcd. = \320\265-*<\320\233
\320\232)

where r/0> is the direction vector of the /th clement measured from the
center of the group (IGr ) it belongs to. The result of the disaggregation
operation yieldsa term dependent only on the test element alone and is the
contribution to the /th entry of the product vector.
The windowed FM1V1 differs from the exact FMM in the translation phase, and
this is illustrated in Figs. 8.9 and 8.12.These figures illustrate that the windowed
FMM achievesits reducedoperation count by eliminating some of the directions in
which plane wave interaction takes place. The innermost loop in the translation
phase has an operation count which is a constant A5-25 in our simulations. 20 in
288 Fast Integral Methods \302\246
Chapter 8

Aggregation Translation Dlsaggregation

8M Bourc* group wunfcH - 1 (Start S\302\253ttM 1


group oounler\302\273 SM Mil group counter. | (Start
\302\253itr.
irwdrsltM group)
\302\253nh
the Ihvl \320\274\321\210\321\201\320\262
group) (SttrtM\342\204\226li\302\273llrulMt

uurce gmup countar a


S\302\253t Stt dlrealon oounur \342\200\242
1 (Start
Set direction counter \302\273
1 (Star)
w\342\204\226
\342\204\226\342\200\242
>rst raotatton \302\253notion) (Stan w\342\204\226
ln\302\253
ftrvtsource v\302\253ti
it\302\273
\320\273\320\263\321\217
radMUondraaian)
group)

ooumw > 1 (Stan


Sm \302\253Itmirii
counter= I (Start
Set \302\253lament
SMKnellonoounMr>1 (Sun Km tmk \302\253tomint
\302\246Mb
\342\204\226\342\200\242
In \342\204\226\342\200\242
\302\246\320\231
\302\253ret
oaele element in \342\204\226\342\200\242
\342\204\226\342\200\242
aourca group)
With\342\204\2269
UnitniOUIIDfifflf>MwOn) wt group)

?
Perform tre atla
(or \320\262 for \302\246 \320\253 lorailngleelamnioflrw
tingle \302\253IMMnt
or \342\204\226\342\200\242 tingle p\302\253lr
IM group lor a alngla emctan
wurce group tor a imgie direction and tett QttmpBimd t\302\273
a
tlntf \342\200\242
dlnoHon (Operation {OparMUnoouni-l)
(OpTOHoncounUI)

Aggregation op. count Translation op. count Disaggregation op. count

(M) (M)

Figure 8.8 Sequence of operations be


\321\216 performed in the Exacl FMM.
Section 8.3 \302\246
Logic Flow 289

Set ten group counter = 1


(Start with the first test
eraup)

Sol source group counter - 1


(Start with ihe \320\250
saun\302\273
group)

Set direction countei \302\246


1 (Start
the (Irat radiation direction)
w\342\204\226

\320\233
M M

Pei<t>rm translation operation


Opsl Ops
(or \321\217
single pair ot source
and lost groups and lor a
alnpie direction (Operation
count -1)

to leet group counter >


total number ol groups

Translation op. count


Yes
Hgure 8.9 Sequence of operations lo be
End translation
performed in ihe translation process of the
Go to dlsaggregatlon
windowed FMM. <; \320\234\320\263
290 Fast Integral Methods \302\246
Chapter 8

Set test group counter = 1 Set ITestGrNew = 1


(Start with the first test (To Indicate that this
group) is a new test group)

Set test element counter = '


(Start with the first basis
element In the test group)
-ttz
Set source group counter = 1
(Start with the first source
group)

Is
TestGrNew
(Isthis a new
test group ?

/Aggregatlon\\
&
\\TranslaUon \320\242\320\2231
IOp\302\273

Set source element counter = 1 (Start


Perform disaggregation
with the first basis element In the
for a single test element
source group)
& source group

Perform aggregation for a single Increment source group


source & source group
element
counter by one
A operation)

Increment source element counter


by one s source group \\ No
counter > totaf number

Is source element Set ITestGrNew=0


counter > total number (Indicates the test group Is handled
that
of elements In the for aggregation and translation and
group (M)? therefore only disaggregation needs
to be computed for each test element)

Perform translation for a single


source group & test group pair
A operation)
s test element
Total op. count counter > total
number of elements In
the group (M)?

= + N2| + Yes
M 1 M2I M )
llncrement test group counter by onel
A It \321\204

Aggregation Translation Disaggregation

/Matrix-vector product s test group counter >


total number of groups
V done

Figure 8.10 Sequence of opcraiions to be performed in the FAFFA.


Section 8.3 \302\246
Logic Flow 291

- Number of - Number of radiation directions


\321\201NQr groups; IQAngles
\321\201DPhi' Angular spacing between radiation directions
-
\321\201NBGr(NGt) Array containing number of elements in each group
- function which
\321\201GetGlobal gets the global element # given a group
\321\201 number and local element number
\321\201DMIn - Minimum distance beyond which two group are treated as far
\321\201 groups
\321\201Distance - Function which returns the distance between groups
\321\201Dum - Vector the matrix - In the CG algorithm
multiplying
\321\201 wilt be the search and residual vectors
AX - Array which
\321\201 Is the matrix-vector product
SrcGcTrans
\321\201 & TestGc - aggregation, translation & disaggregatlon factors
\321\201 respectively described in the text

IQAngles = 2*NGr
DPhi = 2*Pl/IQAngles

-
\321\201
Aggregation Operation count O(NM)|
doJGr=1,NGr
do \320\232
=
1,IQAngles
doJEI=1,NEIGr(JGr)
J = GetGiobal(JGr.JEI)
if (S.eq.'*') then
1! V{JGr.K) = V(JGr,K)+ Dum(J)*conjg{SrcGc(J,K))
\320\274
\320\274
V(JGr.K) = V(JGr,K)+ Dum(J)*SrcGc(J.K)
endif
enddo
enddo
enddo
- count 0(N^2/M)]
[c Translation Operation
dolGr= 1,NGr
doJGr=1,NGr
\320\223
If (Distance(IGr.JGr).gt.DMin) then
doK=1,IQAngies
if then
(S.eq.1*1)
GrGr(IGr,K) = GrGr(IGr.K) + conjg(Trans(IGr,JGr,K))*V(JGr,K)
M< else
M M GrGr(IGr.K)
= GrGr{IGr,K) + Trans(IGr,JGr,K)* V(JGr,K)
endif
enddo
endif
I enddo
enddo

- Operation count O(NM)


Dlsaggregation |
dolGr=1,NGr
do \320\232
=1, IQAngles
dol?l=1,NEIGr(IGr)
I = GetGlobal(IGr.lEl)
if (S.eq.'*1) then
_N AX(i) = AX(I) + GrGr(IGr,K)*conjg(TestGc(l,K))
else
M
AX(I)(
= AX(I) + GrGr(IGr,K)TestGc(l,K)
endif
dif
enddo
enddo
enddo

Figure 8.11 Code indicuting the compuiaiion tin; matrix-vector


\320\276\320\223 product in Ihc
Exact FMM.
292 Fast Integral Methods \302\246
Chapter 8

\321\201The symbols In this section of code represent the


\321\201 same quantities as in the code for the exact FMM

Translation
\321\201 - Operation count \320\236^\320\263/\320\234\320\233\320\231)
dolGr=1,NGr
( doJGr=1,NGr
If (Dlstance(IGr,JGr).gt.DMIn) then
doK = 1,IQAngles
If (S.eq.'\302\2731)then

pr(abs(Trans(K3>r,JGf,K)).eq.u) tjwni
continue
= GrGr(IGr.K) + *
GrGr(IGr,K) conjg(Trans(IGr,JGr,K)) V(JGr,K)
lendtf
N
else
M \320\274

continue
else
= GrGr(IGr.K) *
GrGr(IGr.K) +Trans{IGr,JGr,K) V(JGr,K)

endlf
enddo
encflf
enddo
enddo

Figure 8.12 Code indicating the computation of the matrix-vector product in ihc
translation phase of the windowed FMM.

[13]) and is a significant reduction from the corresponding operation count in the
exact FMM.
The technique by which the FAFF A achieves its speed-up is depictedin Figs.
8.10 and 8.13. It is seen that the FAFFA \"recycles\" the plane wave spectra of the
source group. For a given test group, the aggregation and translation operations are
performed only once for each source group, necessitating that only the disaggrega-
tion operation needs to be performed for each individual element of the test group.
Similar to the exact FMM, the aggregation, translation, and disaggregation pro-

processes consist of a single multiplication. However, the factors used in the three
processes and the method by which the reduced operation count is achieved are

different.

\302\246
The aggregation operation again consists of the product of an entry of the
trial vector with an aggregation factor, representedin Fig. 8.13 for the \320\233\3
element and /Grth test group as SrcGcdGr, J). This is given by

SrcGcUGr , J) = ^\320\265~^'\321\210\320\272\"'\320\254\321\202'
where is the length
\320\224\321\203 of the Jth dis-
discretization element, rJ(,rjck 's the unit vector along the line joining the source
and test groups while Tjj(jr is the vector along the line joining the source
element with its group center. Thus, an aggregation sum is formed for each
combination of source and test groups.
\302\246
The translation operation involves the multiplication of the aggregation sum
with a translation factor, representedin Fig. 8.13 for the /Grth test group
Section 8.3 \302\246
Logic Flow 293

The symbols section of code represent the


in this
same In the code for the exact FMM
quantities as
ITestGrNew is a counter which is set to 1 if a test group
is \"new\". This means that since the aggregation and
translation operations Involve only the test group
rather than the test element, they need to be done
only once (for the first element In the test group).
For the rest of the elementsIn the test group only
the dlsaggregatlon operation needs to be done (corresponds
to ITestGrNew =0).

dolGr=1,NGr
ITestGrNew = 1
dolEI = 1.NEIGr(IGr)
I = GetGlobal(IGr.lEI)
'doJGr=1,NGr
If (Distence(IGr,JGr).gt.Dmin) then
| If flTa8tarNew,eq.i}ttign~l

doJEI = 1,NEIGr(JGr)
J = GetGlobal(JGr,JEI)
if (S.eq.'*') then
M V = Dum(J)*conJg(SrcGc(IGr.J))
+V
else
V= Dum(J)*SrcGc(IGr,J)+ V
N endlf
N
M enddo
M if (S.eq.'\") then
GrGr(JGr) = conjg(Trans(IGr.JGr))*V
else
GrGr(JGr) = Trans(IGr,JGr)'V
endlf

if (S.eq.'\") then
AX(I)=AX(I) + GrGr(JGr)'conjg{TestGc(l,JGr))
else
AX(I) m AX(I) + GrGr(JGr)\"TestGc(l,JGr)
endlf
endlf
enddo
ITestGrNew = 0
enddo
enddo

Figure 8.13 Code indicating ihc computaiion of the matrix-vector product in the
FAFFA.

and JGrth source group by Trans (IGr,JGr) and given by

Trans (IGr ,JGr) = ?-\302\246**\"'\"\"\"\302\246r/\\Ao'VG,-./o>- Again, the translation opera-


operation needs to be done for each pair of test and sourcegroups.

The disaggregation operation involves the multiplication of the translation


sum, GrGr (JGr), with a disaggregation factor, represented in Fig. 8.13 for
the /th test element and the JGnh source group by TestGc (I, JGr ). This
is given by TestGc (I, JGr) = c~ik\"rjl\"-\"i'*'\"''.
where r/(,v./ is the vector
along the line joining the test element with its group center. It should be
noted that to compute the interactions between a pair of groups,the aggre-
aggregation and translation need to be done only once, and thus the crux of the
FAFFA is indicated with highlighted sections of the code in Fig. 8.13.
294 Fast Integral Methods \302\246
Chapter 8

8.4 RESULTS

The results presented in this section [16]. [17] are based on an FMM computer code,
incorporating a conjugate gradient solver, and executed on an HP 9000/750work-
workstation with a peak flop rate of 23.7MFLOPS.The geometry considered was the
rectangular groove shown in Fig. 8.2. Table 8.1 compares the execution time and
RMS error [14] of the standard FE-BIto the FE-Exact FMM, FE-FAFFA and the
FE-Windowed FMM (FE-WFMM) for grooves of widths 25A.. 35X and 50\320\273. The

depth of the groove was 0.35A. with a material filling of er = 4 and \342\200\224
1 and
\321\206\320\263 was
illuminated at normal incidence. The data reveal that the FE-FMMExacl offers
almost a 50 percent savings in execution time with almost no compromise in accu-

accuracy. While the FE-FAFFA is the fastest of the three algorithms, the RMS error was

substantially higher (> 1 dB). If the maximum tolerable RMS error is set at 1dB [14].
the FE-Windowed FMM is the most attractive option since it meets the error criter-
criterionand is only slightly slower than the FE-FAFFA.
Table 8.2 gives the exact Bl operation count rather than merely stating its
order. The knowledge of the constants associatedwith each exponent of Nh enables
us to compare the requirements of two algorithms which might have the same
order of operation count. In Table 8.2, /VNq is the number of near groups (groups
which are treated with the exact moment method procedureowing to their elec-
electrical proximity) which depends on the algorithm and the problem geometry. WN(i
is smallest for the FE-FMM^\"\" and largest for the FE-FAFFA. due to the use of
the far-zone Green's function in the latter. Table 8.2 also gives the number of
multiplications in a single BI matrix-vector product for the 50A. groove. For the

FE-FMME*act the number of multiplications is reducedby a factor of three over


the FE-BI. However, the actual CPU time is reduced by a smaller factor due to
computational overhead for the various function calls. Of interest is the compar-
comparison of the residual error as a function of the number of iterations in the conjugate
gradient solver. Such a comparisonis shown in Fig. 8.14 and it is seen thai the
curves for the FE-BI and the FE-FMM^\"\"* overlap to graphical accuracy whereas
the FE-WFMM shows a very small deviation from the exact result. Thus, ihc

TABLE 8.1 CPU Times and RMS Error of the Hybrid Algorithms

CPU time for Bl (minutes, seconds)


Groove Total
width Unknowns Bl Unknowns FE-B1(CG) FE-FAFFA FE-FMM11\342\204\2421 FE-WFMM

25A 2631 375 (8.48) (\320\227\320\2336) E.25) D,13)


35/. 3681 525 A6.34) E.55) A0,31) G,22)
SOX 5256 750 D5,1) A4.31) B6,18) A6.10)

RMS error (dB)


Groove Width FE-FAFFA FE-FMMKS!K1 FE-WFMM

25X 1.12 0.0752 0.6218


35a 1.2 0.1058 0.721
50\320\273 1.36 0.1123 0.843
Section8.4 \302\246
Results 295

TABLE 8.2 Exact BI Operation Count of the Hybrid Algorithms. /VA is


the Number of BI Unknowns. /VNti is the Number of near Groups (groups
which arc treated with the exact moment method). 0win is the Width of
the Window in the Windowed FMM.

Operations

(multiplications) Required
Operation count for BI Computation (as for BI Computation for

Algorithm implemented) the SOX Groove

FE-BI Ni 562500
-
FE-FAFFA (\\m + 2)Nl 4A- 2NNa)N,, NwX* 136890
FE-FMMbl<CI 180356
FE-WFMM D + Nm + QvM)Nl-n - NNaQWisKM 153780

hybridization of the FMM does not have any adverse effect on the condition of the
FE-BI system. The time for each iteration is reduced and the total number of
iterations remains approximately the same, resulting in reduced overall solution
time for the Fast BI algorithms.
The performance of the hybrid more stressing angle of inci-
algorithms at a
incidence is depicted in Fig. 8.15. For
example this the width of the groove calculation,
was 10A. and it is seen that the RMS error follows the same trend as for normal
incidence illumination. However, even for this smaller size aperture, the scalability
of the speed-up is maintained. The employed near-group radius was \320\246
implying
thai the matrix-vector products for groups separated by a distance less than a
wavelength was computed using the exact method of moments procedure.
Smaller near-group distances can be employed to reduce the CPU time even

further, and near-group distances down to 0.3\320\233have been found to yield suffi-
sufficiently accurate results.

\342\200\224FE-BI
- FE-Exact FMM

-1 - FE-Windowed FMM
10

-2
10

10

10
Figure 8.14 Convergence curves for the hy-
100 200 300 400 500 60
hybrid algorithms lor the groove of width 25/.. Iteration number
296 Fast Integral Methods \302\246
Chapter 8

30
FE - Bl DMin = 1 A
20
FE - Exact FMM

10 FE - Windowed FMM
5
\321\201\320\276
FE -FAFFA
\302\260
g

-10

-20

-30
30 60 90 120 150 180
Observation angle (deg)
(b)

RMS error (dB)


Groove Width FE-FAFFA FE-Exact FMM FE-WFMM
10\320\224 0.7627 0.1621 0.3291

Figure 8.15 Scalability of the hybrid techniques to smaller problems: (a) Problem
geometry, (b) Bistatic patterns, (c) Error table.

REFERENCES

[1] J. Gong,J. L. Volakis, A. Woo, and H. Wang. A hybrid finite element bound-
boundaryintegral method for analysis of cavity-backed antennas of arbitrary shape.
IEEE Trcms. Antennas Propagat., 42(9): 1233-1242, September 1994.
[2] E. Bleszynski, M. Bleszynski, and T. Jaroszewicz. AIM: Adaptive integral
method for solving large-scale electromagnetic scattering and radiation prob-
problems. Radio Sci.. 31E): 1225-1251. 1996.

[3] N. N. Bojarski. fc-space formulation of the electromagnetic scattering problem.


Technical Report AFAL-TR-71-5, U.S. Air Force, March 1971.

[4] C. Y. Shen, K. J. Glover. M. 1. Sancer. and A. D. Varvatsis. The discrete


Fourier Transform method of solving differential integral equations in scatter-
theory.
scattering IEEE Trans. Antennas Propagat., AP-37:1032-1049,August 1989.

[5] S. Bindiganavale. T. Ozdemir, J. L. Volakis, and J. Berrie. Broadband antenna


analysis and scattering from planar structures using a fast integral method. Int.
Radio Science Meeting, Montreal, CA. 1997.
[6] J. Barnes and P. Hut. A hierarchical O(N log N) force-calculation algorithm.

Nature, 324D):446~449, 1986.

[7] M. S. Warren and J. K. Salmon. Astrophysical \320\274-body simulations using hier-


hierarchical tree data structures. In Proceedings Supercomputing 92, pp. 570-576.
1992.
References 297

[8] H-. Q. Ding, N. Karasawa, and W. A. Goddard 111.Atomic level simulations


on a million particles: The cell multipole method for Coulomb and London
nonbond interactions. J. Chem. Phys.. 97F):430<M315, 1992.

[9] L. F. Greengard. The Rapid Evaluation of Potential Fields in Particle Systems.


The MIT Press. 1988.

[10] V. Rokhlin. Rapid solution of integral equations for scattering theory in two
dimensions. Journal of ComputationalPhysics,86B):414-439, 1990.
[11]R. Coifman, V. Rokhlin. and S. Wandzura. The fast multipole method for the
wave equation: A pedestrian prescription. IEEE Antennas ami Propagation
Magazine, 35C):7-l2, 1993.

[12] J, M. Song and W. C. Chew. Multilevel fast multipole algorithm for solving
combined field integral equation of electromagneticscattering. Microwave and

Optical Technology Letters, 10:14-19, 1995.


[13] R. J. Burkholder and D. H. Kwon. High-frequency asymptotic acceleration of
the fast multipole method. Radio Science, 31E): 1199-1206,1996.
[14] S. S. Bindiganavale and J. L. Volakis. Guidelines for using the fast multipole
method to calculate the RCS of large objects. Microwave and Optical
Technology Letters, 11 D): 190-194, 1996.
[15]\320\241.\320\241Lu and W. \320\241Chew. Fast far field approximation for calculating the RCS
of large objects. Microwave and Optical Technology Letters, 8E):238-24l, 1995.

[16] S. S. Bindiganavale and J. L. Volakis. A hybrid FEM-FMM technique for


electromagnetic scattering. IEEE Transactions on Antennas and Propagation,
45A): 180-181. 1997.(Also in Proc. 12th Ann. Rev. Progress Appl. Comptilal.

Ekctromagn. (ACES), Monterey. CA, March 1996,pp. 563-570.)


[17]S.S. Bindiganavale and J. L. Volakis. Comparison of three FMM techniques
for solving hybrid FE-B1 systems. IEEE Antennas and Propagation Magazine,
39D). 47-60. 1997.
Numerical Issues

9.1INTRODUCTION

In the previous chapter, we outlined the formulation of the finite element method as
appliedto problems in electromagnetics. In three dimensions, FEM is primarily a
volume formulation and the number of unknowns escalatesrapidly as the size of the
problem increases. Therefore, the limiting factor in dealing with three-dimensional

problems is the unknown count and the associated demands on storageand solution
time. Techniques which have O(N) storageand solution times are thus necessary to
tackle three-dimensional This
problems. is one of the principal reasons for the
popularity of partial differential
equation techniques over integral equation (IE)
approaches, as the latter lead to dense matrices with O(N2) storage. As the problem
size increases, the IE and hybrid methods, both of which need OIN1), I < / < 2,
storage, quickly become unmanageable in terms of storage and solution time.
Another concern while solving problems having more than 100,000 unknowns\342\200\224a
scenario that can be envisioned for most practical problems\342\200\224is to avoid software
bottlenecks. The algorithmic complexity of any part of the program should increase
at most linearly with the number of unknowns.This is not possible in many cases but
as a rule of thumb, it is generally true that schemes can be devised to manipulate

sparse matrices using O(N) storage and operation count.


In this chapter, we will present some numerical considerationsfor writing
efficient sparse matrix codes, of which FEM is an example. The trade-offs associated
with the various data structures used to representsparsematrices and their impact
on vectorization and parallelization are first discussed. Next, we review sometech-
techniques to solve linear systems of equations.Directsolvers like Cholesky decomposi-
and
decomposition Gaussian elimination are discussed along with ordering algorithms for
optimizing memory and CPU resources. We then focus on iterative solvers, point
and blockpreconditioning strategies and their corresponding trade-offs. A modified

299
300 Numerical Issues \302\246
Chapter 9

incomplete LU (ILL)) preconditioner is presented,which seems to work better than

the 1LU preconditioner


original for the weakly positive-definitematrix systems we
have encountered. Iterative solvers for unsymmetric matrix systems are also men-

mentioned to handle anisotropic geometries and situations where the boundary condi-
conditions make the system unsymmetric. We devote an entire section to sparse
eigenanalysis where we focus mainly on solving the generalized eigenvalue problem

using sparse and full matrix methods. To solve large problems, the computationally
intensive portions of the finite element code to be parallelized on massively
need
parallel architectures. A parallelization paradigm is discussed in connection with a
distributed memory multiprocessor such as the KSR1 (Kendall Square Research)
machine.

9.2 SPARSE STORAGE SCHEMES

The matrix systems in finite elements and related PDE methods are very sparse and
the percentage of sparsity increaseswith the number of unknowns. In an average
three-dimensional tetrahedral mesh with edge basis functions, the minimum number
of nonzero elements per row can be9 and the maximum number of nonzeros per row
is about 30. The total number of nonzerosvaries between and
15\320\233\320\223167V, where N is
the number of unknowns. Assuming a square matrix, the matrix is 99.84 percent
sparse for a 10,000-unknown problem whereas for 100,000 unknowns, 99.984 per-
percent of the matrix entries are zero. Clearly, it makes little sense to store thesezero
entries which motivates us to find the best possible scheme for storing such matrices.
As we shall see in the subsequent paragraphs and indeed throughout this chapter, the
definition of best is not unique and is governed by computer architecture.
There are various storageschemesfor sparse matrices. In this chapter, we will
discuss the more viable ones: CompressedSparse Row (CSR) format, 1TPACK
format [1], and the jagged diagonal format. Knowledge of the storage formats is

important since the speed of computation on vector or parallel processors is directly


linked to the data structure used for matrix storage.
The most commonly used format for storing sparse matrices is referred to as
the Compressed Sparse Row (CSR) format. In this scheme, the values of the nonzero
elements of the sparse matrix A are stored by rows along with their corresponding
column indices, in two long vectors VAC and KOW, respectively.1The dimension of

VAC and COC equals the number of nonzero elements in A. Another pointer
array\342\200\224TlOWPAfTH\342\200\224of dimension N is used to store the number of nonzero
elements per row. Thus the position of each element in the sparse matrix is uniquely
defined. For example, if we have the 5 x 5 unsymmetric matrix A

'For conveniencethe matrices and column vectors will be denoted by bold letters in this chapter
only and the columns will be treated as vectors in the dot/inner product definitions.
Section 9.2 \302\246
Sparse Storage Schemes 301

3 0 0 4 5\"

7 0 4 0 2
A = 4 0 7 0 0
0 0 8 0 0
9 7 0 0 0

Then according to the CSR scheme, VAC and TlOW will take the form

VA? = [3 457424789 7]

COC = [\\ 4 5 13 5 13 3 12]

and the row pointers will be stored in the array TlOWPAfTTl

\320\234 = [3 6 8 9 U]

Note the first value of IWWPN'Tll implies that after reading three entries of VAC,

we will then start reading entries that belong to the second row of A. After reading
the sixth entry of VAC we will then begin reading entries of VAC that to the
belong
third row of A and so on. The last entry of TlOWPAfTTZ is always equal to the
length of the vector VAC.
In the above example, the matrix entries for each row were stored in ordered
fashion., i.e., in increasing order of column indices, but this is not necessary for
commutative operationslike addition and multiplication. A similar data structure
which stores indices instead of the column
the row indices is called the Compressed
Sparse Column (CSC) format. The CSC format is sometimes usedwhen the matrix is
to be accessed along the rows and not the columns, e.g., in the multiplication of the
transpose of a sparse matrix with a vector. The CSR/CSC schemeis very convenient

for addition, multiplication, permutation, and transposition of sparse matrices.


However, if the matrix is extremely sparse, then row-wise traversal can lead to
short vector lengths and a significant hit in performance on vector supercomputers.
Thus alternative storage schemes are necessary for sparse matrix codes to run effi-
efficiently on vector machines.
Of course, storage can be further reduced if the matrix is symmetric since only
the upper or lower triangle of the matrix can be stored without loss of information.
However, there is a significant performance hit since the increased indirect address-
causes
addressing tremendous memory contention. For this reason, it is advisable to store
the entire matrix since storage requirements are usually very low for PDE methods
and the resulting increase in runtime is not worth the storage trade-off.
In the ITPACK storage scheme,a sparsematrix of order N is stored using two

arrays VAC and COC. Then, according to the ITPACK scheme, the rows of the

array VAC will contain the nonzero elements of the corresponding rows of the
original matrix. The number of columns of VAC will be equal to the maximum
number of nonzeros in a row; rows containing fewer nonzero elements will be
zero padded. Again, considering the sparse matrix A the corresponding ITPACK
array VAC can be represented as
302 Numerical Issues \302\246
Chapter 9

\023 4 5
7 4 2
VAC = 4 7 0
8 0 0
.9 7 0_
The column indices of the elements in VAC are stored in an integer array COC
defined as

\"L 4 5\"

1 3 5

COC = 1 3 *
3 * *

.1 2
The asterisk denotes that the corresponding elements of COC are zeros.The
ITPACK storage scheme is attractive for generating finite element matrices since
the number of comparisonsrequired while augmenting the matrix depends only on
the locality of the corresponding variable and not on the number of unknowns. This
feature can alsobe used for implementing fast searches and comparisons,whenever the
matrix is extremely sparse. Moreover, the sparse matrix-vector multiplication pro-
process can be highly vectorized because of large vector lengths when the number of
nonzeros in all rows is nearly equal. This is becausethe multiplication operation is
carried out by traversing the columns of VAC and COCwhose dimensions are O(N).
However, for our application,almost half the space is lost in storing zeros. As a
result, a lot of storage as well as computational effort is wasted in storing and
operating on zeros, respectively.
The modified ITPACK scheme [2] does alleviate this problem to a certain
degree by sorting the rows of the matrix by decreasing number of nonzero elements.
However, 30 percent of the allotted space is still lost in zero padding.
The other storage format that has been found to be useful for sparsematrices is
the jagged diagonal storage scheme [3].On a vector machine, this format givesbetter
performance in terms of vectorizability. In this scheme, the vector lengths are
approximately equal to the order of the system being solved.The rows are first
sorted by increasing degree of sparsity. The first jagged diagonal is constructed by

taking the first element from each row of the CSR data structure of the ordered
matrix. The rest of the jagged diagonalscan be obtained in a similar fashion. The
matrix is thus stored as a collectionof subvectors of decreasing length. The number
of jagged diagonals equals the number of nonzeros in the first row of the sorted
matrix. An additional vector is required as before to store the corresponding column
numbers from the original sparsematrix. The inner loop of the matrix-vector multi-

multiplication routine traverses the entire length of a jagged diagonal, the maximum
dimension of which is the same as that of the sparse matrix. This feature enhances
vectorization massively. The storage requirement of the above format can be made
to be the same as the previously mentioned CSR format through careful program-
programming.Again, taking the earlier sparse matrix example, we see in Fig. 9.1 how the
matrix is stored in the jagged diagonal format. The arrays VAL and COL store the

matrix values and the corresponding column numbers of the sparse matrix, respec-
Section 9.3 Direct
\302\246 Equation Solver 303

VAL 37498447752

COL 1 1 1 1 \320\267|4
3 3 g|5 5

PNTR 1 10
\320\262 12

ROWPERM 12 3 5 4
Figure 9.1 Jagged diagonal storage formal.

lively. The PNTR a length


vector has of 4 in this case which equals one plus the most
populated row of
sparse matrix.
the This vector therefore stores the starting location
of each jaggeddiagonal in the arrays VAL and COL. The array ROWPERM indi-
indicates the locations of the permuted rows of the original sparse matrix. The altered
code then runs at around 275 Mflops on a Cray C-90.The dot product reaches
speeds of 550 Mflops and the vector updates execute at 600 Mflops. It must be
mentioned that the CRAY C-90 is a substantially faster machine than the Cray
YMP but the CSR formatted matrix-vector multiplication routine runs about four
times slower on the C-90. Moreover,the speed of operation is a direct consequence
of the amount of sparsity in the matrix system. In spite of these caveats, the jagged
diagonal method is much better than the CSR or ITPACK storage methods for

implementation on vector processors.

9.3 DIRECT EQUATION SOLVER

In this and the two subsequent sections,we will concentrate on the various tech-
techniques for solving the linear equation system

b (9.1)
When is
\320\233 dense, i.e., most of the elements of \320\233
are nonzero, the decision is some-
somewhat straightforward. The inversion of the matrix can be carried out in O(N})
operations using popular methods like LU decomposition or Cholesky factorization.
However, in the case of sparse matrices, a simple application of the traditional
methods can prove catastrophic,as storageand processor demands will far exceed
acceptable levels.Our focus in this section will, therefore, be on sparsefactorization
techniques.

9.3.1 Factorization Schemes

LU decomposition is a method of reducing a general matrix into two triangular


matrices: L (lower triangular) and U (upper triangular with unit diagonals). Thus, in

mathematical notation, this can be written as

A = CU (9.2)
The advantage of this representation is that the subsequent systems
304 Numerical Issues \302\246
Chapter 9

Cw = b (9.3)
Ux = w (9.4)

can be solved in 0{N2) operations using forward and backward substitutions,


respectively. Moreover, the decomposition is unique if A is nonsingular. For sym-
symmetric positive definite matrices, the procedure is called Choleskyfactorization and
the decomposition of A can be written as

A = C?T (9.5)
The Choleskyfactorization thus preserves symmetry of the factored matrix and is
also a unique factorization of A. Public domain codes for both Cholesky factoriza-

and
factorization LU decomposition can be found in [4].

9.3.2 Error Control

Due to finite precision arithmetic, floating point errors can creep into malrix
factorization schemes. For a full matrix order the
\320\270, error bound can be expressed
as [5]
|*| < \320\227\320\253\320\277\320\265\320\274\320\260\320\274 (9.6)

where eM is the machine precision and aM is the largest element of the original or
factorized matrix.
In sparse matrix factorizations, the error bound is actually lower since only a
few operations are performed on each nonzero element.The error matrix for the LU

decomposition is expressed as
?U = A + ? (9.7)

where ? is
\321\203
bounded by the expression
< (9.8|
\\?jj\\ 3.01\320\277\342\202\254\320\234\320\260\321\203\320\237\321\203

and ny is a matrix of integers given by


mind',/)

E(k) tlij
k=l
= 1, if both Lik and UkJ are nonzero
np

n[y'
= 0, otherwise (9.9)
As can be observed from the error bounds, the growth of the error is direclly

proportional to themaximum value of any element that occurs in the original matrix
or due to factorization. Thus, a strategy for monitoring element growth and then
reducing it points the way for error control. One of the most popularstrategies is

pivoting. Scaling with the largest element in the corresponding row of the submatrix

(partial pivoting) or with the largest element in the entire submatrix {complete pivot-
pivoting) usually stabilizes the factorization process and provides accurate answers.
Complete pivoting comes with a severe pricetag in computational expense; partial
pivoting is, therefore, a method of choice for most factorization schemes.It should
be mentioned here that if a matrix is positive definite, diagonal elements are chosen
as pivots since diagonal dominance is a natural consequence of positive definiteness.
Section 9.3 \302\246
Direct Equation Solver 305

For sparse factorizations, even partial pivoting can be too expensiveand too
rigid.In such cases, threshold pivoting is employed which strives to maintain sparsity
and employs a user-defined threshold parameter to determine the choice of the pivot
[5]. Threshold pivoting is quite popular and is used in production level codes.
Zlatev's strategy [6] is a variation of threshold pivoting: in addition to maintaining
sparsity, it reduces the number of search rows for the pivot to a user-defined value.

9.3.3Matrix Ordering Strategies

matrix techniques can be used


Sparse to solve very large problems since the
storage increases as O(N), where N denotesthe degrees
required of freedom for the
problem. However, when a sparse matrix is factorized, the upper or lower triangular
factors may not reflect the sparsity pattern of the respective upper or lower triangle
of the original sparsematrix.This phenomenon is called^//-//?, which corresponds to
the additional nonzero elements generated during factorization. If matrix fill is left

uncontrolled, serious storage and performance penalties ensue. Fill is undesirable for
three compelling reasons:
\302\246
Additional storage must be allocated for the extra nonzeros.An extreme

example is a matrix with full first row, full first column and main diagonal
and zeroselsewhere would be completely filled on factorization.
\302\246
Number of operations needed for factorization increaseswith increasing fill.
\302\246
The error bounds defined earlier increase as the matrix becomes filled with
more and more nonzero entries.
Strategies for the of fill-in
reduction have their origins in graph theory. Since
the amount of fill
depends on the row/column permutation selected, a convenient
ordering of the matrix will drastically reduce the computation time and storage
requirements of the factorization. However, it is extremely difficult to find an opti-
optimum ordering which will guarantee the smallest possible fill-in or operation count. In
fact, no general algorithm exists to generate an optimal ordering for an arbitrary
graph. Existing strategies attempt to find an ordering for which the fill-in and opera-
operation count are low, without guaranteeing a true minimum. In finite elements, all
matrices are structurally symmetric, i.e., the positions of the nonzeros form a sym-
symmetric pattern, even though the corresponding values may break the matrix symme-
Thus
symmetry. we will mention ordering strategies for symmetric matrices only.
A graph consists of a set of vertices together with of edges. Thus, a finite
a set
element mesh can be considered to be an undirected graph the edge pair between
since

nodes (u, v) and are


(i>, \320\270) indistinguishable due to symmetry. Figure 9.2 shows a

symmetric matrix and its corresponding labeled undirected graph. A graph with n
vertices is labeled when there existsa one-to-one
correspondence between the vertices
and the integers 1.2 n. Ordering strategies for symmetric matrices hinge on the
fact that the graph of a symmetric matrix remains invariant under a symmetric
permutation of its rows and columns; what changes is merely the vertex labeling.
Before discussingthe various algorithms involved with matrix ordering, we
need to be familiar with a few basic terms in graph theory. Any square matrix \320\233
of order N can be considered to be an undirected graph with iV labeled vertices,
vl.V2,..., vn. The pair (\302\253,,
Vj)
is an edge of the graph if and only if Ay \320\244
0. The
306 Numerical Issues \302\246
Chapter 9

X X
2 X X X

3 X X X

4 X X
X X S X X

X X e
X X X 7 X X X
X X 8 X X
X X X X 9 X
Figure 9.2 (a) Corresponding graph; (b)
X X X 10
symmetric sparse matrix structure.

diagonal element corresponds


\320\233\320\270 to a loop or selfedge and is always present for a
nonsingular matrix. If (t>/. Vj) forms a valid edge, vertices\320\263',-, Vj
are said to be adjacent
to each other and the corresponding edge is incident on each of the vertices. The
number of edgesincident on a vertex denotes the degree of the vertex. The distance
between two vertices d(v/. vj) is the length of the shortest connected path between
them. The largest distance between \302\246\302\253,\342\200\242and any other vertex of the graph is called the

eccentricity of the vertex e(Vj). The vertex with the largest eccentricity is called Ihe

peripheral vertex. Since no efficient algorithms are available for determining a per-
peripheral vertex, a pseudo-peripheral vertex is used. Vj is a pseudo-peripheral vertex if
= e(Vi) implies =
d(Vj, vj) that e(t>,) e(vj), thereby guaranteeing that the eccentricity
of the selected vertex is large.
Most matrix reordering algorithms start with a vertex of minimum degree or a
pseudo-peripheral vertex. The bandwidth reduction algorithm mentioned here is due
to Cuthill and McKee [7]. Starting with a pseudo-peripheral vertex, all unlabeled
vertices adjacent to it are labeled successively in order of increasing degree.The
reverse Cuthill-McKee algorithm is used when the matrix profile needs to be mini-
minimized. In this case, the orderings of the Cuthill-McKee algorithm are merely
reversed to arrive at the minimized profile. Figure 9.3 shows a typical profile reduc-
reduction algorithm at work. Notice the bandedness of the final system compared with the
arbitrary sparsity pattern of the original matrix. King [8] also proposed profile a
reduction algorithm with similar performance characteristics as the reverse Cuthill-
McKee algorithm. The profile reduction and bandwidth reduction algorithms are
useful since they save both storage and operation count in the triangular factoriza-
factorization
process. However, none of them explicitly minimize the fill-in of the factors.
The algorithm commonly used for reducing fill-in during factorization of a
sparse matrix is called the minimum degree algorithm. The idea behind the algorithm
Section 9.4 Iterative
\302\246 Equation Solvers 307
X104 x 10\"
0 0

\"if
0.5 *?\" 0.5
\302\273\342\200\242
\342\200\242
\320\233
*
1 1

1.5 1.5
ft

*
2 .. \". ' \302\246 2

2.5
.
' ~ 2.5
i

3
* . , 1
0 0.5 1 1.5 2 2.5 : 3o 0.5 1 1.5 2 2.5 3
nz = 469151 x 10\" rtz = 469151 x 104

(a) (b)

Figure 9.3 (tt) Original matrix structure: (b) matrix structure after re-ordering using
a profile reduction algorithm.

is simple intuitively and one of the cheapest and the most effective computationally.
Fill-in and operation count is minimized locally by selecting, at each stage of the
elimination process and among all possible diagonal entries, that row and column
which introduces the least number of nonzeros in the resulting factor. It is quite
amazing that such a simple idea works so effectively. One of the problems with the

original implementation of the algorithm was that the total storage could not be
predicted beforehand. To alleviate this problem, George and Liu [9] introduced the

concept of indistinguishable vertices. Two vertices are said to be indistinguishable


from each other if they have the same degree. Also, oncethey are indistinguishable at
an intermediate factorization step, they continue to be so till one of them is elimi-
eliminated. If one of them is of minimum degree, then elimination of one is directly
followed by the elimination of its partner in the next step. A detailed description
of the minimum degree algorithm is well beyond the scope of this book. It can be
found in most texts on graph theory as well as in Chapters 4 and 5 of [5].

9.4 ITERATIVE EQUATIONSOLVERS

In PDE techniques like finite differences, the order \320\233'


elements or finite
of the system

of linear equations may be very large. Three-dimensional problems lead to even


bigger equation systems. As shown in the earlier section, direct solvers usually suffer

from fill-in to an extent that these large problems cannot be solved at a reasonable
cost even on state-of-the-art parallel machines. It is, therefore, essential to employ
solverswhose memory requirements are a small fraction of the storage demand of
the coefficient matrix. This necessitatesthe use of iterative algorithms instead of
direct solvers to preserve the sparsity pattern of the finite element matrix.
Especially attractive are iterative methods that involve the coefficient matrices
308 Numerical Issues \320\250
Chapter 9

only in terms of matrix-vector productswith A or AT. The most powerful iterative

algorithm of this type is the conjugate gradient algorithm for solving positive definite
linear systems [10]. In this section, we will discuss some algorithms which have been
found effective in solving the sparse matrices that occur in our application. These are
the biconjugate gradient (BiCG)and the quasi-minimal residual (QMR) [11] algo-
algorithms. A version of the generalized minimal residual (GMRES) method is also
presented. These algorithms can also be used for solving unsymmetric matrix sys-
systems as is the case with anisotropic materials.
The convergence pattern of the CG method for self-adjoint positive definite
(SPD) systems can be described by

where En = r%A~lra, Ku
=
A.m\302\273xAmin is the spectral condition number and n is the

number of iterations. Axelsson [12]and Van der Vorst [13] examine the convergence
of the CG algorithm in detail. Its convergence is shown to be superlinear, with a

convergence rate that depends on the distribution of the (mainly smallest) eigen-
eigenvalues of A, rather than on the spectral condition number. However, if the matrix \320\233
is not too far from being positive definite, which is the case with the matrix systems
emerging from edge element implementations, the BiCG and CG algorithms should
still converge. Some implementations espouse premultiplication of A by A1 to
ensure positive definiteness of the system. Unless the condition number is known a

priori to be small or the matrix is unitary, this is a very bad idea, since the conver-
convergence is going to be drastically slow as is evident from (9.10).
The hiconjugate gradient (BiCG) method is a variation of the CG algorithm.
This scheme is useful for solving unsymmetric systems; however, it performs equally
well when applied to symmetric systems of linear equations. For symmetric matrices,
BiCG differs from CG in the way the inner product of the vectors are taken. BiCG
usually converges much faster than CG; however, the
convergence is highly erratic.
The BiCG algorithm for unsymmetric matrix problems is given in Fig. 9.4. For
symmetric positive definite matrix systems, the BiCG algorithm needs approximately
half the computational work and only one matrix-vector multiply operation. The
complete algorithm is presented later in the chapter (Fig. 9.18).
The conjugate gradient squared (CGS)algorithm [14] performs best when
applied to unsymmetric systems of linear equations. big advantage
A of CGS over
BiCG when solving unsymmetric equation systems is that the matrix-vector product

only involves the matrix A and not Ar. Figure 9.5 shows the CGS algorithm for

positive definite matrices, as it appears in [14]. It should be noted that s requires a


second, arbitrary nonzero starting vector. Usually, s is set to r0 or rl if r0 is full or to
random entries when r0 is sparse. CGS usually converges faster than BiCG but is
more unstable since the residual polynomials are the squared BiCG polynomials and
hence exhibit even more erratic behavior than the BiCG residuals. Moreover, there
are cases where CGS diverges, while BiCG still converges. The erratic convergence
pattern of BiCG and CGS has to do with the nonpositive definiteness of the inner
products used [14]. The BiCGSTAB algorithm [15] alleviates this problem by damp-
dampingout the wild fluctuations.
Section 9.4 \302\246
Iterative Equation Solvers 309

Initialization
\320\273:
given

ro = so= b- /\302\273_i=< ?_,


\320\220\321\2050;
= 0: /9_, =
= 0;
\321\217

Repeat until {resd < lot)

Pn
= *n \342\200\242
**\302\273 A)
Pit = Pn/Pn-\\ B)
Pn - rn + PnPn-\\ C)
\320\226 ' rf\\
\320\237 \320\257 V \320\224\342\200\224
1 D)
E)
rH+i =\320\263\342\200\236-
anApj, F)
*e+l
= *\321\217\"\320\2701^*
?\321\217 G)
*\321\217+!
= *n + (8)
g/)Pn

EndRepeat

is
\320\233 a sparse complex unsymmetnc matrix.

Figure 9.4 Unsymmctric biconjugatc gradient algorithm.

Initialization
x and $ given
,v: Arbitrary

ro = b-Axo; =
\320\277\320\273\320\272/
yrj \342\200\242!

q0 =p_x = 0: p_i=0;
= = 0;
\321\217

Repeat until {read t= /o/)


A. = * \342\200\242
ra A)
Pn = Pn/Pn-\\ B)
=
\320\260* + \321\200\320\233\321\217
\320\263\320\273 l C)
= +
+ PiM |\320\263\321\217 D)
Pa \"\321\217 P,J>n-\\)
va
= E)
\320\220\321\200\342\200\236

<rn=s- =
\320\274\320\271:
\320\260\342\200\236
\320\240\320\237/\302\260\320\237 F)
=\302\273,-<\302\273,
?\302\253+! G)
Aua + qa+l) (8)
= \"\320\254 i (9)
\342\200\242'n+l*\321\217 ff^ It
\320\241, \321\202\320\223\320\264-^-1'
'
\320\231+1
= \320\233
\320\257 + I

EndRepeat

t is a sparse complex unsymmelric matrix.

Figure 9.5 Conjugate gradient squared algorithm.


310 Numerical Issues \302\246
Chapter 9

Besides the problem of erratic convergence,BiCGsuffers from the problem of break-


breakdowns. Although the breakdowns occur only in exact arithmetic, the robustness of
the iterative solver is compromised and near-breakdowns can adversely affect con-

convergence accuracy. The BiCG algorithm is said to have broken down if


= (), wheni.^O.r.^O (9.11)
?\342\200\236\342\200\242/>\342\200\236

or if

sn-rB = 0, when \302\253\342\200\236


\321\204 0
0, ra \321\204 (9.12)

According to Freund et al. [16],the above breakdown can be attributed to the

Galerkin condition that a BiCG residual must satisfy. The second type of breakdown
parallels the breakdown in the unsymmetric Lanczos process.
Freund [11] has proposed the quasi-minimal residual {QMR)algorithm with
look-ahead for solving linear equation systems.QMR eliminates the oscillations in

the BiCG residual norm and generates smooth, near monotonically converging iter-
iterates. QMR also avoids breakdowns of the first and second kind: the latter is corrected
by using technique. Moreover, transpose-free
a look-ahead QMR algorithms exist for
solving unsymmetric matrix systems. The readeris referred to [11] for algorithms

pertaining to unsymmetric matrices with look-ahead. It should be mentioned in

passing that in most cases, breakdowns do not occur; however, for the sake of robust-
robustness, the look-ahead feature should be included in commercial codes. The QMR
algorithm for complexsymmetric systems without look-ahead is presented in Fig. 9.6.
In most cases, QMR converges in about the same number of iterations as
BiCG with one significant difference. Since the algorithm minimizes the residual at

Initialization
x given
r0 = b \342\200\224
AxQ ;
= ' =
yrjj-ro :
resd P\\
= 'o; vi *olP\\ '\342\200\242
y/r*a

/>o
= 4> = O; t'o = 6O= \\,so = = -1:
9o=- 1.\320\247\320\276
=
vi ro/resd;
\320\270=1;

Repeat for n = 1,2,...


if =0
<?\342\200\236_! then stop -* \\Code Breakdown

Sn = vB-i\302\273n A)
if = \320\236
8\342\200\236then stop
= B)
P\302\253 va-{pnSJeri-i)pn-,
= 6ll/\302\253e C)
e\302\253=P\302\253--4*\302\273\302\253:A
= -
va+i finvn
\320\233\321\200\342\200\236 D)
\342\200\242 = Ar-iAn-ilA,! E)
v\302\273+i:e\302\273

= = = nn-v<,pA!Pncl-\\)
cH sn
1\320\233/1+\320\262\320\227; \320\263)\342\200\236
\320\250\320\250\320\245\321\200\320\277+\320\274/\321\201^iaj): F)
4, = + @\302\273-|?\320\233\302\273-1
\302\273\302\253/>\320\277 G)
*B = *\302\273-i+ dn\\ va+, = va+,/pn+l (8)
n = n+ 1
End
is
\320\233 a sparse complex symmetric matrix.
Figure 9.6 Quasi-minimai residual (QMR) algorithm for complex symmetric
matrices without look-ahead.
Section 9.4 Iterative
\302\246 Equation Solvers 311

each iteration, convergence criteria can be set at a maximum number of iterations


rather than at the residual norm for large problems. It has also been observed that
QMR stagnates for those iterations when BiCG yields wildly oscillating residual
norms.
Another mildly bothersome aspect of QMR is that the residual vector cannot
be recovered without going through a matrix-vector multiplication. However, there
is an excellent upper bound for the residual norm defined as

1|\320\263.||<||\320\263\320\276||>/1\320\275\320\237\"|51\320\260:....\321\202\320\270| (9.13)

which is approximately five times larger than the true residual norm. Usually, the
upper bound is computed until we are very close to the tolerance and then the

residual is computed exactly by carrying out an additional matrix-vector multiply

for about five to ten more iterations.


The GMRES {Generalized Minimal Residual) algorithm (see Fig. 9.7), first
proposed by Saad and Schulz [17] is another iterative solver for sparse systems. It
leads to the smallest residual for a fixed number of iteration steps. However, each of
these iteration steps becomes increasingly more expensive because GMRES stores
the iteration vectors. To limit the increasing storage requirements and workload per
iteration step, it is necessary to restart the algorithm. The properchoiceof spanning

Initialization
x is given
Define the (m + 1) x m matrix \"Hm = [hy. 1 < / < m + I and I <j < m]
Specify the number of spanning vectors m

START: = \320\254-
\320\263\320\276

Repeal2 for v = 1 m

Repeat 1 for / = I j

End Repeat 1

End Repeat2

\342\200\224
Compute ym
to minimize ftmy||2
\320\235/\320\227\320\265,

Vm = {v,,v2 vm)
x= x + V,,,ym
If convergence is achieved, then stop, else go to START

Figure 9.7 Restarted GMRES algorithm (fy refers to the /th column the identity
\320\276\320\223

matrix),
312 Numerical Issues \302\246
Chapters

vectors m requires a priori experience,and is dependent on the system parameters


and sampling rate. With regard to the number of operations, GMRESrequires only
one matrix-vector product but the number of inner products increases linearly with
the iteration steps. In addition to the matrix storage requirements, (m + \\)N complex
words are needed to store the m search or spanning vectors.The only drawback of
GMRES is that CPU time and storage per iteration increaseslinearly with the
iteration count. A way to overcome this is to choosethe initial spanning vectors
m in an optimal way. If m is too small, GMRES may exhibit slow convergence or it

may not converge at all. On the other hand, if m is large, an excessive amount of

work is needed since the CPU time is O(m2N). More details about this solver are

given in [18] and [17].


Figure 9.8 shows a comparison of the performance among the three different
solvers (BiCG, QMR, and GMRES). This comparison is done for a sparse
symmetric three-dimensional FEM system of approximately 6700 unknowns simu-

simulating an enclosed transmission line. The GMRES solver was applied with only

12 and
m \342\200\224 converged after about 75 restarts.Its errorhistory was monotonic. a

typical characteristic of GMRES.


Finally, we make mention of the BiCG STAB algorithm, originated by Van der
Vorst [15]. BiCGSTAB combines monotonic convergencewith the superior storage
properties of BiCG.

100 150 200 250 300 350 400 450


Iterations

Figure 9,8 Example convergence hisiory of the BiCG, QMR and GMRES algo-
algorithms. For GMRES. iterations refer to restarts. [Courtesy of Y. Boiros.]
Section 9.5 \302\246
Preconditioning 313

9.5 PRECONDITIONING

The condition number of a system of equations usually increases with the number of
unknowns. It is then desirable to precondition the coefficient matrix such that the
modified system is well conditioned and converges in significantly fewer iterations
than the original system. The equivalent preconditioned system is of the form

ClAx = C]b (9.14)


The nonsingular preconditioning matrix must
\320\241 satisfy the following conditions. It

1. should be a good approximation to A.


2. should be easy to compute.
3. should be invertible in O(N) operations.

The preconditioned mentioned in the following section are the diagonal and the ILU
point and block preconditioned. Block preconditioners are usually preferable due to
reduced data movement between memory level hierarchies as well as a decreased
number of iterations required for convergence. Block algorithms are also suited for
high-performance computers with multiple processors since all scalar, vector, and
matrix operations can be performed with a high degree of parallelism.

9.5.1Diagonal Preconditioner

The simplest preconditioner that can be used in iterative solvers is the point
diagonal preconditioner.The preconditioning matrix is a
\320\241 diagonal matrix which is

easy to invert and has a storagerequirement of N complex words, where N is the


number of unknowns. The entries of \320\241
are given by
= /=l
\320\2419 \320\246\320\220\321\206, N\\ j=\\ N (9.15)
where Sy is the Kronecker delta. The matrix C~l contains the reciprocal of the
diagonal elements of A. Fora positive definite matrix, the diagonal preconditioner
is very effective both in terms of cost and performance.In the matrix systems we
dealt with, the diagonally preconditioned systems converged in about 35 percent of
the number of iterations required for the unpreconditioned cases. The diagonal

preconditioner is also easily vectorizable and consumes4.1microseconds per itera-


iteration per unknown on a Cray YMP, a marginal slowdown over the unpreconditioned
system.
A more general diagonal preconditioner is the block
diagonal preconditioner.
The point diagonal preconditioner is a block diagonal preconditionerwith block size
I. The block diagonal preconditioning matrix consists ofw x m symmetric blocks,as
shown in Fig. 9.9. The inverse of the whole matrix is simply the inverse of each
individual block put together. If the preconditioning matrix is broken
\320\241 up into n
blocks of size m, the storage requirement for the preconditioner is at most m2 x N.
However, this method suffers a bit from fill-in since the inverted in x m blocks are
dense even though the original blocks may have been sparse. Due to this reason,
large blocks cannot be createdsince the inverted blocks would lead to full matrices
and take a significant fraction of the total CPU time for inversion. However, since
314 Numerical Issues \302\246
Chapter 9

Figure 9.9 Structure block precondition-


\320\276\320\223
matrix.
preconditioning

the structure of the preconditioning matrix is known a priori, this preconditioner


vectorizes well and has beenobserved to run at 194 MFLOPS on the Cray-YMP for
a block size of 8. As an example, for a system of 20,033unknowns, a block size of 2
caused the maximum reduction in the number of iterations A4 percent) and ran at
197 MFLOPS.
The simplediagonal preconditioner was implemented and tested for solving the

problem (with approximately 6700 unknowns) of a shielded microstrip line. The


significance of the preconditioner is illustrated in Fig. 9.10, where the number of
iterations droppedfrom 174 to 78 when the diagonal preconditioner is employed.In
general, the diagonal preconditioner saves from 30 percent to 60 percent of the total
number of iterations and CPU time for all of the iterative solvers.This percentage,of
course, is highly dependent on many factors and parameters. The element shape

J
i (3MRES without Precond.
i 1 1
\320\276
GMRES with Precond.
. ! i 1 1
iA
1U \320\263\321\202 i 1 1 \320\263 ~r \320\223

\\ I i 1 1 1 1
1
-15 \302\246^bd- -+- -1- -H- -H \\- _
^S 1 1 1 1

-20 j \321\202\320\263\321\202J
1
\342\200\224 1 J JL
m i \320\276 1 1 1 1
\\
\302\246a i 1 1 1 I
ol\\
E -25 \342\200\224i- ^rV -r _1 r_
\342\200\224
1
N
\320\276 i \302\260l4J 1 1 I
-30\342\200\2241_
\320\250 - J 1- _ j
i 1 1 I
~v
-35 i
i -I4
. 1 1
1
1 I
\320\223
i 1 > I
-40 \342\200\224i-
- + - -1- SJ
-i r-
i 1 1 i 1 I

-45 20
i
40
1
60
1
80
i I

100 120 140 160 180


1 I

Iterations

Figure 9.10 Convergence pattern for GMRES with and without diagonal precon-
preconditioning for a standard problem. [Courtesy of Y. Botros.]
Section 9.5 \302\246
Preconditioning 315

employed to discretize the computational domain, the sampling or discretization


rate, the layer parameters, etc., all affect the performance.

9.5.2 Incomplete LU (ILU) Precondltloner

The ILU preconditioned is considered an improvement over the diagonal pre-


conditioner. As mentioned earlier,the convergence of conjugate-gradient methods
depends on the clustering of eigenvalues in the spectrum of the preconditioned

system. The ILU preconditioner enhances the eigenvalue clustering leading to


improved convergence. In our test cases, the traditional ILU preconditioner [19]
was employed with zero fill-in. Higher values of fill-in usually improve the conver-
convergence even further until a trade-off is reached between storage,matrix-vector multi-

multiplication lime, and iteration count for convergence [12]. There is another flavor of
ILU called ILUT [20] which stores matrix elements only if they exceed a certain
threshold value relative to the diagonal. In the cases mentioned below,no attempt
was made to employ higher values of fill-in since the preconditioner already occupied
storage spaceequal to that of the coefficient matrix.

Algorithm 1: Modified ILU Preconditioner with Zero Fill-in

Ft is assumed that the data is stored in CSR format column numbers for
and that the
each row are sorted in increasing order. The sparse matrix is stored in the vector 2>
and the column numbers in PC. SIQ{i) contains the total number of nonzeros till the
/th row. The locations of the diagonal entries for each row are stored in the vector
TXLAQ. The preconditioner is stored in a complex vector, CU.

for i=l step 1 until n-1 do


begin
lbeg=diag(i)
lend=sig(i)
d=lu(lbeg)

for j=lbeg+l step 1 until lend do


begin
jj=pc(j)
ij=srch(jj,i)
if (ij.ne.0) then

begin
lu(ij)=e=lu(ij)/d
for k=lbeg+l step 1 until lend do
begin
kk=pc(k)

ik=srch(kk,i)
if (ik.ne.O) lu(ik)=lu(ik)-e*lu(k)
end
end
end
end
316 Numerical Issues \302\246
Chapter 9

In comparison with the traditional ILU preconditioner given above, the modi-
modified ILU preconditioner eliminates the inner loop over the integer variable k. The
modified algorithm basically scalesthe off-diagonal elements in the lower triangular
portion of the matrix by the column diagonal. Since the matrix is symmetric, it
retains the LDLT form and is also positive definite if the coefficient matrix is positive
definite. For our test cases, the modified ILU was especially helpful since the tradi-
traditional ILU preconditioned system may not have been positive definite, as documen-
in [21]. The modified
documented ILU preconditioner is alsolessexpensive to generate and
converges in about 1/3 the number of iterations taken by the point diagonal pre-
preconditioner. However, on vector architectures, the time taken by the two precondi-
preconditioning strategies is approximately the same since each iteration of the ILU
preconditioned system is about three times more expensive [22].The forward and
backward substitutions are very difficult to parallelize and prove to be the bottleneck
since they are inherently sequential processeswith vector lengths approximately half

that of the sparse matrix-vector multiplication process. The triangular solver is also
extremely difficult to parallelize [23].
One way to improve the paralielization of the ILU preconditioneris to use level
scheduling and self-scheduling [23]. In particular, level scheduling can be used to
increase parallelizability by taking advantage of matrix structure and sparsity. For
solving any lower triangular system Lx = b, the /th unknown in the forward solution
is given by

i / \\
(9.16)

If L is dense,all the components \320\273-| xi4 need to be computed before Xj can be


obtained. However, when L is sparse,most of the /,/S are zero; hence, we may not
need to compute all of the unknowns xlt..., .v,_i before solving for xt. Level sched-

scheduling is based on this simple observation. The dependencies between the unknowns

can be modeled using a graph in which node / corresponds to the unknown .v, and an

edge from node j to node / indicates that 0


li} \321\204 implying
that the value of x} is
needed for solving x,. The operationshown in (9.16) can then be rewritten as

Thus jc, can be solved at the klh step if all the components Xj
in (9.17) have been
computed in the earlier
steps.
To implement the level schedulingalgorithm, it is first necessary to define the

depth of a node and the level of the graph. The depth of a node is defined as the

maximum distance from the root [3].Therefore,let us place an imaginary root node
with links to the nodes having no predecessorsso that the depth of each node will be
defined from the same point. The depth of each nodecan now be computed with one
pass through the structure of the coefficient matrix L by

1 \302\246 if Ijj
= 0 for ally < /1
(9.18)
/..^\320\276\321\214 otherwjse )
Section 9.5 \302\246
Preconditioning 317

The graph can then


level of the be defined as the set of nodeswith the same depth.
The level schedulingalgorithm can now be implemented without physically ordering
the matrix, but solving the system in increasing order of node depth and distributing
the nodes at each deplh among the available processors.

Algorithm 2: Forward Elimination Step with Level Scheduling

The number of levels of the graph, nlev. can be easily determined from the depth
information. To do so, let us define two other integer vectors: ORDER(i)storesthe
ordering of the rows of L in terms of increasing node depth and LEVEL(i)which
stores the index to the start of each level in ORDER(i).

do k=l,...,nlev
do j=ilevel(k),...,ilevel(k+l)-l (parallel loop)
i=iorder(j)
execute Equation (9.17)
enddo

enddo

However, in our experience, parallelizing the ILL) preconditioned system with level
schedulingdid not lead to significant speedup mainly due to the enormous amount of
memory traffic that was generated. This observation was also noticed in [23], where
the authors estimated that the parallel algorithm generated as much as ten times

more traffic than the sequential code. The blockFLU preconditioner considered next
reduces memory traffic and is thus more effective to parallelize.
In implementing the 1LU preconditioner.
block one block is distributed to each
processorin a multiprocessor architecture, thus achieving load balancing as well as
minimizing fill-in. The modified ILL) decomposition outlined earlier is then carried
out on each of theseindividual blocks. Further, since the blocks are much larger than
the block diagonal version, the preconditioner is a closer approximation to the

coefficient matrix. Moreover, the triangular solver is fully parallelized since each
processor solves an independentsystem of equationsthrough forward and backward
substitution. For example, in an equation system with 20,033 unknowns, the number
of iterations was reducedby approximately half the number required by the diagonal

preconditioner. However, since the work doneis less than twice that for the diagonal
preconditioner, only marginal savings of CPU time was achieved in this case. Also,

the number of iterations required for convergence is highly sensitive to block size, as
shown in Table 9.1 for the 20,033 unknown system. Table 9.1 clearly shows that a

larger block size (smaller number of blocks)does not guarantee faster convergence.
Nevertheless, there is an approximately 50 percent decreasein the number of itera-
iterations the point diagonal preconditioner. regardless of blocksize.The optimum
over

block is dependent on the sparsity pattern


size of the matrix and can only be deter-
determined empirically. The savings in the number of iterations over the point diagonal

preconditioner for 28 blocks is given in Table 9.2 for a system having 224.476
unknowns. From the table, it is clearly observed that the block ILL) preconditioner
is very effective in reducing the iteration count; however, the CPU time required is
about 10 percent lessthan that required by the point diagonal preconditioner for the

best case.
318 Numerical Issues \302\246
Chapter 9

TABLE 9.1 No. of Iterations versusNumber of Blocks for a Block


ILU Preconditioned Biconjugate Gradient Solution Method

No. of Blocks No. of Iterations

1 127
2 176
4 185
8 172
12 162
16 174
24 223
28 177

TABLE 9.2 Number of Iterations Required for Convergenceof a


224.476Unknown System Using the Point Diagonal and Block ILU
Preconditioning Strategies

Iterations

Angle of Incidence Point Diagonal (I) Block ILU A1) Ratio (II/I)

0 2943 2758 .937


ID 5985 3834 .641

20 5464 3984 .729


30 6048 3651 .604
40 5770 3256 .564

50 5107 3720 .728


60 6517 4162 .639
70 5076 4108 .809

80 5305 3551 .669


90 2898 2832 .977

In a nutshell, the simplest and the most effective preconditioner was found to be the
diagonal preconditioner. It is also amenable to vectorization and parallelization. The
ILU preconditioner should be employed when the matrix system is ill-conditioned
and vectorizability is not an issue. On most high performance PCs and scalar work-
workstations, the ILU preconditioner performs better than the diagonal one. For parallel
architectures, block ILU is clearly the method of choice. Matrix ordering strategies
that minimize matrix profile can further enhance the block ILU performance since
only a small fraction of resultant nonzeros will lie outside the blocks.

9.5.3 Approximate Inverse Preconditioner

When the matrix is


\320\233 indefinite, standard preconditioning techniques may fail

to achieve quick convergence. Most of the previous preconditioners (such as the

diagonal and ILU) will perform poorly case.in this


The Approximate Inverse Preconditioner {A I PC) is typically more robust and
operates by minimizing the Frobenius norm of the matrix R,
Section 9.5 \302\246
Preconditioning 319

R = 1-AM (9.19)
where M is the preconditioning matrix and / refers to the identity matrix.
The Frobenius norm of any matrix S of dimension (m x n) is given by

\\\\s\\\\F
=

Referring to [17], this minimization can be achieved in two different ways. The
first is via the Global Iteration approach which treats the matrix M as an unknown
sparse matrix and minimizes the objective function in (9.19). One of the well-known
techniques that employs this approach is the Global Steepest DescentMethod. Its

implementation is as follows:
Initialize M
Repeat for i=\\ till convergence

Update M
EndRepeat
The drawback of this technique is its high CPU time and memory cost (both of
order \320\2702).This is because the entire matrix is used during the minimization process.
However, the Column-Oriented Algorithm minimizes the individual functions

=1,2
\321\203
/\321\203(\321\202)=||\320\265,-\320\233\321\202,||\321\214
n (9.21)
where e, and my are the yth columns of the identity and preconditioner matrices,
respectively. The F
subscript implies the Frobenius norm defined in (9.20).
The algorithm for the Approximate Inverse Preconditioner is given in Fig.

9.11. Note that nt in the 'Repeat2' loop of this algorithm can be set as small as

Initialize M

Repeat 1 for each column j = 1,2 n

\342\200\224
\320\251 -Me,

Repeat2 for i \342\200\224


1,2,..., \321\217,

=
aij m, + ajtj
Apply numerical dropping to my

EndRepeat2

EndRepeat 1

nt = # of MR iterations used to minimize K. Large values


lead to a better preconditioner.
Figure 9.11 Approximate inverse preconditioner using MR (minimum residual)
iteration.
320 Numerical Issues \302\246
Chapter 9

unity. However, higher values of \320\270,-


will lead to a better preconditioner and thus
reduce the number of iterations neededfor convergence. The trade-off is that the

preconditioner will become more dense.

9.5.4Flexible GMRES with Preconditioning

In all the preconditioned discussedin the previous sections, it was implicitly


assumed that matrix M. is fixed. However, in many cases, M
the preconditioning
may not be a constant operator and therefore, iterative solvers preconditioned with
constant operators may not converge. Flexibleiterative solvers are those that include
variations or changes of the preconditioner from one iteration to another. One of
these solvers is the Flexible GMRES (FGMRES) algorithm. A pseudocode for the
FGMRES algorithm is given in Fig. 9.12 [17].

Initialization
x is given
Specify the number of spanning vectors m
Define the (m + 1) x m matrix Hm
=
(A//. 1 < / < in + 1 and 1 <j<m\\

START: ro = b-.4x

v\\
= 'o/P

Repeatl for 7 = I,.... /\321\217

=
\302\246ijMrxy},
w =A
Repeat2 for /= 1 j
h,j = w \342\200\242
yj

EndRepeat2

EndRepeatl
minimize \342\200\224
Compute ym to ||0e, W,,,y|b

x = x+Zmym
If convergence is achieved, then stop, else go to START

Figure 9.12 Flexible GMRES algorithm.

9.6 EIGENANALYSIS

In the area of microwave circuits,accurate determination of dominant eigenvalues or


propagation constant of a system is essential for a viable design. Since dominant

mode operation is usually desired, determination of those eigenvalues influences the


operating bandwidtli of the design. In finite elements, eigenanalysis is complicated by
Section 9.6 \320\250
Eigenanalysis 321

three factors: the matrices method are sparse,the


gives rise to a generalizedeigen-
problem, and selected eigenvalues are desired.The dense eigenvalue
only a few

problem is essentially solved, and E1SPACK has excellent black box routines to

do the job. However, the sparse eigenproblemis still an area of active research
and will be the main focus of this section.
It shouldbe pointed out that the eigenvalue problem for a general matrix is
usually more difficult to solve than a set of linear equations. Since determination of
eigenvalues requires finding the mh-order polynomial, it is an essentially
roots of an
iterative process as polynomial be solved algebraically for fourth-
equations cannot
and higher order polynomials. However, before the onset of iterations, the system is
usually reduced to a convenient form for fast calculation of eigenvalues.The reduc-
reduction does not come cheaply and usually takes longer than the actual eigenvalue
calculation process. It is also in this reduction process that dense and sparsepro-
problems are treated differently.
The standard eigenproblem is defined as

Ax = \320\233\321\205 (9.22)

where A, denotes the eigenvalue of the matrix A and x represents the corresponding
eigenvector. The generalized eigenprobiem found most commonly in finite element
analysis is given as

Ax = kBx (9.23)
where is
\320\272 the eigenvalue of the \320\222
\320\220, pencil. Usually, A is symmetric and is
\320\222

symmetric positive definite for finite element systems. However, as shown in

Chapter 5, can
\320\222 be symmetric indefinite which increases the computational rigor
significantly. The eigenproblem is usually reduced to form before starting
a simpler
the iterative solution. The reduction is achieved in dense matrices by means of a
congruence transformation. The resulting eigenproblem amounts to solving for the
eigenvalues and eigenvectors of a symmetric tridiagonal matrix for symmetric prob-
problems or an upper Hessenberg matrix when the original matrix is unsymmetric. A
tridiagonal matrix has nonzeros only in its diagonal and in its first upper and lower
codiagonals. An upper Hessenberg matrix has nonzeros only in its upper triangle,
diagonal and first lower codiagonal. The reduction to tridiagonal or upper
Hessenbergform is achieved by a series of rotations, called Givens rotations, or
Householder reflections. For detailed information on these algorithms, the reader
is referred to [24]. Givens rotations are used when the matrix is sparse and banded
sinceit is possible to carry out these rotations such that no nonzeros are introduced
outside the band. Once the tridiagonal or the upper Hessenberg form has been
obtained, there exist powerful techniques like the QR and QL algorithms to deter-
determine all the eigenvalues of the system. Such algorithms are readily available in
source code from the EISPACK library through neilib [4]. However, there are two
restrictions in the above approach: A) the matrix needs to be banded and B) the
entire spectrum of eigenvalues is computed. The necessity of a bandedmatrix is a

prohibitive requirement for finite element meshes with arbitrary sparsity patterns.
The computation of the entire spectrum can be avoided using the bisection method
based on Sturm sequences for calculating eigenvalues within a specified interval [24].
322 Numerical Issues \302\246
Chapter 9

9.6.1 Direct and Inverse Iteration

For a truly sparse eigensolver, the original matrix should


\320\233 be used in the
iterative process only as a matrix-vector product. This avoids the need for an explicit
inversion and resulting loss of sparsity.One of the earliest methods to accomplish
this is known as the power method, also known as the method of direct iteration.Il is
best for computing a few select eigenvaluesof large sparsematrices.A bonus is the
automatic availability of the desired eigenvector. The algorithm is simple and has
been given in Fig. 9.13 for solving (9.22). Thus, \320\233 = (x, Ax)/{x, x) is an eigenvalue of
and
\320\233 x is the corresponding eigenvector.The starting vector e\\ is a zero vector with
unit value at the first location. The more dominant the eigenvalue, the faster the
convergence to the desired eigenvalue. Interior eigenvalues can be found by using a
starting vector which is orthogonal to the existing eigenvectors. For example, if

A.|, A,2,..., A.,, have been found along with the eigenvectors x2
\320\264\320\2631, then the
\320\273\320\263\",

starting vector should be orthogonal to each of the preceding eigenvectors

q= p - (/xV - (/*V (/>V)x\"

where/\302\273 starting vector. Thus dominant eigenvalues


is the with corresponding eigen-
eigenvectors canobtained
be without modifying the nonzero structure of the matrix A

However, the outlook is not so rosy in practice: round-off errors rapidly lead to loss
of orthogonality, and re-orthogonalization is necessary from time to time. The cri-
criterion for re-orthogonaiization is usually when |A.,/A| is larger than a prespecified

tolerance, where h is the current estimate of the eigenvalue and A.) is the dominant
eigenmode. As we will see later, loss of orthogonality is the bane of sparseiterative
eigensolvers and cannot be avoided in finite precision arithmetic, leading to storage
and time constraints if interior eigenvalues are required in addition to the extremal
ones.

Initialization
Choose any column vector, say et

Repeat until
Axk/\\Axk\\ =

Figure 9.13 Power method.

In some cases, eigenvalues immediately adjacent to a desired cutoff value are

sought. The determination of the dominant eigenmode of a two-dimensional micro-

strip line is a case in point. The dominant mode is usually the one closest to the
maximum wavenumber supported by the dielectric medium. However, predicting an

eigenvalue close to the desired one is often not an easy task. Algorithms like the
determinant search method use preliminary guesses and the properties of Sturm
sequences to predict eigenvalues near the desired ones.
Once an educated guess can be made regarding the desired eigenvalue, the
eigenproblem becomes somewhat easier to solve. Shifts of origin can be carried
out to improve the performance of the algorithm. This works on the principle that
Section 9.6 \302\246
Eigenanalysis 323

if A/, / = 1 n, denotes the spectrum of A, then al


A\342\200\224 has eigenvalues A.,
\342\200\224
a and

the same eigenvectors as A In this way, interior eigenvalues can be calculated once
the neighborhood of the eigenvalue can be determined.The method converges lin-

linearly with the factor \\p/kr where kp


is the dominant eigenvalue and kg
is the next
dominant one. Since convergence rate dependsheavily on the separation between
eigenvalues, closely spacedeigenvalues can cause the algorithm to stagnate.
The explanation as to why direct iterations converge can be found in [5].
Basically, if the eigenvectors are taken to form an orthonormal basis in \320\270-dimen-

sional space, then pre-multiplication of an arbitrary vector \320\270


with A produces a tilt
toward the first dominant eigenvector by the factor kp/kq. where kr k4
are successive

eigenvalues. Successive premultiplication followed by normalization will converge to


the dominant eigenvector.Oncean eigenvector is found, we choose another arbitrary
starting vector for computing the next dominant eigenvalue. The starting vector

must be orthogonal to the dominant eigenvector. This process is called deflation


by which the iteration vector is restricted to the invariant subspace which is the
complement of the known eigenvectors.
The method of inverse iteration is used for solving the eigenvalues of the

inverse problem

(9.24)

where A is symmetric and nonsingular. The method of inverse iteration with shifts

of origin can be efficiently applied computation of eigenvectorswhen


to the a set of

eigenvalues are known from methods such as bisection.


Both methods,direct and inverse iteration, can be used for solving the general-
generalized
eigenproblem. Thus, the power method with shift a can be extended as given in

Fig. 9.14. The shift factor is usually taken close to the desired eigenvalue. The

method requires the solution of a linear system of equations at each step. It can
be accelerated by factorizing the matrix
\320\222 using sparse techniques at the beginning of
the iterative procedure.In the inverse method with shifts of origin, the system to be
solved at every iteration is

?yk+1= (A -
crB)xk

Initialization
Choose any column vector, say e\\

Repeat
Solve Byk+i
= {A-aB)xk

*
xk lnen sl0P

Figure 9.14 Shifted power method for ihe generalized cigenproblcra.


324 Numerical Issues \302\246
Chapter 9

9.6.2 Simultaneous Iteration

The method of simultaneous iteration (also known as subspace iteration) is a

generalization of the power method describedin the previous section. This method
as well as the subsequent Lanczos algorithm rests on the concept of Rayleigh
matrices, Ritz values, and Ritz vectors. Let us considerQ = (qt, \321\206\320\263qm) as an
orthonormai basis of a subspaceS. invariant under A, arranged in the form of an
n x m matrix Q. Then \320\241 = QTAQ is a square symmetric Rayleigh matrix of order
m. It can be shown that the eigenvalues of \320\241
are the same as that of A and the

eigenvectors of \320\241
equals Qy, where \321\203
is the eigenvector of A The advantage is that a
much smaller matrix \320\241 of order m <K n yields the desired extremal eigenvalues of \320\233
The eigenvalues of \320\241
are the best approximations to the eigenvalues of A and arc
known as Ritz values and the corresponding eigenvectors are known as Ritz vectors.
Thus, if rt = Ax,- - fXjXi is the residual vector for the Ritz pair (jih x,), then there is
an of A -
eigenvalue in the interval [\320\264, ||r/||, /n, + \320\24617\320\246].

In the method of simultaneous iteration, the orthonormai


matrix Q is made up
of Ritz at each step. The subspaceS = span(Q)
vectors is not initially invariant
under A but becomes nearly so as the iteration progresses. The algorithm by
Reinsch [25] is given in Fig. 9.15. (For proof that the columns of Um are Rite vectors,
see [5].) Simultaneous iteration in conjunction with shifts can be a powerful tool for

solving large eigenvalue problems with minimal usage of computational resources.


Thus the k leftmost or \320\272rightmost eigenvalues can be found along with their corre-
corresponding eigenvectors, though the speed of convergencedependson the size of the
\320\272

subspace.

Initialization
Choose any orthonormai basis Vq of dimension nx where
\320\272, \320\272

m = l
Repeat for m = 1, 2,...

Orthonormalize \320\241
such that
=
\320\241 QR, with Q unitary and R upper triangular
Find eigenvalues of RRT
Spectral decomposition of RRT = PDPT,
where D is diagonal with =
fij
?>,,\342\200\242

and P is unitary

Vm = QP
If residual vector < tolerance,convergenceachieved
End Repeat

Figure 9.15 Simultaneous iteration for Ihc standard eigenproblem.

In finite elements, where the generalized eigenproblem is to be solved in most

cases, simultaneous iteration provides a powerful tool for extracting the extremal
eigenvalues. If an educatedguess can be made regarding the eigenvalue, the inverse
Section 9,6 \302\246
Eigenanalysis 325

iteration with shift is used for finding the eigenvalues of large, sparse systems. As
shown earlier, the shifted generalized eigenproblem is defined as

(A-cxB)X = ABX (9.25)

where a is the shift and of eigenvalues. The algorithm


is
\320\233 the vector is given in Fig.
9.16 and is taken from
[5]. algorithm Thisworks even if the matrix
\320\222 in the \320\222
\320\220,

pencil is symmetric indefinite. Note that the sparse system A-aB needs to be solved

at each step for multiple right-hand sides. It is convenient to do this using one of the

sparse direct solving strategies outlined in the earlier section. The solution of the
generalized order of the HA, HB pencil, is much smaller than n, the order of the \320\222
\320\220,

pencil.

Initialization
Choose any Ult of dimension n x k, where \320\272 \320\277.
\302\253: is also
\320\233\320\276 available.
m= \\

Repeal for m= 1,2,...

Solve (.\320\220-\320\276-\320\222)\320\241
= \320\257for \320\241

Construct two Rayleigh matrices:


\302\273a-CtR

HB = CTBC
Solve the generalized eigenproblem:
{HA-aHB)P=\\mHBP,
where P is //\320\264-orthogonal

Vm
= CP
If residual vector < tolerance,convergenceachieved
End Repeat

Figure 9.16 Simultaneous iteration with shirt for the generalized eigenproblem.

One of the problems this method is that k, the size of the desired subspace,
with
is not known a priori. However, \320\272
can be modified within the iteration process by
adding new columns to the basis or by deflating the basis from the converged

eigenvectors.
9.6.3LanczosAlgorithm
The Lanczos algorithm results when the initial guess for the orthonormal basis
is drawn from the Krylov subspace. Therefore, if A is an arbitrary nonzero vector, a
Krylov subspace is defined as

K.'\" = span (b,Ab A\"'1 b) (9.26)


It can bereasonedfrom the convergence of the power method that 1\320\241\"
will be nearly
invariant under A, when m is sufficiently large. Choosing the vectors from a Krylov

subspace lends the Lanczos algorithm some remarkableproperties:


\302\246
The Rayleigh matrix is
\320\241 tridiagonal which simplifies the computation of
Ritz pairs.
326 Numerical Issues \302\246
Chapter 9

\302\246
The computation of the orthonormal Lanczos basiscan bedone through a
three-term recurrence relation.
\302\246
Convergence to the eigenvalues is very rapid.

However, the chief in the algorithm


flaw lies in the loss of orthogonality of the

Lanczos bases due to round-off errors. Selective, and sometimes,complete re-

orthogonalization is needed to correct this problem. However, this method can be


a bit expensivewhen interior eigenvalues For extremal eigenvalues,
are required. the
Lanczos method is one of the most efficient sparse matrices as it has
for large far

superior convergence properties than the power method. Lanczos converges approxi-
approximately as whereas
B\320\272)\320\263^'~1) the
power method converges as /c2(\"~'\\ where n is the

iteration number and =


\320\272 (A.i/A.2) < 1. The
convergence rate is especially critical if
\320\250\320\265
separation between the two adjacent eigenvalues is small.The basic algorithm
without re-orthogonalization is given in Fig. 9.17. For a definitive account of the
Lanczos algorithm and its features, the reader is referred to [26]. The source code is
also available via netlib [4].
Initialization
Chooseany b of dimension n.
SeU0 = 0;r0 = *;A) = ||*||2
m= 1
Repeat for m \342\200\224
1,2,...
4m = rm-l/Pi\302\273-\\

\320\223\321\202= $m-\\4m-\\
\320\220\321\206\321\202-
= \302\246
\302\253m 4m rn>

rm = rm- amqm\\ P,,,


= \\\\rj\\
Solve the eigenproblem:
Tmhj
= is tridiagonal
\320\264,\320\271/,
\320\242\321\202
\320\230\"
AJA/J < tolerance, convergence achieved
EndRepeat
Figure 9.17 Lanczos algorithm for the standard eigenproblem.

The generalized eigenproblem gets considerably tougher to solve if the matrix


in
\320\222 the \320\222
\320\220 pencil is not positive definite or unsymmelric. If \320\233
is symmetric and \320\222

is symmetric indefinite, its Cholesky factorization does not exist and consequently,
the product B~[A will usually be unsymmetric. The QZ algorithm is the method of
choice for solving full unsymmetric generalized eigenproblems. The Lanczos tri-

diagonalization is particularly effective for solving generalized eigenproblems with

symmetric, indefinite since the inversion and hence pre-multiplication


matrices
is not carried out explicitly. A procedure for solving the generalized symmetric
eigenproblem where \320\233, are
\320\222 both indefinite is given in [27].
In conclusion, it should be remarked that the speed of convergence of the sparse
eigensolution techniques is quite amazing compared to full matrix methods. This is
only to be expected, but a 500-fold speedup for the extremal eigenvectorsof a general-
generalized
eigenvalue problem of order 800 caught the author by surprise. When this fact is
coupled with the meager storage requirements, of
usage sparse methods for eigenvalue
problems is clearly a win-win situation. The trade-off comes in the increased program-
programming labor and the inability to use public domain black-boxroutines.
Section 9.7 Parallelization
\302\246 327

9.7 PARALLELIZATION

As mentioned earlier, ihere are two problems which limit the vectorizability of a
sparse matrix code: short vector lengths and indirect addressing.Thereis not much
to be done about the second problem since sparse matrices must have indirect

addressing to exploit the O(N) storage feature. However, the first problem can be
removed by storing the matrix in an optimizable machine-dependent format. The
jagged diagonal method of matrix storage is a case in point. The still slower execu-
execution speeds of the matrix-vector multiply compared with the vector update dueto
is
the indirect addressing in the inner loop which causes memory contention. On a
distributed memory architecture, the second problem can also be partly removed by
keeping local copies of the desired vector in each processor. The subsequentgather
and scatter operations of the updated vectors then consume the majority of the
processor communication time.
Belowwe discussthe implementation of a finite element code on two different

types of massively parallel architectures:2 the KSRI and the Intel iPSC/860. The
KSR1is a parallel machine which implements a sharedvirtual memory, although the
memory is physically distributed for the sake of scalability. The Intel iPSC/860, on
the other hand, is a distributed memory, Multiple Instruction, Multiple Data

(MIMD) system in which the nodes process information independently of one


another and communicate by passing messages to each other. The conversion of

sequential or vectorized code to parallel code involves two primary tasks:

\302\246
parallelization of DO loops
Parallelism is introduced by allowing each processor to execute a portion of

the DO loop.
\302\246
distribution of arrays among the processor set
Sinceeach processoronly has a limited amount of memory, each array is

divided into smaller units that reside on each node. This also allows array
accesses from each processor to be serviced by different nodes, thus reducing
contention for resourceson any single node.

On a cache-only memory machine such as the KSRI,only the first step is necessary
since the hardware cache system automatically takes care of data distribution among
the processors. This makes porting codes to the KSRI quite easy. However, the
increasedcontrol of data distribution and communication on theiPSC/860 can trans-
translate into improved performance for some applications. The data distribution on mes-

message-passing architectures is controlled by a preprocessing step in which the

computation domain is partitioned efficiently among the processors. This efficient


partitioning step is more commonly referred to as Domain Decomposition which
aims to minimize interprocessor communication and to achieve load balancing. A
detailed discussion of domain decomposition is outside the scope of this book: for
further information, the reader is referred to an excellent review paper by Hamandi
et al. [28].

\"The KSRI is no longer available, and the Intel iPSC/860is being phased out. Nevertheless,these
parallel platforms represent examplesof distributed memory architectures.
328 Numerical Issues \302\246
Chapter 9

1. KSRIPort

The most important aspect of parallelization involves optimizing the iterative solver.
For the sakeof simplicity, we complex symmetric BiG solver.
consider the
Figure 9.18 shows the symmetric BiCG algorithm; the unsymmetric method
given in Fig. 9.4 contains an additional matrix-vector multiply and a few additional
vector updates. For a system of equations containing N unknowns, all vectors in the
algorithm are of size N and the sparse matrix A is of order N. Table 9.3shows the

operation counts per iteration for each type of vector operation, where nze denotes
the number of nonzero elements in the sparse matrix. In the finite element code, each
vector operation is implemented as a loop and parallelization is achieved by tiling

these loops. For P processors, the vectors are divided into P sections of N/P con-
consecutive elements. Each processor is assigned the same section of each vector. This

partitioning attempts to reduce communication while balancing load. To guarantee


correctness, synchronization points are added after lines 2, 7, and 9. Lines 2 and 7
require synchronization to guarantee that the dot products are computed correctly.

Initialization:

\320\273:
given
r = b \342\200\224
Ax;p = r, Imp r-r
\342\200\224

Repeat until {resd < tof)

a = tmp/{q \302\246
p) B)
x = x+ ap C)'
r=r \342\200\224
uq D)
q = CTx*r E) Step!
resd = Vk \342\200\242
f* I (&)

P = (r-q)/tmp G)
= /8 x imp (8)
imp
. Step 3
p = q + Pp (9)
End Repeat

is
\320\233 a sparse complex symmetric matrix.
the preconditioning
is
\320\241 matrix.
q, p, x, r are complexvectors
a, ft, tmp are complex scalars; resd,tol are real scalars.

Figure 9.18 Symmetric biconjugate gradient method with preconditioning.

TABLE 9.3 Floating Point Operations per Iteration

Complex Real

Operation \320\266 + * +

Matrix multiply me n:e-N 4n~e 4/ire - 2N

Vector updates AN 3N \\6N 12N


Dot product!. \320\227\320\233/ 3/V 12\320\233\320\223 I2.V
Section 9.7 Parallelizalion
\302\246 329

Note that the dot products in lines 6 and 7 require only one synchronization. The

line 9 synchronization guarantees that p is completely updated before the matrix

multiply for the next iteration begins.


In the sparse matrix vector multiplication, each processor computesa block of

the result vector by multiplying the corresponding block of rows of the sparse matrix
with the operand vector. Sincethe operand vector is distributed among the process-
data
processors, communication The communication pattern
is required. is determined by
the sparsity structure of the matrix, is derived from the unstructured mesh.
which
Therefore, the communication pattern is unstructured and irregular. Vector updates
and dot products are easily parallelized using the same blockdistribution as in the

sparse matrix vector multiply.


Sparse computations are known to be hard to implement efficiently on a distrib-
distributedmemory machine, mainly because of the unstructured and irregular communica-

communication
pattern. However, scheme was easilyand efficiently
the previous implemented on
the KSRl Massively Parallel Processingmachine thanks to the global address space
[29].Table 9.4 shows the execution time of one iteration (in seconds) and the speedup
for different numbers of processors and for two problem sizes.

TABLE 9.4 Execution Time and Speedup for the Iterative Solver

N = 20,033 N = 224,476

Execution Time Execution Time


Procs (seesper iter) Speedup (sees per iter) Speedup

I\" .515 1 10.8 1

s .071 7.3 1.4 7.7


16 .040 12.9 .671 16.1
29 .027 19.1 .304 35.6
60\" .149 76.2

\"For 1. 8. and 16 processors, only the first 100 iterations were run.
''Code run on a 64-node KSR at Cornell Universiiy.

For both problems, the performance scales surprisingly well up to a large


number of processors.For the 20,033-unknown problem, the speedup for the
parallelizedsparsesolver varies from 1 to number of processorsis increased
38 as the
from 1 to 56 (Fig. 9.19). The overall performance of the solver on 28 processorsis
more than three times that of a single processor on the Cray-YMP. The large
problem B24,476unknowns) exhibits superlinear speedup which can be attributed
to a memory effect. As a matter of fact, the large data set does not entirely fit in the
local cache of a single node in the KSR which results in a large number of page
faults. However, as the number of processors increases, the large data set is more
evenly distributed over the different processors' memories.
The global matrix assembly is the second largest computation in terms of
execution time. Typically, the elemental matrices are computed for each element
in the 3D mesh and assembledin a global sparse matrix. A natural way of paralleliz-
the
parallelizing global matrix assembly is to distribute the elements over the processors, have
330 Numerical Issues \302\246
Chapter 9

56
Linear
Measured (solver}
Measure\"d'(rnatgen)

16 24 32 40 48 56
Figure 9.19 Speedup curve for the linear
Number of processors
equation solver on the K.SR1.

each processor compute the elemental matrix of the elements it owns, and update ihe
global sparse Since
matrix. the global sparse matrix is shared by all processors, the
update needs to be done automatically. On the KSR1 this can be done by using the
hardware lock mechanism.
The performance for the matrix assembly is given in Table 9.5 and also in Fig.
9.19.

TABLE 9.5 Execution Time and Speedup for the Matrix Generation
and Assembly B0,033 unknowns)

Procs Execution time in seconds Speedup

I 24.355 )
2 13.376 1.8
4 6.811 3.6
8 3.744 6.5
16 1.89 12.9
25 1.625 15,0
28 1.276 19.1

9.7.1Analysis of Communication

In the main loop (Fig. 9.18),significant communication between processors


takes place only during the sparse matrix vector multiply (line 1) and the vector
update of p (line9).The rest of the vector operations incurs little or no communica-

at all.
communication The distribution of the nonzero entries in the matrix affects the amount

and nature of communication. In this section, an analysis of the com-


is presented
communication pattern incurred by the sparse matrix vector multiplication as derived
from analysis of the sparsity structure of the matrix.
Seciion 9.7 Parallelization
\302\246 331

Line /. In the matrix-vector multiply, each processor computes an N/P-sized


subsection of the product q. The processor needs the elementsof p that correspond
to the nonzero elements found in the N/P rows of \320\233that are aligned with its
subsection. Because the matrix \320\233
remains constant throughout the program, the
set of elements of p that a given processor needs is the same for all iterations in
the loop. However, since p is updated at the end of each iteration, all copies of its
element set are invalidated in each processor's local cache except for the ones that the
processor itself updates. As a result, in each iteration, processors must obtain

updated copies of the required elements of p that they do not own.


These elementscan be updated by a read miss to the corresponding subpage,

by an automatic update, or by an explicit prefetch or poststore instruction. Figure


9.20 lists the number of subpagesthat each of the 28 processors needs to acquire
from other processors.Automatic update of an invalid copy of a subpage becomes
more likely as the of processors sharing this subpage
number grows. The number of
processors that need a given subpage (excluding the processor that updates the
subpage) is referred to as the decree of sharing of that subpage. Figure 9,21 shows
the degree of sharing histogram for the example problem. Since the only subpage
misses occurring in Step I of the sparse solver are coherence misses due to the vector
p, the use of the poststore instruction to broadcast the updated sections of the vector
p from Step should3 eliminate the subpage misses in Step 1. However, the overhead

350-

300

250

200

150

100

50

1 I I I I I I \320\237 I I I 1 I I I I II I I I I I I
\320\223 I
012345678 9 10 1112 13141516171819
20 2122 23 24!
Thread ID

Figure 9.20 Counts of p subpages required by each processor sparse malrix-vecior


multiply (total copies = 5968).
332 Numerical Issues \302\246
Chapter 9

Figure 9.21 Degree of sharing histogram of


0 1 2 345678910 p subpages during sparse matrix-vector multi-
Degree of sharing
B8
multiply processors).

of executing the poststore instruction in Step 3 offsets the reduction in execution time
of Step 1. On a poststore,the processor typically for 32 cycles while
stalls the local

cache is busy for 48 cycles. As a result, the net reduction in execution time is only 3
percent.

Line 9. Before proceeding with the updates of the N/P elements of p for
which it is responsible, each processormust acquire exclusive ownership for those

elements. Because a cache line hoJds eighl consecutiveelements,each processorwiJI


generate N/&P requests for ownership subpages (assuming all
are shared). In order
to hide access latencies,the request can be issued in the form of a
for ownership
prefetch instruction after step 1. This could lead to an eightfold decrease in the
number of subpage misses. However, as with the poststore instruction, the benefit
of prefetching is offset by the overhead of processing the prefetch instructions in Step
2. This is because the processor stalls for at least two cycles on a prefetch and the
local cache cannot satisfy any processor request until the prefetch is put on the ring.
The overall execution time is reduced by only 4 percent in this case.

Lines 2,6, 7. The rest of the communication is due to the three dot products.
Each processor computes the dot product for the vector subsection that it owns.

These are then gathered and summed up on a single processor.

2. Intel IPSC/860 Port

The parallelization of the DO loopsis main tasks sincethe majority


one of the of the

computer time is spent on solving the


system of equations.
linear The basic strategy

for parallelizing the DO loops on the iPSC/860 is similar to the KSR1 with each
processor executing a portion of the DO loop. This scheme works fine as long as
there are no dependencies in the body of the loop, as is the case for the vector
updates and the sparse matrix-vector multiply of the linear solver. However, the
main loop in the matrix generation/assemblyphasecontains a dependency between
loop iterations. As on the KSR1. this problem is solved by using a mechanism in
Section 9.7 Parallelization
\302\246 333

which each processor locks a row of the matrix while performing an update. Since
the of each row is maintained by the processor
locking whose memory holds the
particular row, processors lock and unlock rows by sending messages to the appro-
appropriate row owner.
Even though the parallelization of loops enables programs to run faster on
multiprocessors, the distribution of arrays must be done for all.
the code to run at
Arrays are distributed code by partitioning
in the data along one array dimension
among the processors. Thus for a 1000-element array, processorI holds the first 100
elements, processor 2 the next 100 and so on. The straightforward method for
accessing this distributed array involves the translation of array references into sub-
subroutine calls. Thus an expression x = a{i)is translated into the call call fetcha(i. \320\273).
The subroutine then sends
/<?/\320\263\320\233\302\253 a message to the processor that holds element a(i),
which in turn sends a reply message with the value of \302\253(/).Although this scheme
requires the implementation of a new subroutine for each distributed array and the
replacement of each array access with a subroutine call, the process is easy and
mechanical.
The schemementioned above does not. however, result in good performance.
The primary reason for this is that the overhead for sending a messageis much

higher than that of sending a single byte. The cost for sending ten or even 100
bytes is usually not much higher than that of sending 1 byte. Thus, messages need
to be \"bundled' for fast and efficient operation. However, the simple strategy
mentioned above is in direct contrast to message bundling. One way of overcoming
this conflict is to implement the simple schemefor parts of the code that do not take
up a significant portion of computation time like the matrix generation/assembly
phase and a better scheme for accessing the distributed arrays in the equation solver
phase.
The primary operation in the solver that
generates communication is the sparse
matrix-vector product. Sincethe matrix-vector product involves performing a dot
product of each row with the distributed vector, each processor must obtain the
values vector from the other processors. The dot product operation
for the entire
must be carried out in several phases as each processor may not be able to hold the
entire vector in memory. Thus, each processor P beginsthe matrix-vector multiply
by sending its portion of the vector to other processors, then performs the following
tasks for every other processor P'\\

\302\246
Reads the portion of the vector owned by P'.
\302\246
Updates the partial dot product for each row by adding the product of the
appropriate matrix element with the elements of the partial vector.

After performing the above operations for all the processors, the dot product is
complete. Unfortunately, each phase requires a pass over all the sparse matrix

rows owned by the processor. For better parallel performance, each row of the
matrix must be sorted to allow the phases to pass over the rows in order. It was

found that the problem scaled reasonably well for a small number of processors.
However, as the number of processors increased, much of the time was spent on
communication and book-keeping than on true computation.
334 Numerical Issues \302\246
Chapter 9

REFERENCES

[1] D. R. Kincaid and \320\242.


\320\241. ITPACK
\320\236\321\200\321\200\320\265. on supercomputers. Numerical
Methods: Lecture Notes in Mathematics, 1005:151-161, 1982.

[2] G. V. Paolini and G. Radicati di Brozolo. Data structures to vectorize CG


algorithms for general sparsity patterns. BIT, 29:703-718. 1989.
[3] E. Anderson and Y. Saad. Solving sparsetriangular linear systems on parallel
computers. International Journal of High Speed Computing, 1 1989.
A ):73\342\200\22495,

[4] netlib. Available through the World Wide Web at http.y/www.netlib.org.


[5] S. Pissanetzky. Sparse Matrix Techniques. Academic Press,New York. 1984.

[6] Z. Zlatev. On some pivotal strategies in Gaussian elimination by sparse tech-


technique. SI AM J. Numer. Anal., 17:18-30, 1980.
[7] E. Cuthill and J. McK.ee. Reducing the bandwidth of sparse symmetric

matrices. In Proceedings of the 24th National Conference of the ACM.


Brandon Systems Press, NJ, 1969.
[8] King. An automatic reordering schemefor simultaneous
1. P. equations derived
from network systems. Int. J. Numer. Meth. Eng., 2:523-533. 1970.
[9] A. George and J. W. H. Liu. Computer Solution of Large Sparse Positive
Definite Systems. Prentice-Hall, Englewood Cliffs, NJ, 1981.
[10]M.R. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear
systems. J. Res. 1952.
Natl. Bur. Stand.,49:409-436,
[11]R. Freund. Conjugate-gradient type methods for linear systems with complex
symmetric coefficient matrices. SI AM J. Set. Stat. Comput., 13:425-448, 1992.
[12] O. Axelsson. Solution of Linear Systems of Equations: Iterative Methods.
Number 572 in Lecture Notes in Mathematics. Springer-Verlag, Germany.
1977.
[13] H. A. Van der Vorst. Preconditioning by Incomplete Decompositions.PhD thesis,
University of Utrecht, Holland, 1982. ACCU-series 32.
[14]P. Sonneveld. CGS, a fast solver for nonsymmetric linear systems.SIAM J. Sci.
Stat. Comput., 10:35-52, 1989.
[15] H. A. Van der Vorst. Bi-CGSTAB: A fast and smoothly converging variant of
Bi-CG for the solution of nonsymmetric linear systems. SI AM J. Sci. Sun.
Comput., 13:631-644,1992.
[16] R. W. Freund, G, H. Golub, and N. M. Nachtigal. Iterative solution of linear
systems. Ada Numerka, pp. 57-100,1991.
[17J Y. Saad. Iterative Method's for Sparse Linear Systems,PWS Pub. Co., Boston.
1996.
[18] R. Barret et al. Templates for the Solution of Linear Systems: Building Blocks for
Iterative Solvers. S1AM, 1994.

[19] H. P. Langtangen. Conjugate gradient methods and ILU preconditioning of


nonsymmetric matrix systems with arbitrary sparsity patterns. Int. J. Numer.
Meth. Fluids, 9:213-233, 1989.
References 335

[20] Y. Saad. ILUT: A dual threshold incomplete LU factorization. Technical

Report U MSI 92/38, University of Minnesota Supercomputer Institute,


Minneapolis, MN, March 1992.
[21]J. R. Lovell. Hierarchical basis functions for 3D finite element methods. ACES
Digest, pp. 657-663, 1993.
[22] A. Chatterjee. J. L. Volakis, and D. Windheiser. Parallel computation of 3D
electromagnetic scattering using finite elements. Int. J. Num. Modeling. 7:329-
342, 1994.
[23] E. Rothberg and A. Gupta. Parallel 1CCGon a hierarchical memory multi-

multiprocessor^\342\200\224Addressing the triangular solve bottleneck. Parallel Computing,

18:719-741, 1992.

[24] G. H. Golub and \320\241F. Van Loan. Matrix Computations. Johns HopkinsUniv.
Press, Baltimore, MD, 1983.
[25] \320\241. Reinsch.
\320\235. A stable rational QR algorithm for the computation of eigen-
eigenvalues of a hermitian, tridiagonal matrix. Numer. Math., 25:591-597,1971.
[26]J. Cull urn and R. A. Willoughby. Lanczos algorithms for large symmetric eigen-
eigenvalue computations. Progress in Scientific Computing Series. BirkhauserBoston
Inc., 1983.

[27] J. F. Lee, D. K. Sun, and Z. J. Cendes. Full-wave analysis of dielectric wave-


waveguides using tangential vector finite elements. IEEE Trans. Microwave Theory
Tech., MTT-39(8): 1262-1271, August 1991.

[28] Hamandi, Lee, and Ozguner. Survey of domain decomposition methods. In


ACES ConferenceDigest,March 1995.

[29] D. Windheiser, E. Boyd, E. Hao,S. G.Abraham, and E. S. Davidson. KSR1


multiprocessor:Analysis of latency hiding techiques in a sparse solver. In Proc.
of the 7th International Parallel Processing Symposium, Newport Beach,April
1993.
Index

absorbers capacitor. 69
active, 198 caviiy

anisotropic, 199 cylindrical, 180


artificial. 130. 194 rectangular, 172
Perfectly Matched Layers (PML), 131, resonators, 171
194-201 ridged, 172

absorbing boundary conditions, 121 circuit transition


circular boundary. 121 coax-to-microstrip, 174
conformal. 186,192 CPW-to-microstrip. 173
cylindrical. 212 coatedconductor reflection, 70, 85-89
FEM formulation. 123. 184-193. groove geometry, 279
201-204 groovescattering, 132-134, 296
first-order. 186 parallel plates, 69, 83
planar. 192-193,206,211,216 radiation

rectangular boundary, 122 array reflection coefficient. 263


rectangular-cylindrical. 213 cavity-backed patch, 218

scattering examples cylindrical via, 220


circular cylinder. 128 patch antenna, 216, 255, 260-261

groove, in ground plane, 132 patch on ogive. 219


rectangular cylinder-coated. 145 scattering, three-dimensional

triangular cylinder. 129 composite cube, 207


second-order,188.192 conesphere, 217
spherical. 206. 209 cylindrical inlet, 210-212
survey. 183 metallic cube, 205
symmetric, 189 plate, 213
unsymmetric, 186 RCS of ogival cylinder. 236
adaptive integral method. 277 RCS of patch antenna, 252
aggregation, see Fast Multipole Method rectangular inlet, 210
anisotropic. 5, 27. 228. 268 scattering, two-dimensional, 120-127
antennas, see radiation circular cylinder, 128
applications groove,in ground plane, 132, 134, 296

337
338 Index

scattering, two-dimensional (continued} resonators, 171


rectangular cylinder-coated, 145 ridged. 172
triangular cylinder, 129 cavity-backed aperture. 245, 257. 266
transmission coefficient, 257 characteristic impedance
waveguide eigenvalues transmission line. 95
circular, 112 waveguide, 98
rectangular. 108-111 Cholesky, 299,303-304, 326
\320\242\320\225
modes, 111 circuit excitations, 170, 229. 238-243
TM modes. 4
\320\230 circuit transition

waveguide propagation coax-to-microstrip. 174


homogeneous. 97, 111-118 CPW-to-microstrip. 173
inhomogeneous, 98 circular cylinder, array radiation, 257
area coordinates, 43 circular cylinder scattering, 128. 135
array reflection coefficient, 263 coaled conductor reflection, 70, 85-89
artificial absorbers, 130-132, 194-201 coaled cylinder, scattering, 145

assembly collocation (point matching), 28


one-dimensional,76 combined field integral, 231
sample matlab code, 150 condensation of boundary conditions, 116
two-dimensional, 105, 108, 110. 115 condition number (matrix), 30
conductivity. 3, 194

Bayliss-Turkel, ABCs. 121 Conjugate Gradient (CG), 308


Berenger PML, 194 Conjugate Gradient-FFT (CG-FFT),
Besselfunction, 282 249 251
BICGSTAB, 310 Conjugate Gradient Squared (CGS),308
Biconjugate Gradient(BiCG), 245,248,318, constitutive relations, 2
328 continuity
see also iterative algorithms equation. 3
biconjugate gradient, unsymmetric, 309 field continuity. 232
bistatic echowidth, 146 interelement continuity, 165
boundary conditions coordinatestretching, 199

conductive, 20 coupling equation, 233, 243


Dirichlet, 8, 15, 80-81. 114 curl operator. 163-164
impedance, 17, 20-23,82-83, 160 curvature. 185
natural. 1, 17-20 curvilinear element, 61
Neumann. 8. 16.80-81,104, 130. 259 cylindrical boundary ABCs, 193
resistive sheet. 160
Rytov, 19 Dirichlet boundary condilions, see
sheet transition. 23-25, 160 boundary conditions
boundary integral mesh truncation, 134, disaggregation, see Fast Multipole Method
227, 247,270, 279-280 divergencetheorem. 11, 101,159,271
KirchhofT integral equation, 134-136 duality, II. 13
matrix. 137, 248 Dupin coordinate system, 184
self cell. 136.230
bricks, 45, 167-168, 268-270 echowidth, 124, 146-149
edge basis/elements
C\302\260
continuity, 38 definition, 138
C1 elements, 163 expansion, 138
capacitance of transmission line, 94 hierarchical, 62
capacitor, 69 matrix elements, 143
cavity three dimensions, 54, 56, 59, 163,
cylindrical, 180 166-167. 176,258.268
rectangular, 172 two dimensions,48, 51,54, 137
Index 339

vector plots, two-dimensional, 51-52, Fast Fourier Transform (FFT), 247, 249.
140-141 260, 267
see also elements fast integral methods, see adaptive integral
eigenvalue problem, 111,117, 144, 315, 320 method; Fast Multipole Method

degeneracy,172 Fast Multipole Method (FMM)


generalized eigenvalue problem, 165, 325 aggregation. 282. 285, 287, 292
power method, 322-323 pseudo code, 288, 290-291
Rayleigh matrix. 324, 325 disaggregation. 283, 286-287
standard eigenvaJue problem. 321. 325 pseudo code, 288,290-292
subspaee iteration, 324 exact FMM
electric conductivity. 194 algorithm, 280, 282
electric field integral operation count, 283. 294-295

three-dimensional. 10-12, 231, 240 pseudo code, 288, 291


two-dimensional, 14, 125, 127, 146 Fast Far Field Algorithm (FAFFA)
electrostatics, 6 algorithm, 285

element matrix operation count, 286. 294-295


brick, anisotropic. 268-272 pseudocode,290, 293

brick, isotropic. 167-168 logic flow. 287


linear, 75 matrix-vector product
quadrilateral. 149 procedure, 282

triangular element pseudo code. 293


283
edge-based, 143 multilevel FMM.

node-based. 104-105 operation count, 283-284.286,294-295


prism, 176-178 RMS error, 294. 296
triangular
elements translation, 282, 286-287
curvilinear. 61 pseudo code, 288, 292
H\302\260
(curl), 53, 166 windowed FMM
tf' (curl), 53 algorithm. 283

hexahedral, 54 operation count. 284, 289,294-295


hierarchical, 54-56 pseudo code, 289, 292

isoparametric, 41 feeds
linear. 39, 73 aperture, 229, 242
241
prism/pentahedral, 48, 59, 176-178, 247 coaxial cable,

40, 50, 149 filamentary current (probe),229,239


quadrilateral.
174
rectangular/brick, 40. 49, 167-168,
238, interchip feed-through,
243, 247, 252, 256, 268-270 microstrip, 242
257-259 modal excitations, 169
shell,

letrahedra/, 46, 51. 56-59. 166.242, 247 mode-matched, 169,243


triangular. 42. 51, 104. 143.272 plane wave. 238
gap, 240
equivalence principle, 132 voltage
see also circuit
equivalence surface. 9. 230
excitations

error control, 304 finite element-ABC. 121, 184-193,


expansion, see element matrix; elements;
201-204
Galerkin's method; piecewise Unite elemeni-artifietai absorber, 130,
functions 194-201
constant expansion; shape
Finite Element-Boundary Integral
(FE-B1) method, 227. 246.252, 256
face basis. 57 finite element method, one-dimensional
far zone, 12 assembly, 76
far zone field evaiuation. 146
126\342\200\224127. boundary condition implementation, 79,
fast far field algorithm, we Fast Multipole 83
Method boundary constraints, 78
340 Index

finite element method, one-dimensional input, 261

{continued) plane wave. 197


elemental matrix. 75 impedanceboundary conditions, 17, 20-22,
Galerkin's method, 72 82-83, 160
history, 65 natural, 1

memory, 66 radiation, 5
mesh examples. 67-68 resistive, 19
node numbering, 74 inner product, 24
procedure/steps, 68, 72 interchip feed-through. 174
pseudocode,79. 89 interelement continuity, 165
stiffness matrix, 78 isoparametric element, 41
weak form. 72 isotropicmedium. 2, 228, 269
weighted residual method. 71. 75 iterative algorithms
finite element-potential formulation, 162 BICGSTAB, 310
formulation Biconjugate Gradient (BiCG), 245,248,
electric, 34 318.328

magnetic. 35 biconjugate gradient, unsymmetric, 309


potential, 35. 162 Conjugate Gradient (CG).308
scattered field, 202-204 Conjugate gradient-FFT (CG-FFT),
secondary (scattered), 35 249-251
total field, 34. 201 Conjugate Gradient Squared(CGS),307
functional. 159 convergence. 312, 314
Generalized Minimal Residual

Galerkin'smethod. 28. 32. 71, 74-75. (GMRES), 311,314


229-234 GMRES, flexible, 320

Gaussian curvature, 187 Quasi-Minimal Residual (QMR), 310


Gaussian elimination, 299 see also adaptive integral method; Fast
see alsoLU decomposition Multipole Method

Generalized Minimal Residual (GMRES),


311,314
jacobian, 42, 46
flexible, 320

global numbering, 168


Green's function Kirchhoff integral equation, 134-136
dyadic, 10, 12-13, 232, 246. 251,259, 270 Krylov space. 311.314,325
three-dimensional, static, 7 A-space method, 278
two-dimensional, dynamic, 14. 125, 281
two-dimensional, static, 7 Lagrange 39
polynomials,
Green's theorem, 8, 125.229 Lanczos algorithm, 325-326
groove scattering, 132, 296 Laplace's equation. 94
ground plane, 133, 230, 239, 245 least squares method, 32-33
linear. 75
H\302\260
(curl). 53. 166 losstangent. 4
Hl (curl), 53 LU decomposition, 303, 316-317
Hankel function, 125 Cholesky.303-304.326
addition theorem, 281 Incomplete LU (ILU), 300, 315 318
far field approximation, 284
Helmholtz equation, 6, 25, 98 magnetic conductivity, 194
hexahedral element, 54-56 magnetic field integral
hierarchical element, 56 three-dimensional, 10-12, 230
two-dimensional. 14. 134,279-280
impedance magneloslatic. 9
free space. 5, 14 material constants. 3
Index 341

malrix packaging, 173


bandwidth, 306-307 parallel plate waveguide, 69. 83
Block Toeplitz.247, 249, 260 parallelization. 300. 328-330
compression, ,we adaptive integral matrix-vector product. 331-332
method; fast multipole method multiprocessor communication, 330
condition, 30, 308 Perfectly Matched Layers (PML), 131,
CuthiIl-McK.ee. 306 194-201
definitions, 31 PeriodicMoment Method (PMM), 261
error control. 304 permeability. 3
forward elimination, 317 permittivity, 3
malrix graph, 306 phusor, 2
matrix norms. 30 piecewise constant expansion, 134

norms, 30 pivoting. 304-305


positive-definite, 31, 300 planar boundary ABCs, 192-193
preconditioning. 3 i 3-319 plane wave, see circuit excitations: feeds
storage, 299 plane wave impedance,96. 197
structure/formats, 306 Poisson's equation, 7
compressedsparsecolumn, 301 port scattering parameters, 169
Compressed Sparse Row (CSR). positive-definite. 24, 31
300-303 potential formulation, 162
finite element-ABC/PML, 307 potentials. 7, 9. 35. 242-243
finite element-Bl, 237. 247 Poynting\"s theorem, 20-21
ITPACK,30O-303 preconditioning, 3J3
jagged. 300-303 approximate inverse, 318-319

ordering, 305, 307 block, 314


pivoting. 304-305 diagonal. 313
scheduling algorithm, 316-317 global steepestdescent, 319

vector product, 277, 282, 331-332 1LU.315


pseudo code. 293 principal value, 230
seealso element matrix prism/pentahedral element, 48, 59,
Maxwell's equations, J-2 176-178, 247
mesh examples. 67-68
mesh truncation, see absorbing boundary quadrilateral element, 40, 50. 149
conditions; artificial absorbers; Quasi-Minimal Residual <QMR), 310
boundary integral mesh truncation
modal excitation. 169 Radar Cross-Section (RCS), 205,236, 252,
moment method, 134-137, 229. 254 260
periodic, see Galerkin's method radar echo-area, 205
multiprocessor communication, 330 radiation

cavity-backed patch, 218, 253, 256


cylindrical via, 220
netlib. 321
notch antenna, 263
Neumann boundary conditions, 7-8, 13.16,
on cylinder, 258, 262
80-81,104, 130, 259
patch
on 219
patch ogive,
node-based expansion, 102-103
node numbering Rayleigh-Ritz method, 24. 32
seealso variational formulation
global, 74, 168
Rayleigh-Rilz minimization, 21, 24,159.
local, 74
161,170
null-space,164 brick element, 40, 49,
rectangular
Nyquist sampling theorem, 282
167-168.238.243, 247, 252. 256.
268-270
orthogonalizalion. 322 rectangular cylinder scattering. 145
342 Index

rectangular groove scattering, 133 tetrahedral element. 46, 51.56-59,166, 242,


resistive sheet boundary condition, 160 247
resonance. 157.163.231.254 time-harmonic, 2, 16
Toeplilzmatrix. 247. 249. 260. 277
scattered field transfinite element method. 169
zone evaluation,
\320\223\320\260\320\263 126-127, 146 transition conditions, see boundary
integral expression. 125 conditions
wave equation, two-dimensional, 97 translation, see Fast Multipole Method
scattering parameters traveling waves, 129
/V-porl, 169 triangular cylinder scattering. 129
scattering, three-dimensions
triangular element, 42. 51.104.143.272
circular cylinder, 128, 135 two-dimensional problems, see applications
coatedconductor reflection. 70, 85-89
coated cylinder, 145
theorem. 22
composite cube. 207 uniqueness

conesphere, 215
cylindrical inlei, 211 variational formulation, 24-27, 159.161,
groove. 132 170
metallic cube,205 vector norms, 29
plate, 213 volume coordinates, 46, 56
rectangular inlet, 209
triangular cylinder. 129
Watson's transformation. 260
see also applications
wave equation, 5
scattering, two-dimensions, 120-127
general form. 100
self-adjoint, 24, 33
self-cell, 136 scalar, 97
functions
vector, 97-98, J84
shape
37, 48, 143 weak form. 72. 102. 230
edge-based,
node-based,37, 39 waveguide eigenvalues
circular, 112
one-dimensional, 39, 73
see alsoelements rectangular, 108-111
see \320\242\320\225
modes, 111
sheet transition conditions, boundary
conditions
TM modes, 114
shell, 257-259 waveguide propagation

solvers, see iterative algorithms; LU homogeneous, 97, 144


decomposition; matrix inhomogeneous,98
Sommerfeldradiation condition. 192 wave (intrinsic) impedance, 5. 14
source modeling, see circuit excitations;
wave number. 5, 14
feeds weak form

spurious solutions, 157, 163 one-dimensional, 72


Sturm-Liouville problem, 69 three-dimensional, 229-230
superposition theorem, 23 two-dimensional, 102

surface curl. 185 weighted residual, 28, 71. 75


surface gradient, 191 Whitney elements, 51, 140
surface of revolution, 230, 264 see also edgeelements
surface waves. 129 Wilcox expansion. 186
windowed FMM, see Fast Multipole
tessellation, see mesh examples Method
About the Authors

John L. Volakis is a professor in the Department of Electrical Engineering and

Computer Science at the University of Michigan. He received his Ph.D. from


Ohio State University in 1982 and spent two years at Rockwell International
working on the B-1B program before going to Michigan. He has 20 years of
experiencein numerical and analytical methods and pioneeredthe application and
development of hybrid finite element methods to high-frequency electromagnetics.
His publications include over 140 refereed journal chapters on
articles, 12 book
analytical and numerical methods, and numerous conferencearticles. He also
coauthored (he book, Approximate Boundary Conditions in Electromagnetics
(Institute of Electrical Engineers. London, 1995). Dr. Volakis is a Fellow of the
IEEE and has advised over 20 Ph.D. students. He has served as an associate
editor to IEEE Transactions on Antennas and Propagation and Radio Science.
Currently, he is an associate editor for the IEEE Antennas and Propagation
Society Magazine and the Journal of Electromagnetic Waves and Applications.

Arindam Chatterjee obtained his Ph.D. from the University of Michigan in 1994.
From 1989 to 1994, he served as a research assistant and later as a Research Fellow
in the Radiation Laboratory, University of Michigan, Ann Arbor. His work there
dealt with the development, implementation, and application of the finite element

method and absorbingboundary conditions in modeling electromagnetic radiation


and scattering from arbitrary three-dimensional structures. From 1995 to 1996,he
worked at Compact Software. He is presently with the HP-EEsof division of Hewlett
Packard and workson the development of the HP HFSS (High-Frequency Structure

Simulator) finite element modeling package for CAD simulation.

Leo \320\241
Kempel is a senior research engineer in Mission Research Corporation's
Electromagnetic Observables Sector. He received his Ph.D. from the University of

Michigan in 1994. In addition to conducting research in scattering reduction, Dr.

343
344 About the Authors

Kempel developed the finite element-boundary integral method for singly-curved


structures and modeledconformal antennas with complex material loading using
the FEM. His current interest has expanded to antennas on doubly-curved
conformal platforms, modeling anisotropic substrate, and to developing novel
hybridization strategies designed to marry the best properties of the finite element

method with other computational electromagnetics methods such as integral


equations or physical optics.

You might also like