0% found this document useful (0 votes)
159 views

Neural Network and Fuzzy System

The document introduces fuzzy logic and its key concepts. It discusses how fuzzy logic allows for partial truth between fully true and fully false, using membership values between 0 and 1. It then contrasts fuzzy logic with neural networks, noting that fuzzy logic makes decisions based on imprecise data using rules and membership functions, while neural networks learn from examples using an algorithmic process. Finally, it provides examples of applications of fuzzy systems in various fields such as aerospace, automotive, business, defense, electronics, and more.

Uploaded by

Renatus Godian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
159 views

Neural Network and Fuzzy System

The document introduces fuzzy logic and its key concepts. It discusses how fuzzy logic allows for partial truth between fully true and fully false, using membership values between 0 and 1. It then contrasts fuzzy logic with neural networks, noting that fuzzy logic makes decisions based on imprecise data using rules and membership functions, while neural networks learn from examples using an algorithmic process. Finally, it provides examples of applications of fuzzy systems in various fields such as aerospace, automotive, business, defense, electronics, and more.

Uploaded by

Renatus Godian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 189

Chapter 1

Introduction to Fuzzy Logic


1.1 Fuzzy Logic: The word fuzzy means uncertainty. Any particular event which
do not result any of the exact value (i.e. true or false) is fuzzy. Fuzzy Logic was
introduced in 1965 by Lofti A. In other words, we can say that fuzzy logic is not
logic that is fuzzy, but logic that is used to describe fuzziness. There can be
numerous other examples like this with the help of which we can understand the
concept of fuzzy logic.
The notion central to fuzzy systems is that truth values (in fuzzy logic) or
membership values (in fuzzy sets) are indicated by a value on the range [0.0, 1.0],
with 0.0 representing absolute Falseness and 1.0 representing absolute Truth. For
example, let us take the statement:
"Rama is old."
If Rama’s age was 75, we might assign the statement the truth value of 0.80. The
statement could be translated into set terminology as follows:
"Rama is a member of the set of old people."
This statement would be rendered symbolically with fuzzy sets as:
mOLD(Rama) = 0.80
where m is the membership function, operating in this case on the fuzzy set of old
people, which returns a value between 0.0 and 1.0.

A set is an unordered collection of different elements. It can be written explicitly by


listing its elements using the set bracket. If the order of the elements is changed or any
element of a set is repeated, it does not make any changes in the set.

Example

 A set of all positive integers.


 A set of all the planets in the solar system.
 A set of all the states in India.
 A set of all the lowercase letters of the alphabet.

1.2 Differences between Fuzzy Logic and Neural Networks

Fuzzy logic allows making definite decisions based on imprecise or ambiguous data, whereas
ANN tries to incorporate human thinking process to solve problems without mathematically
modelling them. Even though both of these methods can be used to solve nonlinear problems,
and problems that are not properly specified, they are not related. In contrast to Fuzzy logic,
ANN tries to apply the thinking process in the human brain to solve problems. Further, ANN
includes a learning process that involves learning algorithms and requires training data whereas

1 Unedited Version: Neural Network and Fuzzy System


fuzzy logic includes development of membership functions and rules to relate them. Fuzzy
logic basically deals with fixed and approximate (not exact) reasoning and the variables
in fuzzy logic can take values from 0 to 1, this is contradicting to the traditional binary
sets which takes value either 1 or 0 and since it can take any values in the range 0 to 1,
it means that it is partially true and it is widely used for applications in control systems.
Neural network on the other hand is based on biological neural network which is made
up of artificial neurons interconnecting one another working in unison to produce
outputs and it adapts to system with the data given to it by making adjustments to the
synaptic connections that exist between the neurons.

Fuzzy logic makes decision based on the raw and ambigous data given to it whereas
Neural network tries to learn from the data, incorporating the same way involved in the
biological neural network. Both of these system are used to solve non-linear and
complex problems and are no where related to each other.

1.3 Applications of Fuzzy System:

Aerospace

In aerospace, fuzzy logic is used in the following areas −

 Altitude control of spacecraft


 Satellite altitude control
 Flow and mixture regulation in aircraft deicing vehicles

Automotive

In automotive, fuzzy logic is used in the following areas −

 Trainable fuzzy systems for idle speed control


 Shift scheduling method for automatic transmission
 Intelligent highway systems
 Traffic control
 Improving efficiency of automatic transmissions

Business

In business, fuzzy logic is used in the following areas −

 Decision-making support systems


 Personnel evaluation in a large company

Defense

In defense, fuzzy logic is used in the following areas −

2 Unedited Version: Neural Network and Fuzzy System


 Underwater target recognition
 Automatic target recognition of thermal infrared images
 Naval decision support aids
 Control of a hypervelocity interceptor
 Fuzzy set modeling of NATO decision making

Electronics

In electronics, fuzzy logic is used in the following areas −

 Control of automatic exposure in video cameras


 Humidity in a clean room
 Air conditioning systems
 Washing machine timing
 Microwave ovens
 Vacuum cleaners

Finance

In the finance field, fuzzy logic is used in the following areas −

 Banknote transfer control


 Fund management
 Stock market predictions

Industrial Sector

In industrial, fuzzy logic is used in following areas −

 Cement kiln controls heat exchanger control


 Activated sludge wastewater treatment process control
 Water purification plant control
 Quantitative pattern analysis for industrial quality assurance
 Control of constraint satisfaction problems in structural design
 Control of water purification plants

Manufacturing

In the manufacturing industry, fuzzy logic is used in following areas −

 Optimization of cheese production


 Optimization of milk production

Marine

In the marine field, fuzzy logic is used in the following areas −

3 Unedited Version: Neural Network and Fuzzy System


 Autopilot for ships
 Optimal route selection
 Control of autonomous underwater vehicles
 Ship steering

Medical

In the medical field, fuzzy logic is used in the following areas −

 Medical diagnostic support system


 Control of arterial pressure during anesthesia
 Multivariable control of anesthesia
 Modeling of neuropathological findings in Alzheimer's patients
 Radiology diagnoses
 Fuzzy inference diagnosis of diabetes and prostate cancer

Securities

In securities, fuzzy logic is used in following areas −

 Decision systems for securities trading


 Various security appliances

Transportation

In transportation, fuzzy logic is used in the following areas −

 Automatic underground train operation


 Train schedule control
 Railway acceleration
 Braking and stopping

Pattern Recognition and Classification

In Pattern Recognition and Classification, fuzzy logic is used in the following areas −

 Fuzzy logic based speech recognition


 Fuzzy logic based
 Handwriting recognition
 Fuzzy logic based facial characteristic analysis
 Command analysis
 Fuzzy image search

Psychology

In Psychology, fuzzy logic is used in following areas −

4 Unedited Version: Neural Network and Fuzzy System


 Fuzzy logic based analysis of human behavior
 Criminal investigation and prevention based on fuzzy logic reasoning

A set is an unordered collection of different elements. It can be written explicitly by


listing its elements using the set bracket. If the order of the elements is changed or any
element of a set is repeated, it does not make any changes in the set.

Example

 A set of all positive integers.


 A set of all the planets in the solar system.
 A set of all the states in India.
 A set of all the lowercase letters of the alphabet.

1.4 Historical Evolution Fuzzy Set:

Human Reasoning was dominated for centuries by the fundamental “Laws of Thought”
(Korner, 1967), introduced by Aristotle (384-322 BC) and the philosophers that
preceded him, which include:
• The principle of identity
• The law of the excluded middle
• The law of contradiction

In particular, the second of the above laws, stating that every proposition has to be
either “True” or “False”, was the basis for the genesis of the Aristotle’s bi-valued Logic.
The precision of the traditional mathematics owes undoubtedly a large part of its
success to this Logic.

However, even when Parmenides proposed, around 400 BC, the first version of the law
of the excluded middle, there were strong and immediate objections. For example,
Heraclitus opposed that things could be simultaneously true and not true, whereas the
Buddha Sidhartha Gautama, who lived in India a century earlier, had already indicated
that almost every notion contains elements from its opposite one. The ancient Greek
philosopher Plato (427-377 BC) laid the foundation of what it was later called FL by
claiming that there exists a third area beyond “True” and “False”, where these two
opposite notions can exist together. More modern philosophers like Hegel, Marx,
Engels and others adopted and further cultivated the above Plato’s belief.
The Polish philosopher Jan Lukasiewicz (1878-1956) was the first to propose a
systematic alternative of the bi-valued logic introducing in the early 1900’s a three
valued logic by adding the term “Possible” between “True” and “False” (Lejewski,
1967). Eventually he developed an entire notation and axiomatic system from which he
hoped to derive modern mathematics. Later he also proposed four and five valued

5 Unedited Version: Neural Network and Fuzzy System


Logics and he finally arrived to the conclusion that axiomatically nothing could prevent
the derivation of an infinite valued Logic.
But it was not until relatively recently that an infinite-valued Logic was introduced
(Zadeh, 1973), called FL, because it is based on the notion of FS initiated in 1965
(Zadeh, 1965) by Lotfi (2018), Professor at the University of Berkeley, California. An
important goal of FL is that through it algorithmic procedures can be devised which
translate the “fuzzy” terminology into numerical values, perform reliable operations
upon those values and then return natural language statements in a reliable manner.

Zadeh (1921–2017) (Wikipedia, retrieved from the Web on February, 2012) was born in
Baku, Azerbaijan of USSR, to a Russian Jewish mother (FanyaKoriman), who was a
pediatrician, and an Iranian Azeri father (Rahim Aleskerzade), who was a journalist on
assignment from Iran.

At the age of 10, when Stalin introduced collectivization of farms in USSR, the Zadeh
family moved to Iran. In 1942 Zadeh graduated from the University of Tehran with a
degree in electrical engineering and moved to the USA in 1944. He received a MS from
MIT in 1946 and a Ph.D. in electrical engineering from Columbia University in 1949.
He taught for ten years in Columbia, promoted to a Full Professor in 1957, before
moving to Berkeley in 1959. Among others he introduced jointly with J.R. Ragazzini in
1962 the pioneering z-transform method used today in the digital analysis (Brule, 2016)
whereas his more recent works include computing with words and perceptions (Zadeh,
1984;2005a) and an outline towards a generalized theory of uncertainty (Zadeh, 2005b).
It has been estimated that Zadeh, who died in Berkeley on 6 September 2017, aged 96,
counted in 2011 more than 950 000 citations by other researchers!
As it was expected, the far-reaching theory of fuzzy systems aroused some objections to
the scientific community. While there have been generic complaints about the fuzziness
of assigning values to linguistic terms, the most cogent criticisms come from Haak
(1979). She argued that there are only two areas – the nature of Truth and Falsity and
the fuzzy systems’ utility – in which FL could be possibly needed, and then maintained
that in both cases it can be shown that FL is unnecessary.

Fox (1981) responded to her objections indicating that FL is useful in three areas: To
handle real-world relationships which are inherently fuzzy, to calculate the frequently
existing in real world situations fuzzy data and to describe the operation of some
inferential systems which are inherently fuzzy. His most powerful arguments were that
traditional and FL need not be seen as competitive, but as complementary and that FL,
despite the objections of classical logicians, has found its way into practical applications
and has proved very successful there.

1.5 The concept of Fuzzy Set

Real life situations appear frequently where some definitions have not clear boundaries,
like “the young people of a city”, “the good players of a team”, “the diligent students of

6 Unedited Version: Neural Network and Fuzzy System


a class”, etc. The need to model mathematically such kind of situations was one of the
main reasons that laid to the development of the FS theory.
Let U be the set of the discourse. Then, according to Zadeh (1965), a fuzzy subset Α of
U (or for brevity a FS in U) can be defined with the help of its membership function
mA: U [0,1], which assigns to each element x of U a real value mΑ(Fox) in [0, 1], called
the membership degree of x in Α. The closer is mΑ(Fox) to 1, the more x satisfies the
characteristic property of Α. Then one defines Α as a set of ordered pairs of the form: A
= {(x, mA(x)) :x€U
Many authors, for reasons of simplicity, identify the FS Α with its membership function
mA. A FS can be also denoted in the form of a symbolic sum, or a symbolic power
series, or a symbolic integral, when U is a finite or numerable set or it has the power of
the continuous respectively. For general facts on FS and the uncertainty connected to
them we refer to the book of Klir and Folger (1988).
Example 1: The young human ages
Let U be the set of the non negative integers not exceeding 140 (considered as the upper
bound of human life) representing the human ages. The set of all ages not exceeding a
given integer in U, e.g. 20, is a crisp subset of U. On the contrary the set A of the young
human ages, being not precisely defined, is a FS in U. The membership function of A
can be defined by

Therefore, the age of a recently born baby has membership degree mA(0) = 1, the age of
25 years has membership degree:

MA (25) = ( 1 + 12)-1 = 0.5 , etc.

Mathematical Representation of a Set

Sets can be represented in two ways −

Roster or Tabular Form

In this form, a set is represented by listing all the elements comprising it. The elements
are enclosed within braces and separated by commas.

Following are the examples of set in Roster or Tabular Form −

 Set of vowels in English alphabet, A = {a,e,i,o,u}


 Set of odd numbers less than 10, B = {1,3,5,7,9}

7 Unedited Version: Neural Network and Fuzzy System


Set Builder Notation

In this form, the set is defined by specifying a property that elements of the set have in
common. The set is described as A = {x:p(x)}

Example 1 − The set {a,e,i,o,u} is written as

A = {x:x is a vowel in English alphabet}

Example 2 − The set {1,3,5,7,9} is written as

B = {x:1 ≤ x < 10 and (x%2) ≠ 0}

If an element x is a member of any set S, it is denoted by x∈S and if an element y is not


a member of set S, it is denoted by y∉S.

Example − If S = {1,1.2,1.7,2},1 ∈ S but 1.5 ∉ S

Cardinality of a Set

Cardinality of a set S, denoted by |S||S|, is the number of elements of the set. The
number is also referred as the cardinal number. If a set has an infinite number of
elements, its cardinality is ∞∞.

Example − |{1,4,3,5}| = 4,|{1,2,3,4,5,…}| = ∞

If there are two sets X and Y, |X| = |Y| denotes two sets X and Y having same
cardinality. It occurs when the number of elements in X is exactly equal to the number
of elements in Y. In this case, there exists a bijective function ‘f’ from X to Y.

|X| ≤ |Y| denotes that set X’s cardinality is less than or equal to set Y’s cardinality. It
occurs when the number of elements in X is less than or equal to that of Y. Here, there
exists an injective function ‘f’ from X to Y.

|X| < |Y| denotes that set X’s cardinality is less than set Y’s cardinality. It occurs when
the number of elements in X is less than that of Y. Here, the function ‘f’ from X to Y is
injective function but not bijective.

If |X| ≤ |Y| and |X| ≤ |Y| then |X| = |Y|. The sets X and Y are commonly referred as
equivalent sets.

Types of Sets

Sets can be classified into many types; some of which are finite, infinite, subset,
universal, proper, singleton set, etc.

8 Unedited Version: Neural Network and Fuzzy System


Finite Set

A set which contains a definite number of elements is called a finite set.

Example − S = {x|x∈ N and 70 > x > 50}

Infinite Set

A set which contains infinite number of elements is called an infinite set.

Example − S = {x|x∈ N and x > 10}

Subset

A set X is a subset of set Y (Written as X ⊆ Y) if every element of X is an element of


set Y.

Example 1 − Let, X = {1,2,3,4,5,6} and Y = {1,2}. Here set Y is a subset of set X as all
the elements of set Y is in set X. Hence, we can write Y⊆X.

Example 2 − Let, X = {1,2,3} and Y = {1,2,3}. Here set Y is a subset (not a proper
subset) of set X as all the elements of set Y is in set X. Hence, we can write Y⊆X.

Proper Subset

The term “proper subset” can be defined as “subset of but not equal to”. A Set X is a
proper subset of set Y (Written as X ⊂ Y) if every element of X is an element of set Y
and |X| < |Y|.

Example − Let, X = {1,2,3,4,5,6} and Y = {1,2}. Here set Y ⊂ X, since all elements in
Y are contained in X too and X has at least one element which is more than set Y.

Universal Set

It is a collection of all elements in a particular context or application. All the sets in that
context or application are essentially subsets of this universal set. Universal sets are
represented as U.

Example − We may define U as the set of all animals on earth. In this case, a set of all
mammals is a subset of U, a set of all fishes is a subset of U, a set of all insects is a
subset of U, and so on.

9 Unedited Version: Neural Network and Fuzzy System


Empty Set or Null Set

An empty set contains no elements. It is denoted by Φ. As the number of elements in an


empty set is finite, empty set is a finite set. The cardinality of empty set or null set is
zero.

Example – S = {x|x∈ N and 7 < x < 8} = Φ

Singleton Set or Unit Set

A Singleton set or Unit set contains only one element. A singleton set is denoted by {s}.

Example − S = {x|x∈ N, 7 < x < 9} = {8}

Equal Set

If two sets contain the same elements, they are said to be equal.

Example − If A = {1,2,6} and B = {6,1,2}, they are equal as every element of set A is
an element of set B and every element of set B is an element of set A.

Equivalent Set

If the cardinalities of two sets are same, they are called equivalent sets.

Example − If A = {1,2,6} and B = {16,17,22}, they are equivalent as cardinality of A is


equal to the cardinality of B. i.e. |A| = |B| = 3

Overlapping Set

Two sets that have at least one common element are called overlapping sets. In case of
overlapping sets −

n(A∪B)=n(A)+n(B)−n(A∩B)
n(A∪B)=n(A−B)+n(B−A)+n(A∩B)
n(A)=n(A−B)+n(A∩B)
n(B)=n(B−A)+n(A∩B)
Example − Let, A = {1,2,6} and B = {6,12,42}. There is a common element ‘6’, hence
these sets are overlapping sets.

10 Unedited Version: Neural Network and Fuzzy System


Disjoint Set

Two sets A and B are called disjoint sets if they do not have even one element in
common. Therefore, disjoint sets have the following properties −

n(A∩B)=ϕ
n(A∪B)=n(A)+n(B)
Example − Let, A = {1,2,6} and B = {7,9,14}, there is not a single common element,
hence these sets are overlapping sets.

Operations on Classical Sets

Set Operations include Set Union, Set Intersection, Set Difference, Complement of Set,
and Cartesian Product.

Union

The union of sets A and B (denoted by A∪ BA ∪ B) is the set of elements which are in
A, in B, or in both A and B. Hence, A ∪ B = {x|x∈ A OR x ∈ B}.

Example − If A = {10,11,12,13} and B = {13,14,15}, then A ∪ B =


{10,11,12,13,14,15} – The common element occurs only once.

Intersection

The intersection of sets A and B (denoted by A ∩ B) is the set of elements which are in
both A and B. Hence, A ∩ B = {x|x∈ A AND x ∈ B}.

11 Unedited Version: Neural Network and Fuzzy System


Difference/ Relative Complement

The set difference of sets A and B (denoted by A–B) is the set of elements which are
only in A but not in B. Hence, A − B = {x|x∈ A AND x ∉ B}.

Example − If A = {10,11,12,13} and B = {13,14,15}, then (A − B) = {10,11,12} and


(B − A) = {14,15}. Here, we can see (A − B) ≠ (B − A)

Complement of a Set

The complement of a set A (denoted by A′) is the set of elements which are not in set A.
Hence, A′ = {x|x∉ A}.

More specifically, A′ = (U−A) where U is a universal set which contains all objects.

Example − If A = {x|x belongs to set of add integers} then A′ = {y|y does not belong to
set of odd integers}

Cartesian Product / Cross Product

The Cartesian product of n number of sets A1,A2,…An denoted as A1 × A2...× An can


be defined as all possible ordered pairs (x1,x2,…xn) where x1 ∈ A1,x2 ∈ A2,…xn∈ An

Example − If we take two sets A = {a,b} and B = {1,2},

The Cartesian product of A and B is written as − A × B = {(a,1),(a,2),(b,1),(b,2)}

12 Unedited Version: Neural Network and Fuzzy System


And, the Cartesian product of B and A is written as − B × A = {(1,a),(1,b),(2,a),(2,b)}

Properties of Classical Sets

Properties on sets play an important role for obtaining the solution. Following are the
different properties of classical sets −

Commutative Property

Having two sets A and B, this property states −

A∪B=B∪A
A∩B=B∩A
Associative Property

Having three sets A, B and C, this property states −

A∪(B∪C)=(A∪B)∪C
A∩(B∩C)=(A∩B)∩C
Distributive Property

Having three sets A, B and C, this property states −

A∪(B∩C)=(A∪B)∩(A∪C)
A∩(B∪C)=(A∩B)∪(A∩C)
Idempotency Property

For any set A, this property states −

A∪A=A
A∩A=A
Identity Property

For set A and universal set X, this property states −

A∪φ=A
A∩X=A

13 Unedited Version: Neural Network and Fuzzy System


A∩φ=φ
A∪X=X
Transitive Property

Having three sets A, B and C, the property states −

If A⊆B⊆C

, then A⊆C

De Morgan’s Law

It is a very important law and supports in proving tautologies and contradiction. This
law states −

𝐴 ∩ 𝐵=A∪B
A∪B=A∩B

Exercise:

1. What is Fuzzy System? Give Example for the same.


2. Explain the concept of fuzziness with suitable example.
3. State the applications of fuzzy sets.
4. What is fuzzy set/
5. Give comparison between fuzzy system and neural networks.
6. Give the historical evolution of fuzzy system.

14 Unedited Version: Neural Network and Fuzzy System


Chapter 2
Fuzzy Sets

Introduction:
Fuzzy logic starts with and builds on a set of user-supplied human language rules. The
fuzzy systems convert these rules to their mathematical equivalents. This simplifies the
job of the system designer and the computer, and results in much more accurate
representations of the way systems behave in the real world.

Additional benefits of fuzzy logic include its simplicity and its flexibility. Fuzzy logic
can handle problems with imprecise and incomplete data, and it can model nonlinear
functions of arbitrary complexity. "If you don't have a good plant model, or if the
system is changing, then fuzzy will produce a better solution than conventional control
techniques," says Bob Varley, a Senior Systems Engineer at Harris Corp., an aerospace
company in Palm Bay, Florida.

You can create a fuzzy system to match any set of input-output data. The Fuzzy Logic
Toolbox makes this particularly easy by supplying adaptive techniques such as adaptive
neuro-fuzzy inference systems (ANFIS) and fuzzy subtractive clustering.

Fuzzy logic models, called fuzzy inference systems, consist of a number of conditional
"if-then" rules. For the designer who understands the system, these rules are easy to
write, and as many rules as necessary can be supplied to describe the system adequately
(although typically only a moderate number of rules are needed).

In fuzzy logic, unlike standard conditional logic, the truth of any statement is a matter
of degree. (How cold is it? How high should we set the heat?) We are familiar with
inference rules of the form p -> q (p implies q). With fuzzy logic, it's possible to say
(.5* p ) -> (.5 * q). For example, for the rule if (weather is cold) then (heat is on), both
variables, cold and on, map to ranges of values. Fuzzy inference systems rely on
membership functions to explain to the computer how to calculate the correct value
between 0 and 1. The degree to which any fuzzy statement is true is denoted by a value
between 0 and 1.

Not only do the rule-based approach and flexible membership function scheme make
fuzzy systems straightforward to create, but they also simplify the design of systems
and ensure that you can easily update and maintain the system over time.

Fuzzy Set Theory was formalised by Professor LoftiZadeh at the University of


California in 1965. What Zadeh proposed is very much a paradigm shift that first gained

1 Unedited Version: Neural Network and Fuzzy System


acceptance in the Far East and its successful application has ensured its adoption around
the world.

A paradigm is a set of rules and regulations which defines boundaries and tells us what
to do to be successful in solving problems within these boundaries. For example the use
of transistors instead of vacuum tubes is a paradigm shift - likewise the development of
Fuzzy Set Theory from conventional bivalent set theory is a paradigm shift.

Bivalent Set Theory can be somewhat limiting if we wish to describe a 'humanistic'


problem mathematically. For example, Fig 1 below illustrates bivalent sets to
characterise the temperature of a room.

The most obvious limiting feature of bivalent sets that can be seen clearly from the
diagram is that they are mutually exclusive - it is not possible to have membership of
more than one set (opinion would widely vary as to whether 50 degrees Fahrenheit is
'cold' or 'cool' hence the expert knowledge we need to define our system is
mathematically at odds with the humanistic world). Clearly, it is not accurate to define a
transition from a quantity such as 'warm' to 'hot' by the application of one degree
Fahrenheit of heat. In the real world a smooth (unnoticeable) drift from warm to hot
would occur.

This natural phenomenon can be described more accurately by Fuzzy Set Theory. Fig.2
below shows how fuzzy sets quantifying the same information can describe this natural
drift.

2 Unedited Version: Neural Network and Fuzzy System


Fuzzy Logic is a problem-solving control system methodology that lends itself to
implementation in systems ranging from simple, small, embedded micro-controllers to
large, networked, multi-channel PC or workstation-based data acquisition and control
systems. It can be implemented in hardware, software, or a combination of both. FL
provides a simple way to arrive at a definite conclusion based upon vague, ambiguous,
imprecise, noisy, or missing input information. FL's approach to control problems.
FL requires some numerical parameters in order to operate such as what is considered
significant error and significant rate-of-change-of-error, but exact values of these
numbers are usually not critical unless very responsive performance is required in
which case empirical tuning would determine them. For example, a simple temperature
control system could use a single temperature feedback sensor whose data is subtracted
from the command signal to compute "error" and then time-differentiated to yield the
error slope or rate-of-change-of-error, hereafter called "error-dot".

2.1 History of Fuzzy Logic

Although, the concept of fuzzy logic had been studied since the 1920's. The term fuzzy
logic was first used with 1965 by LotfiZadeh a professor of UC Berkeley in California.
He observed that conventional computer logic was not capable of manipulating data
representing subjective or unclear human ideas.

Fuzzy logic has been applied to various fields, from control theory to AI. It was
designed to allow the computer to determine the distinctions among data which is
neither true nor false. Something similar to the process of human reasoning. Like Little
dark, Some brightness, etc.

3 Unedited Version: Neural Network and Fuzzy System


2.2 Characteristics of Fuzzy Logic

Here, are some important characteristics of fuzzy logic:

 Flexible and easy to implement machine learning technique


 Helps you to mimic the logic of human thought
 Logic may have two values which represent two possible solutions
 Highly suitable method for uncertain or approximate reasoning
 Fuzzy logic views inference as a process of propagating elastic constraints
 Fuzzy logic allows you to build nonlinear functions of arbitrary complexity.
 Fuzzy logic should be built with the complete guidance of experts

However, fuzzy logic is never a cure for all. Therefore, it is equally important to
understand that where we should not use fuzzy logic.

Here, are certain situations when you better not use Fuzzy Logic:

 If you don't find it convenient to map an input space to an output space


 Fuzzy logic should not be used when you can use common sense
 Many controllers can do the fine job without the use of fuzzy logic

2.3 Fuzzy Logic Architecture

Fuzzy Logic architecture has four main parts as shown in the diagram:

Rule Base:

It contains all the rules and the if-then conditions offered by the experts to control the
decision-making system. The recent update in fuzzy theory provides various methods
for the design and tuning of fuzzy controllers. This updates significantly reduce the
number of the fuzzy set of rules.

Fuzzification:

4 Unedited Version: Neural Network and Fuzzy System


Fuzzification step helps to convert inputs. It allows you to convert, crisp numbers into
fuzzy sets. Crisp inputs measured by sensors and passed into the control system for
further processing. Like Room temperature, pressure, etc.

Inference Engine:

It helps you to determines the degree of match between fuzzy input and the rules. Based
on the % match, it determines which rules need implment according to the given input
field. After this, the applied rules are combined to develop the control actions.

Defuzzification:

At last the Defuzzification process is performed to convert the fuzzy sets into a crisp
value. There are many types of techniques available, so you need to select it which is
best suited when it is used with an expert system.

Fuzzy Logic vs. Probability


Fuzzy Logic Probability

Fuzzy: Tom's degree of membership within the Probability: There is a 90% chance that Tom is
set of old people is 0.90. old.

Fuzzy logic takes truth degrees as a


Probability is a mathematical model of
mathematical basis on the model of the
ignorance.
vagueness phenomenon.

Crisp vs. Fuzzy


Crisp Fuzzy

It has strict boundary T or F Fuzzy boundary with a degree of membership

Some crisp time set can be fuzzy It can't be crisp

True/False {0,1} Membership values on [0,1]

In Crisp logic law of Excluded Middle and In the fuzzy logic law of Excluded Middle and
Non- Contradiction may or may not hold Non- Contradiction hold

Classical Set vs. Fuzzy set Theory


Classical Set Fuzzy Set Theory

Classes of objects do not have sharp


Classes of objects with sharp boundaries.
boundaries.

A classical set is defined by crisp boundaries, A fuzzy set always has ambiguous boundaries,
i.e., there is clarity about the location of the set i.e., there may be uncertainty about the location

5 Unedited Version: Neural Network and Fuzzy System


boundaries. of the set boundaries.

Widely used in digital system design Used only in fuzzy controllers.

Fuzzy Logic Examples

See the below-given diagram. It shows that in fuzzy systems, the values are denoted by
a 0 to 1 number. In this example, 1.0 means absolute truth and 0.0 means absolute
falseness.

2.4 Application Areas of Fuzzy Logic

The Blow given table shows how famous companies using fuzzy logic in their products.

Product Company Fuzzy Logic

Use fuzzy logic to controls brakes in hazardous cases


Anti-lock
Nissan depend on car speed, acceleration, wheel speed, and
brakes
acceleration

Fuzzy logic is used to control the fuel injection and


Auto
NOK/Nissan ignition based on throttle setting, cooling water
transmission
temperature, RPM, etc.

Use to select geat based on engine load, driving style,


Auto engine Honda, Nissan
and road conditions.

Copy machine Canon Using for adjusting drum voltage based on picture

6 Unedited Version: Neural Network and Fuzzy System


density, humidity, and temperature.

Nissan, Isuzu, Use it to adjusts throttle setting to set car speed and
Cruise control
Mitsubishi acceleration

Use for adjusting the cleaning cycle, rinse and wash


Dishwasher Matsushita strategies based depend upon the number of dishes and
the amount of food served on the dishes.

Fujitec, Mitsubishi Use it to reduce waiting for time-based on passenger


Elevator control
Electric, Toshiba traffic

Golf diagnostic
Maruman Golf Selects golf club based on golfer's swing and physique.
system

Fitness Fuzzy rules implied by them to check the fitness of their


Omron
management employees.

Kiln control Nippon Steel Mixes cement

Microwave Mitsubishi
Sets lunes power and cooking strategy
oven Chemical

Palmtop Hitachi, Sharp,


Recognizes handwritten Kanji characters
computer Sanyo, Toshiba

Plasma etching Mitsubishi Electric Sets etch time and strategy

2.4.1 Advantages of Fuzzy Logic System

 The structure of Fuzzy Logic Systems is easy and understandable


 Fuzzy logic is widely used for commercial and practical purposes
 It helps you to control machines and consumer products
 It may not offer accurate reasoning, but the only acceptable reasoning
 It helps you to deal with the uncertainty in engineering
 Mostly robust as no precise inputs required
 It can be programmed to in the situation when feedback sensor stops working
 It can easily be modified to improve or alter system performance
 inexpensive sensors can be used which helps you to keep the overall system cost and
complexity low
 It provides a most effective solution to complex issues

2.4.2 Disadvantages of Fuzzy Logic Systems

 Fuzzy logic is not always accurate, so The results are perceived based on assumption,
so it may not be widely accepted.
 Fuzzy systems don't have the capability of machine learning as-well-as neural network
type pattern recognition

7 Unedited Version: Neural Network and Fuzzy System


 Validation and Verification of a fuzzy knowledge-based system needs extensive testing
with hardware
 Setting exact, fuzzy rules and, membership functions is a difficult task
 Some fuzzy time logic is confused with probability theory and the terms

2.5 Features of Membership Functions

We will now discuss the different features of Membership Functions.

Core
For any fuzzy set A˜, the core of a membership function is that region of universe that
is characterize by full membership in the set. Hence, core consists of all those elements
yof the universe of information such that,
μA˜(y)=1
Support

For any fuzzy set A˜, the support of a membership function is the region of universe
that is characterize by a nonzero membership in the set. Hence core consists of all those
elements yof the universe of information such that,

μA˜(y)>0
Boundary

For any fuzzy set A˜, the boundary of a membership function is the region of universe
that is characterized by a nonzero but incomplete membership in the set. Hence, core
consists of all those elements yof the universe of information such that,

1>μA˜(y)>0

8 Unedited Version: Neural Network and Fuzzy System


2.5.1 Fuzzification

It may be defined as the process of transforming a crisp set to a fuzzy set or a fuzzy set
to fuzzier set. Basically, this operation translates accurate crisp input values into
linguistic variables.

Following are the two important methods of fuzzification −

Support Fuzzification(s-fuzzification) Method

In this method, the fuzzified set can be expressed with the help of the following relation

A˜=μ1Q(x1)+μ2Q(x2)+...+μnQ(xn)

Here the fuzzy set Q(xi)is called as kernel of fuzzification. This method isimplemented
by keeping μi constant and xi being transformed to a fuzzy set Q(xi).

Grade Fuzzification (g-fuzzification) Method

It is quite similar to the above method but the main difference is that it kept xiconstant
and μiis expressed as a fuzzy set.

2.5.2 Defuzzification

It may be defined as the process of reducing a fuzzy set into a crisp set or to convert a
fuzzy member into a crisp member.

We have already studied that the fuzzification process involves conversion from crisp
quantities to fuzzy quantities. In a number of engineering applications, it is necessary to
defuzzify the result or rather “fuzzy result” so that it must be converted to crisp result.
Mathematically, the process of Defuzzification is also called “rounding it off”.

The different methods of Defuzzification are described below −

Max-Membership Method

This method is limited to peak output functions and also known as height method.
Mathematically it can be represented as follows −

μA˜(x∗)>μA˜(x)forallx∈X

Here, x∗is the defuzzified output.

9 Unedited Version: Neural Network and Fuzzy System


Centroid Method

This method is also known as the center of area or the center of gravity method.
Mathematically, the defuzzified output x∗will be represented as −

x∗=∫μA˜(x).xdx∫μA˜(x).dx
Weighted Average Method

In this method, each membership function is weighted by its maximum membership


value. Mathematically, the defuzzified output x∗

will be represented as −

x∗=∑μA˜(xi).xi∑μA˜(xi)
Mean-Max Membership

This method is also known as the middle of the maxima. Mathematically, the
defuzzified output x∗

will be represented as −

x∗=∑i=1nxin

2.6 Operations on Fuzzy Sets

Having two fuzzy sets A˜ and B˜, the universe of information U and an element 𝑦 of
the universe, the following relations express the union, intersection and complement
operation on fuzzy sets.

Union/Fuzzy ‘OR’

Let us consider the following representation to understand how the Union/Fuzzy ‘OR’
relation works −

μA˜∪B˜(y)=μA˜∨μB˜∀y∈U

Here ∨ represents the ‘max’ operation.

10 Unedited Version: Neural Network and Fuzzy System


Intersection/Fuzzy ‘AND’

Let us consider the following representation to understand how the Intersection/Fuzzy


‘AND’ relation works −

μA˜∩B˜(y)=μA˜∧μB˜∀y∈U

Here ∧ represents the ‘min’ operation.

Complement/Fuzzy ‘NOT’

Let us consider the following representation to understand how the Complement/Fuzzy


‘NOT’ relation works −

μA˜=1−μA˜(y).y∈U

11 Unedited Version: Neural Network and Fuzzy System


Definition. (support) Let A be a fuzzy subset ofX; the support of A, denoted supp(A), is
the crispsubset of X whose elements all have nonzero membership grades in A.
supp(A) = {x ∈X|A(x) >0}.

Definition. (normal fuzzy set) A fuzzy subset A ofa classical set X is called normal if
there exists anx ∈X such that A(x) = 1. Otherwise A is subnormal.
Definition. (α-cut) An α-level set of a fuzzy setA of X is a non-fuzzy set denoted by
[A]α and isdefined by
{t ∈ X|A(t) ≥ α} if α > 0
[A]α= {
cl(suppA) if α = 0

wherecl(suppA) denotes the closure of the supportof A.

Definition. (convex fuzzy set) A fuzzy set A of X iscalled convex if [A]α is a convex
subset of X ∀α ∈[0, 1].Anα-cut of a triangular fuzzy number.

In many situations people are only able to characterize numeric information


imprecisely. For example, people use terms such as, about 5000, near zero, or
essentially bigger than 5000. These are examples of what are called fuzzy numbers.
Using the theory of fuzzy subsets we can represent these fuzzy numbers as fuzzy
subsets of the set of real numbers.

Definition. (fuzzy number) A fuzzy number A is afuzzy set of the real line with a normal,
(fuzzy) convex and continuous membership function of bounded support. The family of
fuzzy numbers will be denoted by F.

12 Unedited Version: Neural Network and Fuzzy System


Definition. (quasi fuzzy number) A quasi fuzzy numberA is a fuzzy set of the real line
with a normal,fuzzy convex and continuous membership functionsatisfying the limit
conditions

Let A be a fuzzy number. Then [A]γ is a closedconvex (compact) subset of R for all γ
∈[0, 1]. Let us introduce the notations a1(γ) = min[A]γ, a2(γ) = max[A]γ
In other words, a1(γ) denotes the left-hand side anda2(γ) denotes the right-hand side of
the γ-cut. It iseasy to see that Ifα ≤β then [A]α⊃[A]β

Furthermore, the left-hand side functiona1: [0, 1] →Ris monoton increasing and lower
semi-continuous,and the right-hand side functiona2: [0, 1] →Ris monoton decreasing
and upper semi-continuous.

We shall use the notation[A]γ = [a1(γ), a2(γ)].The support of A is the open interval
(a1(0), a2(0)).

If A is not a fuzzy number then there exists a γ ∈ [0, 1] such that [A]γ is not a convex
subset of R.
Definition. (Triangular fuzzy number) A fuzzy set A is called triangular fuzzy number
with peak (or center) a, left width α > 0 and right width β > 0 if its membership function
has the following form

13 Unedited Version: Neural Network and Fuzzy System


and we use the notation A = (a, α, β). It can easily be verified that
[A]γ = [a − (1 − γ)α, a + (1 − γ)β], ∀γ ∈ [0, 1]. The support of A is (a − α, b + β).
A triangular fuzzy number with center a may be seen as a fuzzy quantity “x is
approximately equal to a”.

Definition .(trapezoidal fuzzy number) A fuzzy set A is called trapezoidal fuzzy number
with tolerance interval [a, b], left width α and right width β if its membership function
has the following form

and we use the notation A = (a, b, α, β). It can easily be shown that
[A]γ = [a −(1 −γ)α, b + (1 −γ)β], ∀γ ∈[0, 1]. The support of A is (a −α, b + β).
A trapezoidal fuzzy number may be seen as a fuzzy quantity ”x is approximately in the
interval [a, b]”.

Definition. (subsethood) Let A and B are fuzzy subsets of a classical set X. We say that
A is a subset of B if A(t) ≤ B(t), ∀t ∈ X.

14 Unedited Version: Neural Network and Fuzzy System


Operations on fuzzy sets
We extend the classical set theoretic operations from ordinary set theory to fuzzy sets.
We note that all those operations which are extensions of crisp concepts reduce to
theirusual meaning when the fuzzy subsets have membership degrees that are drawn
from{0,1}. For this reason, when extending operations to fuzzy sets we use the same
symbol as in set theory. Let A and B are fuzzy subsets of a nonempty (crisp) set X.
Definition. (intersection) The intersection of A and B is defined as
(A∩ B)(t) = min{A(t),B(t)} = A(t) ∧B(t), for all t ∈X.

Intersection of two triangular fuzzy numbers.

Definition. (union) The union of A and B is definedasv(A ∪B)(t) = max{A(t),B(t)} = A(t)


∨B(t),for all t ∈X.

Union of two triangular fuzzy numbers.

Definition. (complement) The complement of afuzzy set A is defined as

(¬A)(t) = 1 − A(t)

15 Unedited Version: Neural Network and Fuzzy System


A closely related pair of properties which hold inordinary set theory are the law of
excluded middleA∨¬A = Xand the law of non-contradiction principle
A ∧¬A = ∅. It is clear that ¬1X = ∅and ¬∅= 1X, however, thelaws of excluded
middle and non-contradiction arenot satisfied in fuzzy logic.

2.7 FUZZY LOGIC OBJECTIONS


It would be remarkable if a theory as far-reaching as fuzzy systems did not arouse some
objections in the professional community. While there have been generic complaints about the
"fuzziness" of the process of assigning values to linguistic terms, perhaps the most cogent
criticisms come from Haack . A formal logician, Haack argues that there are only two areas in
which fuzzy logic could possibly be demonstrated to be "needed," and then maintains that in
each case it can be shown that fuzzy logic is not necessary.

The first area Haack defines is that of the nature of Truth and Falsity: if it could be
shown, she maintains, that these are fuzzy values and not discrete ones, then a need for
fuzzy logic would have been demonstrated. The other area she identifies is that of fuzzy
systems' utility: if it could be demonstrated that generalizing classic logic to encompass
fuzzy logic would aid in calculations of a given sort, then again a need for fuzzy logic
would exist.

In regards to the first statement, Haack argues that True and False are discrete terms.
For example, "The sky is blue" is either true or false; any fuzziness to the statement
arises from an imprecise definition of terms, not out of the nature of Truth. As far as
fuzzy systems' utility is concerned, she maintains that no area of data manipulation is
made easier through the introduction of fuzzy calculus; if anything, she says, the
calculations become more complex. Therefore, she asserts, fuzzy logic is unnecessary.

Fox has responded to her objections, indicating that there are three areas in which fuzzy
logic can be of benefit: as a "requisite" apparatus (to describe real-world relationships
which are inherently fuzzy); as a "prescriptive" apparatus (because some data is fuzzy,
and therefore requires a fuzzy calculus); and as a "descriptive" apparatus (because some
inferencing systems are inherently fuzzy).

His most powerful arguments come, however, from the notion that fuzzy and classic
logics need not be seen as competitive, but complementary. He argues that many of
Haack's objections stem from a lack of semantic clarity, and that ultimately fuzzy
statements may be translatable into phrases which classical logicians would find
palatable.

Exercise:

1. Explain in brief about fuzzy logic.


2. Discuss the history of fuzzy system.

16 Unedited Version: Neural Network and Fuzzy System


3. Explain fuzzy operations with suitable example.
4. What is subsethood? Explain in brief with suitable example.
5. Explain trapezoidal and union operation.
6. Explain intersection and triangular operations.
7. Explain in brief quasi fuzzy number.

17 Unedited Version: Neural Network and Fuzzy System


Chapter 3
Fuzzy Relations and Implications

3.1 Crisp and Fuzzy Relation

A fuzzy relation generalizes these degrees to membership grades. So, a crisp relation is a
restricted case of a fuzzy relation.

Crisp relation:

+ degrees or strengths of relation

Fuzzy relation

。Cartesian product :

 X i  {( x1 , , xn ) | xi  X i , i  Nn }
iNn

Nn  {1,2, , n}
。n-ary relation: a subset of  Xi
iNn

R( X 1 , X 2 , , X n )  X 1  X 2  X n
i.e.,
| |
a set the universal set

Characteristic function:

1 if ( x1 , x1 ,  xn )  R
 R ( x1 , x1 ,  xn )  
0 otherwise
。Binary, Ternary, Quaternary, Quinary, n-ary

Relations

Definition of Relation

A relation among crisp sets X1,...,Xn is a subset of X1 × ... × Xn denoted as R(X1,...,Xn) or R(Xi
| 1 ≤ i ≤ n). So, the relation R(X1,...,Xn) ⊆ X1 × ... ×Xn is set, too. The basic concept of sets can
be also applied to relations: • containment, subset, union, intersection, complement Each crisp

1 Unedited Version: Neural Network and Fuzzy System


relation can be defined by its characteristic function R(x1,...,xn) =(1, if and only if (x1,...,xn) ∈ R,
0, otherwise.

The membership of (x1,...,xn) in R signifies that the elements of (x1,...,xn) are related to
each other.

Relation as Ordered Set of Tuples


A relation can be written as a set of ordered tuples. Thus R(X1,...,Xn) represents n-dim.
membership array R = [ri1,...,in]. • Each element of i1 of R corresponds to exactly one member
of X1. • Each element of i2 of R corresponds to exactly one member of X2. • And so on...

If (x1,...,xn) ∈ X1 × ... ×Xn corresponds to ri1,...,in ∈ R, then ri1,...,in =(1, if and only if
(x1,...,xn) ∈ R, 0, otherwise.

Fuzzy Relations
The characteristic function of a crisp relation can be generalized to allow tuples to have
degrees of membership. • Recall the generalization of the characteristic function of a crisp set!
Then a fuzzy relation is a fuzzy set defined on tuples (x1,...,xn) that may have varying
degrees of membership within the relation. The membership grade indicates strength of the
present relation between elements of the tuple. The fuzzy relation can also be represented by
an n-dimensional membership array.

Cartesian Product of Fuzzy Sets: n Dimensions


Let n ≥ 2 fuzzy sets A1,...,An be defined in the universes of discourse X1,...,Xn, respectively.
The Cartesian product of A1,...,An denoted by A1 × ... × An is a fuzzy relation in the product
space X1 × ... × Xn. It is defined by its membership function

µA1×...×An(x1,...,xn) = ⊤(µA1(x1),...,µAn(xn))

whereas xi ∈ Xi, 1 ≤i≤ n. Usually ⊤ is the minimum (sometimes also the product).

Cartesian Product of Fuzzy Sets: 2 Dimensions


A special case of the Cartesian product is when n = 2. Then the Cartesian product of fuzzy
sets A∈ F(X) and B ∈ F(Y) is a fuzzy relation A × B ∈ F(X × Y) defined by

µA×B(x,y) = ⊤[µA(x), µB(y)], ∀x ∈ X, ∀y ∈ Y.

Subsequences

Consider the Cartesian product of all sets in the family

X = {Xi | i∈INn = {1,2,...,n}}.

2 Unedited Version: Neural Network and Fuzzy System


For each sequence (n-tuple) x = (x1,...,xn) ∈×i∈INnXi and each sequence (r-tuple, r ≤ n) y =
(y1,...,yr) ∈×j∈JXj where J ⊆INn and |J| = r y is called subsequence of x if and only if yj = xj, ∀j
∈ J. y ≺ x denotes that y is subsequence of x.

Projection

Given a relation R(x1,...,xn). Let [R ↓ Y] denote the projection of R on Y. It disregards all sets
in X except those in the family

Y = {Xj | j ∈ J ⊆INn}.

Then [R ↓ Y] is a fuzzy relation whose membership function is defined on the Cartesian


product of the sets in Y

[R ↓ Y](y) = max x≻y

R(x).

Under special circumstances, this projection can be generalized by replacing the max
operator by another t-conorm.

Cylindric Extension
Another operation on relations is called cylindric extension. Let X and Y denote the same
families of sets as used for projection. Let R be a relation defined on Cartesian product of sets
in family Y. Let [R ↑ X \ Y] denote the cylindric extension of R into sets X1, (i∈INn) which are in
X but not in Y. It follows that for each x with x ≻ y

[R ↑ X \ Y](x) = R(y).

The cylindric extension • produces largest fuzzy relation that is compatible with projection, • is
the least specific of all relations compatible with projection, • guarantees that no information
not included in projection is used to determine extended relation.

Example

Consider again the example for the projection. The membership functions of the cylindric
extensions of all projections are already shown in the table under the assumption that their
arguments are extended to (x1,x2,x3) e.g.

[R23 ↑ {X1}](0,0,2) = [R23 ↑ {X1}](1,0,2) = R23(0,2) = 0.2.

In this example none of the cylindric extensions are equal to the original fuzzy relation. This is
identical with the respective projections. Some information was lost when the given relation
was replaced by any one of its projections.

3 Unedited Version: Neural Network and Fuzzy System


Cylindric Closure
Relations that can be reconstructed from one of their projections by cylindric extension exist.
However, they are rather rare. It is more common that relation can be exactly reconstructed •
from several of its projections (max), • by taking set intersection of their cylindric extensions
(min). The resulting relation is usually called cylindric closure. Let the set of projections {Pi | i∈
I} of a relation on X be given. Then the cylindric closure cyl{Pi} is defined for each x ∈ X as

cyl{Pi}(x) = min i∈I [Pi ↑ X \ Yi](x).

Yi denotes the family of sets on which Pi is defined.

Example

Consider again the example for the projection. The cylindric closures of three families of the
projections are shown below:

Motivation and Domain

Binary relations are significant among n-dimensional relations. They are (in some sense)
generalized mathematical functions. On the contrary to functions from X to Y, binary relations
R(X,Y) may assign to each element of X two or more elements of Y. Some basic operations on
functions, e.g. inverse and composition, are applicable to binary relations as well. Given a fuzzy
relation R(X,Y). Its domain domR is the fuzzy set on X whose membership function is defined
for each x ∈ X as

domR(x) = max y∈Y

R(x,y),

i.e. each element of X belongs to the domain of R to a degree equal to the strength of its
strongest relation to any y ∈ Y.

Range and Height

The range ran of R(X,Y) is a fuzzy relation on Y whose membership function is defined for each
y ∈ Y as

ranR(y) = max x∈X

4 Unedited Version: Neural Network and Fuzzy System


R(x,y),

i.e. the strength of the strongest relation which each y ∈ Y has to an x ∈ X equals to the degree
of membership of y in the range of R. The height h of R(X,Y) is a number defined by

h(R) = max y∈Y

maxx∈X

R(x,y).

h(R) is the largest membership grade obtained by any pair (x,y) ∈ R.

Representation and Inverse


Consider e.g. the membership matrix R = [rxy] with rxy = R(x,y).

Its inverse R−1(Y,X) of R(X,Y) is a relation on Y × X defined by

R−1(y,x) = R(x,y), ∀x ∈ X, ∀y ∈ Y.

R−1 = [r−1 xy ] representing R−1(y,x) is tje transpose of R for R(X,Y)

(R−1)−1 = R, ∀R.

Standard Composition

Consider the binary relations P(X,Y), Q(Y,Z) with common set Y. The standard composition of P
and Q is defined as

(x,z) ∈ P ◦ Q ⇐⇒∃y ∈Y : {(x,y) ∈ P ∧ (y,z) ∈ Q}.

In the fuzzy case this is generalized by

[P ◦ Q](x,z) = sup y∈Y

min{P(x,y), Q(y,z)}, ∀x ∈ X, ∀z ∈ Z.

If Y is finite, sup operator is replaced by max. Then the standard composition is also called
max-min composition.

Inverse of Standard Composition


The inverse of the max-min composition follows from its definition:

[P(X,Y) ◦ Q(Y,Z)]−1 = Q−1(Z,Y) ◦ P−1(Y,X).

Its associativity also comes directly from its definition:

[P(X,Y)] ◦ Q(Y,Z)] ◦ R(Z,W) = P(X,Y) ◦ [Q(Y,Z) ◦ R(Z,W)].

Note that the standard composition is not commutative. Matrix notation: [rij] = [pik] ◦ [qkj]
with rij = maxkmin(pik,qkj).

5 Unedited Version: Neural Network and Fuzzy System


For instance:

r11 = max{min(p11,q11),min(p12,q21),min(p13,q31)} = max{min(.3,.9),min(.5,.3),min(.8,1)} =


.8 r32 = max{min(p31,q12),min(p32,q22),min(p33,q32)} = max{min(.4,.5),min(.6,.2),min(.5,0)}
= .4

Example: Types of Airplanes (Speed, Height, Type)

Consider the following fuzzy relations for airplanes: • relation A between maximal speed and
maximal height, • relation B between maximal height and the type.

Relational Join
A similar operation on two binary relations is the relational join. It yields triples (whereas
composition returned pairs). For P(X,Y) and Q(Y,Z), the relational join P ∗ Q is defined by

6 Unedited Version: Neural Network and Fuzzy System


[P ∗ Q](x,y,z) = min{P(x,y), Q(y,z)}, ∀x ∈X,∀y∈Y,∀z∈ Z.

Then the max-min composition is obtained by aggregating the join by the maximum:

[P ◦ Q](x,z) = max y∈Y

[P ∗ Q](x,y,z), ∀x ∈X,∀z∈ Z.

Example

The join S = P ∗ Q of the relations P and Q has the following membership function (shown
below on left-hand side). To convert this join into its corresponding composition R = P ◦ Q
(shown on right-hand side), the two indicated pairs of S(x,y,z) are aggregated using max.

For instance,

R(1,β) = max{S(1,a,β), S(1,b,β)} = max{.7, .5} = .7

Binary Relations on a Single Set


It is also possible to define crisp or fuzzy binary relations among elements of a single set X.
Such a binary relation can be denoted by R(X,X) or R(X2) which is a subset of X × X = X2. These
relations are often referred to as directed graphs which is also an representation of them. •
Each element of X is represented as node. • Directed connections between nodes indicate
pairs of x ∈ X for which the grade of the membership is nonzero. • Each connection is labeled
by its actual membership grade of the corresponding pair in R.

Example

An example of R(X,X) defined on X = 1,2,3,4. Two different representation are shown below.

7 Unedited Version: Neural Network and Fuzzy System


Properties of Crisp Relations
A crisp relation R(X,X) is called • reflexive if and only if ∀x ∈ X : (x,x) ∈ R, • symmetric if and
only if ∀x,y∈ X : (x,y) ∈ R ↔ (y,x) ∈ R, • transitive if and only if (x,z) ∈ R whenever both (x,y) ∈ R
and (y,z) ∈ R for at least one y ∈ X.

Properties of Fuzzy Relations


These properties can be extended for fuzzy relations. So one can define them in terms of the
membership function of the relation. A fuzzy relation R(X,X) is called • reflexive if and only if
∀x ∈ X : R(x,x) = 1, • symmetric if and only if ∀x,y∈ X : R(x,y) = R(y,x), • transitive if it satisfies

R(x,z) ≥ max y∈Y

min{R(x,y), R(y,z)}, ∀(x,z) ∈ X2.

Note that a fuzzy binary relation that is reflexive, symmetric and transitive is called fuzzy
equivalence relation.

R. Kruse, C. Moewes FS – Fuzzy Relations

8 Unedited Version: Neural Network and Fuzzy System


· Representation of a relation
 
R ( X 1 ,..., X n )
 

( ri1 ,i2 ,..,in ) : n-D membership array

ri1 ,i2 ,..,in = 1 iff ( x1 ,..., xn )  R

0 Otherwise

○ Example 3.1 :
 
R( X Y ) Z )Y1Y2Y3Y4 Z 2 Z1Z 3 Z 4 Z 5

  
R({ X 1 , X 3} , X i J  N nY  X
  
 1 1 1 1 0.8 0.8 0.8 0.8
{ X i | j  J  N n }  R2        
 X , a,* X , a,$ Y , b,* Y , a,$ X , b,* X , b,$ Y , b,* Y , b,$
 
 y  { X 1 , X 3}
 

Y j  X j j  J
 
[ R  X  Y ] :[ R2  { X 1 , X 3}]
 
  
 y  { X 2 }, X  Y  { X 1 , X 3 }  {(*,*), ( x,$), (Y ,*), (Y , s)}
  


( x)  R ( y )  [ R  { X i }](Y )  max R( x) Rij
 x y


 Y j | j  J  X X j
jJ

1 0.7 0.4 0.8


R2,3     .
a,* a,$ b,* b,$
0.9 0.4 1 0.7 0.8
   
X , a,* X , b,* Y , a,* Y , a,$ Y , b,$
0.9 0.4 1 0.8 0.9 0 1 0.8
 R1,2     R1,3    
X , a X ,b Y , a Y ,b X ,* X ,$ Y ,* Y ,$
1 0 0.6 0.9 0.7
0  R( x1 , x2 ,..., xn )  1     
( NY , Beijing ) ( NY , NY ) ( NY , London) ( Paris, Beijing ) ( Paris, NY ) ( Pa
 
X ={English , French} , Y ={ dollar , pound , franc , mark}

9 Unedited Version: Neural Network and Fuzzy System


Z={US , France , Canada , Britain , Germary}
 
R( X Y Z ) ={(English , dollar , US) , (French , franc , France)

(English , dollar , Canada) , (French , dollar , Canada)

(English , pound , Britain)}

Y1 Dollar 1 0 1 0 0 Dollar 0 0 1 0 0

Y2 Pound 0 0 0 1 0 Pound 0 0 0 0 0

Y3 Franc 0 0 0 0 0 Franc 0 1 0 0 0

Y4 Mark 0 0 0 0 0 Mark 0 0 0 0 0

US Fran Can Brit Ger US Fran Can Brit Ger

Z1 Z 2 Z 3 Z 4 Z 5 Z1 Z 2 Z 3 Z 4 Z 5

English Franch

X1 X 2

10 Unedited Version: Neural Network and Fuzzy System


· Fuzzy Relations
  
Cartesian Product : X1 X 2  X n
  

tuples : ( x1 , x2 ,..., xn )

membership grade :

0  R( x1 , x2 ,..., xn )  1

Example 3.2: Binary relation R : represents the concept “ very far”



X = { New York , Paris}


Y ={Beijing , New York , London}
Relation in list notation
 
R( X Y ) =

1 0 0.6 0.9 0.7 0.3


    
( NY , Beijing ) ( NY , NY ) ( NY , London) ( Paris, Beijing ) ( Paris, NY ) ( Paris, Londom)

Relation in membership array

NY Paris

Beijing 1 0.9

NY 0 0.7

London 0.6 0.3

11 Unedited Version: Neural Network and Fuzzy System


· Ordinary fuzzy relation

with valuation set [0,1]

L-fuzzy relation

With ordered valuation set L

3.2 Projection and Cyclindric Extensions



· set family X = { X i | i  N n }


Let X =  xi | i  N n   X X i
jJ 


Let Y =  Y j | j  J   X X j
jJ

Where J  N n , | J | = r

Y a subsequence of X , Y  X

iff Y j  X j j  J

⊙Projection : [ R  y ] the projection of R on Y

  
R( X1 , X 2 , , X n ) : a relation
  


Y= { X i | j  J  N n }

[ R  y ] : a fuzzy relation (set)

[ R  y ](Y )  max R( x)
x y

※max can be generalized by other t-conorms

12 Unedited Version: Neural Network and Fuzzy System


  
· Example 3.3 X 1 ={X,Y}, X 2 ={a,b}, X 3 ={*,$}
  

   0.9 0.4 1 0.7 0.8


R( X 1 , X 2 , X 3 ) =    
   X , a,* X , b,* Y , a,* Y , a,$ Y , b,$
  
Let Rij = [ R  { X i , X j }] , Ri  [ R  { X i }]
  

0.9 0.4 1 0.8


 R1,2    
X , a X ,b Y, a Y,b

0.9 0 1 0.8
R1,3    
X ,* X ,$ Y ,* Y ,$

1 0.7 0.4 0.8


R2,3    
a,* a,$ b,* b,$

0.9 1
R1  
* y

1 0.8
R2  
a b
1 0.8
R3  
* $

⊙Cyclindric Extension [ R  X  Y ] the CE of R into X-Y


X-Y : sets X i that are in X but are not in Y

[ R  X  Y ]( x)  R ( y )

R: a relation defined on Y

13 Unedited Version: Neural Network and Fuzzy System


·Example 3.4 ( Refer to example 3.3)
  
Let X = { X1 , X 2 , X 3}
  

 
And R= R1,2  y  { X1 , X 3}
 


∴ X-Y = X 3 = {*,$}

0.9 0.4 1 0.8


From example 3.3 R1,2    
X , a X ,b Y, a Y,b


∴ [ R  X  Y ]  [ R1,2  { X 3}] =

0.9 0.9 0.4 0.4 1 1 0.8 0.8


      
X , a,* X , a,$ X , b,* X , b,$ Y , a,$ Y , a,$ Y , b,* Y , b,$

      
[ R  X  Y ]:[ R12  { X 3}] [ R13  { X 2 }] [ R23  { X1}] [ R1  { X 2 , X 2 }] [ R2  { X1 , X 3}]
      
 
[ R3  { X1 , X 2 }]
 

 
Consider [ R  X  Y ] = [ R2  { X1 , X 3}]
 

  
 y  { X 2 }, X  Y  { X1 , X 3}  {(*,*),( x,$),(Y ,*),(Y , s)}
  

{x,y} {x,$}

1 0.8
R= R2  
a b
  1 1 1 1 0.8 0.8 0.8 0.8
∴ [ R2  { X1 , X 3}] =       
  X , a,* X , a,$ Y , b,* Y , a,$ X , b,* X , b,$ Y , b,* Y , b,$

14 Unedited Version: Neural Network and Fuzzy System


3.4 Cyclindric closure
-A relation may be exactly reconstructed from several of its projections by taking the set
intersection of their cyclindricextensions Pi | i  I :a set of projections of a relation on X

 cyl{Pi }( X )  min [ Pi  X  Yi ]( X )  R
iI

Yi :The family of sets on which Pi is defined.

‧ Example:
0.9 0.4 1 0.7 0.4 0.8
Cyl{R1, 2 , R1,3 , R2,3 }      
x, a, x, b, y, a, y, a,$ y, b, y, b,
_ _ _
Refer to the original relation R( X , X , X ) in example 3.3.
 1  2  3

It is not fully reconstructable from its projections become of ignoramus of R1 , R2 ,and R3 .

15 Unedited Version: Neural Network and Fuzzy System


 
3.3. Binary Relations R( X , Y )
 

 
X  Y :bipartite graph
 

 
X  Y :directed graph
 

‧ Representations
i, matrices R  [ rij ] , where rij  R( xi , y j )

ii, sagittal diagrams

Examples:

i) y1 y2 y3 y4 y5

x1 .9 1 0 0 0
x2 0 .4 0 0 0
x3 0 0 1 .2 0
x4 0 0 0 0 .4
x5 0 0 0 0 .5
x6 0 0 0 0 .2

ii)

16 Unedited Version: Neural Network and Fuzzy System


3.9

‧ Domain:dom R
Crisp – dom R = {x  X | ( x, y )  R, y  Y }

Fuzzy – dom R(x) = max R( x, y)


yY

The domain of a fuzzy relation R(x,y) is a fuzzy set on X; dom R(x) is its membership function.

e.g. dom R( X 1 ) = max(0.9, 1) = 1

‧ Range:ran R
Crisp – ran R = { y  Y | ( x, y )  R, x  X }

Fuzzy – ran R(y) = max R( x, y )


xX

e.g. ran R( y 5 ) = max(0.4, 0.5, 0.2) = 0.5

‧Height: h( R)  max max R( x, y)


yY xX

e.g., h( R)  1 normal fuzzy relation

17 Unedited Version: Neural Network and Fuzzy System


‧Inverse: R 1 (Y , X )

R 1 ( y, x)  R( x, y)

 R 1  RT , ( R 1 ) 1  R

0.3 0.2

e.g. R  0 1 

0.6 0.4

0.3 0 0.6
R 1  RT   
0.2 1 0.4

‧ Composition: R( X , Z )  P( X , Y )  Q(Y , Z )
R( x, z)  [ P  Q]( x, z)  max min[ P( x, y), Q( y, z)]
yY

Max-min composition

 P。Q  Q。P

Properties:  ( P。Q) 1  Q 。
1
P 1
( P。Q。
) R  P。(Q。R)

Matric form: [rij ]  [ pik 。


] [qkj ]

Where [rij ]  max min( pik , qkj )


k

18 Unedited Version: Neural Network and Fuzzy System


3.11

R( x, z)  [ P。Q]( x, z)  max [ P( x, y‧
) Q( y, z)]
yY

max-product composition

matrix form [rij ]  [ pik 。


] [qkj ]

Where [rij ]  max ( pik , qkj )

‧ Example
0.3 0.5 0.8 0.9 0.5 0.7 0.7
0.0 0.7 1.0 。0.3 0.2 0.0 0.9
  
0.4 0.6 0.5 1.0 0.0 0.5 0.5

0.8 0.3 0.5 0.5



Max-min = 1.0 0.2 0.5 0.7

 
0.5 0.4 0.5 0.6

0.8 0.15 0.4 0.45



Max-prod = 1.0 0.14 0.5 0.63

 
0.5 0.2 0.28 0.54

‧ Relational join: R( X , Y , Z )  P( X , Y ) * Q(Y , Z )


R( x, y, z )  [ P * Q]( x, y, z )  min[ P( x, y ), Q( y, z )]

※The max-min composition can be obtained by aggregating appropriate elements of the


corresponding join.

19 Unedited Version: Neural Network and Fuzzy System


3.12‧Example

※ [ P。Q]( x, z )  max [ P * Q]( x, y, z )


yY

5.4Binary Relation on a Simple Set

‧Representations

20 Unedited Version: Neural Network and Fuzzy System


3-13
◎ characteristic Properties (Crisp case)
i, reflexive

irrflexive 

antiflexive 

ii, symmetric

asymmetric 

antisymmetric ( x, y )  R, ( y, x)  R  x  y

strictly antisymmetric x  y, ( x, y )  R or ( y, x)  R

iii, transitive

nontransitive 

antitransitive 

◎ Fuzzy Relations
i, reflexive --- x , R( x, x)  1
irreflexive --- x , R( x, x)  1

antiflexive --- x , R( x, x)  1

 -reflexive --- x , R( x, x)  

21 Unedited Version: Neural Network and Fuzzy System


3-14

ii, symmetric -- x, y, R( x, x)  R( y, x)

asymmetric -- x, y, R( x, x)  R( y, x)

 R ( x, x )  0
antisymmetric --  x y
 R( y, x)  0

iii, max-min transitive -- x, z

R( x, z )  max

min  R( x, y ), R( y, z ) 
yY

max-product transitive -- x, z

R( x, z )  max

min  R( x, y )  R( y, z ) 
yY

nontransitive -- x, z

R( x, z )  max

min  R( x, y ), R( y, z ) 
yY

antitransitive -- x, z

R( x, z )  max

min  R( x, y ), R( y, z ) 
yY

◎ Example 3.7 R:very near


 reflexive, symmetric, nontransitive

22 Unedited Version: Neural Network and Fuzzy System


3-15

◎Summary

Antisymmetric
Antireflexive

Symmetric

Transitive
Reflexive
Crisp: equivalence;
Fuzzy: similarity
Quasi-equivalence
Compatibility or
Tolerance
Partial ordering
Preordering or
Quasi-ordering
Strict ordering
Figure3.6 Some important types of binary relation R(X,X)

◎ transitive closure: RT ( X )

Algorithm for computing RT

1. R /  R  ( R R)

2. If R /  R , Let R /  R , go to step1

3. Stop, RT  R /

Where  : component-wise max

23 Unedited Version: Neural Network and Fuzzy System


3-16

◎Example 3.8

0.7 0.5 0.0 0.0 


 0.0 0.0 0.0 1.0 
R
 0.0 0.4 0.0 0.0 
 
 0.0 0.0 0.8 0.0 

0.7 0.5 0.0 0.5


 0.0 0.0 0.8 0.0
Step1: R R  
 0.0 0.0 0.0 0.4
 
 0.0 0.4 0.0 0.0

0.7 0.5 0.0 0.5 


 0.0 0.0 0.8 1.0 
R  ( R R)    R/
 0.0 0.4 0.0 0.4 
 
 0.0 0.4 0.8 0.0 

Step2: R /  R , Let R  R /
repeat step1

0.7 0.5 0.5 0.5


 0.0 0.4 0.8 0.4
R R
 0.0 0.4 0.4 0.4
 
 0.0 0.4 0.4 0.4

0.7 0.5 0.5 0.5 


 0.0 0.4 0.8 1.0 
R  ( R R)    R/
 0.0 0.4 0.4 0.4 
 
 0.0 0.4 0.8 0.0 

Step3: R /  R , Let R  R /
repeat step1

0.7 0.5 0.5 0.5


 0.0 0.4 0.8 1.0 
R 
'
R
 0.0 0.4 0.4 0.4
 
 0.0 0.4 0.8 0.4

Step4: Stop

24 Unedited Version: Neural Network and Fuzzy System


RT  R /

25 Unedited Version: Neural Network and Fuzzy System


3-17

3.5 Fuzzy Equivalence Relation

◎Crisp binary relation

equivalence: reflexive, symmetric, and transitive

equivalence classes

partition: X/R

◎ Example 3.9:

X  1, 2, ,10

 
R( X  X )  { ( x, y ) | x, y have the same remainder when divided by 3}
 

R: reflexive, symmetric, transitive

 equivalence

partition X / R  (1, 4, 7,10), (2,5,8), (3, 6,9)

26 Unedited Version: Neural Network and Fuzzy System


3-18

◎Fuzzy Binary Relation

。 Fuzzy

Similarity relation equivalence relation

Similarity classes equivalence classes

2 Interpretations of a similarity relation:

1.Group similar elements into crisp classes

whose members are similar to each other

to some specified degree.



2. x  X , associate a fuzzy set Ax


defined on X .

。a fuzzy relation R    R (Theorem 2.5, Eqs.(2.1)(2.2))


[0,1]

If R: Similarity relation,

  R : equivalence relation

。Let (  R) : the partition of X w.r.t.  R

  ( R)   (  R) |    0,1

(  R) : nested, i.e.,

(  R) : aredefinement of  (  R) iff   

27 Unedited Version: Neural Network and Fuzzy System



Prove that :A fuzzy relation R : X  X is a similarity relation, then R is a equivalent
relation

Pf :∵ R : a similar relation

∴R : reflexive, i.e., x  X , R( x, x)  1

symmetric, i.e., x, y  X , R( x, y )  R( y, x)

transitive, i.e., x, z  X 2 , R( x, z )  max[ R( x, y), R( y, z )]


yY


i, R : reflexive

x  X , R( x, x)  1   [0,1],( x, x)   R

R : reflexive

ii,  R : symmetric

∵R : symmetric

x, y  Z , R( x, y )  R( y, x)

Let R( x, y )  R( y, x)  

Then    or   

a, if    => ( x, y),( y, x)   R

b, if    => ( x, y),( y, x)   R

iii,  R : transitive

∵R : transitive

x, z  X 2 , R( x, z )  max[ R( x, y), R( y, z )]


yY

Let R( x, y )  1 , R( y, z )   2

Assume 1   2

Then   1   2 , 1     2 , or 1   2  

a. if   1   2 => ( x, y)  R, ( y, z )   R --- (A)

28 Unedited Version: Neural Network and Fuzzy System


R( x, z)  max[ R( x, y), R( y, z)]  min[ 1, 2 ]  1  
yY

 R( x, z )   ,( x, z)   R --- (B)

(A) , (B) =>  R : transitive

b. if 1     2
 ( x, y)   R, ( y, z )   R , don’t care ( x, z )

c. if 1   2  
 ( x, y)   R, ( y, z )   R , don’t care ( x, z )

Example 3.10 : R ( X , X ) : a fuzzy relation

R : reflexive , symmetric , transitive ( R '  R  ( R R)  R )

∵level set : R  {0.0,0.4,0.5,0.8,0.9,1.0}

There are five nested partition   ' s

 The similarity class for each element is a fuzzy set defined by the row of the membership
matrix corresponds to that element

Example : see Example 3.10

0 0 1 0 1 0.9 0.5
For c :      
a b c d e f g

29 Unedited Version: Neural Network and Fuzzy System


0 0 1 0 1 0.9 0.5
For e :      
a b c d e f g
∴c and e are similar at any level 

3.6 Compatibility Relations ---- reflexive , symmetric

compatibility

Alternatives : tolerance relation

proximity

 Crisp case :
Maximal compatibility classes – not properly contained within any other compatibility
class

Complete cover – all the maximal compatibility classes

 Fuzzy case :
α-compatibility class ---- a subset A of X ,

s.t. x, y  A, if R( x, y )  R( x, y )   , R : fuzzy compatibility


relation

maximalα ---- compatibility classes

completeα-cover

Example 3.11 : R ( X , X ) : a fuzzy relation

∵R : reflexive , symmetric

∴a compatibility relation

∵ R  {0.0,0.4,0.5,0.7,0.8,1.0}

=>the completeα-covers

30 Unedited Version: Neural Network and Fuzzy System


3-25

  0.5
0.7 0.9

b d
a  1 0.7 0 1 0.7 
b  0 1 0 0.9 0 
c 0.5 0.7 1 1 0.8 
 
d 0 0 0 1 0 
e  0 0.1 0 0.9 1 
1 1 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 

0 0 1 1 1 0 0 0 0
 
0 0 1 1 1 1 1 0 0
0.5
R  0 0 1 1 1 1 1 1 0
 
0 0 0 1 1 1 0 0 0
0 0 0 1 1 0 1 0 0
 
0 0 0 0 1 0 0 1 0
0 1 
 0 0 0 0 0 0 0

e.q.   0.4
 x  y  xx  yy  xx  X ( x, y )  XS  {x1 , x2 }  Xy  Ax  y
(x,y)y  X ( x  y, or  y  xAx  XA  Xx  yR[ x ] ( y )  R ( y , x )
x  U ( R, A)( x)  R[ x ]y 
xA

1 1 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 

0 0 1 1 1 0 0 0 0
 
0 0 1 1 1 1 1 0 0
0.4
R  0 0 1 1 1 1 1 1 0
 
0 0 0 1 1 1 1 0 0
0 0 0 1 1 1 1 0 0
 
0 0 0 0 1 0 0 1 0
0 1 
 0 0 0 0 0 0 0

31 Unedited Version: Neural Network and Fuzzy System


Look for complete subgraphs

(1,2) , (3,4,5),(4,5,6,7),(5,8),(9)

(34,),(4,5,6),(4,5,7),(3,5),(5,6)

(4,5),(5,6,7),(4,6,7),(4,6),(6,7)

 maximal compatible classes (the complete 0.4-cover):


(1,2),(3,4,5),(4,5,6,7),(5,8),(9)

These do not partition X.

32 Unedited Version: Neural Network and Fuzzy System


3-26

e.q.   0.5

1 1 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 

0 0 1 1 1 0 0 0 0
 
0 0 1 1 1 1 1 0 0
0.5
R  0 0 1 1 1 1 1 1 0
 
0 0 0 1 1 1 0 0 0
0 0 0 1 1 0 1 0 0
 
0 0 0 0 1 0 0 1 0
0 1 
 0 0 0 0 0 0 0

 maximal compatible classes


( The complete 0.5-cover)

(1,2),(3,4,5),(4,5,6),(4,5,7),(5,8),(9)

33 Unedited Version: Neural Network and Fuzzy System


3-27

3.7. Ordering Relations

˙ partial ordering: reflexive , antisummetric , transitive

X  Y : X : predecessor
precedes Y : successor

if exist First member : if x  y y  X (minimum)

unique Last member : if y  x y  X (maximam)

may not Minimal member : if y  x  x  y

be unique Maximal member : if x  y  x  y

˙properties :

1, if  , at most one first member

if  , at most one last member

2, There may be several maximal and minimal member


3, if  a first member X ,  only one minimal member Y exists and x=y

4, if  a last member x ,  only one maximal member Y exists and x=y.

5, partial the first member  the last member inverse

ordering the last member  the first member partial ordering

34 Unedited Version: Neural Network and Fuzzy System


3-28

※ In a partial ordering , it does not guarantee that (x,y) , ( x  y, or y  x ).

If  ,  (x,y) : comparable (total ordering)

Otherwise (x,y) : non comparable

˙ A X

If x  X , and y  A , x  y ,

 x: lower bound of A on X
If ……… , x  y

 x : upper bound of A on X

˙greatest lower bound ( orinfimum ) GLB


- a lower bound which succeeds every other lower bound

Least upper bound ( orsupermum ) LUB

- a upper bound which preceeds every other upper bound

˙Lattice – A partial ordering on X contains GLB and LUB , S  {x1 , x2 }  X

35 Unedited Version: Neural Network and Fuzzy System


3-29

˙ Connected – a partial ordering is said to be connected

iff x, y  X , x  y  x<y or y>x

˙ Linear ordering (total ordering , simple ordering , complete ordering )

- when a partial ordering is connected , then ( x, y ) : comparable

˙ Hasse diagrams – representing partial orderings in which  indicates 

˙ Example 3.12 : Crisp partial orderings

36 Unedited Version: Neural Network and Fuzzy System


3-30

˙ Fuzzy partial ordering

- reflexive , antisymmetric , and transitive under some form of transitivity.

※ any fuzzy partial ordering can be resolved into a series of crisp partial ordering .

i.e. taking a series of  cut that produce increasing levels of refinement

˙ In a fuzzy partial ordering , R

x  X , twofuzzy sets are associated with


R[ x ] : dominating class

R[ x ] ( y )  R( x, y )

R[ x ] : dominated class

R[ x ] ( y )  R( y, x)

37 Unedited Version: Neural Network and Fuzzy System


3-31

˙ xundominatediff R(x,y) = 0 y  x

X undominatingiff R(y,x) = 0 y  x

˙ Fuzzy upper bound for A  X is a fuzzy set

U ( R, A)  R[ x ]
xA

※ If a least upper bound of A exists , it is the unique element x U ( R, A)

s.t. 1, 2,

U ( R, A)( x) > 0 R(x,y) > 0 ,

y  support [ U(R,A) ]

˙ Example 3.13

a b c d e

a  1 0.7 0 1 0.7 
b  0 1 0 0.9 0 
Fuzzy partial ordering R: c 0.5 0.7 1 1 0.8 
 
d 0 0 0 1 0 
e  0 0.1 0 0.9 1 

1. row : dominating class for each element


column : dominated class for each element

2. d : undominated , C : undominating
3. For A = {a,b} , U(R,A) = the intersection of
0.7 0.9
The dominating classes of a and b = 
b d
4, LUB(A) =b

38 Unedited Version: Neural Network and Fuzzy System


3-32

3. Crisp ordering captured by the fuzzy ordering

e.g.   0.5

1 1 0 1 1
0 1 0 1 0


R  1 1 1 1 1
 
0 0 0 1 0
0 0 0 1 1

# is → 2 3 1 5 3

※The ordering become weaken with the increasing α

39 Unedited Version: Neural Network and Fuzzy System


5-33

Fuzzy preordering – reflexive and transitive

Fuzzy weak ordering –

i, an ordering satisfying the proportion of a fuzzy total ordering except antisymmetry.

ii, a fuzzy preordering in which x  y , either R(x,y)>0 or R(y,x)>0

Fuzzy strict ordering –

Antireflexive

Antisymmetric

Transitive

3.8. Morphisms

‧Crisp homomorphism h from (X,R) to (Y,Q)

Where R(X,X), Q(Y,Y):binary relations

( x1 , x2 )  R  (h( x1 ), h( x2 ))  Q
‧ Fuzzy homomorphism h
If R(X,X), Q(Y,Y):Fuzzy binary relations

And R( x1 , x2 )  Q[h( x1 ), h( x2 )]

※ It’s possible that a relation (h( x1 ), h( x2 ))  Q which ( x1 , x2 )  R .


※ If this is never the case h is called a strong homomorphism.

40 Unedited Version: Neural Network and Fuzzy System


3-34

‧Crisp strong homomorphism h

If ( x1 , x2 )  R  (h( x1 ), h( x2 ))  Q

And ( y1 , y2 )  Q  (h 1 ( y1 ), h 1 ( y2 ))  R

※ where h : many to one → h 1 ( y) contains a set of Xs

‧Fuzzy strong homomorphism h

H imposes a partition  h on X

Let

A  {a1 , a2 , , an }

B  {b1 , b2 , , bn }   h

R,Q:fuzzy relations

h : strong homomorphism

iff max ( R(ai , b j ))  Q( y1 , y2 )


i, j

 y1  h(ai )ai  A
where 
 y2  h(b j )b j  B

41 Unedited Version: Neural Network and Fuzzy System


3-33

‧Example 3.14

R(X,X)

0 0.5 0 0
0 0 0.9 0 
R 
1 0 0 0.5
 
0 0.6 0 0

Q(Y,Y)

0.5 0.9 0 
Q   1 0 0.9
 1 0.9 0 

→h:ordinary fuzzy homomorphism (one way)

 R( x1 , x2 )  Q(h( x1 ), h( x2 ))  strong


But Q( ,  )  0.9 R(d , c)  0

i,e, ( ,  )  Q while ( d , c )  R where h(d )   , h(c)  

42 Unedited Version: Neural Network and Fuzzy System


3-36

R(X,X)

 0 .8 0 .4 0 0 0 0
 0 0.5 0 0.7 0 0 

0 0 0.3 0 0 0
 
 0 0 .5 0 0 0.9 0.5
0 0 0 1 0 0
 
 0 0 0 0 1 0.8

Q(Y,Y)

0.7 0 0.9
0.4 0.8 0 
 
 1 0 1 

→h:strong fuzzy homomorphism (two way)

43 Unedited Version: Neural Network and Fuzzy System


3-37

※Q represents a simplification of R

‧Isomorphism : (congruence)

h:1-1, onto  X  Y

Endomorphism : (subgraph)

h:X→Y, Y  X

Automorphism :

Isomorphism and End Endomorphism

i.e.m X=Y nad R=Q

44 Unedited Version: Neural Network and Fuzzy System


3-38

3.9 SUP-i Compositions of Fuzzy Relations Generalize max-min Composition

i : t-norm

sup : t-conorm

‧ P(X,Y), Q(Y,Z):fuzzy relations


i
P o Q( X  Z ) :sup-i composition
i
[ P o Q]( X , Z )  sup i[ P( x, y ), Q( y, z )]
yY

‧ Properties
i i i i
1. ( P o Q ) o R  P o( Q o R )
i i
2. P o( Q j )  ( P o Q j )
j j
i i
3. P o( Q j )  ( P o Q j )
j j
i i
4. ( Pj ) o Q  ( Pj o Q)
j j
i i
5. ( Pj ) o Q  ( Pj o Q)
j j
i i
6. ( P o Q) 1  Q 1 o R 1

45 Unedited Version: Neural Network and Fuzzy System


3-39
i i
3. Show Eq.(3.16), i.e., P ( Qj )  ( P Q j ),
jJ jJ

  
Where P ( X , Y ) and Q (Y , Z ) are fuzzy relations.

 i

pf. From Eq.(3.13), i.e., P Q ( x, z )  sup i  P( x, y), Q( y, z ) 
  
yY

 i   
  P ( Q j )  ( x, z )  sup i  P( x, y), Q j ( y, z ) 
 jJ  yY 

jJ 

Let Q  Qj
jJ

 Q  Q1 , Q  Q2 , , Q  QJ

i.e., ( y, z ), Q( y, z)  Q1 ( y, z), , Q( y, z)  Q J ( y, z)

i is monotonically increasing

i[ P( x, y), Q j ( y, z )]  i[ P( x, y), Q1 ( y, z )]


 jJ

 ........... ( x , y )
i[ P( x, y), Q ( y, z )]  i[ P( x, y), Q ( y, z )]
 j

J
jJ

i[ P( x, y),( Q j )( y, z)]  i[ P( x, y), Q j ( y, z)]


jJ jJ

 sup i[ P( x, y ), ( Q j )( y, z )]  sup i[ P( x, y), Q j ( y, z )]


 
yY jJ yY jJ

 sup i[ P( x, y ), Q j ( y, z )], ( x, y ), ( y, z )

jJ yY

 i   i 
  P ( Q j )  ( x, z )   ( P Q j )  ( x, z ), ( x, z )
 jJ   jJ 
i i
i,e., P ( Qj )  (P Qj )
jJ jJ

46 Unedited Version: Neural Network and Fuzzy System


3-40

。Sup-i composition monotonically increases

i i
P Q1  P Q2    (5.20)
i.e., if Q1  Q2
i i
Q1 P  Q2 P    (5.21)

i
。Identity of

1 0 
1, x  y 
E ( x, y )    
0, x  y  
0 1
i i
i.e., E P  P E  P

 2
。Relation Ron X :i-transitive


iff R( x, z )  i  R ( x, y ), R ( y, z )  , x, y, z  X

i
R RR

。i-transitive closure RT ( i )

--- The smallest i-transitive relation containing R

。Theorem 3.1: R: any fuzzy relation



 RT (i )  R ( n ) , where
n 1

i
R( n )  R R( n1)

47 Unedited Version: Neural Network and Fuzzy System


3-41
By (5.15) (5.17)
proof:
i  
i    
 ( n) i ( m) 

i, RT (i ) RT (i )   R( n)   R( m)   R R  R( nm)
 n1   m1  n1 m 1   n,m1
 
 R(k )  R ( k )  RT ( i )
k 2 k 1

i
i.e., RT ( i ) : i-transitive ( RT (i ) RT (i )  RT (i ) )

ii, Let S:i-transitive, R  S (5.20)(5.21) monotonically increasing

i i
 R(2)  R R  S S  S
mathematical
If R ( n )  S , i-transitive induction
i i
 R( n1)  R R( n)  S S  S

 R( k )  S , k

 RT (i )  R(k )  S
k 1

i.e., RT ( i ) : smallest

 2 
。Theorem 3.2: R: reflexive fuzzy relation on X , X  n
 

 R ( m )  R m 1 
 RT (i )  R ( n 1)   n n 1
 m
 R  R 

48 Unedited Version: Neural Network and Fuzzy System


3-42

proof :i, R : reflexive,


i i
 E  R, R  E R  R R  R (2)

 R( n1)  R( n) (By repetition)

ii, show R( n1)  R( n)

proof: If x  y,  R( n1) ( x, x)  1

If x  y,  reflexive
i
Extension of definition

R( n ) ( x, y)  sup i  R( x, z1 ), R( z1 , z2 ), , R( zn1 , y)


Z1 , , Z n1


X n

 X  Z0 , Z1 , , Z n  y contains
at least 2 identical element.

Say Z r  Z s (r  s)

 i  R( x, z1 ), , R( zr 1, zr ), , R( zs , zs 1 ), , R( zn1, y)  R( k ) ( x, y),(k  n 1)



x, y  X , R ( n ) ( x, y )  R ( n 1) ( x, y ),

 R( n)  R( n1)    ( B)  R( n)  R( n1)  ( A, B)

 RT (i )  R( n1)

49 Unedited Version: Neural Network and Fuzzy System


3-43

3.10 INF- wi Compositions of Fuzzy Relations

a b b
。 wi operation:
a  b   1

wi (a, b)  sup x [0,1]| i(a, x)  b

where a, b  [0,1] , i : continuous t-norm

※ If i : logical conjunction (i.e.,  , and)


 wi : logical implication (i.e.,  , if then)

。Theorem 3.3

1, i (a, b)  d iff wi (a, b)  b

2, wi (wi (a, b), b)  a

3, wi (i(a, b), d )  wi (a, wi (b, d ))

4, a  b,  wi (a, d )  wi (b, d ) ---- i

wi (d , a)  wi (d , b) --- ii

3, i(wi (a, b), wi (b, d ))  wi (a, d )

6, wi (inf a j , b)  sup wi (a j , b)
j j

7, wi (sup a j , b)  inf wi (a j , b)
j j

8, wi (b,sup a j )  sup wi (b, a j )


j j

9, wi (b,inf a j )  inf wi (b, a j )


j j

10, i(a, wi (a, b))  b

50 Unedited Version: Neural Network and Fuzzy System


3-44

proof: (1) i , If i(a, b)  d ,  b x | i(a, x)  d

( )  b  supx | i(a, x)  d  wi (a, d )

ii, If b  wi (a, d ) i: continuous


monotone
d
()  i(a, b)i:monotone a, d ))  i(a,sup x | i(a, x)  d )  sup i(a, x) | i(a, x)  d   d
i(a, wi (increasing
(3)

i(a, x)  wi (b, dba )  bi(ab,ddi(a,By


x)) d
(1)<=
By (1)=>
b

Associativity
communitation
 i(i(a, b), x)  d  x  wi (i(a, b)d )

wi (i(a, b), d ) By (A)

 wi (a, wi (b, d ))  sup x | i(a, x)  wi (b, d )  sup x | x  wi (i(a, b), d )  wi (i(a, b), d )

(7) Let S  sup a j ---(B)  a j  s, j


j
By(4)

 wi ( s, b)  wi (a j , b), j

 wi (s, b)  inf wi (a j , b) ---- (C)


j

inf wi (a j , b)  wi (a j0 , b), j0  J


j
By(1)

 i(a j0 ,inf wi (a j , b))  b, j0


j

i(s,inf wi (a j , b))  sup i(a j0 ,inf wi (a j , b))  b


j j0 j

By(1)
 wi (s, b)  inf wi (a j , b) --- (D)
j

By(B)(C)(D)
 wi (sup a j , b)  wi ( s, b)  inf wi (a j , b)
j
j

51 Unedited Version: Neural Network and Fuzzy System


3-45

(2)Show wi ( wi (a, b), b)  a (Theorem 3.3 (2))

proof : wi (a, b)  Sup x [0,1]| i(a, x)  b and by Theorem 3.10

imin (a, b)  i(a, b)  min(a, b)

i, If a>b

wi (a, b)  Sup x [0,1]| i(a, x)  b  Sup x [0,1]| min(a, x)  b  b

i( wi (a, b), a)
 i(b, a) By Axiom i2 wi (a, b)  b

 i(b,1)
By Axiom i2 ( a  1 )
b
By Axiom i1
 i(wi (a, b), a)  b

i, If a  b

wi (a, b)  Sup x [0,1]| i(a, x)  b  Sup x [0,1]| min(a, x)  b  1

 i( wi (a, b), a)  i( wi (a, b), b)

 i (1, a) By Axiom i2 wi (a, b)  1


 i (b,1)
By Axiom i2
b
By Axiom i1
 i(wi (a, b), a)  b

By Theorem 3.3 property 1(i.e., i (a, b)  d iff wi (a, d )  b )

i(wi (a, b), a)  b  wi (wi (a, b), b)  a

52 Unedited Version: Neural Network and Fuzzy System


(4) prove Theorem 3.3 (4) : a  b =>i, Wi (a, d )  Wi (b, d )

ii, Wi (d , a)  Wi (d , b)

proof :i, Wi (a, d )  Wi (b, d )

Wi (a, d )  sup{x | i(a, x)  d} ---- (A)

Wi (b, d )  sup{x | i(b, x)  d} ---- (B)

a, if d  a  b => (A)=d , (B)=d ,

∴ (A)=(B) ----- (1)

b, if a  d  b => (A)=1 , (B)=d

∴ (A)  (B) ----- (2)

c, if a  b  d => (A)=1 , (B)=1

∴ (A)= (B) ----- (3)

(1),(2),(3) => (A)  (B)

i.e., Wi (a, d )  Wi (b, d )

ii, see i

5. show i(Wi (a, d ),Wi (b, d ))  Wi (a, d )

Proof :∵ if a  b  Wi (a, b)  b

if a  b  Wi (a, b)  1

A, if a  b  i (Wi (a, b),Wi (b, d ))  i(b,Wi (b, d ))

=> i(b,Wi (b, d ))  i(b, d )  min(b, d )  d

Wi (a, d )  d

=> i(b,Wi (b, d ))  i (b,1)  b

Wi (a, d )  1

53 Unedited Version: Neural Network and Fuzzy System


=> i(b,Wi (b, d ))  i (b,1)  b

Wi (a, d )  1

B, if a  b  i(Wi (a, b),Wi (b, d ))  i (1,Wi (b, d ))  Wi (b, d )

=> Wi (b, d )  d

Wi (a, d )  d

Wi (b, d )  d

Wi (a, d )  1

Wi (b, d )  1

Wi (a, d )  1

10. show i(a,Wi (a, b)  b

Proof :∵ a  b  Wi (a, b)  b

a  b  Wi (a, b)  1

A, if a  b

 i(a,Wi (a, b))  i(a, b)  min(a, b)  b

B, if a  b

 i(a,Wi (a, b))  i(a,1)  a  b

54 Unedited Version: Neural Network and Fuzzy System


 inf  Wi composition
Wi inf
( P Q)( x, z )  y Y Wi ( P( x, y), Q( y, z ))

 Theorem 3.4 :
i Wi Wi
(1)( P Q  R)  (Q  P 1 R )  ( P  (Q R 1 ) 1 )
Wi Wi i Wi
(2)( P (Q S )  ( P Q) S

 Theorem 3.5 :
Wi Wi
( Pj ) Q  ( Pj Q)
j j
Wi Wi
( Pj ) Q)  ( Pj Q)
j j
Wi Wi
P ( Q j )  ( Pj Q j )
j j
Wi Wi
P ( Q j )  ( Pj Q j )
j j

Wi Wi
 Theorem 3.6 : if Q1  Q2 => P Q1  P Q2
Wi Wi
Q1 R  Q2 R

Proof : Q1  Q2 => Q1  Q2  Q1 , Q1  Q2  Q2

Wi Wi Wi Wi
∵ ( P Q1 )  ( P Q2 )  P (Q1  Q2 )  P Q1

Wi Wi
=> P Q1  P Q2

Wi Wi Wi Wi
∵ (Q1 R)  (Q2 R)  (Q1  Q2 ) R  Q2 R
Wi Wi
=> Q1 R  Q2 R

55 Unedited Version: Neural Network and Fuzzy System


 Theorem 3.7 :
i Wi
1. P 1 ( P Q )  Q

Wi i
2. R  P ( P 1 R )
Wi Wi
3. P  ( P Q ) Q 1

Wi Wi
4. R  ( R Q 1 ) Q

Proof :
Wi Wi
(1) P Q  ( P 1 ) 1 Q ---- (A)

(5.26)  (5.25)


Wi i
i.e., (Q  P 1 R)  ( P Q  R)
Wi
let P Q  Q , P 1  P , Q  R

i
 ( A)  P 1 ( P Q)  Q
i i
(2) P1 R  P1 R ---- (B)
i
Let P  P , Q  R , P 1 R  R
1

Wi i
 ( B)  R  P ( P 1 R)
i Wi
(3) by (3.33) , [ P 1 ( P Q)]1  Q 1

Wi i
 ( P Q) 1 P  Q 1 ---- (C)
Wi
Let P Q  P, P  Q, Q 1  R

Wi Wi
 (C )  P  ( P Q) 1 Q 1

(4) follows (3)

56 Unedited Version: Neural Network and Fuzzy System


57 Unedited Version: Neural Network and Fuzzy System
Chapter 4

Single Layer Perception: Perceptron convergence theorem, Method of steepest descent-least mean
square algorithms.

Single-Layer NN Systems

Here,asimplePerceptronModelandanADALINENetworkModelispresented.
Single layerPerceptron

Definition:Anarrangementofoneinputlayerofneuronsfeedforwardtooneoutputlayerofneurons
isknownasSingleLayerPerceptron.

inputxi weightswij outputyj

Xn ym

Single layer Perceptron Fig.


Simple PerceptronModel

1 if net j>0 n
where netj=  xiwij
0 if net j<0 i=1

• LearningAlgorithm:TrainingPerceptron
ThetrainingofPerceptronisasupervisedlearningalgorithmwhereweightsareadjustedtominimize
errorwhenevertheoutputdoesnotmatchthedesiredoutput.
- Iftheoutputiscorrectthennoadjustmentofweightsisdone.
K+1 K
i.e.
W ij = W ij

Unedited Version: Neural Network and Fuzzy System


- Iftheoutputis1butshouldhavebeen0thentheweightsaredecreasedontheactiveinputlink
K+1 K
- .xi
i.e. Wij = Wij

Unedited Version: Neural Network and Fuzzy System


- Iftheoutputis0butshouldhavebeen1thentheweightsareincreasedontheactiveinputlink

K+1 K
+ . xi
i.e. Wij = Wij

Where

K+1 K

Wij isthenewadjustedweight,Wij is the oldweight

xi istheinputandisthelearningrateparameter.

 smallleadstoslowandlargeleadstofastlearning.

• PerceptronandLinearlySeparableTask

Perceptroncannothandletaskswhicharenotseparable.
- Definition:Setsofpointsin2-Dspacearelinearlyseparableifthesetscanbeseparatedbyastraight line.
- Generalizing,asetofpointsinn-dimensionalspacearelinearlyseparableifthereisahyperplaneof (n-
1)dimensionsseparatesthesets.
Example

(a) Linearlyseparablepatterns (b) Not Linearly separablepatterns


Note:Perceptroncannotfindweightsforclassificationproblemsthatarenotlinearlyseparable.

Unedited Version: Neural Network and Fuzzy System


• XOR Problem

:Exclusive

ORoperation

Fig. Output of XOR in x1 , x2 plane

Evenparityis,evennumberof1bitsintheinput
Oddparityis,oddnumberof1bitsintheinput
- Thereisnowaytodrawasinglestraightlinesothatthecirclesareononesideofthelineandthedots
ontheotherside.
- Perceptronisunabletofindalineseparatingevenparityinputpatternsfromoddparityinputpatterns.

• PerceptronLearningAlgorithm

The algorithm is illustrated step-by-step.

* Step 1:
Createapeceptronwith(n+1)inputneuronsx0,x1,.............. , . xn,
where xo = 1 is the bias input. Let O be the output neuron.

* Step 2:
InitializeweightW=(w0,w1, ............. ,.wn)torandomweights.

* Step 3:
IteratethroughtheinputpatternsXjofthetrainingsetusingtheweightset;iecomputetheweightedsum of
inputs net j= xiwi foreachinputpatternj.
i=1

* Step 4:
Compute the output y j using the step function

1 if net j>0 n
yj=f(netj)= where netj= xi
wij0 if net j<0 i=1

Unedited Version: Neural Network and Fuzzy System


* Step 5:
Compare thecomputedoutput yj with thetargetoutput yj
foreachinputpatternj.Ifalltheinputpatternshavebeenclassif
iedcorrectly,thenoutput(read)theweightsandexit.

* Step 6:
Otherwise, update the weights as given below :
Ifthecomputedoutputsyjis1butshouldhavebeen0,
Then wi = wi - xi, i= 0,1,2,..........,n
Ifthecomputedoutputsyjis0butshouldhavebeen1,
Thenwi=wi+xi, i= 0,1,2,..........,n
where is the learning parameter and is constant.

* Step 7:
goto step 3

* END

AnADALINEconsistsofasingleneuronoftheMcCulloch-Pittstype,whereitsweightsaredetermined
bythenormalizedleastmeansquare(LMS)traininglaw.TheLMSlearningruleisalsoreferredtoas
deltarule.Itisawell-establishedsupervisedtrainingmethodthathasbeenusedoverawiderangeof
diverseapplications.

• ArchitectureofasimpleADALINE

Desired Output

ThebasicstructureofanADALINEissimilartoaneuronwithalinearactivationfunctionandafeedback
loop.DuringthetrainingphaseofADALINE,theinputvectoraswellasthedesiredoutputarepresented to
thenetwork.
[The complete training mechanism has been explained in the next slide. ]

Unedited Version: Neural Network and Fuzzy System


• ADALINETraining Mechanism

(Ref. Fig. in the previous slide - Architecture of a simple ADALINE)

* ThebasicstructureofanADALINEissimilartoalinearneuronwithanextrafeedbackloop.

* DuringthetrainingphaseofADALINE,theinputvectorX=[x1,x2,...,xn]Taswellasdesired
outputarepresentedtothenetwork.

* Theweightsareadaptivelyadjustedbasedondeltarule.

* AftertheADALINEistrained,aninputvectorpresentedtothenetworkwithfixedweightswillresult
inascalaroutput.

* Thus,thenetworkperformsanndimensionalmappingtoascalarvalue.

* The activation function is not used during the training phase. Once the weights are properly adjusted,
he response of the trained unit can be tested by applying various inputs, which are not in the training
set.Ifthenetworkproducesconsistentresponsestoahighdegreewiththetestinputs,itissaidthatthe
etworkcouldgeneralize.Theprocessoftrainingandgeneralizationaretwoimportantattributesofthis
network.

Usage of ADLINE :
In practice, an ADALINE is used to

- Makebinarydecisions;theoutputissentthroughabinarythreshold.

- RealizationsoflogicgatessuchasAND,NOTandOR.

- Realizeonlythoselogicfunctionsthatarelinearlyseparable.

Applications of Neural Network

Neural Network Applications can be grouped in following categories:

* Clustering:
Aclusteringalgorithmexploresthesimilaritybetweenpatternsandplacessimilarpatternsinacluster.
Bestknownapplicationsincludedatacompressionanddatamining.

* Classification/Pattern recognition:

Thetaskofpatternrecognitionistoassignaninputpattern(likehandwrittensymbol)tooneofmany
classes.Thiscategoryincludesalgorithmicimplementationssuchasassociativememory.

* Function approximation:

Thetasksoffunctionapproximationistofindanestimateoftheunknownfunctionsubjecttonoise.
Variousengineeringandscientificdisciplinesrequirefunctionapproximation.

* Prediction Systems:

Thetaskistoforecastsomefuturevaluesofatime-sequenceddata.Predictionhasasignificantimpact
ondecisionsupportsystems.Predictiondiffersfromfunctionapproximationbyconsideringtimefactor.
System may be dynamic and may produce different results for the same input data based on system
state(time).
Unedited Version: Neural Network and Fuzzy System
SC - Neural Network

Multilayer Perceptron: Derivation of the back-propagation, Algorithm , Learning


Factors
MULTI-LAYER PERCEPTRONS

Intheprevioussectionweshowedthatbyaddinganextrahiddenunit,theXORproblemcanbesolved.For
binaryunits,onecanprovethatthisarchitectureisabletoperformanytransformationgiventhecorrectconnections
andweights.Themostprimitiveisthenextone.Foragiventransformationy=d(x),wecandividethesetofall
possibleinputvectorsintotwoclasses:X+={x|d(x)=1}andX-={x|d(x).1}
p
Since there are Ninput units, the total number of possible input vectors x is 2N. For every x c X+
ahiddenunithcanbereservedofwhichtheactivationyhis1ifandonlyifthespecificpatternpispresentattheinput:
wecanchooseitsweightswihequaltothespecificpatternxandthebiasUhequalto1-Nsuchthat
p p 1
y = sgn[w x - N + -]
ih
h
i i 2

isequalto1forxp=wonly.Similarly,theweightstotheoutputneuroncanbechosensuchthattheoutputisoneas
h

soonasoneoftheMpredicateneuronsisone:
M
[w + M - - ]
p 1
y =sgn h
o
h=1 2

Thisperceptronwillgiveyo=1onlyifxcX+:itperformsthedesiredmapping.Theproblemisthelargenumberof
predicateunits,whichisequaltothenumberofpatternsinX+,whichismaximally2N.ofcoursewecandothesame
trickforX-,andwewillalwaystaketheminimalnumberofmaskunits,whichismaximally2N-1.Amoreelegantproof
isgivenbyMinskyandPapert,butthepointisthatforcomplextransformationsthenumberofrequiredunitsinthe
hiddenlayerisexponentialinN.

Back-Propagation

Aswehaveseeninthepreviouschapter,asingle-layernetworkhassevererestrictions:theclassoftasksthat
canbeaccomplishedisverylimited.Inthischapterwewillfocusonfeedforwardnetworkswithlayersofprocessing units.
MinskyandPapertshowedin1969thatatwolayerfeed-forwardnetworkcanovercomemanyrestrictions,
butdidnotpresentasolutiontotheproblemofhowtoadjusttheweightsfrominputtohiddenunits.Ananswertothis
questionwaspresentedbyRumelhart,HintonandWilliamsin1986,andsimilarsolutionsappearedtohavebeen
publishedearlier(Parker,1985;Cun,1985).

Unedited Version: Neural Network and Fuzzy System


Thecentralideabehindthissolutionisthattheerrorsfortheunitsofthehiddenlayeraredeterminedbyback-
propagatingtheerrorsoftheunitsoftheoutputlayer.Forthisreasonthemethodisoftencalledtheback-propagation
learningrule.Back-propagationcanalsobeconsideredasageneralizationofthedeltarulefornon-linearactivation
functionsandmultilayernetworks.

Unedited Version: Neural Network and Fuzzy System


MULTI - LAYER FEED - FORWARD NETWORKS
Afeed-forwardnetworkhasalayeredstructure.Eachlayerconsistsofunits,whichreceivetheirinputfrom units
from a layer directly below and send their output to units in a layer directly above the unit. There are no
connectionswithinalayer.
TheNiinputsarefedintothefirstlayerofNh,1hiddenunits.Theinputunitsaremerely‘fan-out’units;no
processingtakesplaceintheseunits.TheactivationofahiddenunitisafunctionFioftheweightedinputsplusabias,
asgivenineq.(10.4).TheoutputofthehiddenunitsisdistributedoverthenextlayerofNh,2hiddenunits,untilthe
lastlayerofhiddenunits,ofwhichtheoutputsarefedintoalayerofNooutputunits(seeFig.12.1).
Althoughback-propagationcanbeappliedtonetworkswithanynumberoflayers,justasfornetworkswith
binaryunits(section11.7)ithasbeenshown(Cybenko,1989;Funahashi,1989;Hornik,Stinchcombe,&White,
1989;Hartman,Keeler,&Kowalski,1990)thatonlyonelayerofhiddenunitssufficestoapproximateanyfunction
withfinitelymanydiscontinuitiestoarbitraryprecision,providedtheactivationfunctionsofthehiddenunitsarenon-
linear(theuniversalapproximationtheorem).

Fig. A multi-layer network with layers of units.


Inmostapplicationsafeed-forwardnetworkwithasinglelayerofhidden
unitsisusedwithasigmoidactivationfunctionfortheunits.

THE GENERALISED DELTA RULE

Sincewearenowusingunitswithnonlinearactivationfunctions,wehavetogeneralisethedeltarule,which
waspresentedinchapter11 forlinearfunctionstothesetofnon-linearactivationfunctions.Theactivationisa
differentiablefunctionofthetotalinput,givenby

yP = F(Sp) ................................................................................................ (12.1)


k k
in which
SP =w yp +  ....(12.2)
k jkk k
j

To get the correct generalization of the delta rule as presented in the previous chapter, we must set

EP
 pwjk = -  ....(12.3)
Wjk
The error Ep is defined as the total quadratic error for pattern p at the output units :

No
EP =
1
( dp -yp)2 ..........................................................................................................................................................
2 o=1 o o
(12.4)

Unedited Version: Neural Network and Fuzzy System


where doP is the desired output for unit 0 when pattemp is clamped.We further set E = EP as the summed
p
squared error. We can write

EP EP SPk


w jk SPk w jk ...(12.5)

By equation (12.2) we sec thal the scoond factor is

SkP
= yP jkj
w ...(12.6)

When we define

EP
k
P=

SPk ...(12.7)

wewillgetanupdaterulewhichisequivalenttothedeltaruleasdescribedinthepreviouschapter,resultingina
gradientdescentontheerrorsurfaceifwemaketheweightchangesaccordingto:

pWjk=PPk j ...(12.8)

The trick is to figure out what kp should be for each unit k in the network. The interesting result,
whichwe nowderive,isthatthereisasimplerecursive computationofthese’swhichcanbeimplemented
bypropagatingerrorsignalsbackwandthroughthenetwork.

Tocomputepweapplythechainruletowritethispartialderivativeastheproductoftwofactors,onefactor
reflectingthechangeinerrorasafunctionoftheoutputoftheunitandonereflectingthechangeinthecutputasa
functionofchangesintheinput.Thus,wehave

EP EP yPk


 wP
Pk =
SP SP
k k k ...(12.9)

Let us compute the scoond factor. By equation (12.1) we see that

yk=
P
F(SP)
SPk
k
...(12.10)

Unedited Version: Neural Network and Fuzzy System


whichissimplythederivativeofthesquashingfunctionFforthekthunit,evaluatedattheactinputSftothatunit.To
computethefirstfactorofequation(12.9),weconsidertwocases.First,assumethatunitkisanoutputunttk=oof
thenetwork.Inthiscase,itfollowsfromthedefinitionofEPthat

EP = - (dP - yP ) ...(12.11)


yPo o o

which is the same result as we obtained with the standard delta rule. Substituting this and equation
(12.10)inequation(12.9),weget

Po -(dp o- yp) oFt (S po) o ...(12.12)

foranyoutputunito,Secondly,ifkisnotanoutputunitk-h,wedonotreadilyknowthecontributionoftheunitto
theoutputerrorofthenetwork.However,theerrormeasurecanbewrittenasafunctionofthenetinputsfromhidden
tooutputlayerEo-Ep(xp,xp,......sp,)andweusethechainruletowrite.

No No N No N
E =
P
 E
P
S P
o
=  E 
P

o
Win yPj 
 EP wij  o
Po wij
y Pk 0=1
SP k S Pk 0=1
SP n yPn j=1 j=1
S P n j=1
...(12.13)

Substituting this in equation (12.9) yields.

No
Pk =F(Sp) k  w
j=1
P
o ho

Unedited Version: Neural Network and Fuzzy System


Equations (12.12) and (12.14) give a recursive procedure for computing the 6’s for all units in the network,
whicharethenusedtocomputetheweightchangesaccordingtoequation(12.8).Thisprocedureconstitutesthegeneralized
delta rule for a feed-forward network of non-linearunits.

12.3.1 UnderstandingBack-Propagation

The equations derived in the previous section may be mathematically correct, but what do they actually mean
?
Is there a way of understanding back-propagation other than reciting the necessary equations ?

The answer is, of course, yes. In fact, the whole back-propagation process is intuitively very clear. What
happensintheaboveequationsisthefollowing.Whenalearningpatternisclamped,theactivationvaluesarepropagated
totheoutputunits,andtheactualnetworkoutputiscomparedwiththedesiredoutputvalues,we usuallyendupwithan error
in each of the output units. Let’s call this error eofor a particular output unit o. We have to bring eotozero.

The simplest method to do this is the greedy method: we strive to change the connections in the neural
network
insuchawaythat,nexttimearound,theerroreowillbezeroforthisparticularpattern.Weknowfromthedeltarulethat, in
order to reduce an error, we have to adapt its incoming weightsaccording to

Owho=(do- yo) yh

Thatisstepone.Butitaloneisnotenough:whenweonlyapplythisrule,theweightsfrominputtohiddenunits are
never changed, and we do not have the full representational power of the feed-forward network as promised by the
universal approximationtheorem.

In order to adapt the weights from input to hidden units, we again want to apply the delta rule. In this case,
however, we do not have a value for 6 for the hidden units. This is solved by the chain rule which does the following:
distribute the error of an output unit o to all the hidden units that is it connected to, weighted by this connection.
Differentlyput,ahiddenunithreceivesadeltafromeachoutputunitoequaltothedeltaofthatoutputunitweightedwith (=
multiplied by) the weight of the connection between thoseunits.

In symbols: 6h = \ 6OwhoWell, not exactly: we forgot the activation


0

function of the hidden unit; F’ has to be applied to the delta, before the backpropagation process can continue.

Unedited Version: Neural Network and Fuzzy System


WORKINGWITH BACK-PROPAGATION

Theapplicationofthegeneraliseddeltarulethusinvolvestwophases:Duringthefirstphasetheinputxis
presentedandpropagatedforwardthroughthenetworktocomputetheoutputvaluesypforeachoutputunit.This
outputiscomparedwithitsdesiredvaluedo,resultinginanerrorsignal6pforeachoutputunit.

Thesecondphaseinvolvesabackwardpassthroughthenetworkduringwhichtheerrorsignalispassedto
eachunitinthenetworkandappropriateweightchangesarecalculated.

12.4.1 WeightAdjustmentswithSigmoidActivationFunction

The results from the previous section can be summarised in three equations:

* Theweightofaconnectionisadjustedbyanamountproportionaltotheproductofanerrorsignal6,
ontheunitkreceivingtheinputandtheoutputoftheunitjsendingthissignalalongtheconnection:

pWkj= PkP j
...(12.16)

* Iftheunitisanoutputunit,theerrorsignalisgivenby

Po -(dp o- yp) Fo t (Sop) o ...(12.17)

Take as the activation function F the ‘sigmoid’function as defined in chapter 2 :

yP = F(SP)= 1 ...( 12.18)

Inthiscasethederivativeisequalto
1 + e-tP

F-1(SP) = __
p
1 = 1 p
(-e -t ) = - 1 (-e-t ) = yp(1 -yp) ...(12.19)
SP 1+e-S
P P P P
-S 2 -S 2 -S 2
(1+e ) (1+e ) (1+e )

such that the error signal for an output unit can be written as :

Po -(dp -y p p
o ) y o oo (1-yp)o ...(12.20)

* Theerrorsignalforahiddenunitisdeterminedrecursivelyintermsoferrorsignalsoftheunitstowhichit
directlyconnectsandtheweightsofthoseconnections.Forthesigmoidactivationfunction:
No No

Pk = Fp(Sp) k  d w
j=1
o
P
ho
- yP (1
k
- yp) k  d w
j=1
P
o ho ...( 12.21)

Unedited Version: Neural Network and Fuzzy System


Learning Rate And Momentum

The learning procedure requires that the change in weight is 6E . True gradient descent proportional to
p
6
w
.
requiresthatinfinitesimalstepsaretaken.Theconstantofproportionalityisthelearningratey.Forpracticalpurposes
wechoosealearningratethatisaslargeaspossiblewithoutleadingtooscillation.Onewaytoavoidoscillationat
large,istomakethechangeinweightdependentofthepastweightchangebyaddingamomentumterm:
Owjk (t + 1) = y6p
yp+ aOw(t)... (12.22)
k j jk
where t indexes the presentation number and a is a constant which determines the effect of the previous weight
change.
TheroleofthemomentumtermisshowninFig.12.2.Whennomomentumtermisused,ittakesalongtime
beforetheminimumhasbeenreachedwithalowlearningrate,whereasforhighlearningratestheminimumisnever
reachedbecauseoftheoscillations.Whenaddingthemomentumterm,theminimumwillbereachedfaster.

The descent in weight space. (a) for small learning rate; (b) for large learning rate: note the oscillations, and
(c) with large learning rate and momentum term added.

Learning Per Pattern

Although,theoretically,theback-propagationalgorithmperformsgradientdescentonthetotalerroronlyiftheweights
areadjustedafterthefullsetoflearningpatternshasbeenpresented,moreoftenthannotthelearningruleisapplied
toeachpatternseparately,i.e.,apatternpisapplied,Ep iscalculated,andtheweightsareadapted(p=1,2,…,P).
Thereexistsempiricalindicationthatthisresultsinfasterconvergence.Carehastobetaken,however,withtheorder
inwhichthepatternsaretaught.Forexample,whenusingthesamesequenceoverandoveragainthenetworkmay
becomefocusedonthefirstfewpatterns.Thisproblemcanbeovercomebyusingapermutedtrainingmethod.

Example12.1:Afeed-forwardnetworkcanbeusedtoapproximateafunctionfromexamples.Supposewehavea
system(forexampleachemicalprocessorafinancialmarket)ofwhichwewanttoknowthecharacteristics.The
inputofthesystemisgivenbythetwo-dimensionalvectorxandtheoutputisgivenbytheone-dimensionalvectord.
Wewanttoestimatetherelationshipd=ƒ(x)from80examples{xp,dp}asdepictedinFig.12.3(topleft).Afeed-
forwardnetworkwasprogrammedwithtwoinputs,10hiddenunitswithsigmoidactivationfunctionandanoutput
unitwithalinearactivationfunction.Checkforyourselfhowequation(4.20)shouldbeadaptedforthelinearinstead
ofsigmoidactivationfunction.Thenetworkweightsareinitializedtosmallvaluesandthenetworkistrainedfor5,000
learningiterationswiththeback-propagationtrainingrule,describedintheprevioussection.Therelationshipbetween
xanddasrepresentedbythenetworkisshowninFig.12.3(topright),whilethefunctionwhichgeneratedthe
learningsamplesisgiveninFig.12.3(bottomleft).TheapproximationerrorisdepictedinFig.12.3(bottomright).
Weseethattheerrorishigherattheedgesoftheregionwithinwhichthelearningsamplesweregenerated.The
networkisconsiderablybetteratinterpolationthanextrapolation.
Unedited Version: Neural Network and Fuzzy System
Fig. Example of function approximation with a feed forward network. Top left: The original learning
samples; Top right: The approximation with the network; Bottom left: The function which
generated the learning samples; Bottom right: The error in the approximation.

Exercise:
Q1. What is mean by Single-Layer NN Systems.
Q2. Explain Architecture of a simple ADALINE.
Q3. What are the use of ADLINE.
Q4. What are the Applications of Neural Network.
Q5. What is mean by Learning Algorithm.
Q6. Explain MULTI-LAYER PERCEPTRONS
Q7. What is the process of Back Propagation.
Q8. Write a short note on MULTI - LAYER FEED - FORWARD NETWORKS.
Q 9. List out the GENERALISED DELTA RULE.
Q10. How we can understand the Back Propagation.

Unedited Version: Neural Network and Fuzzy System


Q11. Explain WORKINGWITH BACK-PROPAGATION.
Q12. What is mean by Learning Rate And Momentum.

Unedited Version: Neural Network and Fuzzy System


Chapter 5

Radial Basis and Recurrent Neural Networks: RBF network structure , theorem and the reparability
of patterns, RBF learning strategies

RadialBasisandRecurrentNeuralNetworks:RBFnetworkstructure,theoremandthereparabilityofpatternsRBF
learningstrategies,K-meansandLMSalgorithms,comparisonofRBFandMLPnetworks:energyfunction,spurious
states, errorperformance.

Radialbasisfunction(RBF)networksarefeed-forwardnetworkstrainedusingasupervisedtrainingalgorithm.They
aretypicallyconfiguredwithasinglehiddenlayerofunitswhoseactivationfunctionisselectedfromaclassof
functionscalledbasisfunctions.Whilesimilartobackpropagationinmanyrespects,radialbasisfunctionnetworks
haveseveraladvantages.Theyusuallytrainmuchfasterthanbackpropagationnetworks.Theyarelesssusceptibleto
problemswithnon-stationaryinputsbecauseofthebehaviouroftheradialbasisfunctionhiddenunits.

PopularizedbyMoodyandDarken(1989),RBFnetworkshaveproventobeausefulneuralnetworkarchitecture.
ThemajordifferencebetweenRBFnetworksandbackpropagationnetworks(thatis,multilayerperceptrontrained
byBackPropagationalgorithm)isthebehaviourofthesinglehiddenlayer.RatherthanusingthesigmoidalorS-
shapedactivationfunctionasinbackpropagation,thehiddenunitsinRBFnetworksuseaGaussianorsomeother
basiskernelfunction.Eachhiddenunitactsasalocallytunedprocessorthatcomputesascoreforthematchbetween
theinputvectoranditsconnectionweightsorcentres.Ineffect,thebasisunitsarehighlyspecializedpatterndetectors.
Theweightsconnectingthebasisunitstotheoutputsareusedtotakelinearcombinationsofthehiddenunitsto
productthefinalclassificationoroutput.Inthischapterfirstthestructureofthenetworkwillbeintroducedanditwill
beexplainedhowitcanbeusedforfunctionapproximationanddatainterpolation.Thenitwillbeexplainedhowitcan
betrained.

The Structure of the RBF Networks

RadialBasisFunctionsarefirstintroducedinthesolutionoftherealmultivariableinterpolationproblems.BroomheadandL
owe(1988),andMoodyandDarken(1989)werethefirsttoexploittheuseofradialbasisfunctionsinthe
designofneuralnetworks.

The structure of an RBF networks in its most basic form involves three entirely different layers

Unedited Version: Neural Network and Fuzzy System


Structure of the Standart RBF network

Theinputlayerismadeupofsourcenodes(sensoryunits)whosenumberisequaltothedimensionpoftheinput vectoru.

Unedited Version: Neural Network and Fuzzy System


Hidden layer

Thesecondlayeristhehiddenlayerwhichiscomposedofnonlinearunitsthatareconnecteddirectlytoallofthe
nodesintheinputlayer.Itisofhighenoughdimension,whichservesadifferentpurposefromthatinamultilayer
perceptron.

Eachhiddenunittakesitsinputfromallthenodesatthecomponentsattheinputlayer.Asmentionedabovethehidden
unitscontainsabasisfunction,whichhastheparameterscenterandwidth.Thecenterofthebasisfunctionforanode
iatthehiddenlayerisavectorciwhosesizeistheastheinputvectoruandthereisnormallyadifferentcenterforeach
unitinthenetwork.

First,theradialdistancedi,betweentheinputvectoruandthecenterofthebasisfunctionciiscomputedforeachunit
iinthehiddenlayeras

di= || u -ci|| (5.1.1)

using the Euclidean distance.

The output hiof each hidden unit t is then computed by applying the basis function G to this distance.

hi= G( di,i) (5.1.2)

AsitisshowninFigure5.2.thebasisfunctionisacurve(tipicallyaGaussianfunction,thewidthcorrespondingtothe
variance.i)whichhasapeakatzerodistanceanditdecreasesasthedistancefromthecenterincreases.

Unedited Version: Neural Network and Fuzzy System


Figure5.2.TheresponseregionofanRBFhiddennodearoundits
centerasafunctionofthedistancefromthiscenter.

Unedited Version: Neural Network and Fuzzy System


ForaninputspaceuR2,thatisM=2,thiscorrespondstothetwodimensionalGaussiancenteredatciontheinput
space,wherealsociR2,asitisshowninFigure5.3

Figure5.3ResponseofahiddenunitontheinputspaceforuR2

5.1.2 Outputlayer

Thetransformationfromtheinputspacetothehiddenunitspaceisnonlinear,whereasthetransformationtothe
hiddenunitspacetotheoutputspaceislinear.

The jth output is computed as


t

xj = fj (u) = woj+w h (w)


y i
j = 1,2,.....,M (5.1.3)

Mathematicalmodel

In summary, the mathematical model of the RBF network can be expressed as;

x=f(u),f:RN RM (5.1.4)

xj =fj(u)=woj+ w G(|u-ct|) j=1,2,.....,M


g
|u|
(5.1.5)

Unedited Version: Neural Network and Fuzzy System


where is the Euclidean distance between u and ci

Unedited Version: Neural Network and Fuzzy System


5.2 Functionapproximation

Let y=g(u) be a given function of u, yR, uR, g:R R,


i
G i=1.. L, be a finite set of basis functions.

The function g can be written in terms of the given basis functions as


t

y= g(u) = w G (u) +r(u)


|u|
t t
(5.2.1)

where r(u) is the residual.

The function y can be approximated as


t

y= g(u)  w G (u)
|u|
t t
(5.2.2)

The aim is to minimize the error by setting the parameters of Giappropriately. A possible choice for the error
definition is the L2 norm of the residual function r(u) which is defined as

||y(u)||1.2 = r(u)2 (5.2.3)

5.2.1 Approximation byRBFNN

Now,considerthesingleinputsingleoutputRBFnetworkshowninFigure5.4.Thenxcanbewrittenas t

x=f(u)= w G (||u-c ||)


t
|u|
t t
(5.2.4)

Bytheuseofsuchanetwork,ycanbewritten t

y= w G (||u - c ||) + r(u) = f(u)+r(u)


|u|
t t
(5.2.5)

where f(u) is the output of the RBFNN given in Figure 5.4 and r(u) is the residual.

By setting the center ci, the variance i, and the weight withe error appropriately, the error can be

Unedited Version: Neural Network and Fuzzy System


Figure 5.4 Single input, single output RBF network

Whatever we discussed here for g:RR, can be generalized to g:RN RM easily by using an N input,
M output RBFNN given in figure 5.1 previously.

5.2.2 DataInterpolation

Giveninputoutputtrainingpatterns(uk,yk)tk=1,2...K,theaimofdatainterpolationistoapprimatethefunctiony
fromwhichthedataisgenerated.Sincethefunctionyisunknown,theproblemcanbestatedasaminimization
problemwhichtakesonlythesamplepointsintoconsideration:

Choose wijand ci, i=1,2 ... L. j=1,2 ... M so as to minimize


t

J(wtc)=  ||y -f(u )||


|u|
k k 2
(5.2.6)

Unedited Version: Neural Network and Fuzzy System


Asanexample,theoutputofanRBFnetworktrainedtofitthedatapointsgiveninTable5.1isgiveninFigure5.5.

TABLEI:13datapointsgeneratedbyusingsumofthreegaussisnswithc1=0.2000 c2 = 0.6000

c3=0.9000 w1= 0.2000 w = 20.5000 w = 0.3000


3
 =0.1000

data no 1 2 3 4 5 6 7 9 10 11 12 13

x 0.0500 0.2000 0.2500 0.3000 0.4000 0.4300 0.4800 0.6000 0.7000 0.8000 0.9000 0.9500

f(x) 0.863 0.2662 0.2362 0.1687 0.1260 0.1756 0.3290 0.6694 0.4573 0.3320 0.4063 0.3535

Figure 5.5 Output of the RBF network trained to fir the datapoints given in Table 5.1

5.3 Training RBFNetworks

ThetrainingofaRBFnetworkcanbeformulatedasanomilinearunconstrainedoptimizationproblemgivenbelow:

Giveninputoutputtrainingpatterns(uk,yk)tk=1,2...K,Choosewijandci,i=1,2...L.j=1,2...M

soastominimize

Unedited Version: Neural Network and Fuzzy System


Note that the training problem becomes quadratic once if ci’s (radial basis function centers) are
known.

5.3.1 Adjusting thewidths

Initssimplestform,allhiddenunitsintheRBFnetworkhavethesamewidthordegreeofsensitivitytoinputs.
However,inportionsoftheinputspacewheretherearefewpatterns,itissometimedesirabletohavehiddenunits
withawideareaofreception.Likewise,inportionsoftheinputspace,whicharecrowded,itmightbedesirableto
haveveryhighlytunedprocessorswithnarrowreceptionfields.Computingtheseindividualwidthsincreasesth
e performanceoftheRBFnetworkattheexpenseofamorecomplicatedtrainingprocess.

5.3.2 Adjusting thecenters

Rememberthatinabackpropagationnetwork,allweightsinallofthelayersareadjustedatthesametime.Inradial
basisfunctionnetworks,however,theweightsintothehiddenlayerbasisunitsareusuallysetbeforethesecondlaye
r
ofweightsisadjusted.Astheinputmovesawayfromtheconnectionweights,theactivationvaluefallsoff.This
behaviorleadstotheuseoftheterm“center”forthefirst-layerweights.Thesecenterweightscanbecomputedusing
Kohonenfeaturemaps,statisticalmethodssuchasK-
Meansclustering,orsomeothermeans.Inanycase,theyare
thenusedtosettheareasofsensitivityfortheRBFnetwork’shiddenunits,whichthenremainfixed.

5.3.3 Adjusting theweights

Oncethehiddenlayerweightsareset,asecondphaseoftrainingisusedtoadjusttheoutputweights.Thisprocess
typicallyusesthestandardsteepestdescentalgorithm.Notethatthetrainingproblembecomesquadraticonceifci’
s(radialbasisfunctioncenters)areknown.

Unedited Version: Neural Network and Fuzzy System


Exercise

Q1. Write a short on RadialBasisandRecurrentNeuralNetworks


Q2. Explain the Structure of the RBF Networks.
Q3. What is mean by Hidden layer.
Q4. What is mean by Function approximation.
Q5. Write a short note on DataInterpolation.
Q6. Write a short note on Adjusting the centers.
Q7. Write a short note on Adjusting the widths.

Unedited Version: Neural Network and Fuzzy System


Chapter 6
K-means and LMS algorithms, Comparison of RBF and MLP networks
k-means clustering algorithm

k-meansisoneofthesimplestunsupervisedlearningalgorithmsthatsolvethewellknownclusteringproblem.The
procedurefollowsasimpleandeasywaytoclassifyagivendatasetthroughacertainnumberofclusters(assumek
clusters)fixedapriori.Themainideaistodefinekcenters,oneforeachcluster.Thesecentersshouldbeplacedina
cunningwaybecauseofdifferentlocationcausesdifferentresult.So,thebetterchoiceistoplacethemasmuchas
possiblefarawayfromeachother.Thenextstepistotakeeachpointbelongingtoagivendatasetandassociateitto
thenearestcenter.Whennopointispending,thefirststepiscompletedandanearlygroupageisdone.Atthispoint
weneedtore-calculateknewcentroidsasbarycenteroftheclustersresultingfromthepreviousstep.Afterwehave
theseknewcentroids,anewbindinghastobedonebetweenthesamedatasetpointsandthenearestnewcenter.A
loophasbeengenerated.Asaresultofthisloopwemaynoticethatthekcenterschangetheirlocationstepbystep
untilnomorechangesaredoneorinotherwordscentersdonotmoveanymore.Finally,this algorithmaimsat
minimizinganobjectivefunctionknowassquarederrorfunctiongivenby:

C Ci
i j
J(V) = (||x -v ||)2
i=1j=1
where,
‘||xi-vj||’istheEuclideandistancebetweenxiandvj.

‘c’isthenumberofdatapointsini
i
th
cluster.

‘c’ is the number of cluster centers.

Algorithmic steps for k-means clustering

LetX={x1,x2,x3,.............,xn}bethesetofdatapointsandV={v1,v2, ........................ vc}bethesetofcenters.


1) Randomly select ‘c’ clustercenters.
2) Calculalethedistancebetweeneachdatapointandclustercenters.
3) Assignthedatapointtotheclustercenterwhosedistancefromtheclustercenterisminimumofalltheclusterc
enters.
4) Recalculatethenewclustercenterusing:

Ci xij=1
Vi=(1/ci)

Unedited Version: Neural Network and Fuzzy System


1
th
where,‘c’representsthenumberofdatapointsini
i
cluster.

5) Recalculatethedistancebetweeneachdatapointandnewobtainedclustercenters.

Unedited Version: Neural Network and Fuzzy System


2
Advantages

1) Fast,robustandeasiertounderstand.
2) Relativelyefficient:O(tknd),wherenis#clusters,dis#dimensionofeachobject,andtis#iterations.
Normally,k,t,d<<n.
3) Givesbestresultwhendatasetaredistinctorwellseparatedfromeachother.

Fig. I : Showing the result of k-means for ‘N’ = 60 and ‘c’ = 3


Note : For more detailed figure for k-means algorithm please refer to k-means figuresub page.

Disadvantages

1) Thelearningalgorithmrequiresopriorispecificationofthenumberofclustercenters.
2) TheuseofExclusiveAssigmnent-Iftherearetwohighlyoverlappingdatathenk-meanswillnotbeableto
resolvethattherearetwoclusters.
3) Thelearningalgorithmisnotinvarianttonon-lineartransformationi.e.withdifferentrepresentationofdata
wegetdifferentresults(datarepresentedinformofcartesianco-ordinatesandpolarco-ordinateswillgive
differentresults).
4) Euclideandistancemeasurescanunequallyweightunderlyingfactors.
5) Thelearningalgorithmprovidesthelocaloptimaofthesqurederrorfunction.

Unedited Version: Neural Network and Fuzzy System


3
5) Thelearningalgorithmprovidesthelocaloptimaofthesquarederrorfunction.
6) Randomlychoosingoftheclustercentercannotleadustothefruitfulresult.Pl.referFig.
7) Applicableonlywhenmeanisdefinedi.e.failsforcategoricaldata.
8) Unabletohandlenoisydataandoutliers.
9) Algorithmfailsfornon-lineardataset.

Fig II :Show.the non-linear data set where k-means algorithm fails.

References :
1) AnEfficientk-meansClusteringAlgorithm:AnalysisandImplementationbyTapasKanungo,DavidM.
Mount,NathanS.Netanyahu,ChristineD.Piatko,RuthSilvermanandAngelaY.Wu.
2) ResearchissuesonK-meansAlgorithm: AnExperimentalTrialUsingMatlabbyJoaquinPerezOrtega,Ma
Del.RocioBooneRojasandMariaJ.SomodevillaGarcia.
3) Thek-méansalgorithm-NotesbyTan,Steinbach,KumarGhosh.
4) http://home.dei.polimi.it/matteucc/Clustering/tutorialhtml/kmeans.html
5) k-meansclusteriiigbykechen.

References
https://sites.google.com/site/dataclusteringalgorithm/k-means-clustering-algorithm

Unedited Version: Neural Network and Fuzzy System


4
6.4 Least-Mean-Square Algorithm
Theleast-mean-square(LMS)algorithmisbasedontheuseofinstantaneousestimatesoftheautocorrelationfunction
rt(j.k)andthecross-correlationfunctionrul(k).Theseestimatesarededucteddirectlyfromthedefiningequations (6.8)
and (6.7) as follows:
p1(j, k, n) -x1(m)x1(n) (6.20)
and
ptt(k, n) - x1(m) x1(n)

Theuseofahatinptandpuisintendedtosignifythatthesequantitiesare“estimates.”Thedefinitionsintroducedin
Eqs.(6.20)and(6.21)havebeengeneralizedtoincludeanonstationaryenvironment,inwhichcaseallthesensory
signalsandthedesiredresponseassumetime-varyingformstoo.Thus,substitutingp1(j,k,n)-x1(m)x1(n)andptt(k,
n) inplaceofr,(j,k)andfm(k)inEq.(6.17).Weget
t
wty(n + 1) = wt (n)y + [ x (n) tf(n) -  w (n)x (n) x (n) ]
y j j

j-1

y t y j j

= wt (n) + [d(n)- w (m)x (n) ] x(n)


j-1

=wt(n)+[d(n)-y(m)]x(n)]x(n), k-1,2,......P. (6.22)


y j j

where y(n) is the output of the spatial filter computed at iteration n in accordance with the LMS algorithm; that is,
t
d(n) = w (m)x (n)
j-1
y j
(6.23)

NotethatinEq.(6.22)wehaveusedwy(n)inplaceofsy(n)toemphasizethefactthatEq.(6.22)invilves“estimates”
oftheweightsofthespatialfilter.
Figure6.3illustratestheoperationalenvironmentoftheLMSalgorithm,whichiscompletelydescribedby
Eqs.(6.22)and(6.23).AsummaryoftheLMSalgorithmispresentedinTable6.1,whichclearlyillustratesthe
simplicityofthealgorithm.Asindicatedinthistable,fortheinitializationofthealgorithm,itiscustomarytosetallthe
initialvaluesoftheweightsofthefilterequaltozero.
Inthemethodofsteepestdescentappliedtoa“known”environment,theweightvectorw(n),madeupofthe
weights wy(n), wy(n),,w, (n). starts at some initial value w(i), and then follows a precisely defined trajectory
(alongtheerrorsurface)thateventuallyterminatesontheoptimumsolutionw,providedthatthelearning-rateparameter
nischosenproperly.Inconstrast,intheLMSalgorithmappliedtoan“unknown”environment,theweightvector
w(n),respresentingan“estimate”ofw(n),followsarandomtrajectory.Forthisreason,theLMSalgorithmis
sometimesreferredtoasa“stochasticgradientalgorithm.”AsthenumberofiterationsintheLMSalgorithmapproaches
Unedited Version: Neural Network and Fuzzy System
5
infinity,w(n)performsarandomwalk(Brownianmotion)abouttheoptimumsolutionw,;seeAppendixD.

Unedited Version: Neural Network and Fuzzy System


6
Desired response
FIGURE 6.3 Adaptive spatial filter.

AnotherwayofstatingthebasicdifferencebetweenthemethodofsteepestdescentandtheLMSalgorithm
isintermsoftheerrorcalculationsinvolved.Atanyiterationn,themethodofsteepestdescentminimizesthemean-
squarederrorj(n).Thiscostfunctioninvolvesensembleaveraging,theeffectofwhichistogivethemethodof
steepestdescentan“exact”gradientvectorthatimprovesinpointingaccuracywithincreasingn.TheLMSalgorithm,
ontheotherhand,minimizesaninstantaneousestimateofthecostfunctionj(n).Consequently,thegradientvectorin
theLMSalgorithmis“random,”anditspointingaccuracyimproves“ontheaverage”withincreasingn.
ThebasicdifferencebetweenthemethodofsteepestdescentandtheLMSalgorithmmayalsobestatedin
termsoftime-domainideas,emphasizingotheraspectsoftheadaptivefilteringproblem.Themethodofsteepest
discentminimizesthesumoferrorsquaresx, (n),integratedoverallpreviousiterationsofthealgorithmuptoand
including estimates of the autocorrelation function y, and cross-correlation function yn. In constract, theLMS
1
algorithmsimplyminimizestheinstantaneouserrorsquaredy(n),definedas( )e2(n),therebyreducingthestorage
2
requirementtotheminimumpossible.Inparticular,itdoesnotrequirestoringanymoreinformationthanispresentin
theweightsofthefilter.

TABLE 6.1 Summary of the LMS Algorithm


1. Initialization.Set
wy(1) - 0 fork-1,2, ........ p.
2. Filtering. For time n-1,2, .......,compute.
p
y(n) = w (n)x (n)
j-1
y j

e(n) - d(n) - y(n)

wt1(n + 1) - wx(n) +ny(n) x2(n) fork-1,2, ............ p

Unedited Version: Neural Network and Fuzzy System


7
ItisalsoimportanttorecognizethattheLMSalgorithmcanoperateinastationeryornonstationaryenvironment.
Bya“nonstationery”environmentwemeanoneinwhichthestatisticsvarywithtime.Insuchasituationtheoptimum
solutionassumesatime-varyingform,andtheLMSalgorithmthereforehasthetaskofnotonlyseekingtheminimum
pointoftheerrorsurfacebutalsotruckingit.Inthiscontext,thesmallerwemakethelearning-rateparametern,the
betterwillbethetrackingbehaviourofthealgorithm.However,thisimporovementinperformanceisattainedatthe
costofaslowadaptationrate(Haykin,1991;WidrowandStearns,1985).
Signal-Flow Graph Representation of the LMS Algorithm

Equation(6.22)providesacompletedescriptionofthetimeevolutionoftheweightsintheLMSalgorithm.Rewriting
thesecondlineofthisequationinmatricform,wemayexpressitasfollows:
W(n+1)=w(n)+n[d(n)-x2(n)w(n)]x(n) (6.24)

where
t
w(n) - [w (n), tw (n), ......,w]
2 t
(6.25)

and
x(n) - [w (n), w (n), ......,x(n)]t (6.26)
t 2 t

Rearranging terms in Eq. (6.24), we have

W(n+1)=[I-nx(n)x2(n)]w(n)+nx(n)d(n) (6.27)

whereIistheidentitymatrix.InusingtheLMSalgorithm,wenotethat

W(n) = z-1 [w(n)+1)] (6.28)

wherez-1istheunit-delayoperatorimplyingstorage.UsingEqs.(6.27_and(6.28),wemaythusrepresenttheLMS
algorithmbythesignal-flowgraphdepictedinFig.6.4.

Thesignal-flowgraphofFig.6.4revealsthattheLMSalgorithmisanexampleofastochasticfeedback
system.ThepresenceoffeedbackhasaprofoundimpactontheconvergencebehaviouroftheLMSalgorithm,as
discussednext.

FIGURE 6.4 Signal-flow graph representation of the LMS algorithm.

Unedited Version: Neural Network and Fuzzy System


8
Q1. What is mean by k-means clustering algorithm.
Q2.What are the Algorithmic steps for k-means clustering.
Q3. Write a Advantages and Dis Advantage of k-means clustering.
Q4. Explain Least-Mean-Square Algorithm.
Q5. Explain Signal-Flow Graph Representation of the LMS Algorithm

Unedited Version: Neural Network and Fuzzy System


9
Chapter 7
Hopfield networks: energy function, spurious states, error performance.

Hopfield Network

HopfieldneuralnetworkwasinventedbyDr.JohnJ.Hopfieldin1982.Itconsistsofasinglelayerwhichcontains one or
more fully connected recurrent neurons. The Hopfield network is commonly used for auto-association and
optimizationtasks.

Discrete Hopfield Network

A Hopfield network which operates in a discrete line fashion or in other words, it can be said the input and
outputpatternsarediscretevector,whichcanbeeitherbinary(0,1)orbipolar(+1,-1)innature.Thenetworkhas
symmetrical weights with no selfconnectionsi.e., wij= wjiand wii =0.

Architecture

Following are some important points to keep in mind about discrete Hopfield network -

* Thismodelconsistsofneuronswithoneinvertingandonenon-invertingoutput.

* Theoutputofeachneuronshouldbetheinputofotherneuronsbutnottheinputofself.

* Weight/connectionstrengthisrepresentedbywij.

* Connectionscanbeexcitatoryaswellasinhibitory.Itwouldbeexcitatory,iftheoutputoftheneuronissame
astheinput,otherwiseinhibitory.

* Weightsshouldbesymmetrical,i.e.wij=wji

Unedited Version: Neural Network and Fuzzy System


1
TheoutputfromY1goingtoY2,YiandYnhavetheweightsw12,w1iandw1nrespectively.

Similarly,otherarcshavetheweightsonthem.

Unedited Version: Neural Network and Fuzzy System


2
Training Algorithm

DuringtrainingofdiscreteHopfieldnetwork,weightswillbeupdated.Asweknowthatwecanhavethebinary input
vectors as well as bipolar input vectors. Hence, in both the cases, weight updates can be done with the
followingrelation.

Case 1 - Binary input patterns


For a set of binary patterns s(p), p = 1 to P
Here, s(p) =si(p),s2(p), ...... ,Si(P), ...... ,Sn(P)


Weight Matrix is given by
p
ij i j

w = [2s (p) - 1] [2s (p) - 1] for ij


p=1

Case2- Bipolar inputpatterns


For a set of binary patterns s(p), p = 1 to P
Here,s(p)=s1(p),....,s2(p),....,si(p), ........................, Sn(P)

Weight Matrix is given by


p
wij = [si(p)] [sj(p) ] for i j
p=1

Step1- Initializetheweights,whichareobtainedfromtrainingalgorithmbyusingHebbianprinciple.
Step2- Performsteps3-9,iftheactivationsofthenetworkisnotconsolidated.
Step3- Foreachinputvectorx,performsteps4-8.
Step4- Makeinitialactivationofthenetworkequaltotheexternalinputvectorxasfollows-
yi= xifor i = 1 to
nStep5- ForeachunitYi,performsteps6-9.
Step6- Calculatethenetinputofthenetworkasfollows-

y w
Yini= xi+ j ji
j

Step7- Applytheactivationasfollowsoverthenetinputtocalculatetheoutput-
1 ifyini>
iy i = yi
ifyini= i
0 ifyini< i
Here i is the threshold.
Unedited Version: Neural Network and Fuzzy System
3
Step9- Testthenetworkforconjunction.
Energy Function Evaluation

An energy function is defined as a function that is bonded and non-increasing function of the state of the system.
EnergyfunctionalsocalledLyapunovfunctiondeterminesthestabilityofdiscreteHopfieldnetwork,andischaracterized as
follows-
n n n n
Ef = -
1
2 i=1 j=1
yyw
i j ij -  x y +  y
i=1
ii
i=1
ii

Condition :In a stable network, whenever the state of node changes, the above energy function will decrease.
Suppose when node ihas changed state from y i (k) to y (k + 1)
i
then the Energy change Ef is given by the
following relation.
E f= E (y f(k+1) -i E (y (k) )f i

( w y
n
=-
j=1
iji
(k)
+x
i
)
-  ( y (k+1) - y (k) )
i i i

= - (net )
i
y i

Here y i= y (k +i1) - y (k) i

The change in energy depends on the fact that only one unit can update its activation at a time.

Continuous Hopfield Network

IncomparisonwithDiscreteHopfieldnetwork,continuousnetworkhastimeasacontinuousvariable.Itisalsoused
inautoassociationandoptimizationproblemssuchastravellingsalesmanproblem.

Model: Themodelorarchitecturecanbebuildupbyaddingelectricalcomponentssuchasamplifierswhichcan
maptheinputvoltagetotheoutputvoltageoverasigmoidactivationfunction.

Energy Function Evaluation :

n n n n n yi
1 1
E f =  -  xyii + a-1 (y)dy
2
yijyw
ij  w ijri9
0
i=1 j=1 i=1 i=1 j=1

j i 
j i

Here is gain parameter and 9riinput conductance.


Unedited Version: Neural Network and Fuzzy System
4
Some important points about Boltzmann Machine -

* Theyuserecurrentstructure.

* Theyconsistofstochasticneurons,whichhaveoneofthetwopossiblestates,either1or0.

* Someoftheneuronsinthisareadaptive(freestate)andsomeareclamped(frozenstate).

* IfweapplysimulatedannealingondiscreteHopfieldnetwork,thenitwouldbecomeBoltzmannMachine.

Reference :

https://www.tutorialspoint.com/artificial_neural_network/artificial_neural_network_hopfield.htm

spurious states

purious states are patterns xsPxsP, where Pp is the set of patterns to be memorized. In other
words, they
correspondtolocalminimaintheenergyfunctionthatshouldn’tbethere.Theycanbecomposedofvariouscombinatio
ns
oftheoriginalpatternsorsimplythenegationofanypatternintheoriginalpatternset.Thesetendtobecomepresent
when=|P|/N=|P|/N(whereNN isthenumberofneurons)becomestoohighforacertainlearningrule.

Itturns outthatspurious statesareimportant forderivingáá inHopfieldnetworks.Becauseweknowthatthe


dynamicalupdateequationsalwaysreducetheenergyofasystem,spuriousminimawilltrapthenetworkandretur
n
incorrectorincompleteresults.Typicallythesespuriousminimahaveahigherenergyandsmallerbasinthanrea
l
patterns(thoughthisisnotguaranteedwhen|P||P|istoolarge).Thisnaturallyleadstoastochasticsolutionusinga
MonteCarlotypeapproach,whereenoughenergyisgiventotheneuronssothattheyaren’tstuckinthelocalminima
butdon’tjumpoutoftheclosestcorrectpatternminimum(theseareBoltzmannmachines).

Here’ssomehand-
wavyintuition.Thelearningrulesprojectthecurrentconfigurationofthenetworkintothesubspace
spannedbythepatternvectorsandthencalculatethepatternvectorthatliesclosesttotheprojectedconfiguration
vector.Butevenifyouhadcompletelyorthogonalpatterns,youcannotspecifymorepatternsthanthenumberof
neurons(becausetheneitheryouduplicateapatternorthenextpatternyouaddisn’torthogonal).

Therealproblemisthatmostlearningrulesgive<<N<<N(e.g.theHebbrulegives0.1380.138using
mean-fieldderivations)becausetheprojectionintothesubspaceisnotorthogonal.Thisisnotanissueifthepatterns
Unedited Version: Neural Network and Fuzzy System
5
themselvesareorthogonal(i.e.completelyuncorrelated),butthatisveryrarelythecaseinpractice.

Therearewaysto“unlearn”thesespuriousminimatoo.Seethisquestionforgoodreferences,especiallycheckthe
Rojasbookwhichisavailableforfreeonline.Also,ifyoucangetyourhandsontheHertzbook,lookatEq.(10.22)
whichisthemeanfieldequationwhosesolutionsgivethepossiblestates,includingspuriousones(theyalsogivean
explanationforhowtofindthemspecifically).

Exercise:
Q1. Write short note onHopfield Network.
Q2. Explain Discrete Hopfield Network.
Q3. What are the important points of discrete Hopfield network.
Q4. Write a note on Training Algorithm with its cases
Q5. Explain Energy Function Evaluation.
Q6. What is mean by Continuous Hopfield Network.

Unedited Version: Neural Network and Fuzzy System


6
Chapter 8

Simulated Annealing: The Boltzmann machine, Boltzmann learning rule, Bidirectional


Associative Memory.
Simulated Annealing

Statistical Mechanics and the Simulated Annealing

Thestartingpointofstatisticalmechanicsisanenergyfunction.Weconsideraphysicalsystemwithasetof
probabilisticstatesx={x},eachofwhichhasenergyE(x).ForasystematatemperatureT>0,itsstatexvarieswith
time,andquantitiessuchasEthatdependonthestatefluctuates.Althoughtheremustbesomedrivingmechanismfor
thesefluctuations,partoftheideaoftemperatureinvolvestreatingthemasrandom.Whenasystemisfirstprepared,
orafterachangeofparameters,thefluctuationshasonaverageadefinitedirectionsuchthattheenergyEdecreases.
However,sometimeslater,anysuchtrendceasesandthesystemjustfluctuatesaroundaconstantaveragevalue.
Thenwesaythatthesystemisinthermalequilibrium.

Afundamentalresultfromphysicstellsusthatinthermalequilibriumeachofthepossiblestatesxoccurs
withprobability,determinedaccordingtoBoltzmann-Gibbsdistribution,

- E(x)
T
1
P(x)= e
Z

where the normalizing factor


E(x)
Z =e -
T
x

is called the partition function and it is independent of the state x but temperature.

TheBoltzmann-Gibbsdistributionisusually derivedfromvery general assumptionsaboutmicroscopic dynamics


ofmaterials.ThecoefficientTisrelatedtoabsolutetemperatureTaofthesystemas

T =KBTa

wherecoefficientkBisBoltzmann’sconstanthavingvalue1.38x10-16erg/K.Interestinglyenough,thesame
Unedited Version: Neural Network and Fuzzy System
distributioncanalsobeachievedintheviewpointofinformationtheory.AlthoughthetemperatureThasnophysical
meaningininformationtheory,itisinterpretedasapseudotemperatureinanabstractmanner.

Unedited Version: Neural Network and Fuzzy System


Givenastatedistributionfunctionfd(x),letP(x(k)=xi)betheprobabilityofthesystembeingatstatexiat
thepresenttimek.Furthermore,letP(x(k+1)=xj|x(k)=xi)representtheconditionalprobabilityofnextstatexjgiven
thepresentstateisxi.ThenotationP(xi)andP(xjxi)willbeusedsimplytosdenotetheseprobabilitiesrespectively.
Inequilibriumthestatedistributionandthestatetransitionreachesabalancesatisfying:

P(xj | xi) P(xi) = P(xi| xj) P(xj)

Therefore, in equilibrium the Boltzmann Gibbs distribution given by equation (8.1.1) results in :

1
P(xj | xi) =
1+e

where (xj) (xI)


Figure 8.1 Relation between temperature and probability of the states [Kung 93]

TheMetropolisalgorithmprovidesasimplemethodforsimulatingtheevolutionofphysicalsysteminaheat
bathtothermalequilibrium[Metropolisetal].ItisbasedonMonteCarloSimulationtechnique,whichaimsto
approximatetheexpectedvalue<g(x)>ofsomefunctiong(x)ofarandomvectorxwithagivendensityfunction fd(x).
For this purpose several x vectors, say x=Xk k=1..K, are randomly generated according to the density
functionfd(x)andthenYkisfoundasYk=g(Xk).Byusingthestronglawoflargenumbers:

K
lim1


k k
Y = < Y > = < g(x) >

the average of generated Y vectors can be used as an estimate of <g(x)> [Sheldon 1989].

In each step of the Metropolis algorithm, an atom (unit) of the system is subjected to a small random
displacement,andtheresultingchangeEintheenergyofthesystemisobserved.IfE<0,thenthedisplacementis
accepted,andthenewsystemconfigurationwiththedisplacedatomisusedasthestartingpointforthenextstepof

Unedited Version: Neural Network and Fuzzy System


thealgorithm.If,ontheotherhand,E>0,thenthealgorithmproceedsinaprobabilisticmannersothattheconfiguration
withthedisplacedatomisacceptedwithaprobabilitygivenby:

P(E)=e-

Unedited Version: Neural Network and Fuzzy System


ProvidedenoughnumberoftransitionsintheMetropolisalgorithm,thesystemreachesthermalequilibrium.
Thus,byrepeatingthebasicstepsofMetropolisalgorithm,weeffectivelysimulatethemotionsoftheatomsofa
physicalsystemattemperatureT.Moreover,thechoiceofP(E)definedinEq.(8.1.8)ensuresthatthermalequilibrium
ischaracterizedbytheBoltzmann-GibbsdistributionprovidedinEq.(8.1.5).

ReferringtoEq.(8.1.5),noticethatifP(xi)>P(xj)impliesE(xi)<E(xj),andviceversa.Somaximizingthe
probabilityfunctionisequivalentto minimizingtheenergy function.Furthermore,noticethatthis propertyisindependent
ofthetemperature,althoughthediscriminationbecomesmoreapparentasthetemperaturedecreases(Figure8.1).

Therefore,thetemperatureparameterTprovidesanewfreeparameterforsteeringthestepsizetowardthe
globaloptimum.Withahightemperature,theequilibriumcanbereachedmorerapidly.However,ifthetemperature
istoohigh,allthestateswillhaveasimilarlevelofprobability.Ontheotherhand,whenT0,theaveragestate
becomesvery closetotheglobalminimum.Thisidea,thoughveryattractiveatthefirstglance,cannotbeimplemented
directlyinpractice.Infact,withalowtemperature,itwilltakeaverylongtimetoreachequilibriumand,more
seriously,thestateismoreeasilytrappedbylocalminima.Therefore,itisnecessarytostartatahightemperatureand
thendecreaseitgradually.Correspondingly,theprobablestatethengraduallyconcentratearoundtheglobalminimum
(Figure8.2).

Low Temperature

High Temperature

Figure 8.2 The energy levels adjusted for high and low temperature

Thishasananalogywithmetallurgicalannealing,inwhichabodyofmetalisheatedneartoitsmeltingpoint
andisthenslowlycooledbackdowntoroomtemperature.Thisprocesseliminatesdislocationsandothercrystal
latticedisruptionsbythermalagitationathightemperature.Furthermore,itpreventstheformation ofnewdislocations
bycoolingthemetalveryslowly.Thisprovidesnecessarytimetorepairanydislocationsthatoccurasthetemperature
drops.Theessenceofthisprocessisthatglobalenergyfunctionofthemetalwilleventuallyreachanabsoluteminimum value.

Ifthematerialiscooledrapidly,itsatomsareoftencapturedinunfavorablelocationsinthelattice.Oncethe
temperaturehasdroppedfarbelowthemeltingpoint,thesedefectssurviveforever,sinceanylocal rearrangementsof
atomscostsmoreenergythanwhateveravailableinthermalfluctuations.Theatomiclatticethusremainscapturedin
alocalenergyminimum.Inordertoescapefromlocalminimaandtohavethelatticeintheglobalenergyminimum,the
thermalfluctuationscanbeenhancedbyreheatingthematerialuntilenergy-consuminglocalrearrangementsoccurat
areasonablerate.Thelatticeimperfectionsthenstarttomoveandannihilate,untiltheatomiclatticeisfreeofdefects-
exceptforthosecausedbythermalfluctuations.Thesecanbegraduallyreducedifthetemperatureisdecreasedso slowly
thatthermalequilibriumismaintainedatalltimesduringthecoolingprocess.Howmuchtime mustbespentfor
thecoolingprocessdependsonthespecificsituation.A greatdealofexperienceisrequiredtoperformtheannealing
inanoptimalway.Ifthetemperatureisdecreasedquickly,somethermalfluctuationsarefrozenin.Ontheotherhand,
ifoneproceedstooslowly,theprocessneverends.
Unedited Version: Neural Network and Fuzzy System
Unedited Version: Neural Network and Fuzzy System
Theamazingthingaboutannealingisthatthestatisticalprocessofthermalagitationleadstoapproximatelythesamef
inalenergystate.Thisresultisindependentoftheinitialconditionofthemetalandanyofthedetailsofthe
statisticalannealingprocess.Themathematicalconceptofsimulatedannealingderivesfromananalogywiththis
physicalbehavior.

Thesimulatedannealingalgorithm,isavariantoftheMetropolisalgorithminwhichthetemperatureistime
dependent.Inanalogywithmetallurgicalannealing,itstartswithahightemperatureandgraduallydecreasesit.At
eachtemperature,itappliesseveraltimestheupdaterulegivenbyEq.(8.1.8).Anannealingschedulespecifiesafinite
sequenceoftemperaturevaluesandafinitenumberoftransitionsattemptedateachvalueofthetemperature.The
annealingscheduledevelopedby[Kirkpatricketal1983]isasfollows.

TheinitialvalueT0ofthetemperatureischosenhighenoughtoensurethatvirtuallyallproposedtransitionsbe
acceptedbythesimulatedannealingalgorithm.Thenthecoolingisperformed.Ateachtemperature,enoughtransitions
areattemptedsothatthereisapredeterminednumberoftransitionsperexperimentontheaverage.Attheend,the
systemisfrozenandannealingstopsifthedesirednumberofacceptancesisnotachievedatpredeterminednumberof
successivetemperatures.Inthefollowing,weprovidetheannealingprocedureinmoredetail:

SIMULATED ANNEALING

Step1.SetInitialvalues:assignahighvaluetotemperatureasT(0)=T0,decide
onconstantsKT,KAandKS,Typicalvaluesforwhichare0.8<KT<0.99, KA=10.
and KS=3.

Step2.Decrementthetemperature:T(k)=KT,(k-1)whereKT,isaconstant
smallerbutclosetounity.
Step3.Attemptenoughnumberoftransitionsateachtemperature,sothatthere
areKAacceptedtransitionsperexperimentontheaverage.

Step4.StopifthedesirednumberofacceptancesisnotachievedatKSsuccessive
temperatureselserepeatsteps2and3.

Averyimportantpropertyofsimulatedannealingisitsasymptoticconvergence.Ithasbeenprovedin[GemanandGe
man84]thatifT(k)atiterationkischosensuchthatitsatisfies

T0
T(k) >
log (1+k)

providedtheinitialtemperatureT0ishighenough,thenthesystemwillconvergetotheminimumenergyconfiguration.
Themaindrawbackofsimulatedannealingisthelargeamountofcomputationaltimenecessaryforstochasticrelaxation.

Unedited Version: Neural Network and Fuzzy System


Manyelementarytransformationsareperformedateachtemperaturestepinordertoreachanearequilibriumstate.

Unedited Version: Neural Network and Fuzzy System


Themaindrawbackofsimulatedannealingisthelargeamountofcomputationaltimenecessaryforstochas
tic relaxation.Many elementary
transformationsareperformedateachtemperaturestepinordertoreachnearequilibrium state.
Exercise:

Q1. What is mean by Simulated Annealing.


Q2. Explain Statistical Mechanics.
Q3. Explain Simulated Annealing.
Q4. Write a short note on Boltzmann learning rule.
Q5. Explain Bidirectional Associative Memory.

Unedited Version: Neural Network and Fuzzy System


Chapter 9
The Boltzmann machine, Boltzmann learning rule

9.2 Boltzmann Machine

Boltzmannmachine[Hintonetal83]isaconnectionistmodelhavingstochasticnature.Thestructureofthe
BoltzmannmachineissimilartoHopfieldnetwork,butitaddssomeprobabilisticcomponenttotheoutputfunction.It
usessimulatedannealingconcepts,inspiteofthedeterministicnatureinstatetransitionoftheHopfieldnetwork
[Hintonetal83,Aartsetal1986,AllwrightandCarpenter1989,LaarhovenandAarts1987].

ABoltzmannmachinecanbeviewedasarecurrentneuralnetworkconsistingofNtwostateunits.Depending
onthepurpose,thestatescanbechosenfrombinaryspace,thatisx¸{0,1}Norfrombipolarspacex¸{-1,1}N.
TheenergyfunctionoftheBoltzmannmachineis:

N N N
1 x
E(x) =- 2 w xx-
ijij ii
i j i

Theconnectionsaresymmetricalbydefinition,thatiswij=wji.Furthermoreinthebipolarcase,theconvergence
ofthemachinerequiresw=0(orequivalentltly
ii
=0).Howeverinthebinarycaseself-loopsareallowed.
i

TheobjectiveofaBoltzmannmachineistoreachtheglobalminimumofitsenergyfunction,whichisthestate
havingminimumenergy.Similartosimulatedannealingalgorithm,thestatetransitionmechanismofBoltzmannMachine
usesastochasticacceptancecriterion,thusallowingittoescapefromitslocalminima.InasequentialBoltzmann
machine,unitschangetheirstatesonebyone,whiletheychangestatealltogetherinaparallelBoltzmannmachine.

LetX denotethestatespaceofthemachine,thatis thesetofallpossiblestates.Amongthese,thestate vectors


differing only one bit are called neighboring states. The neighborhood NxX is definedas the set
ofallneighboringstatesofx.Letaxjtodenotetheneighboringstatethatisobtainedfromxbychangingthestate
ofneuronj.Hence,inbinarycasewehave

xj = xi if ij x (0,1)n, xj Nx


i
1-x i if ij
In bipolar case, this becomes:

xj = xi if ij x (-1,1)n, xj Nx


i
-x i if ij

Unedited Version: Neural Network and Fuzzy System


1
The difference in energy when the global state of the machine is changed from x to x j is :

Note that the contribution of the connections wkmkj, mj, to E(x) and E(xj) is identical,
furthermore wij=wij. For the binary case, by using equations (9.2.1) and (9.2.2), we obtain

E(xj |x) = (2xj -1) ( wijx i+j) x{0,1} N


For the bipolar case it is

E(xj |x) = (2xj) ( wijxi+j) x {-1,1} N , wii =0

Therefore,thechangeintheenergycanbecomputedbyconsideringonlylocalinformation.Inasequential
Boltzmannmachine,atrialforastatetransitionisatwo-stepprocess.Givenastatex,firstaunitjisselectedasa
candidatetochangestate.Theselectionprobabilityusuallyhasuniformdistributionovertheunits.Thenaprobabilistic
functiondetermineswhetherastatetransitionwilloccurornot.Thestatexjisacceptedwithprobability

1
j
P(x |x) =
1+e E(x |x)/T
j

whereTisacontrolparameterhavinganalogyintemperature.Initiallythetemperatureissetlargeenoughtoaccept
almostallstatetransitionswithprobabilitycloseto0.5,andthenTisdecreasedintimetozero(Figure9.3).Witha
propercoolingschedule,thesequentialBoltzmannmachineconvergesasymptoticallytoastatehavingminimum
energy.

Figure 9.3. Acceptance probability in Boltzmann machine for different temperatures

ABoltzmannmachinestartsexecutionwitharandominitialconfiguration.Initially,thevalueofTisverylarge.
Acoolingscheduledetermineshowandwhentodecrementthecontrolparameter.AsT 0,lessandlessstate
transitionsoccur.Ifnostatetransitionsoccurforaspecifiednumberoftrials,itisdecidedthattheBoltzmannmachine
hasreachedthefinalstate.

Unedited Version: Neural Network and Fuzzy System


2
A state x* ¸ X is called a locally minimal state, if

E(x* j |x*) >=0 j = 1..N

Unedited Version: Neural Network and Fuzzy System


3
Notethat,alocalminimumisastatewhoseenergycannotbeincreasedbyasinglestatetransition.Lettheset
ofalllocalminimabedenotedbyX*.WhiletheHopfieldnetworkistrappedmostlyinoneoftheselocalminima,the
Boltzmannmachinecanescapefromthelocalminimabecauseofitsprobabilisticnature.Althoughthemachin
e asymptoticallyconvergestoaglobalminimum,thefinite-
timeapproximationoftheBoltzmannMachineprevents
guaranteeingconvergencetoastatewithminimumenergy.However,stillthefinalstateofthemachinewillbeanearly
minimumoneamongX*.

UseofBoltzmannmachineasaneuraloptimizerinvolvestwophasesasitisexplainedfortheHopfield
networkinChapter4.Inthefirstphase,theconnectionweightsaredetermined.Forthispurpose,anenergyfunctio
n forthegivenapplicationisdecided.Inthenon-
constrainedoptimizationapplications,theenergyfunctioncanbe
directlyobtainedbyusingthecostfunction.However,inthecaseofconstrainedoptimization,theenergyfunctio
n
mustbederivedusingboththeoriginalcostfunctionandtheconstraints.Thenextstepistodeterminetheconnection
weights{wij}byconsideringthisenergyfunction.Theninthesecondphase,themachinesearchestheglobalminimu
m throughtheannealingprocedure.

Exercise:

Q1. Write short note on Boltzmann Machine.


Q2. Explain what is mean by Boltzmann learning.
Q3. What are the rule of Boltzmannlearning rule.
Q4. Write a short note on Learning Per Pattern.
Q5. Explain the Structure of the RBF Networks.

Unedited Version: Neural Network and Fuzzy System


4
Unedited Version: Neural Network and Fuzzy System
5
Chapter 10

Bidirectional Associative Memory

BIDIRECTIONALASSOCIATIVE MEMORY (BAM)

Severalversionsoftheheteroassociativerecurrentneuralnetwork,orbidirectionalassociativememory(BAM),
developedbyKosko(1988,1992a).-Abidirectionalassociativememory[Kosko,1988]storesasetofpattern
associationsbysummingbipolarcorrelationmatrices(annbymouterproductmatrixforeachpatterntobestored).
- Thearchitectureofthenetconsistsoftwolayersofneurons,connectedbydirectionalweightedconnectionpaths.
- Thenetiterates,sendingsignalsbackandforthbetweenthetwolayersuntilallneuronsreachequilibrium(i.e.,until
eachneuron’sactivationremainsconstantforseveralsteps).-Bidirectionalassociativememoryneuralnetscan
respondtoinputtoeitherlayer.-Becausetheweightsarebidirectionalandthealgorithmalternatesbetweenupdating
theactivationsforeachlayer,weshallrefertothelayersastheX-layerandtheY-layer(ratherthantheinputand
outputlayers).

Figure : Bidirectional Associative Memory (BAM)

Unedited Version: Neural Network and Fuzzy System


1
Architecture-Thesingle-layernonlinearfeedbackBAMnetwork(withheteroassociativecontentaddressablememory)
hasnunitsinitsXlayerandmunitsinitsY-layer.-Theconnectionsbetweenthelayersarebidirectional;i.e.,ifthe
weightmatrixforsignalssentfromtheX-layertotheY-layerisW,theweightmatrixforsignalssentfromtheY-layer totheX-
layerisWT.

Discrete BAM

Thetwobivalent(binaryorbipolar)formsofBAMarecloselyrelated.Ineach,theweightsarefoundfromthesum
oftheouterproductsofthebipolarformofthetrainingvectorpairs.Also,theactivationfunctionisastepfunction,
withthepossibilityofanonzerothreshold.Thebipolar

vectors improve the performance of the net.

* The weight matrix to store a set of input and target vectors s(p) : t(p), p = 1, . . . , P, where

s(p) = (s1(p), …., si(p), …., sn(p)) ;


t(p) = (t1(p), ……., tj(p), ……., tm(p))
can be determined by the Hebb rule.

* Theformulasfortheentriesdependonwhetherthetrainingvectorsarebinaryorbipolar.

For binary input vectors, the weight matrix

W = {wij} is given by

p
wij = p=1(2s (p) -1) (2t (p)-1)
i j

 Forbipolarinputvectors,theweightmatrixW={wij} isgivenby

p
wij = p=1s (p)t (p)
i j

ActivationFunction:TheactivationfunctionforthediscreteBAMistheappropriatestepfunction,dependingon
whetherbinaryorbipolarvectorsareused.

 Forbinaryinputvectors,theactivationfunctionfortheY-layeris
1 if y_inj > 0
yj = yj if y_inj = 0
0 if y_inj < 0
and the activation function for the X-layer is
1 if x_inj > 0
xj = xj if x_inj = 0
0 if x_inj < 0
Unedited Version: Neural Network and Fuzzy System
2
 Forbinaryinputvectors,theactivationfunctionfortheY-layeris
1 if y_in j >0j
yj = yj if y_in j =0j
-1 if y_in j <0j.
and the activation function for the X-layer is
1 if x_ini>0 i
x i= xi if x_ini=0 i
-1 if x_ini<0i.
Notethatifthenetinputisexactlyequaltothethresholdvalue,theactivationfunction“decides”toleavetheactivation
ofthatunitatitspreviousvalue.

 Theactivationsofallunitsareinitializedtozero.
 ThefirstsignalistobesentfromtheX-layertotheY-layer.However,iftheinputsignalfortheX-layeristhe
zerovector,theinputsignaltotheY-layerwillbeunchangedbytheactivationfunction,andtheprocesswillbe
thesameasifthefirstpieceofinformationhadbeensentfromtheY-layertotheX-layer.
* Signals are sent only from one layer to the other at any step of the process, not simultaneously in both
directions.
Algorithm
1. InitializetheweightstostoreasetofPvectors;initializeallactivationsto0
2. Foreachtestinginput,doSteps3-7.
3a. PresentinputpatternxtotheX-layer,(i.e.,setactivationsofX-layertocurrentinputpattern). 3b.
PresentinputpatternytotheY-layer,(Eitheroftheinputpatternsmaybethezerovector.)
4. Whileactivationsarenotconverged,doSteps5-7.
5. UpdateactivationsofunitsinY-layer:
Compute net inputs :
y_in j= w ij x i
i

Compute activations :
yj = f (y.inj)
Send signal to X-layer.
6. UpdateactivationsofunitsinX-layer:
Compute net inputs :
x_in i= w ij y j
j

Compute net inputs :


xi= f(x.ini)
Send signal to Y-layer.
7. Test for convergence:
Unedited Version: Neural Network and Fuzzy System
3
If the activation vectors x and y have reached equilibrium, then stop; otherwise, continue.

Unedited Version: Neural Network and Fuzzy System


4
Continuous BAM

Acontinuousbidirectionalassociativememory[Kosko,1988]transformsinputsmoothlyandcontinuouslyintooutput
intherange[0,1]usingthelogisticsigmoidfunctionastheactivationfunctionforallunits.

 Forbinaryinputvectors(s(p),t(p)),p=1,2,...,P,theweightsaredeterminedbytheformula

p
wij = p=1(2s (p) -1) (2t (p)-1)
i j

 Theactivationfunctionisthelogisticsigmold

1
f(y,inj) =
1 + exp(-y_inj)
 whereabiasisincludedincalculatingthenetinputtoanyunitandcorrespondingformulas
applyfortheunitsintheX-layer.

y_in j =b j + w ij xi
i

AnumberofotherformsofBAMshavebeendeveloped.Insome,theactivationschangebasedonadifferential
equationknownasCohenGrossbergactivationdyanamics(Cohen&Grossberg,1983).

Application

Example-11 :A BAM net to associate letters with simple bipolar codes

Considerthepossibilityofusinga(discrete)BAMnetwork(withbipolarvectors)tomaptwosimpleletters

(givenby5x3patterns)tothefollowingbipolarcodes:

he target output vector t for letter A is [-1 1] and for the letter C is [1 1],

Unedited Version: Neural Network and Fuzzy System


5
The weight matrices are :

(to store A -1 1) (C 1 1) (w, tostore both)


1 -1 -1 -1 0 -2
-1 1 1 1 0 2
1 -1 1 1 2 0
-1 1 1 1 0 2
1 -1 -1 -1 0 -2
-1 1 -1 -1 -2 0
-1 1 1 1 0 2
-1 1 -1 -1 -2 0
-1 1 -1 -1 -2 0
-1 1 1 1 0 2
1 -1 -1 -1 0 -2
-1 1 -1 -1 -2 0
-1 1 -1 -1 -2 0
1 -1 1 1 2 0
-1 1 1 1 0 2

ToillustratetheuseofaBAM,wefirstdemonstratethatthenetgivesthecorrectyvectorwhenpresentedwiththex
vectorforeitherthepatternAorthepatternC:

INPUT PATTERN A

[-12-1 1-11 111 1 -11 1 -1 1] w =[-1416] [-11]

INPUT PATTERN C

[-111 1-1 -1 1-1 -1 1 -1 -1 -1 1 1] w= [-1416] [11]

Toseethebidirectednatureofthenet,observethattheYvectorscanalsobeusedasinput.Forsignalssentfromthe
Y-layertotheX-layer,theweightmatrixisthetransposeofthematrixW,i.e.wT.

For the input vector associated with pattern A. namely. (-1, 1), we have

[-1 1] wT = [-2 2 -2 2 -2 2 2 2 2 2 -2 2 2 2 -2 2]

[-11-1 1-11 111 1-11 1 -11]

This is pattern A.

Similarly, if we input the vector associated with pattern C, namely, (1,

1).We obtain [1 1] wT [-111 1-1-1 1-1-1 -1 11]

Unedited Version: Neural Network and Fuzzy System


6
Exercise:

Q1. Write a short on Bidirectional Associative Memory.


Q2. What is mean by Discrete BAM.
Q3. Explain Algorithm of BAM.
Q4. Explain Continuous BAM.
Q5. What are the Application of BAM.

Unedited Version: Neural Network and Fuzzy System


7
Chapter 11

Fuzzy Set, Properties, Operations on Fuzzy sets, Fuzzy relations


Introduction

Fuzzy Logic

Fuzzy set theory was developed by Lotfi A. Zadeh [Zadeh, 1965], professor for computer science
at the University of California in Berkeley, to provide a mathematical tool for dealing with the concepts
used in natural language (linguistic variables). Fuzzy Logic is basically a multivalued logic that allows
intermediate values to be defined between conventional evaluations.

However, the story of fuzzy logic started much more earlier . To devise a concise theory of logic, and later

mathematics, Aristotle posited the so-called ¡¨Laws of Thought¡¨. One of these, the ¡¨Law of the Excluded Middle,¡¨

states that every proposition must either be True (T) or False (F). Even when Parminedes proposed the first version of

this law (around 400 Before Christ) there were strong and immediate objections: for example, Heraclitus proposed that

things could be simultaneously True and not True. It was Plato who laid the foundation for what would become fuzzy

logic, indicating that there was a third region (beyond T and F) where these opposites ¡¨tumbled about.¡¨ A systematic

alternative to the bi-valued logic of Aristotle was first proposed by ¢Gukasiewicz around 1920, when he described a

three-valued logic, along with the mathematics to accompany it. The third value he proposed can best be translated as

the term ¡¨possible,¡¨ and he assigned it a numeric value between T and F. Eventually, he proposed an entire notation

and axiomatic system from which he hoped to derive modern mathematics.

Later, he explored four-valued logics, five-valued logics, and then declared that in principle there was

nothing to prevent the derivation of an infinite-valued logic.¢Gukasiewicz felt that three- and infinite-valued logics

were the most intriguing, but he ultimately settled on a fourvalued logic because it seemed to be the most easily

adaptable to Aristotelian logic. It should be noted that Knuth also proposed a threevalued logic similar to

Lukasiewicz¡¦s, from which he speculated that mathematics would become even more elegant than in traditional

bi-valued logic. The notion of an infinite-valued logic was introduced in Zadeh¡¦s seminal work ¡¨Fuzzy Sets¡¨

where he described the mathematics of fuzzy set theory, and by extension fuzzy logic. This theory proposed

making the membership function (or the values F and T) operate over the range of real numbers [0, 1].

New operations for the calculus of logic were proposed, and showed to be in principle at least a generalization

of classic logic. Fuzzy logic provides an inference morphology that enables approximate human reasoning capabilities to

be applied to knowledge-based systems. The theory of fuzzy logic provides a mathematical strength to capture the

uncertainties associated with human cognitive processes, such as thinking and reasoning. The conventional approaches

Unedited Version: Neural Network and Fuzzy System


1
65

to knowledge representation lack the means for representating the meaning of fuzzy concepts. As a consequence, the

approaches based on first order logic and classical probability theory do not provide an appropriate conceptual

framework for dealing with the representation of commonsense knowledge, since such knowledge is by its nature both

lexically imprecise and noncategorical. The developement of fuzzy logic was motivated in large measure by the need for

a conceptual framework which can address the issue of uncertainty and lexical imprecision. Some of the essential

characteristics of fuzzy logic relate to the following (Zadeh, 1992): In fuzzy logic, exact reasoning is viewed as a limiting

case of approximate reasoning. In fuzzy logic, everything is a matter of degree. In fuzzy logic, knowledge is interpreted a

collection of elastic or, equivalently, fuzzy constraint on a collection of variables. Inference is viewed as a process of

propagation of elastic constraints. Any logical system can be fuzzified. There are two main characteristics of fuzzy

systems that give them better performance for specific applications. Fuzzy systems are suitable for uncertain or

approximate reasoning, especially for the system with a mathematical model that is difficult to derive. Fuzzy logic allows

decision making with estimated values under incomplete or uncertain information.

Theory has been attacked several times during its existence. For example, in 1972 Zadeh’s colleague R. E.

Kalman (the inventor of Kalman filter) commented on the importance of fuzzy logic: “...Zadeh’s proposal could be

severely, fericiously, even brutally criticized from a technical point of view. This would be out of place here. But a

blunt question remains: Is Zadeh presenting important ideas or is he indulging in wishful thinking?...”

The heaviest critique has been presented by probability theoreticians and that is the reason why
many fuzzy logic authors (Kosko, Zadeh and Klir) have included the comparison between probability and
fuzzy logic in their publications. Fuzzy researchers try to separate fuzzy logic from probability theory,
whereas some probability theoreticians consider fuzzy logic a probability in disguise.

Fuzzy Set

Since set theory forms a base for logic, we begin with fuzzy set theory in order to ¡§pave the

way¡¨ to fuzzy logic and fuzzy logic systems.

6.1.1 Set - Theoretical Operations and Basic Definitions

In classical set theory the membership of element x of a set A (A is a crisp subset of universe X )

is defined by

0, if xA,

A(x) =
1, if xA
6.1

Unedited Version: Neural Network and Fuzzy System


2
The element either belongs to the set or not. In fuzzy set theory, the element can belong to the
set partially with a degree and the set does not have crisp boundaries. That leads to the following
definition.

Unedited Version: Neural Network and Fuzzy System


3
66

Definition 6.1.1 (fuzzy set, membership function) Let X be a nonempty set, for example X=Rn, and be
called theuniverse of discourse. A fuzzy AX is characterized by the membership function

A:X [0, 1] =
6.2

Where A (X) is a grade (degree) of membership of X x in set A.

From the definition we can see that the fuzzy set theory is a generalized set theory that includes the classical set
theory as a special case. Since {0,1} x [0,1] , crisp sets are fuzzy sets. Membership function (2.2) can also viewed as
a distribution of truth of a variable. In literature fuzzy set A is often presented as a set of ordered pairs :

A = {(x, A(x)) x X}


6.3

where the first part determines the element and the second part determines the grade of membership.
Another way to describe fuzzy set has been presented by Zadeh, Dubois and Prade. If X is infinite, the
fuzzy set A can be expressed by


A = A(x)/ x
6.4

If X is finite, A can be expressed in the form


n
A =A(xi)/x1.... +A(xn) = A(xi)/xi
i=1

6.5

Note :Symbolin (2.4) has nothing to do with integral (it denotes an uncountable enumeration) and /
denotes atuple. The plus sign represents the union. Also note that fuzzy sets are membership
functions. Nevertheless, we may still use the set theoretic notations like A B.


This is the name of a fuzzy set given by AB .
Example 6.1.1

Discrete case: A=0.1/x1+ 0.4/x2 + 0.8/ x3 + 1.0/ x4 + 0.8/ x5 + 0.4/ x6 + 0.1/ x7

1-
x-c , x[c - h, h]
h
x-c
Continuous case : A(x) = , x [c, c + h]
h

Unedited Version: Neural Network and Fuzzy System


4
0 , otherwise

Unedited Version: Neural Network and Fuzzy System


5
67

Figure 6.1 A discrete and continuous membership functions

Definition 6.1.2 (support) The support of a fuzzy set A is the crisp set that contains all elements of A with
non-zeromembership grade :

supp( A) ={xXA|(x) > 0}

6.6

If the support is finite, it is called compact support. If the support of fuzzy set A consists of only one
point, it is called a fuzzy singleton. If the membership grade of this fuzzy singleton is one, A is called a
crisp singleton ‘Zimmermann, 1985’.
Definition 6.1.3 (core) The core (nucleus, center) of a fuzzy set A is defined by

core ( A) ={xX(x) = 1}

Definition 6.1.4 (height) The height of a fuzzy setAonXis defined by

hgt( A ) = sup A( x )

xz
x
and Ais called normal ifhgt( A) = 1, and subnormal ifhgt ( A) < 1.

Note :Non-empty fuzzy set can be normalized by dividingA(x)by supxA(x). Normalizing of A can
beregarded as a mapping from fuzzy sets to possibility distributions :

Normal (A)eA [Joslyn, 1994].

Unedited Version: Neural Network and Fuzzy System


6
68

The relation between fuzzy set membership function , possibility distribution and probability
distribution
p :The definition pA(x)A(x) could hold, ifis additively normal. Additively normal means here that the
stochasticnormalization

 A(x)dx = 1
would have to be satisfied. So it can be concluded that any given fuzzy set could define either a probability
distribution or a possibility distribution, depending on the properties of . Both probability distributions and
possibility distributions are special cases of fuzzy sets. All general distributions are in fact fuzzy sets [Joslyn, 1994].

Definition 6.1.5 (convex fuzzy set)Afuzzy setAis convex if

6x, y x and 6Z, [0,1]

A(Zx + (1 - Z) y)  min (A(x), A(y))

Definition 6.1.6 (width of a convex fuzzy set) The width of a convex fuzzy setAis defined by

width ( A) = sup(supp ( A ) )- inf(supp ( A) )

A
Definition 6.1.7 (-cut) The-cut of a fuzzy set is defined by

Figure 2.2 An ƒÑƒ{ cut of a triangular fuzzy number

A
Definition 6.1.8 (fuzzy partition) set of fuzzy sets is called fuzzy partition if

6x, y X

NA
Ai
i=1

Unedited Version: Neural Network and Fuzzy System


7
provided Ai are nonempty and subsets of x.

Unedited Version: Neural Network and Fuzzy System


8
69

A
Definition 6.1.9 (fuzzy number) fuzzy set (subset of a real line R) is a fuzzy number, if the fuzzy set is

convex,normal, membership function is piecewise continuous and the core consists of G. one value only.
The family of fuzzy numbers is . In many situations people are only able to characterize numeric
information imprecisely. For example, people use terms such as, about 5000, near zero, or essentially
bigger than 5000. These are examples of what are called fuzzy numbers. Using the theory of fuzzy
subsets we can represent these fuzzy numbers as fuzzy subsets of the set of real numbers.

Figure 6.3 Fuzzy number

Figure 6.4 Non-fuzzy number

Note :Fuzzy number is always a fuzzy set, but a fuzzy set is not always a fuzzy number.

An example of fuzzy number is ‘about 1’ that is defined by  (x) = exp(-j (x - 1) 2 )

It is also a quasi fuzzy number because limA (x) = 0.

x +

Definition 6.1.10 (fuzzy interval)Afuzzy interval is a fuzzy set with the same restrictions as in definition

6.1.9, but thecore does not need to be a one point only.


Fuzzy intervals are direct generalizations of crisp intervals [a, b] R.

Definition 6.1.11 (LR-representation of fuzzy numbers) Any fuzzy number can be described by

Unedited Version: Neural Network and Fuzzy System


9
70

L((a - x) /  , x [a -  a]
1 , x [a, b]
A (x) =
R((x - b) /  , x [b, b ]
0 , otherwise

Where [a, b] is the core of A , and L:[0,1] [0,1], R :[0,1] [0,1] are shape functions (called brieflys-
functions) that are continuous and nonincreasing such that L(0) =R(0) = 1,L(1) =R(1) = 0 , where L stands
for lefthand side and R stands for right-hand side of membership function [Zimmermann, 1993].
Definition 6.1.12 (LR-representation of quasi fuzzy numbers) Any quasi fuzzy number can be described by

L((a - x) /  , x a
A (x) = 1 , x[a, b]
R((x - b) /  , x b

Where [a, b] is the core of A , and L:[0,) „_ [0,1],R:[0,) [0,1]

are shape functions that are continuous and non-increasing such that L(0) = R(0) = 1

and they approach zero :limL(x) = 0 , lim R(x) = 0.

For example, f(x) = e-xf, (x) = e-x2 and f(x) = max (0.1 - x) are such shape functions. In the following
the classical set theoretic operations are extended to fuzzy sets.

Definition 6.1.13 (set theoretic operations)

ø0 (empty set)

x1 (basic set, universe)


A = BA(x) =B(x) 6x X (identity)

ABA(x) =B(x) 6x X (subsethood)

6x X:AB(x) = max(A(x),B(x)) (union)

6x X:AB(x) = max(A(x),B(x)) (intersection)

6x X:A-(x) = 1 -A(x) (complement)

Union could also be represented by AA ={(x, max(A(x),B(x)))xX}

Same notation could also be used with intersection and complement.

Unedited Version: Neural Network and Fuzzy System


10
71

A is a subset of The graph of the


universal fuzzy subset in X = [0, 10].

Figure 6.5 Fuzzy sets

Theorem 2.1.1 The following properties of set theory are valid

=
A =A (involution)
AB=BA

AB = BA (commutativity)

(A B) C = A (B C) (AB)C = A(B) (associativity)

A (B C) = (AB) (A)

A (BC) = (A B)(A C) (distributivity)

A A = A

A A = A (idempotence)

A (AB)= A

A (A B) = A (absorption
)
(A B) =
AB

(AB) =
AB
(De Morgan’s laws)

Proof :Above properties can be proved by simple direct calculations. For example,

=
A = 1 - (1 - A) =A .
Fuzzy Relations

Definition 2.1.17 (fuzzy relation) Fuzzy relation is characterized by a function

R : X1 x ... x Xm [0,1] where Xi are the universes of discourse and X1 x ... x Xm

Unedited Version: Neural Network and Fuzzy System


11
is the product space. If we have two finite universes, the fuzzy relation can be presented as a matrix
(fuzzy matrix) whose elements are the intensities of the relation and R has the membership function

Unedited Version: Neural Network and Fuzzy System


12
72

R (u,) , where u  X1 ,  X2 . Two fuzzy relations are combined by a so called sup-*- or max-min
composition, which will be given in the definition 2.1.19.

Figure 2.7 Shadows of a fuzzy relation

Note: Fuzzy relations are fuzzy sets, and so the operations of fuzzy sets (union, intersection, etc.) can be applied to them.

Example 2.1.2 Let the fuzzy relationR= “approximately equal” correspond to the equality of two
numbers.The intensity of cell R(u,) of the following matrix can be interpreted as the degree of
membership of the ordered pair in R. The numbers to be compared are {1,2,3,4} and {3,4 ,5,6}.
u \ 1 2 3 4

3 .6 .8 1 .8
4 .4 .6 .8 1
5 .2 .4 .6 .8
6 .1 .2 .4 .6

The matrix shows, that the pair (4, 4) is approximately equal with intensity 1 and the pair (1, 6) is
approximately equal with intensity 0.1.

Definition 2.1.18 (Cartesian product) LetAi x XIbe fuzzy sets. Then the Cartesian product isdefined

A1 x ... x An (x1 ,..., xn) = min{A (xi)}


Note: min can be replaced by a more general t-norm.

Unedited Version: Neural Network and Fuzzy System


13
Figure 2.8 Cartesian product of two fuzzy sets is a fuzzy relation inXxY

Unedited Version: Neural Network and Fuzzy System


14
73

Definition 2.1.19 (sup-*-composition, max-min composition) The composition of two relationsR o


Sis defined as a membership function in XxY

ROS (x) = supT(R(x,),S,y))


where R is relation in XxV and S is relation in V x Y, x X, y Y and T is a t-norm sup is

replaced by the sum. If S is just a fuzzy set (not a relation) in V , then (2.15) becomes

ROS (x) = supT(R(x,),S))

Example 2.1.3 LetX= {1,2,3,4},fuzzy setA= small = {(1,1), (2,0.6), (3,0.2), (4.0)}
and fuzzy relation R = “approximately equal”.

1 .5 0 0

R .5 1 .5 0
0 .5 1 .5

0 0 .5 1

A o B = maxx min {A( x), R ( x, y )}


= {(1,1), (2,.6), (3,.5), (4,.2)}

The interpretation of example: x is small. If x and y are approximately equal, y is more or less
small.

Exercise:

Q1.What is mean by Fuzzy Logic.


Q2.Explain Fuzzy Set.
Q3. Explain Fuzzy Relation.
Q4. Write a definition of Fuzzy Number.
Q5. Write a definition of LR-representation of fuzzy numbers.

Unedited Version: Neural Network and Fuzzy System


15
Chapter 12
The extension principle, fuzzy measures, Membership function’s

The Extension Principle

The extension principle is said to be one of the most important tools in fuzzy logic. It gives means
to generalize non-fuzzy concepts, e.g., mathematical operations, to fuzzy sets. Any fuzzifying
generalization must be consistent with the crisp cases.

Definition 2.1.20 (extension principle) LetA1,..., An be fuzzy sets, defined on X1...Xn and let f be a function
f : Xix ...x XnVThe extension of f operating on A1 ,..., Angives a membership function (fuzzy set F )

F () = sup min(A (ui), ... , An (A (un))


u1 ... u ny f -1 ()

when the inverse of f exists. Otherwise define F() = 0 . Function f is called inducing mapping.

If the domain is either discrete or compact, sup-min can be replaced by max-min. On continuous domains
sup-operation and the operation that satisfies criterion

x,y=0

Sw (x, y) = y,x=0

1 , otherwise

should be used [Driankov et. al., 1993].

Fuzzy Rules

Fuzzy logic was originally meant to be a technique for modeling the human thinking and reasoning, which is

done by fuzzy rules. This idea has been replaced by the thought that the fuzzy rules form an interface between humans

and computers [Brown & Harris, 1994]. Humans explain their actions and knowledge using linguistic rules and fuzzy logic

is used to represent this knowledge on computers. There are three principal ways to obtain these rules :

1. human experts provide rules


2. data driven: rules are formed by training methods
3. combination of 1. and 2.
The first way is the ideal case for fuzzy systems. Although the rules are not precise, they contain important

information about the system. In practice human experts may not provide a sufficient number of rules and especially in

the case of complex systems the amount of knowledge may be very small or even non-existent. Thus the second way

must be used instead of the first one (provided the data is available). The third way is suited for the cases when some

knowledge exists and sufficient amount of data for training is available. In this case fuzzy rules got from experts roughly
Unedited Version: Neural Network and Fuzzy System
1
approximate the behavior of the system, and by applying training this approximation is made more precise. Rules

provided by the expert’s form an initial point for the training and thus exclude the necessity of random initialization and

diminish the risk of getting stuck in a local minimum (provided the expert knowledge is good enough).

It has been shown in [Mouzouris, 1996] that linguistic information is important in the absence of
sufficient numerical data but it becomes less important as more data become available.

Fuzzy rules define the connection between input and output fuzzy linguistic variables and they
can be seen to act as associative memories. Resembling inputs are converted to resembling outputs.
Rules have a structure of the form :

IF (antecedent) THEN (consequent)

In more detail, the structure is

IF ( 1 x is Ai 1 AND .... AND d x is Aid THEN ( y is i B )

where Ajii and Bi are fuzzy sets (they define complete fuzzy partitions) in % U R % V “R, respectively. Linguistic variable x

is a vector of dimension d in d UU CC ...1 and linguistic variable V y# “. Vector x is an input to the fuzzy system and y is an

output of the fuzzy system. Note that Bi can also be a singleton (consequence part becomes: ...THEN ( y is i z )). Further, if

fuzzy system is used as a classifier, the consequence part becomes: ...THEN class is c.

A fuzzy rule base consists of a collection of rules {R1, R2,..., RM}, where each rule i R can be considered to be
of the form (2.34). This does not cause a loss of generality, since multi-input-multioutput (MIMO) fuzzy logic system can
always be decomposed into a group of multi-input-singleoutput (MISO) fuzzy logic systems. Furthermore (2.34) includes
the following types of rules as a special case [Wang, 1994] :

Unedited Version: Neural Network and Fuzzy System


2
75

 IF (x1 is Ai 1 AND ... AND xi is Aik) THEN (y is Bi) where k>d


xi
 IF (x1 is Ai 1 AND ... AND is Aik) OR (xk+1 is Aik+1AND... ANDxd is Aid)) THEN (y is Bi)
 (y is Bi)
 (y is Bi) UNLESS (xi is Ai1) AND...ANDxd is Aid )

 non-fuzzy rules

In control systems the production rules are of the form

IF <process state> THEN <control action>

where the <process state> part contains a description of the process output at the kth sampling
instant. Usually this description contains values of error and change-of-error. The <control
action> part describes the control output (change-in-control) which should be produced given
the particular <process state>. For example, a fuzzified PI-controller is of the type (2.35).
Important properties for a set of rules are completeness, consistency and continuity. These are
defined in the following.

Definition 6.1.27 (completeness) A rule base is said to be complete if any combination of input

values results in anappropriate output value:

6 x X :hgt(out(x)) >0

Definition 6.1.28 (inconsistency) A rule base is inconsistent if there are two rules with the same

rule antecedent butdifferent rule consequences.

This means that two rules that have the same antecedent map to two non-overlapping
fuzzy output sets. When the output sets are non-overlapping, then there is something wrong
with the output variables or the rules are inconsistent or discontinuous.

Definition 6.1.29 (continuity) A rule base is continuous if the neighboring rules do not have fuzzy output sets
that have

empty intersection.

Exercise:
Unedited Version: Neural Network and Fuzzy System
3
Q1.What is mean by The Extension Principle.
Q2.What are the Fuzzy Rule.
Q3. Write a definition of completeness.
Q4. Write a Definition of inconsistency.

Unedited Version: Neural Network and Fuzzy System


4
Chapter 13
Fuzzification and defuzzyfication methods, Fuzzy controllers

Fuzzifier and Defuzzifier

The fuzzifier maps a crisp point x into a fuzzy set

1, if x = x’
A’ (x) =
0, otherwise

Where X’ is the input. Fuzzifier of the form (2.43) is called a singleton fuzzifier. If the input contains noise

it can be modeled by using fuzzy number. Such fuzzifier could be called a nonsingleton fuzzifier. Because

of simplicity, singleton fuzzifier is preferred to nonsingleton fuzzifier.

Figure 6.15 Fuzy logic controller Figure 2.16 Fuzzy singleton as fuzzifier

The defuzzifier maps fuzzy sets to a crisp point. Several defuzzification methods have been suggested.
The following five are the most common :

 Center of Gravity (CoG) : In the case of 1-dimensional fuzzy sets it is often called the Center of Area (CoA)method.

Some authors (for example, [Driankov et. al., 1993]) regard CoG and CoA as a same method, when other (for

example, [Jager, 1995]) give them different forms. If CoA is calculated by dividing the area of combination of output

membership functions by two and then taking from the left so much that we get an area equal to the right one, then

it is clearly a distinct method. CoG determines center of gravity of the mass, which is formed as a combination of the

clipped or scaled output fuzzy membership functions. The intersection part of these membership functions can be

taken once or twice into calculation. Driankov [1993] separates the Center of Gravity methods such that, if the

Unedited Version: Neural Network and Fuzzy System


1
intersection is calculated once the method is Center of Area and if it is calculated twice the method is called Center

of Sums (CoS). In Fig. 2.17 defuzzified value obtained by CoS is slightly smaller than obtained by the CoA - method.

 Height method (HM) : Can be considered as a special case of CoG, whose output membership functions

aresingletons. If symmetric output sets are used in CoG, they have the same centroid no matter how wide the set is

and CoG reduces to HM. HM calculates a normalized weighted sum of the clipped or scaled singletons. HM is

sometimes called Center average defuzzifier or fuzzy-mean defuzzifier.



 Middle of Maxima (MoM) : Calculates the center of maximum in clipped membership function. The

rest of thedistribution is considered unimportant.

 First of Maxima (FoM): As MoM, but takes the leftmost value instead of center value.

Unedited Version: Neural Network and Fuzzy System


2
77

Figure 6.17 Defuzzification methods

Defuzzification methods can be compared by some criteria, which might be the continuity of the output and the
computational complexity. HM, CoA and CoS produce continuous outputs. The simplest and quickest method of these is
the HM method and for large problems it is the best choice. The fuzzy systems using it have a close relation to some
well-known interpolation methods (we will return to this relation later).

The maximum methods (MoM, FoM) have been widely used. The underlying idea of MoM (with max-min

inference) can be explained as follows. Each input variable is divided into a number of intervals, which means that the

whole input space is divided into a large number of d-dimensional boxes. If a new input point is given, the corresponding

value for y is determined by finding which box the point falls in and then returning the average value of the

corresponding y-interval associated with that input box. Because of the piecewise constant output, MoM is inefficient

for approximating nonlinear continuous functions. Kosko has shown [Kosko, 1997] that if there are many rules that fire

simultaneously, the maximum function tends to approach a constant function. This may cause problems especially in

control.

The property of CoS is that the shape of final membership function used as a basis for defuzzification will

resemble more and more normal density function when the number of functions used in summation grows.

Systems of this type that sum up the output membership functions to form final output set are called additive

fuzzy systems.

Fuzzy logic is widely used in machine control. The term “fuzzy” refers to the fact that the logic
involved can deal with concepts that cannot be expressed as the “true” or “false” but rather as “partially
true”. Although alternative approaches such as genetic algorithms and neural networks can perform just
as well as fuzzy logic in many cases, fuzzy logic has the advantage that the solution to the problem can be
cast in terms that human operators can understand, so that their experience can be used in the design of
the controller. This makes it easier to mechanize tasks that are already successfully performed by
humans.

Fuzzy controllers are very simple conceptually. They consist of an input stage, a processing stage,
and an output stage. The input stage maps sensor or other inputs, such as switches, thumbwheels, and
Unedited Version: Neural Network and Fuzzy System
3
so on, to the appropriate membership functions and truth values. The processing stage invokes each
appropriate rule and generates a result for each, then combines the results of the rules. Finally, the
output stage converts the combined result back into a specific control output value.

The most common shape of membership functions is triangular, although trapezoidal and bell curves are also

used, but the shape is generally less important than the number of curves and their placement. From three to seven

Unedited Version: Neural Network and Fuzzy System


4
78

curves are generally appropriate to cover the required range of an input value, or the “universe of
discourse” in fuzzy jargon.

As discussed earlier, the processing stage is based on a collection of logic rules in the form of IF
THEN statements, where the IF part is called the “antecedent” and the THEN part is called the
“consequent”. Typical fuzzy control systems have dozens of rules.

Consider a rule for a thermostat :

IF (temperature is “cold”) THEN turn (heater is “high”)

This rule uses the truth value of the “temperature” input, which is some truth value of “cold”, to generate

a result in the fuzzy set for the “heater” output, which is some value of “high”. This result is used with the results of

other rules to finally generate the crisp composite output. Obviously, the greater the truth value of “cold”, the

higher the truth value of “high”, though this does not necessarily mean that the output itself will be set to “high”

since this is only one rule among many. In some cases, the membership functions can be modified by “hedges” that

are equivalent to adverbs. Common hedges include “about”, “near”, “close to”, “approximately”, “very”, “slightly”,

“too”, “extremely”, and “somewhat”. These operations may have precise definitions, though the definitions can

vary considerably between different implementations. “Very”, for one example, squares membership functions;

since the membership values are always less than 1, this narrows the membership function. “Extremely” cubes the

values to give greater narrowing, while “somewhat” broadens the function by taking the square root.

In practice, the fuzzy rule sets usually have several antecedents that are combined using fuzzy operators, such as
AND, OR, and NOT, though again the definitions tend to vary: AND, in one popular definition, simply uses the minimum
weight of all the antecedents, while OR uses the maximum value. There is also a NOT operator that subtracts a
membership function from 1 to give the “complementary” function.

There are several ways to define the result of a rule, but one of the most common and simplest is the

“max-min” inference method, in which the output membership function is given the truth value generated by the

premise.

Rules can be solved in parallel in hardware, or sequentially in software. The results of all the rules
that have fired are “defuzzified” to a crisp value by one of several methods. There are dozens, in theory,
each with various advantages or drawbacks.

The “centroid” method is very popular, in which the “center of mass” of the result provides the
crisp value. Another approach is the “height” method, which takes the value of the biggest contributor.
The centroid method favors the rule with the output of greatest area, while the height method obviously
favors the rule with the greatest output value.

Unedited Version: Neural Network and Fuzzy System


5
The diagram below demonstrates max-min inferencing and centroid defuzzification for a system
with input variables “x”, “y”, and “z” and an output variable “n”. Note that “mu” is standard fuzzylogic
nomenclature for “truth value”:

Unedited Version: Neural Network and Fuzzy System


6
79

Notice how each rule provides a result as a truth value of a particular membership function for the output
variable. In centroid defuzzification the values are OR’d, that is, the maximum value is used and values are not
added, and the results are then combined using a centroid calculation. Fuzzy control system design is based on
empirical methods, basically a methodical approach to trial-and-error. The general process is as follows :

 Document the system’s operational specifications and inputs and outputs.

 Document the fuzzy sets for the inputs.

 Document the rule set.

 Determine the defuzzification method.

 Run through test suite to validate system, adjust details as required.

 Complete document and release to production.

Unedited Version: Neural Network and Fuzzy System


7
80

As a general example, consider the design of a fuzzy controller for a steam turbine. The block
diagram of this control system appears as follows :
The input and output variables map into the following fuzzy set :

N3 : Large negative.

N2 : Medium negative.

N1 : Small negative.

Z: Zero.

P1 : Small positive.

P2 : Medium positive.

P3 : Large positive.

The rule set includes such rules as:

rule 1 : IF temperature IS cool AND pressure IS weak, THEN throttle is P3.

rule 2 : IF temperature IS cool AND pressure IS low, THEN throttle is P2.

rule 3 : IF temperature IS cool AND pressure IS ok, THEN throttle is Z.

Unedited Version: Neural Network and Fuzzy System


8
rule 4 : IF temperature IS cool AND pressure IS strong, THEN throttle is N2.

Unedited Version: Neural Network and Fuzzy System


9
81

In practice, the controller accepts the inputs and maps them into their membership functions and truth values.
These mappings are then fed into the rules. If the rule specifies an AND relationship between the mappings of the two
input variables, as the examples above do, the minimum of the two is used as the combined truth value; if an OR is
specified, the maximum is used. The appropriate output state is selected and assigned a membership value at the truth
level of the premise. The truth values are then defuzzified. For an example, assume the temperature is in the “cool”
state, and the pressure is in the “low” and “ok” states. The pressure values ensure that only rules 2 and 3 fire:

The two outputs are then defuzzified through centroid defuzzication :


Unedited Version: Neural Network and Fuzzy System
10
+150

Unedited Version: Neural Network and Fuzzy System


11
82

The output value will adjust the throttle and then the control cycle will begin again to generate the next value .

Building a fuzzy controller[edit]

Consider implementing with a microcontroller chip a simple feedback controller :

A fuzzy set is defined for the input error variable “e”, and the derived change in error, “delta”,
as well as the “output”, as follows:
LP : large positive
SP : small positive
ZE : zero
SN : small negative
LN : large negative

If the error ranges from -1 to +1, with the analog-to-digital converter used having a resolution of 0.25,
then the input variable’s fuzzy set (which, in this case, also applies to the output variable) can be
described very simply as a table, with the error / delta / output values in the top row and the truth
values for each membership function arranged in rows beneath :

_____________________________________________________________________________________

-1 -0.75 -0.5 -0.25 0 0.25 0.5 0.75 1


_____________________________________________________________________________________
mu (LP) 0 0 0 0 0 0 0.3 0.7 1
mu (SP) 0 0 0 0 0.3 0.7 1 0.7 0.3
mu (ZE) 0 0 0.3 0.7 1 0.7 0.3 0 0
mu (SN) 0.3 0.7 1 0.7 0.3 0 0 0 0
mu (LN) 1 0.7 0.3 0 0 0 0 0 0
_____________________________________________________________________________________ or, in
graphical form (where each “X” has a value of 0.1):
LN SN ZE SP LP
+------------------------------------------------------------------------------------------------------- +
-1.0 | XXXXXXXXXX XXX : : : |
-0.75 | XXXXXXX XXXXXXX : : : |
-0.5 | XXX XXXXXXXXXX XXX : : |
-0.25 | : XXXXXXX XXXXXXX : : |
0.0 | : XXX XXXXXXXXXX XXX : |
0.25 | : : XXXXXXX XXXXXXX : |
0.5 | : : XXX XXXXXXXXXX XXX |
0.75 | : : : XXXXXXX XXXXXXX |
1.0 | : : : XXX XXXXXXXXXX |
Unedited Version: Neural Network and Fuzzy System
12
| |
+-------------------------------------------------------------------------------------------------------+

Unedited Version: Neural Network and Fuzzy System


13
83

Suppose this fuzzy system has the following rule base :

rule 1 : IF e = ZE AND delta = ZE THEN output = ZE


rule 2 : IF e = ZE AND delta = SP THEN output = SN
rule 3 : IF e = SN AND delta = SN THEN output = LP
rule 4 : IF e = LP OR delta = LP THEN output = LN
These rules are typical for control applications in that the antecedents consist of the logical combination
of the error and error-delta signals, while the consequent is a control command output. The rule outputs
can be defuzzified using a discrete centroid computation :

SUM ( I = 1 TO 4 OF ( mu(I) * output(I) ) ) / SUM( I = 1 TO 4 OF mu(I) )


Now, suppose that at a given time we have:
e = 0.25
delta = 0.5
Then this gives :
___________________________________
e delta
___________________________________
mu(LP) 0 0.3
mu(SP) 0.7 1
mu(ZE) 0.7 0.3
mu(SN) 0 0
mu(LN) 0 0
___________________________________

Plugging this into rule 1 gives :

rule 1 : IF e = ZE AND delta = ZE THEN output = ZE


mu (1) = MIN ( 0.7, 0.3 ) = 0.3
output(1) =0

-- where :

 mu(1) : Truth value of the result membership function for rule 1. In terms of a centroid calculation, this is
the “mass” of this result for this discrete case.

 output(1) : Value (for rule 1) where the result membership function (ZE) is maximum over the output variable fuzzy
set range. That is, in terms of a centroid calculation, the location of the “center of mass” for this individual result. This
value is independent of the value of “mu”. It simply identifies the location of ZE along the output range.

The other rules give :


rule 2 : IF e = ZE AND delta = SP THEN output = SN
mu (2) = MIN ( 0.7, 1 ) = 0.7
output (2) = -0.5
rule 3 : IF e = SN AND delta = SN THEN output = LP
mu (3) = MIN ( 0.0, 0.0 ) = 0
output (3) = 1

Unedited Version: Neural Network and Fuzzy System


14
rule 4 : IF e = LP OR delta = LP THEN output = LN
mu (4) = MAX ( 0.0, 0.3 ) = 0.3
output (4) = -1

Unedited Version: Neural Network and Fuzzy System


15
84

The centroid computation yields :

-for the final control output. Simple. Of course the hard part is figuring out what rules actually
work correctly in practice.

If you have problems figuring out the centroid equation, remember that a centroid is defined by
summing all the moments (location times mass) around the center of gravity and equating the sum to
zero. So if is the center of gravity, is the location of each mass, and is each mass, this gives :

In our example, the values of mu correspond to the masses, and the values of X to location of the masses
(mu, however, only ‘corresponds to the masses’ if the initial ‘mass’ of the output functions are all the
same/equivalent. If they are not the same, i.e. some are narrow triangles, while others maybe wide
trapizoids or shouldered triangles, then the mass or area of the output function must be known or
calculated. It is this mass that is then scaled by mu and multiplied by its location X_i).
This system can be implemented on a standard microprocessor, but dedicated fuzzy chips are now available. For
example, Adaptive Logic INC of San Jose, California, sells a “fuzzy chip”, the AL220, that can accept four analog inputs
and generate four analog outputs. A block diagram of the chip is shown below :

ADC : analog-to-digital converter

DAC : digital-to-analog converter

SH
Unedited Version: Neural Network and Fuzzy System
16
:
sampl
e/hold

Unedited Version: Neural Network and Fuzzy System


17
85

Antilock brakes
As a first example, consider an anti-lock braking system, directed by a microcontroller chip. The microcontroller
has to make decisions based on brake temperature, speed, and other variables in the system.

The variable “temperature” in this system can be subdivided into a range of “states”: “cold”,
“cool”, “moderate”, “warm”, “hot”, “very hot”. The transition from one state to the next is hard
to define.

An arbitrary static threshold might be set to divide “warm” from “hot”. For example, at exactly 90
degrees, warm ends and hot begins. But this would result in a discontinuous change when the
input value passed over that threshold. The transition wouldn’t be smooth, as would be required
in braking situations.
The way around this is to make the states fuzzy. That is, allow them to change gradually from one
state to the next. In order to do this there must be a dynamic relationship established between
different factors.

We start by defining the input temperature states using “membership functions”:

With this scheme, the input variable’s state no longer jumps abruptly from one state to the next. Instead, as the
temperature changes, it loses value in one membership function while gaining value in the next. In other words,
its ranking in the category of cold decreases as it becomes more highly ranked in the warmer category.

At any sampled timeframe, the “truth value” of the brake temperature will almost always be in some degree
part of two membership functions: i.e.: ‘0.6 nominal and 0.4 warm’, or ‘0.7 nominal and 0.3 cool’, and so on.

The above example demonstrates a simple application, using the abstraction of values from
multiple values. This only represents one kind of data, however, in this case, temperature.

Unedited Version: Neural Network and Fuzzy System


18
Adding additional sophistication to this braking system, could be done by additional factors such
as traction, speed, inertia, set up in dynamic functions, according to the designed fuzzy
system.[9]
Logical interpretation of fuzzy control
In spite of the appearance there are several difficulties to give a rigorous logical interpretation of
the IF-THEN rules. As an example, interpret a rule as IF (temperature is “cold”) THEN (heater is “high”) by
the first order formula Cold(x) High(y) and assume that r is an input such that Cold(r) is false. Then the
formula Cold(r) High(t) is true for any t and therefore any t gives a correct control given r. A rigorous
logical justification offuzzy control is given in Hajek’s book (see Chapter 7) where fuzzy control is
represented as a theory of Hajek’s basic logic.[2] Also in Gerla 2005 [10] another logical approach to fuzzy
control is proposed based on fuzzy logic programming.Indeed, denote by f the fuzzy function arising of an
IF-THEN systems of rules. Then we can translate this system into a fuzzy program P containing a series of
rules whose head is “Good(x,y)”. The interpretation of this predicate in the least fuzzy Herbrand model of
P coincides with f. This gives further useful tools to fuzzy control.

Exercise:
Q1. Explain Fuzzifier and Defuzzifier.
Q2.Write a short note onFuzzification.
Q3. Explain the concept of defuzzyfication methods.
Q4.What is the process of Building a fuzzy controller.
Q5.Write a short note on centroid computation yields.
Q6. Write a short note on Antilock brakes.

Unedited Version: Neural Network and Fuzzy System


19

You might also like