0% found this document useful (0 votes)
28 views

UNIT 2 - AI - Merged

AI

Uploaded by

shanmathi0905
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

UNIT 2 - AI - Merged

AI

Uploaded by

shanmathi0905
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

UNIT 2

ARTIFICIAL INTELLIGENCE

UNIT II PROBLEM SOLVING


Heuristic search strategies – heuristic functions. Local search and optimization problems – local
search in continuous space – search with non-deterministic actions – search in partially
observable environments – online search agents and unknown environments

HEURISTIC SEARCH STRATEGIES

WHAT IS HEURISTIC OR INFORMED SEARCH?

➢ It is the one that uses problem-specific knowledge beyond the definition of the problem
itself.

➢ It can find solutions more efficiently than can an uninformed strategy.

TYPES OF HEURISTIC OR INFORMED SEARCH

➢ Best First Search

➢ Greedy Best First Search

➢ A*

o MA*
o IDA*
o R-BFS
o SMA*

BEST FIRST SEARCH:


➢ Best-first search is an instance of the general TREE-SEARCH or GRAPH-SEARCH
algorithm in which a node is selected for expansion based on an evaluation function,
f(n).

➢ The evaluation function is construed as a cost estimate, so the node with the lowest
evaluation is expanded first.

➢ The implementation of best-first graph search is identical to that for uniform-cost search
except for the use of f instead of g to order the priority queue.

➢ The choice of f determines the search strategy.

Advantages:
➢ Best first search can switch between BFS and DFS by gaining the advantages of both the
algorithms.
➢ This algorithm is more efficient than BFS and DFS algorithms.
Disadvantages:
➢ It can behave as an unguided depth-first search in the worst case scenario.
➢ It can get stuck in a loop as DFS.
➢ This algorithm is not optimal.

HEURISTIC FUNCTION

Best-first algorithms include as a component of f(n) a heuristic function, denoted h(n):

h(n) = estimated cost of the cheapest path from the state at node n to a goal state.
➢ Heuristic functions are the most common form in which additional knowledge of the
problem is imparted to the search algorithm.

➢ We consider them to be arbitrary, nonnegative, problem-specific functions, with one


constraint: if n is a goal node, then h(n)=0.

GREEDY BFS

➢ Greedy best-first search tries to expand the node that is closest to the goal, on the
grounds that this is likely to lead to a solution quickly.
➢ Thus, it evaluates nodes by using just the heuristic function; that is, f(n) = h(n).
➢ The algorithm is called “greedy”—at each step it tries to get as close to the goal as it can.

ALGORITHM:

1. Step 1: Place the starting node into the OPEN list.


2. Step 2: If the OPEN list is empty, Stop and return failure.
3. Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and
places it in the CLOSED list.
4. Step 4: Expand the node n, and generate the successors of node n.
5. Step 5: Check each successor of node n, and find whether any node is a goal node or not.
If any successor node is goal node, then return success and terminate the search, else
proceed to Step 6.
6. Step 6: For each successor node, algorithm checks for evaluation function f(n), and then
check if the node has been in either OPEN or CLOSED list. If the node has not been in
both list, then add it to the OPEN list.
7. Step 7: Return to Step 2.
EXAMPLE:

Expand the nodes of S and put in the CLOSED list

Initialization: Open [A, B], Closed [S]


Iteration 1: Open [A], Closed [S, B]
Iteration2: Open [E, F, A], Closed [S, B]
: Open [E, A], Closed [S, B, F]
Iteration 3: Open [I, G, E, A], Closed [S, B, F]
: Open [I, E, A], Closed [S, B, F, G]
Hence the final solution path will be: S----> B----->F----> G

IMPLEMENTATION

Greedy BFS is implemented by priority queue.

PERFORMANCE EVALUATION:

a. Time Complexity: The worst-case time complexity of Greedy best first search is
O(bm).
b. Space Complexity: The worst-case space complexity of Greedy best first search
is O(bm). Where, m is the maximum depth of the search space.
c. Complete: Greedy best-first search is also incomplete, even if the given state
space is finite.
d. Optimal: Greedy best first search algorithm is not optimal.

Note:
✓ With a good heuristic function, however, the complexity can be reduced substantially.

✓ The amount of the reduction depends on the particular problem and on the quality of the
heuristic.
A* ALGORITHM

➢ The most widely known form of best-first search is called A∗ search. It evaluates nodes
by combining g(n), the cost to reach the node, and h(n), the cost to get from the node to
the goal:
o f(n) = g(n) + h(n)

➢ Since g(n) gives the path cost from the start node to node n, and h(n) is the estimated cost
of the cheapest path from n to the goal, we have

f(n) = estimated cost of the cheapest solution through n .

➢ Thus, if we are trying to find the cheapest solution, a reasonable thing to try first is the
node with the lowest value of g(n) + h(n). It turns out that this strategy is more than just
reasonable: provided that the heuristic function h(n) satisfies certain conditions, A∗
search is both complete and optimal. The algorithm is identical to UNIFORM-COST-
SEARCH except that A∗ uses g + h instead of g.

ALGORITHM:

✓ Step1: Place the starting node in the OPEN list.


✓ Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and
stops.
✓ Step 3: Select the node from the OPEN list which has the smallest value of evaluation
function (g+h), if node n is goal node then return success and stop, otherwise
✓ Step 4: Expand node n and generate all of its successors, and put n into the closed list.
For each successor n', check whether n' is already in the OPEN or CLOSED list, if not
then compute evaluation function for n' and place into Open list.
✓ Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the
back pointer which reflects the lowest g(n') value.
✓ Step 6: Return to Step 2.
OPTIMALITY OF A*:

A∗ has the following properties:

The tree-search version of A* is optimal if h(n) is admissible, while the graph-search version is
optimal if h(n) is consistent.

EXAMPLE:

Example:
In this example, we will traverse the given graph using the A* algorithm. The heuristic value of
all states is given in the below table so we will calculate the f(n) of each state using the formula
f(n)= g(n) + h(n), where g(n) is the cost to reach any node from start state.
Here we will use OPEN and CLOSED list.

Solution:
Initialization: {(S, 5)}
Iteration1: {(S--> A, 4), (S-->G, 10)}
Iteration2: {(S--> A-->C, 4), (S--> A-->B, 7), (S-->G, 10)}
Iteration3: {(S--> A-->C--->G, 6), (S--> A-->C--->D, 11), (S--> A-->B, 7), (S-->G, 10)}
Iteration 4 will give the final result, as S--->A--->C--->G it provides the optimal path with cost
6.

PERFORMANCE EVALUATION:

➢ Complete: A* algorithm is complete as long as:


a. Branching factor is finite.
b. Cost at every action is fixed.
➢ Optimal: A* search algorithm is optimal if it follows below two conditions:
➢ Admissible: the first condition requires for optimality is that h(n) should be an
admissible heuristic for A* tree search. An admissible heuristic is optimistic in nature.
➢ Time Complexity: The time complexity of A* search algorithm depends on heuristic
function, and the number of nodes expanded is exponential to the depth of solution d. So
the time complexity is O(b^d), where b is the branching factor.
➢ Space Complexity: The space complexity of A* search algorithm is O(bd)
Advantages:
o A* search algorithm is the best algorithm than other search algorithms.
o A* search algorithm is optimal and complete.
o This algorithm can solve very complex problems.
Disadvantages:
o It does not always produce the shortest path as it mostly based on heuristics and
approximation.
o A* search algorithm has some complexity issues.
o The main drawback of A* is memory requirement as it keeps all generated nodes in the
memory, so it is not practical for various large-scale problems.

MEMORY BOUNDED A*:

ITERATIVE DEEPENING A*

➢ The simplest way to reduce memory requirements for A∗ is to adapt the idea of iterative
deepening to the heuristic search context, resulting in the iterative-deepening A∗ (IDA∗)
algorithm.

➢ The main difference between IDA∗ and standard iterative deepening is that the cutoff
used is the f-cost (g+h) rather than the depth; at each iteration, the cutoff value is the
smallest f-cost of any node that exceeded the cutoff on the previous iteration.

ALGORITHM:

1. Start with an initial cost limit.


The algorithm begins with an initial cost limit, which is usually set to the heuristic cost
estimate of the optimal path to the goal node.
2. Perform a depth-first search (DFS) within the cost limit.
The algorithm performs a DFS search from the starting node until it reaches a node with a
cost that exceeds the current cost limit.
3. Check for the goal node.
If the goal node is found during the DFS search, the algorithm returns the optimal path to
the goal.
4. Update the cost limit.
If the goal node is not found during the DFS search, the algorithm updates the cost limit
to the minimum cost of any node that was expanded during the search.
5. Repeat the process until the goal is found.The algorithm repeats the process, increasing
the cost limit each time until the goal node is found.

EXAMPLE:

SOLUTION:
We want to find the optimal path from node A to node F using the IDA* algorithm. The
first step is to set an initial cost limit. Let's use the heuristic estimate of the optimal path,
which is 7 (the sum of the costs from A to C to F).
1. Set the cost limit to 7.
2. Start the search at node A.
3. Expand node A and generate its neighbors, B and C.
4. Evaluate the heuristic cost of the paths from A to B and A to C, which are 5 and 10
respectively.
5. Since the cost of the path to B is less than the cost limit, continue the search from node B.
6. Expand node B and generate its neighbors, D and E.
7. Evaluate the heuristic cost of the paths from A to D and A to E, which are 10 and 9
respectively.
8. Since the cost of the path to D exceeds the cost limit, backtrack to node B.
9. Evaluate the heuristic cost of the path from A to C, which is 10.
10. Since the cost of the path to C is less than the cost limit, continue the search from node C.
11. Expand node C and generate its neighbor, F.
12. Evaluate the heuristic cost of the path from A to F, which is 7.
13. Since the cost of the path to F is less than the cost limit, return the optimal path, which is
A - C - F.

ADVANTAGES:

➢ IDA∗ is practical for many problems with unit step costs and avoids the substantial
overhead associated with keeping a sorted queue of nodes.
➢ Completeness: The IDA* method is a complete search algorithm, which means that, if an
optimum solution exists, it will be discovered.
➢ Memory effectiveness: The IDA* method only keeps one path in memory at a time,
making it memory efficient.
➢ Flexibility: Depending on the application, the IDA* method may be employed with a
number of heuristic functions.
➢ Performance: The IDA* method sometimes outperforms other search algorithms like
uniform-cost search (UCS) or breadth-first search (BFS) (UCS).
DISADVANTAGES

➢ Unfortunately, it suffers from the same difficulties with real valued costs as does the
iterative version of uniform-cost search.

➢ Although IDA* is memory-efficient in that it only saves one path at a time, there are
some situations when it may still be necessary to use a substantial amount of memory.

RECURSIVE-BFS:

Recursive best-first search (RBFS) is RECURSIVE a simple recursive algorithm that attempts
to mimic the operation of standard best-first search, but using only linear space.

The algorithm is shown in Figure 3.26.

Its structure is similar to that of a recursive depth-first search, but rather than continuing
indefinitely down the current path, it uses the f limit variable to keep track of the f-value of the
best alternative path available from any ancestor of the current node. If the current node exceeds
this limit, the recursion unwinds back to the alternative path. As the recursion unwinds, RBFS
replaces the f-value of each node along the path with a backed-up value—the best f-value of its
children. In this way, RBFS remembers the f-value of the best leaf in the forgotten subtree and
can therefore decide whether it’s worth re-expanding the subtree at some later time. RBFS is
somewhat more efficient than IDA∗, but still suffers from excessive node regeneration.

EXAMPLE:
PERFORMANCE EVALUATION:

1.OPTIMALITY: Like A∗ tree search, RBFS is an optimal algorithm if the heuristic function
h(n) is
admissible.

2.TIME AND SPACE COMPLEXITY:

✓ Its space complexity is linear in the depth of the deepest optimal solution, but its time
complexity is rather difficult to characterize: it depends both on the accuracy of the
heuristic function and on how often the best path changes as nodes are expanded.

✓ IDA∗ and RBFS suffer from using too little memory. Between iterations, IDA∗ retains
only a single number: the current f-cost limit. RBFS retains more information in memory,
but it uses only linear space: even if more memory were available, RBFS has no way to
make use of it. Because they forget most of what they have done, both algorithms may
end up re-expanding the same states many times over.

✓ Furthermore, they suffer the potentially exponential increase in complexity associated


with redundant paths in graphs.

SMA*
Refer handwritten notes.
LOCAL SEARCH ALGORITHM AND OPTIMISATION PROBLEM

The informed and uninformed search expands the nodes systematically in two ways:

keeping different paths in the memory and


selecting the best suitable path,

Which leads to a solution state required to reach the goal node. But beyond these “classical
search algorithms," we have some “local search algorithms” where the path cost does not
matters, and only focus on solution-state needed to reach the goal node.

A local search algorithm completes its task by traversing on a single current node rather than
multiple paths and following the neighbors of that node generally.

Although local search algorithms are not systematic, still they have the following two
advantages:

Local search algorithms use a very little or constant amount of memory as they operate
only on a single path.
Most often, they find a reasonable solution in large or infinite state spaces where the
classical or systematic algorithms do not work.

Does the local search algorithm work for a pure optimized problem?

Yes, the local search algorithm works for pure optimized problems. A pure optimization problem
is one where all the nodes can give a solution. But the target is to find the best state out of all
according to the objective function. Unfortunately, the pure optimization problem fails to find
high-quality solutions to reach the goal state from the current state.
Note: An objective function is a function whose value is either minimized or maximized in
different contexts of the optimization problems. In the case of search algorithms, an objective
function can be the path cost for reaching the goal node, etc.
Working of a Local search algorithm
Let's understand the working of a local search algorithm with the help of an example:
Consider the below state-space landscape having both:
Location: It is defined by the state.
Elevation: It is defined by the value of the objective function or heuristic cost function.

The local search algorithm explores the above landscape by finding the following two points:
Global Minimum: If the elevation corresponds to the cost, then the task is to find the
lowest valley, which is known as Global Minimum.
Global Maxima: If the elevation corresponds to an objective function, then it finds the
highest peak which is called as Global Maxima. It is the highest point in the valley.

VARIOUS LOCAL SEARCH ALGORITHM:

1.HILL CLIMBING ALGORITHM


2.SIMULATED ANNEALING ALGORITHM
3.LOCAL BEAM SEARCH ALGORITHM
4.GENETIC ALGORITHM
HILL CLIMBING ALGORITHM:
➢ The hill-climbing search algorithm (steepest-ascent version) HILL CLIMBING is
shown in Figure 4.2.

➢ It is simply a loop that continually moves in the direction of increasing value—that is,
uphill. It terminates when it reaches a “peak” where no neighbor has a higher value.

➢ The algorithm does not maintain a search tree, so the data structure for the current node
need only record the state and the value of the objective function.

➢ Hill climbing does not look ahead beyond the immediate neighbors of the current state.
This resembles trying to find the top of Mount Everest in a thick fog while suffering from
amnesia.

State-space Landscape of Hill climbing algorithm

To understand the concept of hill climbing algorithm, consider the below landscape representing
the goal state/peak and the current state of the climber. The topographical regions shown in the
figure can be defined as:

Global Maximum: It is the highest point on the hill, which is the goal state.
Local Maximum: It is the peak higher than all other peaks but lower than the global
maximum.
Flat local maximum: It is the flat area over the hill where it has no uphill or downhill. It
is a saturated point of the hill.
Shoulder: It is also a flat area where the summit is possible.
Current state: It is the current position of the person.

FLOWCHART

Types of Hill climbing search algorithm


There are following types of hill-climbing search:
Simple hill climbing
Steepest-ascent hill climbing
Stochastic hill climbing
Random-restart hill climbing
Simple hill climbing search
Simple hill climbing is the simplest technique to climb a hill. The task is to reach the highest
peak of the mountain. Here, the movement of the climber depends on his move/steps. If he finds
his next step better than the previous one, he continues to move else remain in the same state.
This search focus only on his previous and next step.
Simple hill climbing Algorithm
2. If the CURRENT node=GOAL node, return GOAL and terminate the search.
3. Else CURRENT node<= NEIGHBOUR node, move ahead.
4. Loop until the goal is not reached or a point is not found.

Steepest-ascent hill climbing


Steepest-ascent hill climbing is different from simple hill climbing search. Unlike simple hill
climbing search, It considers all the successive nodes, compares them, and choose the node
which is closest to the solution. Steepest hill climbing search is similar to best-first search
because it focuses on each node instead of one.
Note: Both simple, as well as steepest-ascent hill climbing search, fails when there is no closer
node.
Steepest-ascent hill climbing algorithm
1. Create a CURRENT node and a GOAL node.
2. If the CURRENT node=GOAL node, return GOAL and terminate the search.
3. Loop until a better node is not found to reach the solution.
4. If there is any better successor node present, expand it.
5. When the GOAL is attained, return GOAL and terminate.
Stochastic hill climbing
Stochastic hill climbing does not focus on all the nodes. It selects one node at random and
decides whether it should be expanded or search for a better one.

First-choice hill climbing

It implements stochastic hill climbing by generating successors randomly until one is generated
that is better than the current state. This is a good strategy when a state has many (e.g.,
thousands) of successors.

Random-restart hill climbing


Random-restart algorithm is based on try and try strategy. It iteratively searches the node and
selects the best one at each step until the goal is not found. The success depends most commonly
on the shape of the hill. If there are few plateaus, local maxima, and ridges, it becomes easy to
reach the destination.

Limitations of Hill climbing algorithm


Hill climbing algorithm is a fast and furious approach. It finds the solution state rapidly because
it is quite easy to improve a bad state. But, there are following limitations of this search:

➢ Local Maxima: It is that peak of the mountain which is highest than all its neighboring
states but lower than the global maxima. It is not the goal peak because there is another peak
higher than it.
➢ Plateau: It is a flat surface area where no uphill exists. It becomes difficult for the
climber to decide that in which direction he should move to reach the goal point.
Sometimes, the person gets lost in the flat area.

➢ Ridges: It is a challenging problem where the person finds two or more local maxima of
the same height commonly. It becomes difficult for the person to navigate the right point
and stuck to that point itself.

SIMULATED ANNEAING ALGORITHM:

A hill-climbing algorithm that never makes “downhill” moves toward states with lower
value(or higher cost) is guaranteed to be incomplete, because it can get stuck on a local
maximum.In contrast, a purely random walk—that is, moving to a successor chosen uniformly at
random from the set of successors—is complete but extremely inefficient. Therefore, it seems
reasonable to try to combine hill climbing with a random walk in some way that yields both
efficiency and completeness. Simulated annealing is such an algorithm.

In metallurgy, annealing is the process used to temper or harden metals and glass by
heating them to a high temperature and then gradually cooling them, thus allowing the material
to reach a low energy crystalline state. To explain simulated annealing, we switch our point of
view from hill climbing to gradient descent (i.e., minimizing cost) and imagine the task of
getting a ping-pong ball into the deepest crevice in a bumpy surface. If we just let the ball roll, it
will come to rest at a local minimum. If we shake the surface, we can bounce the ball out of the
local minimum. The trick is to shake just hard enough to bounce the ball out of local minima but
not hard enough to dislodge it from the global minimum.

The simulated-annealing solution is to start by shaking hard (i.e., at a high temperature)


and then gradually reduce the intensity of the shaking (i.e., lower the temperature).

Instead of picking the best move, however, it picks a random move. If the move improves
the situation, it is always accepted. Otherwise, the algorithm accepts the move with some
probability less than 1. The probability decreases exponentially with the “badness” of the
move—the amount ΔE by which the evaluation is worsened. The probability also decreases as
the “temperature” T goes down: “bad” moves are more likely to be allowed at the start when T is
high, and they become more unlikely as T decreases. If the schedule lowers T slowly enough, the
algorithm will find a global optimum with probability approaching 1.

Simulated annealing was first used extensively to solve VLSI layout problems in the
early 1980s. It has been applied widely to factory scheduling and other large-scale optimization
tasks.

ALGORITHM:
EXAMPLE:

• Problem: Sudoku - Find a completed Sudoku grid that satisfies all the rules.
• Start with an initial partially filled Sudoku grid.
• While the temperature is above the minimum temperature:
– Randomly select an empty cell and assign a random number.
– Calculate the number of conflicts in the current and neighbor grids.
– If the neighbor grid has fewer conflicts, accept it.
– If it has more conflicts, accept it with a probability based on the Boltzmann
distribution and the current temperature.
• Decrease the temperature.
• Continue until a termination criterion is met.
• Output the Sudoku grid with the fewest conflicts or a solved Sudoku puzzle if zero
conflicts are reached.

LOCAL BEAM SEARCH ALGORITHM

The local beam search algorithm keeps track of k states rather than just one. It begins with k
randomly generated states. At each step, all the successors of all k states are generated. If any one is a
goal, the algorithm halts. Otherwise, it selects the k best successors from the complete list and repeats.At
first sight, a local beam search with k states might seem to be nothing more than running k random
restarts in parallel instead of in sequence.

In fact, the two algorithms are quite different. In a random-restart search, each search process
runs independently of the others. In a local beam search, useful information is passed among the parallel
search threads. In effect, the states that generate the best successors say to the others, “Come over
here, the grass is greener!” The algorithm quickly abandons unfruitful searches and moves its resources to
where the most progress is being made. In its simplest form, local beam search can suffer from a lack of
diversity among the k states—they can quickly become concentrated in a small region of the state space,
making the search little more than an expensive version of hill climbing.

A variant called stochastic beam search, analogous to stochastic hill climbing, helps alleviate
this problem. Instead of choosing the best k from the the pool of candidate successors, stochastic beam
search chooses k successors at random, with the probability of choosing a given successor being an
increasing function of its value. Stochastic beam search bears some resemblance to the process of natural
selection, whereby the “successors” (offspring) of a “state” (organism) populate the next generation
according to its “value” (fitness).

GENETIC ALGORITHM:

A genetic algorithm (or GA) is a variant of stochastic beam search in which successor states are
generated by combining two parent states rather than by modifying a single state.

IMPORTANT TERMS IN GENETIC ALGORITHM


1.POPULATION:

GAs begins with a set of k randomly generated states, called the population.

2.INDIVIDUAL

Each state, or individual, is represented as a string over a finite alphabet—most commonly, a string of 0s
and 1s.

3.FITNESS FUNCTION

The production of the next generation of states is shown in Figure 4.6(b)–(e). In (b),each state is rated by
the objective function, or (in GA terminology) the fitness function.

A fitness function should return higher values for better states, so, for the 8-queens problem we use the
number of nonattacking pairs of queens, which has a value of 28 for a solution. The values of the four
states are 24, 23, 20, and 11.

In this particular variant of the genetic algorithm, the probability of being chosen for reproducing is
directly proportional to the fitness score, and the percentages are shown next to the raw scores.

4.CROSS OVER

For each pair to be mated, a crossover point is chosen randomly from the positions in the string. In Figure
4.6, the crossover points are after the third digit in the first pair and after the fifth digit
in the second pair. In (d), the offspring themselves are created by crossing over the parent strings at the
crossover point. For example, the first child of the first pair gets the first three digits from the first parent
and the remaining digits from the second parent, whereas the second child gets the first three digits from
the second parent and the rest from the first parent. The 8-queens states involved in this reproduction step
are shown in Figure 4.7. The example shows that when two parent states are quite different, the crossover
operation can produce a state that is a long way from either parent state. It is often the case that the
population is quite diverse early on in the process, so crossover (like simulated annealing) frequently
takes large steps in the state space early in the search process and smaller steps later on when most
individuals are quite similar.

5.MUTATION:

Finally, in (e), each location is subject to random mutation with a small independent probability. One
digit was mutated in the first, third, and fourth offspring. In the 8-queens problem, this corresponds to
choosing a queen at random and moving it to a random square in its column. Figure 4.8 describes an
algorithm that implements all these steps.
ADVANTAGE OF GENETIC ALGORITHM

Like stochastic beam search, genetic algorithms combine an uphill tendency with random exploration and
exchange of information among parallel search threads. The primary advantage, if any, of genetic
algorithms comes from the crossover operation. Yet it can be shown mathematically that, if the positions
of the genetic code are permuted initially in a random order, crossover conveys no advantage. Intuitively,
the advantage comes from the ability of crossover to combine large blocks of letters that have evolved
independently to perform useful functions, thus raising the level of granularity at which the search
operates. For example, it could be that putting the first three queens in positions 2, 4, and 6 (where they
do not attack each other) constitutes a useful block that can be combined with other blocks toconstruct a
solution.

SCHEMA AND INSTANCE

The theory of genetic algorithms explains how this works using the idea of a schema,which is a
substring in which some of the positions can be left unspecified. For example,the schema 246*****
describes all 8-queens states in which the first three queens are in positions 2, 4, and 6, respectively.
Strings that match the schema (such as 24613578) are called instances of the schema. It can be shown
that if the average fitness of the instances ofa schema is above the mean, then the number of instances of
the schema within the population will grow over time. Clearly, this effect is unlikely to be significant if
adjacent bits are totally unrelated to each other, because then there will be few contiguous blocks that
provide a consistent benefit. Genetic algorithms work best when schemata correspond to meaningful
components of a solution. For example, if the string is a representation of an antenna, then the schemata
may represent components of the antenna, such as reflectors and deflectors.

ALGORITHM:
APPLICATION OF GENETIC ALGORITHM

In practice, genetic algorithms have had a widespread impact on optimization problems, such as circuit
layout and job-shop scheduling. At present, it is not clear whether the appeal of genetic algorithms arises
from their performance or from their æsthetically pleasing origins in the theory of evolution. Much work
remains to be done to identify the conditions under which genetic algorithms perform well.

LOCAL SEARCH IN CONTINUOUS SPACE


---------------REFER HANDWRITTEN NOTES----------

SEARCHING WITH NON-DETERMINISTIC ACTIONS

➢ We assumed that the environment is fully observable and deterministic and that the agent
knows what the effects of each action are. Therefore, the agent can calculate exactly which state
results from any sequence of actions and always knows which state it is in. Its percepts provide
no new information after each action, although of course they tell the agent the initial state.

➢ When the environment is either partially observable or nondeterministic (or both), percepts
become useful. In a partially observable environment, every percept helps narrow down the set of
possible states the agent might be in, thus making it easier for the agent to achieve its goals.
When the environment is nondeterministic, percepts tell the agent which of the possible outcomes
of its actions has actually occurred. In both cases, the future percepts cannot be determined in
advance and the agent’s future actions will depend on those future percepts.So the solution to a
problem is not a sequence but a contingency plan (also known as a strategy) that specifies what
to do depending on what percepts are received.

The erratic vacuum world

The state space has eight states, as shown in Figure 4.9. There are three actions—Left, Right, and
Suck—and the goal is to clean up all the dirt (states 7 and 8).

Now suppose that we introduce nondeterminism in the form of a powerful but erratic vacuum cleaner. In
the erratic vacuum world, the Suck action works as follows:

• When applied to a dirty square the action cleans the square and sometimes cleans up

dirt in an adjacent square, too.


• When applied to a clean square the action sometimes deposits dirt on the carpet.
TRANSITION MODEL IN NON-DETERMINISTIC ENVIRONMENT

To provide a precise formulation of this problem, we need to generalize the notion of a transition
model . Instead of defining the transition model by a RESULT function that returns a single state, we use a
RESULTS function that returns a set of possible outcome states. For example, in the erratic vacuum world,

the Suck action in state 1 leads to a state in the set {5, 7}—the dirt in the right-hand square may or may

not be vacuumed up.

SOLUTION:

We also need to generalize the notion of a solution to the problem. For example, if we start in state 1,
there is no single sequence of actions that solves the problem. Instead, we need a contingency plan such
as the following:

[Suck, if State =5 then [Right, Suck] else [ ]]

Thus, solutions for nondeterministic problems can contain nested if–then–else statements;this means that
they are trees rather than sequences. This allows the selection of actions based on contingencies arising
during execution. Many problems in the real, physical world are contingency problems because exact
prediction is impossible. For this reason, many people keep their eyes open while walking around or
driving.
AND-OR SEARCH TREE

In a deterministic environment, the only branching is introduced by the agent’s own choices in each state.
We call these nodes OR nodes. In the vacuum world, for example, at an OR
node the agent chooses Left or Right or Suck.

In a nondeterministic environment, branching is also introduced by the environment’s choice of outcome


for each action. We call these nodes AND nodes. For example, the Suck action in state 1 leads to a state in
the set {5, 7},so the agent would need to find a plan for state 5 and for state 7. These two kinds of nodes
alternate, leading to an AND–OR tree as illustrated in Figure 4.10.
A solution for an AND–OR search problem is a subtree that

(1) has a goal node at every leaf,


(2) specifies one action at each of its OR nodes, and
(3) includes every outcome branch
at each of its AND nodes.
INTERLEAVING

One may also consider a somewhat different agent design, in which the agent can act before it has
found a guaranteed plan and deals with some contingencies only as they arise INTERLEAVING during execution.
This type of interleaving of search and execution is also useful for exploration problems and for game
playing

Figure 4.11 gives a recursive, depth-first algorithm for AND–OR graph search. One key aspect of
the algorithm is the way in which it deals with cycles, which often arise in nondeterministic problems
(e.g., if an action sometimes has no effect or if an unintended effect can be corrected). If the current state
is identical to a state on the path from the root, then it returns with failure. This doesn’t mean that there is
no solution from the current state;it simply means that if there is a noncyclic solution, it must be reachable
from the earlier incarnation of the current state, so the new incarnation can be discarded.

With this check, we ensure that the algorithm terminates in every finite state space, because every
path must reach a goal, a dead end, or a repeated state. Notice that the algorithm does not check whether
the current state is a repetition of a state on some other path from the root, which is important for
efficiency.

EXPLORATION OF AND-OR GRAPH:

AND–OR graphs can also be explored by breadth-first or best-first methods. The concept of a heuristic
function must be modified to estimate the cost of a contingent solution rather than a sequence, but the
notion of admissibility carries over and there is an analog of the A∗ algorithm for finding optimal
solutions.

SEARCHING WITH PARTIAL OBSERVATION

The key concept required for solving partially observable problems is the belief state, representing the
agent’s current belief about the possible physical states it might be in, given the sequence of actions and
percepts up to that point.
SEARCHING WITH NO OBSERVATION

When the agent’s percept provides no information at all, we have what is called a sensor less
problem or sometimes a conformant problem. At first, one might think the sensor less agent has no hope
of solving a problem if it has no idea what state it’s in; in fact, sensor less problems are quite often
solvable. Moreover, sensor less agents can be surprisingly useful, primarily because they don’t rely on
sensors working properly.

In manufacturing systems, for example, many ingenious methods have been developed for
orienting parts correctly from an unknown initial position by using a sequence of actions with no sensing
at all. The high cost of sensing is another reason to avoid it: for example, doctors often prescribe a broad-
spectrum antibiotic rather than using the contingent plan of doing an expensive blood test, then waiting
for the results to come back, and then prescribing a more specific antibiotic and perhaps hospitalization
because the infection has progressed too far.

We can make a sensor less version of the vacuum world. Assume that the agent knows the
geography of its world, but doesn’t know its location or the distribution of dirt. In that case, its initial state

could be any element of the set {1, 2, 3, 4, 5, 6, 7, 8}. Now, consider what happens if it tries the

action Right. This will cause it to be in one of the states {2, 4, 6, 8}—the agent now has more

information! Furthermore, the action sequence [Right, Suck] will always end up in one of the states {4,

8}. Finally, the sequence [Right, Suck, Left, Suck] is guaranteed to reach the goal state 7 no matter what

the start state. We say that the agent can coerce the world into state 7.
Steps in Solving a problem in No Observation Environment:
Searching with Partial Observation:

➢ For a general partially observable problem, we have to specify how the environment generates
percepts for the agent. For example, we might define the local-sensing vacuum world to be one in
which the agent has a position sensor and a local dirt sensor but has no sensor capable of
detecting dirt in other squares.
➢ The formal problem specification includes a PERCEPT(s) function that returns the percept
received in a given state. (If sensing is nondeterministic, then we use a PERCEPTS function that
returns a set of possible percepts.) For example, in the local-sensing vacuum world, the
PERCEPT in state 1 is [A, Dirty]. Fully observable problems are a special case in which
PERCEPT(s)=s for every state s, while sensorless problems are a special case in which
PERCEPT(s)=null.
➢ When observations are partial, it will usually be the case that several states could have produced
any given percept. For example, the percept [A, Dirty] is produced by state 3 as well as by state 1.
Hence, given this as the initial percept, the initial belief state for the local-sensing vacuum world
will be {1, 3}. The ACTIONS, STEP-COST, and GOAL-TEST are constructed from the
underlying physical problem just as for sensorless problems, but the transition model is a bit more
complicated. We can think of transitions from one belief state to the next for a particular action as
occurring in three stages.
1. The prediction stage is the same as for sensorless problems: given the action a in belief state b,
the predicted belief state is ˆb =PREDICT(b, a).11
2. The observation prediction stage determines the set of percepts o that could be observed in the
predicted belief state:

POSSIBLE-PERCEPTS(ˆ b) = {o : o=PERCEPT(s) and s ∈ ˆ b} .

3. The update stage determines, for each possible percept, the belief state that would result from the
percept. The new belief state bo is just the set of states in ˆb that could have produced the
percept:

bo = UPDATE(ˆb, o) = {s : o= PERCEPT(s) and s ∈ ˆ b} .

Notice that each updated belief state bo can be no larger than the predicted belief state ˆ b; observations
can only help reduce uncertainty compared to the sensorless case. Moreover, for deterministic sensing,
the belief states for the different possible percepts will be disjoint, forming a partition of the original
predicted belief state.
Solving partially observable problems

The preceding section showed how to derive the RESULTS function for a nondeterministic belief-state
problem from an underlying physical problem and the PERCEPT function. Given
An agent for partially observable environments:

The design of a problem-solving agent for partially observable environments is quite similar to the simple
problem-solving agent in Figure 3.1: the agent formulates a problem, calls a search algorithm (such as
AND-OR-GRAPH-SEARCH) to solve it, and executes the solution. There are two main differences.
First, the solution to a problem will be a conditional plan rather than a sequence; if the first step is an if–
then–else expression, the agent will need to test the condition in the if-part and execute the then-part or
the else-part accordingly. Second, the agent will need to maintain its belief state as it performs actions and
receives percepts. This process resembles the prediction–observation–update process in Equation (4.5)
but is actually simpler because the percept is given by the environment rather than calculated by the

➢ Figure 4.17 shows the belief state being maintained in the kindergarten vacuum world with local
sensing, wherein any square may become dirty at any time unless the agent is actively cleaning it
at that moment.
➢ In partially observable environments—which include the vast majority of real-world
environments—maintaining one’s belief state is a core function of any intelligent system. This
function goes under various names, including monitoring, filtering and state estimation.
Equation (4.6) is called a recursive state estimator because it computes the new belief state from
the previous one rather than by examining the entire percept sequence.
➢ If the agent is not to “fall behind,” the computation has to happen as fast as percepts are coming
in. As the environment becomes more complex, the exact update computation becomes infeasible
and the agent will have to compute an approximate belief state, perhaps focusing on the
implications of the percept for the aspects of the environment that are of current interest.
ONLINE SEARCH AGENTS AND UNKNOWN ENVIRONMENTS

OFFLINE SEARCH AGENT

They compute a complete solution before setting foot in the real world and then execute the
solution.

ONLINE SEARCH AGENT

In contrast, an online search agent interleaves computation and action: first, it takes an action,
then it observes the environment and computes the next action.

After each action, an online agent receives a percept telling it what state it has reached; from this
information, it can augment its map of the environment. The current map is used to decide where
to go next. This interleaving of planning and action means that online search algorithms are quite
different from the offline search algorithms.

An online depth-first search agent is shown in Figure 4.21.


This agent stores its map in a table, RESULT[s, a], that records the state resulting from executing
action a in state s. Whenever an action from the current state has not been explored, the agent tries
that action. The difficulty comes when the agent has tried all the actions in a state. In offline depth-
first search, the state is simply dropped from the queue; in an online search, the agent has to
backtrack physically. In depth-first search, this means going back to the state from which the agent
most recently entered the current state. To achieve that, the algorithm keeps a table that lists, for
each state, the predecessor states to which the agent has not yet backtracked. If the agent has run
out of states to which it can backtrack, then its search is complete.

SIGNIFICANCE OF ONLINE SEARCH AGENT

 Online search is a good idea in dynamic or semi-dynamic domains—domains where there


is a penalty for sitting around and computing too long.
 Online search is also helpful in nondeterministic domains because it allows the agent to
focus its computational efforts on the contingencies that actually arise rather than those
that might happen but probably won’t.

4.5.1 Online search problems

An online search problem must be solved by an agent executing actions, rather than by pure
computation. We assume a deterministic and fully observable environment but we stipulate that
the agent knows only the following:

 A CTIONS(s), which returns a list of actions allowed in state s;


 The step-cost function c(s, a, s′)—note that this cannot be used until the agent knows
that s′ is the outcome; and
 GOAL -T EST(s).
Note in particular that the agent cannot determine R ESULT(s, a) except by actually being in s
and doing a. For example, in the maze problem shown in Figure 4.19, the agent does not know that
going Up from (1,1) leads to (1,2); nor, having done that, does it know that going Down will take
it back to (1,1). Finally, the agent might have access to an admissible heuristic function h(s) that
estimates the distance from the current state to a goal state. For example, in Figure 4.19, the agent
might know the location of the goal and be able to use the Manhattan-distance heuristic.
COMPETITIVE RATIO:

Typically, the agent’s objective is to reach a goal state while minimizing cost. (Another possible
objective is simply to explore the entire environment.) The cost is the total path cost of the path
that the agent actually travels. It is common to compare this cost with the path cost of the path the
agent would follow if it knew the search space in advance—that is, the actual shortest path (or
shortest complete exploration). In the language of online algorithms, this is called the competitive
ratio; we would like it to be as small as possible.

ADVERSARY ARGUMENT:

 If some actions are irreversible—IRREVERSIBLE i.e., they lead to a state from which no
action leads back to the previous state—the online search might accidentally reach a dead-
end state from which no goal state is reachable.Our claim, to be more precise, is that no
algorithm can avoid dead ends in all state spaces.
 To an online search algorithm that has visited states S and A, the two state spaces look
identical, so it must make the same decision in both. Therefore, it will fail in one of them.
This is an example of an adversary argument.

ONLINE LOCAL SEARCH:

Like depth-first search, hill-climbing search has the property of locality in its node expan sions. In fact,
because it keeps just one current state in memory, hill-climbing search is already an online search
algorithm! Unfortunately, it is not very useful in its simplest form because it leaves the agent sitting at
local maxima with nowhere to go. Moreover, random restarts cannot be used, because the agent cannot
transport itself to a new state.

Augmenting hill climbing with memory rather than randomness turns out to be a more effective approach.

LRTA* algorithm:

The basic idea is to store a “current best estimate” H(s) of the cost to reach the goal from each state that
has been visited. H(s) starts out being just the heuristic estimate h(s) and is updated as the agent gains
experience in the state space.
Figure 4.23 shows a simple example in a one-dimensional state space. In (a), the agent seems to be stuck
in a flat local minimum at the shaded state.

 Rather than staying where it is, the agent should follow what seems to be the best path to the
goal given the current cost estimates for its neighbors.
 The estimated cost to reach the goal through a neighbor s is the cost to get to s plus the estimated
cost to get to a goal from there—that is, c(s,a,s)+H(s).
 There are two actions in the example, with estimated costs 1+9and 1+2, so it seems best to move
right.
 Now, it is clear that the cost estimate of 2 for the shaded state was overly optimistic.
 Since the best move cost 1 and led to a state that is at least 2 steps from a goal, the shaded state
must be at least 3 steps from a goal, so its H should be updated accordingly, as shown in Figure
4.23(b).
 Continuing this process, the agent will move back and forth twice more, updating H each time and
“flattening out” the local minimum until it escapes to the right.

An LRTA∗agent is guaranteed to find goal in any finite, safely explorable environment. Unlike A∗, however,
it is not complete for infinite state spaces—there are cases where it can be led infinitely astray. It can
explore an environment of n states in O(n2) steps in the worst case.

LEARNING IN ONLINE SEARCH:

The initial ignorance of online search agents provides several opportunities for learning.

First, the agents learn a “map” of the environment—more precisely, the outcome of each action in each
state—simply by recording each of their experiences. (Notice that the assumption of deterministic
environments means that one experience is enough for each action.)

Second, the local search agents acquire more accurate estimates of the cost of each state by using local
updating rules, as in LRTA∗.

You might also like