|
|
Resolving the General Dangling Else/If-Else Ambiguity
Description of the AmbiguityCommonly, the syntax for if-else statements is written thus:
if statement
-> if clause, statement
-> if clause, statement, "else", statement
statement
-> simple statement
-> if statement
-> loop statement
where simple statement subsumes expression statements, block
statements, break and continue statements, and even the do/while
statement; in short, any statement which is neither left nor right
recursive. loop statement subsumes while statements and for
statements, i.e., right recursive statements of the form:
loop statement
-> loop header, statement
This syntax is ambiguous, as can be illustrated by the following example:
if (conditionA) if (conditionB) statementA; else statementB;Using the syntax given above, this statement can be interpreted either as:
if (conditionA) {
if (conditionB) statementA; else statementB;
}
or as:
if (conditionA) {
if (conditionB) statementA;
}
else statementB;
Both interpretations are consistent with the syntax definition given
above, but they have very different outcomes. Conventionally, parsers are
rigged using some sort of trick to select the first interpretation. This
trick is often called a "disambiguation rule". In AnaGram, you could use
the sticky attribute statement:
[
sticky {statement}
]
However, such tricks are not necessary, since the statement syntax can be
made conflict-free as described below. A simpler conflict-free syntax for
statements is provided as an example in section 4.3 of "Compilers:
Principles, Techniques and Tools" by Aho, Sethi and Ullman, but their
syntax is incomplete in that it does not, in fact, provide for
right-recursive statements such as while and for.
An Unambiguous Syntax for StatementsThe problem with the conventional syntax for statements is that it represents a simple taxonomy of statements rather than an appropriate description of syntax. Although it is true that there are if statements, loop statements and simple statements, this is not the most useful way to classify statements from a syntactic point of view.The following syntax requires a following else to be paired with the most recent unpaired if, thus disallowing the if-else ambiguity. We define an open statement as one which has at least one if that is not paired with a following else within the statement. A closed statement either does not contain an if at all, or if it does, the if is paired with a following else. We can then write the statement syntax as follows:
statement
-> open statement
-> closed statement
open statement
-> if clause, statement
-> if clause, closed statement, "else", open statement
-> loop header, open statement
closed statement
-> simple statement
-> if clause, closed statement, "else", closed statement
-> loop header, closed statement
In this syntax, we allow only closed statements between an if and its
matching else. In other words, between an if and an else an if is allowed
only if it is paired with a matching else clause. The net effect is
that each else is associated with the nearest preceding if.
In the next section we will show how this syntax can be developed by suitably rewriting our original syntax.
Developing the Conflict-Free SyntaxTo see how to resolve the ambiguity and develop the above syntax, let us begin with the customary ambiguous syntax described in the first section and try rewriting statement as
statement
-> open statement
-> closed statement
where, as described above, open statement is any statement that has
a "dangling" if, that is, an if which could be paired with a
following else. A closed statement is one which does not have a
dangling if.
Substituting the above into the second rule for if statement yields:
if statement
-> if clause, statement
-> if clause, closed statement, "else", statement
-> if clause, open statement, "else", statement
Clearly, the last of these three rules for if statement is always
ambiguous, since open statement, by definition, contains one or more
ifs not paired with elses, and thus it is not clear which
if should be associated with the else.
Therefore, let us eliminate the last rule, leaving our rules for if statement as follows:
if statement
-> if clause, statement
-> if clause, closed statement, "else", statement
One could ask whether it is legitimate to discard a rule so cavalierly. In
fact it can be shown, by means of algebraic manipulations too tedious to
include here, that all sentences produced by the discarded rule can also be
produced by the remaining rules. Thus the discarded rule adds nothing to
the grammar but ambiguity.
Now, it remains to determine just which rules for statement belong to open statement and which to closed statement. To this end, we expand the original rules for statement using the two rules above:
statement
-> simple statement
-> if clause, statement
-> if clause, closed statement, "else", statement
-> loop header, statement
The first rule is clearly closed. The second rule is clearly open since it
contains at least one unpaired if. The
last two rules cannot be classified as they stand, since they would be
open or closed depending on whether the final statement were
open or closed. Therefore, we expand the final token of the last two
rules to yield:
statement
-> simple statement
-> if clause, statement
-> if clause, closed statement, "else", open statement
-> if clause, closed statement, "else", closed statement
-> loop header, open statement
-> loop header, closed statement
Now, we can reorder the above rules:
statement
-> if clause, statement
-> if clause, closed statement, "else", open statement
-> loop header, open statement
-> simple statement
-> if clause, closed statement, "else", closed statement
-> loop header, closed statement
By our definitions, the first three rules are open statements
and the last three rules are closed statements. So we can
finally rewrite the syntax:
statement
-> open statement
-> closed statement
open statement
-> if clause, statement
-> if clause, closed statement, "else", open statement
-> loop header, open statement
closed statement
-> simple statement
-> if clause, closed statement, "else", closed statement
-> loop header, closed statement
yielding the conflict-free syntax presented in the previous section.
|
| XIDEK Interpreter Kit Table of Contents | | | Parsifal Software Home Page |
|---|