L20_Imperative -- Blocks and Procedures
L20_Imperative -- Blocks and Procedures
We have seen so far that the traditional imperative model of computation has as its
computational elements “statements” or “commands” that alter the program state.
As expected, these include an
• an identity command — skip,
• assignment as the basic operation — x := e,
• sequential composition — c1; c2,
• conditionals — if b then c1 else c2 fi — and
• repetition — while b do c1 od
But what are the basic data on which these operations act? The “program state”.
Now so far we have treated the program state as a table, i.e., a mapping from
variables to values (answers). However, the reality underlying this model is that the
program state can be decomposed into two mappings, the first a finite domain
function (called a “binding”) ℓ : 𝒳 ⇀fin L oc from variables to a set of locations
L oc, and another finite domain function (called a “store”) σ : L oc ⇀fin Ans from
locations to values (answers). So what we saw as a table γ is actually their
composition ℓσ : 𝒳 ⇀fin Ans.
The reason to do so is that in the imperative model, during the course of execution
of a command, the bindings do not change; it is only the store that changes.
That is, the memory location with which a variable is associated remains the same;
it’s the contents of the locations that are changed by executing a command.
In forming the composition between the binding ℓ : 𝒳 ⇀fin L oc and the store
-
• ℓ ⊢ σ −[sk ip]→ σ
ℓσ ⊢ e ⟹ a
•
evaluation
ℓ ⊢ σ −[x := e]→ σ[ℓ(x) ↦ a] eager
ℓ ⊢ σ −[c1]→ σ1 ℓ ⊢ σ1 −[c2]→ σ2
• —
ℓ ⊢ σ −[c1; c2]→ σ2
ℓσ ⊢ b ⟹ T ℓ ⊢ σ −[c1]→ σ1
• ℓ ⊢ σ −[if b then c1 else c2 fi]→ σ1
ℓσ ⊢ b ⟹ F ℓ ⊢ σ −[c2]→ σ2
• ℓ ⊢ σ −[if b then c1 else c2 fi]→ σ2
ℓσ ⊢ b ⟹ F
• ℓ ⊢ σ −[while b do c1 od]→ σ
ℓσ ⊢ b ⟹ T ℓ ⊢ σ −[c1]→ σ1 ℓ ⊢ σ1 −[while b do c1 od]→ σ2
• ℓ ⊢ σ −[while b do c1 od]→ σ2
Recall that the Principle of Qualification stated that any syntactic category can be
qualified by a definition.
During the course, we have also made type definitions in languages such as OCaml.
Both constant definitions and type definitions are present in languages like Pascal
or C, usually in that order (since definitions of array types often required a size,
which is specified as a constant).
The types mentioned in these variable declarations are important in these languages
for another reason — they indicated the amount of memory to allocate for each
variable.
The listing of multiple variables having the same type is merely a convenience for
readability. In a simplified form, we could have written “x: integer; y:
integer”. It is a matter of taste whether you prefer the Pascal style (where the
“<variable> : <type>” can be read analogous to set membership) or the C style
(where the type is written first, which can be seen similar to a predicate notation).
Note that new bindings are being created in these declarations. Thus we can
present an (incremental) elaboration rule for variable declarations as follows:
Now we can state the operational rule for a block (i.e., a command qualified with
local variable definitions
Here, the variable declarations create new bindings (written as ℓ +) and the
corresponding store for the newly allocated locations (written as σ +). Then we
execute the command in the block to obtain a resulting store σ′, which is then
pruned removing all the newly allocated locations. (This is specified by writing
σ′|dom(σ) , the restriction of σ′ to the domain of σ). Note that on exiting the block
the bindings are restored to the original.
The abstractions of commands are called procedures. (In C, these are loosely called
“functions”, but these are really value-returning procedures, since they can have
side-effects on global variables.) The Principle says that procedure calls can be
included in the syntax of commands.
This is a definitional form where the name p is being given to the command (block)
abstract. Here x1 : τ1, …, xn : τn constitutes the list of formal parameters of the
procedure, and y1 : τ′1, …, ym : τ′m the list of local variables of the procedure. The
body of the procedure is the command c, which can operate over the all the
variables x1, …, xn and y1, …, ym as well as any “global” variables declared outside
the procedure.
The syntax for an invocation of this procedure of the form p(e1, …, en), and this can
appear wherever any command can appear. We refer to the tuple of expressions
(e1, …, en) as the actual arguments to the procedure in that call.
Note that most imperative languages do not consider procedural abstracts as first-
class citizens (so the space of defined procedures is distinct from the syntactic space
of commands, though commands which now include procedure calls).
For a procedure definition as given above, the operational semantics for a “call-by-
value” procedural call is as follows:
where l1, …, ln are all distinct locations and fresh with respect to
range(ℓ ) ∪ dom(σ).
Note that we execute the body of the procedure as a block, and restore the bindings
to those that prevailed before the procedure call, but retains the effects on the store
on the original locations, pruning away only the newly allocated locations, whether
for the actual arguments or those for local variables in the body of the procedure.
Consider the Pascal procedure, intended to swap the contents of two variables.
written in C as
temp = x;
x = y;
y = temp;
}
Now, within the procedure call, the contents of these three locations are changed as
specified. However, all these three locations are deallocated (by popping the stack).
However, the contents of the argument variables a and b are unaffected.
Note however, if we had inlined the body of the function, substituting the actual
argument names for the formal parameters, then execution would have indeed
swapped their contents:
temp := a;
a := b;
b := temp
So instead of passing the results of evaluating the arguments, i.e., the contents of
variables a and b, we pass the locations to which variables a and b refer. This
difference is indicated in the header of the procedure declaration in Pascal:
The var in the parameter declaration indicates to the compiler that the arguments’
locations have to be passed. Now, the compiler will be informed that the
arguments
temp = *x;
*x = *y;
*y = temp;
}
Note that as the procedure is executing, the contents of the actual arguments change
as the procedure executes. So there isn’t a pleasant modular aspect to procedure
execution.
(This also has security implications — if for some reason, control has to abruptly
return from the procedure call to the calling routine, there may be inconsistent
changes to the “global variables”. Even with normal return, if the stack can be
corrupted somehow, by writing into parts of it below the logical boundary of the
procedure, one can cause induce bugs by “stack smashing”, where by over-writing
the return address, control “returns” to a location where a malicious program begins
to run.)