Compiling LISP Procedures
Compiling LISP Procedures
<expression > ::= ( < f u n c t i o n > < a r g - l > ... < a r g - n > ) An argument value will be c o m p u t e d using the a v a i l -
::= < i d e n t i f i e r > able cpu registers and then t e m p o r a r i l y stored on top of
<function >
the stack (so t h a t the cpu registers can be used to c o m -
<arg-1 > ::= < c o n s t a n t > I < e x p r e s s i o n >
pute the next a r g u m e n t value). The final argument value,
Figure 2 say argument value n, will not be pushed o n t o the stack
only to be i m m e d i a t e l y popped off again. Instead the final
argument value will be computed and m o v e d directly into
register Rn. See the next paragraph.
Note that figure 2 uses an extyended BNF notation.
The three dots are intended to mean "any n u m b e r of o c - A r g u m e n t values will be passed in the cpu registers
currences of." Also note that time g r a m m a r is recursive - - - - a r g u m e n t value 1 in R1, argument value 2 in R2, and so
that is, the syntactic unit < e p r e s s i o n > is defined in terms forth. Hence, after all of the a r g u m e n t values have been
of < a r g - l > which is defined in terms of < e x p r e s s i o n > . c o m p u t e d and stacked (except a r g u m e n t value n, of
This circular definition expresses precisely and consicely course) they must be popped off of the stack and into the
the structure of the t y p e of LISP expressions making up correct cpu register.
the first subset of LISP to be compiled.
Note: the business about not stacking a r g u m e n t value
An example of an expression defined by the g r a m m a r n is, in fact, a first a t t e m p t at optimizing the code
of Figure 2 is produced by the compiler. If all argument values were
treated the same way, the result would be that i m -
<f (g A) (h B) >
mediately after computing the final argument value and
w h e r e f, g, and h are asumed to be the names of pushing it o n t o the stack it w o u l d just have to be popped
user-defined or built-in procedures and A and B are c o n - off of the stack into register Rn. The c o n v e n t i o n adopted
stant arguments. Using the LISP c o n v e n t i o n of quoting here will save one pair of p u s h - p o p instructions. Not
constants and reverting to a case insensitive notation the much of a saving? The saving is one pair of p u s h - p o p in-
expression to be compiled b e c o m e s structions FOR EVERY FUNCTION CALLEDt The saving is
positively w o r t h the effort to achieve it.
< F < G (QUOTE A) > < H (QUOTE B) > >
Function values are to be returned in RI. When a
--- An Aside (Or Four) function is called, it is free to use the cpu registers
Four additional assumptions need to be presented b e - a n y w a y it needs to w i t h o u t having to restore t h e m before
fore the compiler can be discussed. returning. If any of the cpu registers contain information
to be used after the function call returns, this information
First, the LiSP procedure to be compiled is assumed must be stacked before the the function is called and then
to be syntactically and semantically correct, although the restored after the function returns.
latter is not really important. This assumption is justified
by recognizing that the compiler for LISP procedures is it- --- The First C o m p i l e r for LISP P r o c e d u r e s
self a procedure built into an otherwise conventional LISP The compiler for the initial subset of LISP is shown in
system. (See assumptions two and three below.) The Figure 3. The c o m p i l e r is presented in a subset of LISP
primary result of this assumption is that the compiler for which should run on virtually all of the LISP i m p l e m e n -
LISP procedures does not have to do any error checking. tations in existence. In particular, you do not need a v e r -
Second, the LISP procedure to be compiled is a s - sion of COMMON LISP [Steel 1984] to i m p l e m e n t the c o m -
sumed to be in its usual LISP form - - that is, it is a binary piler shown here.
tree. Presumably it was developed using a standard LISP
The c o m p i l e r comprises the main procedure COMPEXP
system. An i m p o r t a n t result of this assumption is that the
(COMPile EXPression) and t w o subsidiary procedures COM-
compiler for LISP procedures does not have to do either a
PLIS (COMPile a r g u m e n t LiSt) and COMPAPPLY (COMPile a
lexical analysis or a syntax analysis of the procedure to be
function APPLication).
compiled. Since the compiler starts with the equivalent of
a conventional parse tree, all it really has to be concerned The h i g h e s t - l e v e l procedure, COMPEXP, accepts the
with is the traditional code generation phase associated expression to be compiled and returns a list representing
with the compilation of o r d i n a ~ h i g h - l e v e l languages. the sequence of target language instructions to be carried
out in order to achieve the effect of evaluating the original
Third, the language used to i m p l e m e n t the c o m p i l e r is
input expression. COMPEXP begins by calling ISCONST to
going to be LISP. The result of this assumption is that the
see if the expression to be compiled is a constant. If so,
LISP compiler is just another LISP procedure. Which raises COMPEXP calls on MKSEND to generate a " m o v e
an interesting possibility - - once the c o m p i l e r is available immediate" instruction, If the expression handed to COM-
it could be used to compile itself. The result should be a PEXP is not a constant then it must be a function applica-
fast LISP compiler. tion. COMPEXP calls COMPLIS to compile the a r g u m e n t
Fourth, the t a r g e t language is also LISP. That is, the list and then calls COMPAPPLY to compile the actual f u n c -
first LISP compiler will generate LISP code. LISP tion call.
aficionados will understand this i m m e d i a t e l y - - others The subsidiary procedure COMPLIS accepts an ar-
please be tolerant and patient. Future reports w i l l in- g u m e n t list to be compiled and returns a list representing
crementally lead to a LISP compiler, written in LISP, which the sequence of target language instructions to be carried
will generate native machine code. out in order to achieve the effect of evaluating the original
--- R u n - t i m e Fun©tion Invocation C o n v e n t i o n s argument list. COMPLIS begins by checking to see if the
S I G A R T N e w s l e t t e r , J a n u a r y 1987, N u m b e r 99 P a g e 28
::: THE PRIMARY PROCEDURES ::: THE RECOGNIZER PROCEDURE
(DEFUN COMPEXP (EXP) (DEFUN ISCONST (X)
(COND ((ISCONST EXP) (OR (NUMBERP X)
(LIST (MKSEND 1 EXP))) (EQ X T)
(T (COMPAPPLY (FUNC EXP) (EQ X NIL)
(COMPLIS (ARGLIST EXP)) (AND (NOT (ATOM X))
(LENGTH (ARGLIST EXP)))) (EQ (FIRST X) ' QUOTE))
)) ))
(DEFUN COMPLIS (U)
(COND ((NULL U ) ' 0 ) ::: THE SELECTOR PROCEDURES
((NULL (REST U)) (DEFUN FUNC (X) (FIRST X))
(COMPEXP (FIRST U))) (DEFUN ARGLIST (X) (REST X))
(T (APPEND-3 (COMPEXP (FIRST U))
(LIST (MKALLOC 1)) ::: THE CONSTRUCTOR PROCEDURES
(COMPELS (REST U)))) (DEFUN MKSEND (DEST VAL) (LIST 'MOVEI DEST VAt..))
)) (DEFUN MKALLOC (DEST) (LIST 'PUSH 'SP DEST))
(DEFUN COMPAPPLY (FN VALS N) (DEFUN MKCALL (FN) (LIST 'CALL FN))
(APPEND-3 VALS (DEFUN MKLINK (N)
(MKLINK N) (COND ((= N 1) '0)
(LIST (MKCALL FN)) (T (CONCAT (MKMOVE N1)
)) (MKLINK1 (SUB1 N))))
Figure 3 ))
(DEFUN MKLINK1 (N)
(COND ((ZEROP N) ' 0)
argument list handed to it is empty. ( Note: COMPLIS calls (T (CONCAT (MKPOP N)
itself and this test is necessary to prevent an infinite (MKLINK1 (SUB1 N))))
recursion. ) If so, COMPLIS simply returns an empty list - - ))
that is, the compilation of an empty argument list is an (DEFUN MKPOP (N) (LIST 'POP 'SP N))
empty list of target language instructions. (DEFUN MKMOVE (DEST VAL) (LIST 'MOVE DEST VAt.))
If the argument list is not empty COMPLIS next Figure 4
checks to see if it is a list of one element. If so, COMPLIS
calls COMPEXP to compile the one argument. COMPLIS
then simply returns the list of target language instructions
The selector procedure FUNC simply returns the first
returned to it by COMPEXP.
element of the list handed to it. The selector procedure
If the argument list consists of t w o or more elements ARGLIST simply deletes the first element of the list handed
COMPLIS calls on COMPEXP to compile the first argument to it and returns the rest.
and then calls itself recursively to compile the rest of the
The seven constructor procedures are used to
argument list. COMPLIS returns the list of target language
instructions that will evaluate the original argument list. generate the actual target language instructions. These
procedures contain all of the machine-specific details
COMPAPPLY accepts a function name, a list of target known to the compiler. (Note: The names of these
language instructions that will evaluate an argument list procedures are those used by Allen - - they relate to ear-
and an integer equal to the length of the original ar- lier, more abstract material in the book [Allen 1578].)
gument list. COMPAPPLY returns the list of target instruc-
tions handed to it extended with target instructions that MKSEND generates a move immediate instruction.
(1) will ensure that the argument values are all in the c o r - MKALLOC generates a push instruction.
rect cpu registers, and (2) will then call the function. The MKCALL generates a function call instruction.
list of target language instructions returned by COM- MKLINK AND MKLINK1 together generate the requisite
PAPPLY is the compiled form of the original expression MOVE POP POP...POP sequence to put the computed
handed to COMPEXP. argument values where they belong just prior to an
In the spirit of good LISP programming style, the actual function call.
three primary procedures of the compiler are supported by MKPOP generates a pop instruction.
a "recognizer" procedure < ISCONST >, t w o "selector" MKMOVE generates a load instruction.
procedures ( FUNC and ARGLIST ), and seven "constructor"
procedures (MKSEND, MKALLOC, MKCALL, MKLINK, MKLINK1, The compiler is written in a version of LISP which includes
MKPOP, and MKMOVE). The seven constructor procedures the procedures FIRST, REST, CONCAT, APPEND-3, and
are the target code generation procedures. The definitions LISTP. All of these procedures may or may not be avail-
for these ten recognizer, selector, and constructor able on a particular LISP system. They weren't available
procedures are shown in Figure 4. on the author's system, UTLISP (University of Texas LISP)
running on a CDC CYBER mainframe. Hence, again in the
The recognizer procedure ISCONST checks to see if spirit of good LISP programming style, they were simply
the argument handed to it is a constant. ISCONST r e c o g - defined in terms of the primitive procewdures actually
nizes the following constants: numbers, the LISP atom T, available. The definitions for these five auxiliary
the LISP atom NIL, and any quoted expression. procedures are shown in Figure 5.
(COMPEXP '( F ( G (QUOTE A) ) ( H (QUOTE B) ) ) ) The seventh instruction (POP SP 1) pops the value on
top of the stack into RI. After this instruction has been
The compiled form of the expression generated by COM- executed R1 will contain the value of the first argument to
PE×P is a list of eight target language instructions: the function F and R2 will contain the value of the second
( (MOVE I 1 (QUOTE A)) argument to F.
(CALL G)
(PUSH SP 1) The eighth instruction (CALL F) invokes the function F
(MOVE I 1 (QUOTE B)) with argument values (G A) and (F B). The value c o m -
(CALL H) puted by F using these argument values is returned in RI.
(MOVE 2 1) 4. FUTURE WORK
(POP SP I)
(CALL F) - - - Introduction
) The procedure COMPEXP developed above is capable
- - - E x e c u t i n g The Target Code of compiling a small but important subset of LISP into a
LISP pseudo-code. The development has been discussed
A trace of the execution of this compiled code is in depth in order to make clear the necessary background
shown, in Figure 6. material and to present the assumptions and conventions
adopted. Work already completed and work in progress is
concerned with extending COMPEXP in a number of
STEP INSTRUCTION R1 R2 STACK dimensions.
(Top...Bottom) - - - Compilers For Larger Subsets of LISP
0 ?? ?? ---
Compilers for larger subsets of LISP have already
1 (MOVEI 1 (QUOTE A)) A ?? ---
been completed. The first extension was to add the
2 (CALL G) A Ret-Add (G A)
capability of compiling L I S P conditional expressions.
(G A) ?? ---
These expressions have the general form
3 (PUSH SP 1) (G A) ?? (G A)
4 (MOVEI I (QUOTE B)) B ?? (G A) (COND (P1 El) (P2 E2) ... (Pn En) )
5 (CALL H) B ?? Ret-Add (G A) and are the LISP equivalent of the control structure
(H B) ?? (O A)
6 (MOVE 2 1) (H B) (H B) (GA) if P1 then E1 else
7 (POP SP 1) (G A) (H B) --- if P2 then E2 else
8 (CALL F) (G A) (H B) Ret-Add
(F) ?? --- if Pn then En