0% found this document useful (0 votes)
10 views33 pages

ComparativeProgramming(1)(2)(4)

Chapter 1 introduces programming languages, focusing on programming linguistics, historical development, and key concepts such as syntax, semantics, and pragmatics. It outlines the evolution of programming languages from the 1950s, highlighting major languages and paradigms, including imperative, object-oriented, and functional programming. The chapter emphasizes the importance of understanding programming language concepts for effective programming and the ongoing development of new languages and paradigms.

Uploaded by

snabonoy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views33 pages

ComparativeProgramming(1)(2)(4)

Chapter 1 introduces programming languages, focusing on programming linguistics, historical development, and key concepts such as syntax, semantics, and pragmatics. It outlines the evolution of programming languages from the 1950s, highlighting major languages and paradigms, including imperative, object-oriented, and functional programming. The chapter emphasizes the importance of understanding programming language concepts for effective programming and the ongoing development of new languages and paradigms.

Uploaded by

snabonoy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Chapter 1

Programming languages

In this chapter we shall:


• outline the discipline of programming linguistics, which is the study of program-
ming languages, encompassing concepts and paradigms, syntax, semantics, and
pragmatics, and language processors such as compilers and interpreters;
• briefly survey the historical development of programming languages, covering the
major programming languages and paradigms.

1.1 Programming linguistics


The first high-level programming languages were designed during the 1950s. Ever
since then, programming languages have been a fascinating and productive area
of study. Programmers endlessly debate the relative merits of their favorite pro-
gramming languages, sometimes with almost religious zeal. On a more academic
level, computer scientists search for ways to design programming languages that
combine expressive power with simplicity and efficiency.
We sometimes use the term programming linguistics to mean the study of
programming languages. This is by analogy with the older discipline of linguistics,
which is the study of natural languages. Both programming languages and natural
languages have syntax (form) and semantics (meaning). However, we cannot take
the analogy too far. Natural languages are far broader, more expressive, and
subtler than programming languages. A natural language is just what a human
population speaks and writes, so linguists are restricted to analyzing existing (and
dead) natural languages. On the other hand, programming linguists can not only
analyze existing programming languages; they can also design and specify new
programming languages, and they can implement these languages on computers.
Programming linguistics therefore has several aspects, which we discuss briefly
in the following subsections.

1.1.1 Concepts and paradigms


Every programming language is an artifact, and as such has been consciously
designed. Some programming languages have been designed by a single person
(such as C++), others by small groups (such as C and JAVA), and still others by
large groups (such as ADA).
A programming language, to be worthy of the name, must satisfy certain
fundamental requirements.

3
4 Chapter 1 Programming languages

A programming language must be universal. That is to say, every problem


must have a solution that can be programmed in the language, if that problem can
be solved at all by a computer. This might seem to be a very strong requirement,
but even a very small programming language can meet it. Any language in which
we can define recursive functions is universal. On the other hand, a language with
neither recursion nor iteration cannot be universal. Certain application languages
are not universal, but we do not generally classify them as programming languages.
A programming language should also be reasonably natural for solving prob-
lems, at least problems within its intended application area. For example, a
programming language whose only data types are numbers and arrays might be
natural for solving numerical problems, but would be less natural for solving prob-
lems in commerce or artificial intelligence. Conversely, a programming language
whose only data types are strings and lists would be an unnatural choice for solving
numerical problems.
A programming language must also be implementable on a computer. That is
to say, it must be possible to execute every well-formed program in the language.
Mathematical notation (in its full generality) is not implementable, because in
this notation it is possible to formulate problems that cannot be solved by any
computer. Natural languages also are not implementable, because they are impre-
cise and ambiguous. Therefore, mathematical notation and natural languages, for
entirely different reasons, cannot be classified as programming languages.
In practice, a programming language should be capable of an acceptably
efficient implementation. There is plenty of room for debate over what is acceptably
efficient, especially as the efficiency of a programming language implementation
is strongly influenced by the computer architecture. FORTRAN, C, and PASCAL
programmers might expect their programs to be almost as efficient (within a factor
of 2–4) as the corresponding assembly-language programs. PROLOG programmers
have to accept an order of magnitude lower efficiency, but would justify this on
the grounds that the language is far more natural within its own application area;
besides, they hope that new computer architectures will eventually appear that
are more suited for executing PROLOG programs than conventional architectures.
In Parts II and III of this book we shall study the concepts that underlie
the design of programming languages: data and types, variables and storage,
bindings and scope, procedural abstraction, data abstraction, generic abstraction,
type systems, control, and concurrency. Although few of us will ever design a
programming language (which is extremely difficult to do well), as programmers
we can all benefit by studying these concepts. Programming languages are our
most basic tools, and we must thoroughly master them to use them effectively.
Whenever we have to learn a new programming language and discover how it
can be effectively exploited to construct reliable and maintainable programs, and
whenever we have to decide which programming language is most suitable for
solving a given problem, we find that a good understanding of programming
language concepts is indispensable. We can master a new programming language
most effectively if we understand the underlying concepts that it shares with other
programming languages.
1.1 Programming linguistics 5

Just as important as the individual concepts are the ways in which they may
be put together to design complete programming languages. Different selections
of key concepts support radically different styles of programming, which are
called paradigms. There are six major paradigms. Imperative programming is
characterized by the use of variables, commands, and procedures; object-oriented
programming by the use of objects, classes, and inheritance; concurrent pro-
gramming by the use of concurrent processes, and various control abstractions;
functional programming by the use of functions; logic programming by the use of
relations; and scripting languages by the presence of very high-level features. We
shall study all of these paradigms in Part IV of this book.

1.1.2 Syntax, semantics, and pragmatics


Every programming language has syntax, semantics, and pragmatics. We have
seen that natural languages also have syntax and semantics, but pragmatics is
unique to programming languages.
• A programming language’s syntax is concerned with the form of programs:
how expressions, commands, declarations, and other constructs must be
arranged to make a well-formed program.
• A programming language’s semantics is concerned with the meaning of
programs: how a well-formed program may be expected to behave when
executed on a computer.
• A programming language’s pragmatics is concerned with the way in which
the language is intended to be used in practice.
Syntax influences how programs are written by the programmer, read by
other programmers, and parsed by the computer. Semantics determines how
programs are composed by the programmer, understood by other programmers,
and interpreted by the computer. Pragmatics influences how programmers are
expected to design and implement programs in practice. Syntax is important, but
semantics and pragmatics are more important still.
To underline this point, consider how an expert programmer thinks, given a
programming problem to solve. Firstly, the programmer decomposes the prob-
lem, identifying suitable program units (procedures, packages, abstract types,
or classes). Secondly, the programmer conceives a suitable implementation of
each program unit, deploying language concepts such as types, control structures,
exceptions, and so on. Lastly, the programmer codes each program unit. Only at
this last stage does the programming language’s syntax become relevant.
In this book we shall pay most attention to semantic and pragmatic issues. A
given construct might be provided in several programming languages, with varia-
tions in syntax that are essentially superficial. Semantic issues are more important.
We need to appreciate subtle differences in meaning between apparently similar
constructs. We need to see whether a given programming language confuses dis-
tinct concepts, or supports an important concept inadequately, or fails to support
it at all. In this book we study those concepts that are so important that they are
supported by a variety of programming languages.
6 Chapter 1 Programming languages

In order to avoid distracting syntactic variations, wherever possible we shall


illustrate each concept using the following programming languages: C, C++, JAVA,
and ADA. C is now middle-aged, and its design defects are numerous; however, it is
very widely known and used, and even its defects are instructive. C++ and JAVA are
modern and popular object-oriented languages. ADA is a programming language
that supports imperative, object-oriented, and concurrent programming. None of
these programming languages is by any means perfect. The ideal programming
language has not yet been designed, and is never likely to be!

1.1.3 Language processors


This book is concerned only with high-level languages, i.e., programming languages
that are (more or less) independent of the machines on which programs are
executed. High-level languages are implemented by compiling programs into
machine language, by interpreting them directly, or by some combination of
compilation and interpretation.
Any system for processing programs – executing programs, or preparing them
for execution – is called a language processor. Language processors include com-
pilers, interpreters, and auxiliary tools like source-code editors and debuggers.
We have seen that a programming language must be implementable. However,
this does not mean that programmers need to know in detail how a programming
language is implemented in order to understand it thoroughly. Accordingly,
implementation issues will receive limited attention in this book, except for a
short section (‘‘Implementation notes’’) at the end of each chapter.

1.2 Historical development


Today’s programming languages are the product of developments that started in
the 1950s. Numerous concepts have been invented, tested, and improved by being
incorporated in successive programming languages. With very few exceptions, the
design of each programming language has been strongly influenced by experience
with earlier languages. The following brief historical survey summarizes the
ancestry of the major programming languages and sketches the development of
the concepts introduced in this book. It also reminds us that today’s programming
languages are not the end product of developments in programming language
design; exciting new concepts, languages, and paradigms are still being developed,
and the programming language scene ten years from now will probably be rather
different from today’s.
Figure 1.1 summarizes the dates and ancestry of several important program-
ming languages. This is not the place for a comprehensive survey, so only the
major programming languages are mentioned.
FORTRAN was the earliest major high-level language. It introduced symbolic
expressions and arrays, and also procedures (‘‘subroutines’’) with parameters. In
other respects FORTRAN (in its original form) was fairly low-level; for example, con-
trol flow was largely effected by conditional and unconditional jumps. FORTRAN has
developed a long way from its original design; the latest version was standardized
as recently as 1997.
1.2 Historical development 7

object-oriented imperative concurrent functional logic


languages languages languages languages languages
1950

FORTRAN
LISP
ALGOL60 COBOL 1960

PL/I
SIMULA
ALGOL68
PASCAL 1970
SMALLTALK PROLOG
C

MODULA
ML
1980
ADA83
C++

HASKELL 1990

JAVA ADA95
Key:
C# major minor 2000
influence influence

Figure 1.1 Dates and ancestry of major programming languages.

COBOL was another early major high-level language. Its most important
contribution was the concept of data descriptions, a forerunner of today’s data
types. Like FORTRAN, COBOL’s control flow was fairly low-level. Also like FORTRAN,
COBOL has developed a long way from its original design, the latest version being
standardized in 2002.
ALGOL60 was the first major programming language to be designed for
communicating algorithms, not just for programming a computer. ALGOL60 intro-
duced the concept of block structure, whereby variables and procedures could
be declared wherever in the program they were needed. It was also the first
major programming language to support recursive procedures. ALGOL60 influ-
enced numerous successor languages so strongly that they are collectively called
ALGOL-like languages.
FORTRAN and ALGOL60 were most useful for numerical computation, and
COBOL for commercial data processing. PL/I was an attempt to design a
general-purpose programming language by merging features from all three. On
8 Chapter 1 Programming languages

top of these it introduced many new features, including low-level forms of excep-
tions and concurrency. The resulting language was huge, complex, incoherent,
and difficult to implement. The PL/I experience showed that simply piling feature
upon feature is a bad way to make a programming language more powerful and
general-purpose.
A better way to gain expressive power is to choose an adequate set of concepts
and allow them to be combined systematically. This was the design philosophy
of ALGOL68. For instance, starting with concepts such as integers, arrays, and
procedures, the ALGOL68 programmer can declare an array of integers, an array of
arrays, or an array of procedures; likewise, the programmer can define a procedure
whose parameter or result is an integer, an array, or another procedure.
PASCAL, however, turned out to be the most popular of the ALGOL-like
languages. It is simple, systematic, and efficiently implementable. PASCAL and
ALGOL68 were among the first major programming languages with both a rich
variety of control structures (conditional and iterative commands) and a rich
variety of data types (such as arrays, records, and recursive types).
C was originally designed to be the system programming language of the UNIX
operating system. The symbiotic relationship between C and UNIX has proved very
good for both of them. C is suitable for writing both low-level code (such as the
UNIX system kernel) and higher-level applications. However, its low-level features
are easily misused, resulting in code that is unportable and unmaintainable.
PASCAL’s powerful successor, ADA, introduced packages and generic units –
designed to aid the construction of large modular programs – as well as high-level
forms of exceptions and concurrency. Like PL/I, ADA was intended by its designers
to become the standard general-purpose programming language. Such a stated
ambition is perhaps very rash, and ADA also attracted a lot of criticism. (For
example, Tony Hoare quipped that PASCAL, like ALGOL60 before it, was a marked
advance on its successors!) The critics were wrong: ADA was very well designed,
is particularly suitable for developing high-quality (reliable, robust, maintainable,
efficient) software, and is the language of choice for mission-critical applications
in fields such as aerospace.
We can discern certain trends in the history of programming languages. One
has been a trend towards higher levels of abstraction. The mnemonics and symbolic
labels of assembly languages abstract away from operation codes and machine
addresses. Variables and assignment abstract away from inspection and updating
of storage locations. Data types abstract away from storage structures. Control
structures abstract away from jumps. Procedures abstract away from subroutines.
Packages achieve encapsulation, and thus improve modularity. Generic units
abstract procedures and packages away from the types of data on which they
operate, and thus improve reusability.
Another trend has been a proliferation of paradigms. Nearly all the languages
mentioned so far have supported imperative programming, which is characterized
by the use of commands and procedures that update variables. PL/I and ADA sup-
port concurrent programming, characterized by the use of concurrent processes.
However, other paradigms have also become popular and important.
1.2 Historical development 9

Object-oriented programming is based on classes of objects. An object has


variable components and is equipped with certain operations. Only these opera-
tions can access the object’s variable components. A class is a family of objects with
similar variable components and operations. Classes turn out to be convenient
reusable program units, and all the major object-oriented languages are equipped
with rich class libraries.
The concepts of object and class had their origins in SIMULA, yet another
ALGOL-like language. SMALLTALK was the earliest pure object-oriented language,
in which entire programs are constructed from classes.
C++ was designed by adding object-oriented concepts to C. C++ brought
together the C and object-oriented programming communities, and thus became
very popular. Nevertheless, its design is clumsy; it inherited all C’s shortcomings,
and it added some more of its own.
JAVA was designed by drastically simplifying C++, removing nearly all its
shortcomings. Although primarily a simple object-oriented language, JAVA can
also be used for distributed and concurrent programming. JAVA is well suited for
writing applets (small portable application programs embedded in Web pages), as
a consequence of a highly portable implementation (the Java Virtual Machine) that
has been incorporated into all the major Web browsers. Thus JAVA has enjoyed a
symbiotic relationship with the Web, and both have experienced enormous growth
in popularity. C# is very similar to JAVA, apart from some relatively minor design
improvements, but its more efficient implementation makes it more suitable for
ordinary application programming.
Functional programming is based on functions over types such as lists and
trees. The ancestral functional language was LISP, which demonstrated at a
remarkably early date that significant programs can be written without resorting
to variables and assignment.
ML and HASKELL are modern functional languages. They treat functions as
ordinary values, which can be passed as parameters and returned as results from
other functions. Moreover, they incorporate advanced type systems, allowing us to
write polymorphic functions (functions that operate on data of a variety of types).
ML (like LISP) is an impure functional language, since it does support variables
and assignment. HASKELL is a pure functional language.
As noted in Section 1.1.1, mathematical notation in its full generality is
not implementable. Nevertheless, many programming language designers have
sought to exploit subsets of mathematical notation in programming languages.
Logic programming is based on a subset of predicate logic. Logic programs infer
relationships between values, as opposed to computing output values from input
values. PROLOG was the ancestral logic language, and is still the most popular.
In its pure logical form, however, PROLOG is rather weak and inefficient, so
it has been extended with extra-logical features to make it more usable as a
programming language.
Programming languages are intended for writing application programs and
systems programs. However, there are other niches in the ecology of computing.
An operating system such as UNIX provides a language in which a user or system
administrator can issue commands from the keyboard, or store a command
10 Chapter 1 Programming languages

script that will later be called whenever required. An office system (such as a word
processor or spreadsheet system) might enable the user to store a script (‘‘macro’’)
embodying a common sequence of commands, typically written in VISUAL BASIC.
The Internet has created a variety of new niches for scripting. For example, the
results of a database query might be converted to a dynamic Web page by a script,
typically written in PERL. All these applications are examples of scripting. Scripts
(‘‘programs’’ written in scripting languages) typically are short and high-level, are
developed very quickly, and are used to glue together subsystems written in other
languages. So scripting languages, while having much in common with imperative
programming languages, have different design constraints. The most modern and
best-designed of these scripting languages is PYTHON.

Summary
In this introductory chapter:
• We have seen what is meant by programming linguistics, and the topics encompassed
by this term: concepts and paradigms; syntax, semantics, and pragmatics; and
language processors.
• We have briefly surveyed the history of programming languages. We saw how new
languages inherited successful concepts from their ancestors, and sometimes intro-
duced new concepts of their own. We also saw how the major paradigms evolved:
imperative programming, object-oriented programming, concurrent programming,
functional programming, logic programming, and scripting.

Further reading
Programming language concepts and paradigms are cov- in WEXELBLAT (1980). Comparative studies of program-
ered not only in this book, but also in TENNENT (1981), ming languages may be found in HOROWITZ (1995), PRATT
GHEZZI and JAZAYERI (1997), SEBESTA (2001), and SETHI and ZELCOWITZ (2001), and SEBESTA (2001). A survey
(1996). Programming language syntax and semantics are of scripting languages may be found in BARRON
covered in WATT (1991). Programming language proces- (2000).
sors are covered in AHO et al. (1986), APPEL (1998), and
WATT and BROWN (2000). More detailed information on the programming languages
The early history of programming languages (up to the mentioned in this chapter may be found in the references
1970s) was the theme of a major conference, reported cited in Table 1.1.

Exercises
Note: Harder exercises are marked *.

Exercises for Section 1.1


1.1.1 Here is a whimsical exercise to get you started. For each programming language
that you know, write down the shortest program that does nothing at all.
How long is this program? This is quite a good measure of the programming
language’s verbosity!
Exercises 11

Table 1.1 Descriptions of major programming and scripting languages.

Programming
language Description

ADA ISO/IEC (1995); www.ada-auth.org/∼acats/arm.html


ALGOL60 Naur (1963)
ALGOL68 van Wijngaarden et al. (1976)
C Kernighan and Ritchie (1989); ISO/IEC (1999)
C++ Stroustrup (1997); ISO/IEC (1998)
C# Drayton et al. (2002)
COBOL ISO/IEC (2002)
FORTRAN ISO/IEC (1997)
JAVA Joy et al. (2000); Flanagan (2002)
LISP McCarthy et al. (1965); ANSI (1994)
HASKELL Thompson (1999)
ML Milner et al. (1997)
MODULA Wirth (1977)
PASCAL ISO (1990)
PERL Wall et al. (2000)
PL/I ISO (1979)
PROLOG Bratko (1990)
PYTHON Beazley (2001); www.python.org/doc/current/ref/
SIMULA Birtwhistle et al. (1979)
SMALLTALK Goldberg and Robson (1989)

Exercises for Section 1.2


*1.2.1 The brief historical survey of Section 1.2 does not mention all major pro-
gramming languages (only those that have been particularly influential, in the
author’s opinion). If a favorite language of yours has been omitted, explain
why you think that it is important enough to be included, and show where your
language fits into Figure 1.1.
*1.2.2 FORTRAN and COBOL are very old programming languages, but still widely used
today. How would you explain this paradox?
*1.2.3 Imperative programming was the dominant paradigm from the dawn of com-
puting until about 1990, after which if was overtaken by object-oriented
programming. How would you explain this development? Why has functional
or logic programming never become dominant?
Comparative Programming Languages
Contents
................................................................".:.............................................................

Preface t o the third edition

1 Introduction
1.1 The diversity of languages
1.2 The software development process
1.3 Language design
1.4 Languages or systems?
1.5 The lexical elements
Summary
Exercises
Bibliography

2 Historical survey
2.1 Early machines
2.2 Fortran
2.3 Algol
2.4 Business data processing languages
2.5 General or multipurpose languages
2.6 Developing programs interactively
2.7 Special-purpose languages
2.8 Systems programming languages
2.9 Modules, classes and abstract data types
2.10 Functional and logic languages
2.1 1 Conclusions
Summary
Exercises
Bibliography
viii Contents

3 Types, values and declarations


3.1 Names
3.2 Declarations and binding
3.3 Type definitions
3.4 Numeric data types
3.5 Logical types
3.6 Character types
3.7 Enumeration types
3.8 Reference and pointer variables
a
Summary
Exercises
B~bliography

4 Expressions and statements


4.1 Expressions
4.2 Statements
4.3 Sequencing and control
4.4 Iterative statements
4.5 goto statement
4.6 Exception handling
Summary
Exercises
Bibliography

5 Program structure
5.1 lntroduction
5.2 Procedural and object-oriented architecture
5.3 Alternative program architectures
5.4 Separate compilation
5.5 Larger units
Summary
Exercises
Bibliography

6 Procedures, functions and methods


6.1 lntroduction
6.2 Parameters
6.3 Functions
6.4 Storage management
6.5 Recursion
6.6 Forward references
6.7 Subprograms as parameters
Summary
Exercises
Bibliography
7 Structured data
7.1 lntroduction
7.2 Arrays
7.3 Records and classes
7.4 Dynamic data structures
7.5 Parametrised types
7.6 Strings
7.7 Sets
7.8 Files
Summary *
Exercises
Bibliography

8 lnheritance and dynamic binding


8.1 lntroduction
8.2 Inheritance
8.3 Polymorphism and dynamic binding
8.4 Comparing the procedural and object-oriented approach
8.5 Abstract methods and classes
8.6 Multiple inheritance
8.7 Problems with inheritance
8.8 Behavioural inheritance
8.9 Conclusion
Summary
Exercises
Bibliography

9 Functional languages
9.1 lntroduction
9.2 Lisp
9.3 FP systems
9.4 Modern functional languages
9.5 Concluding remarks
Summary
Exercises
Bibliography

10 Logic programming
10.1 The Prolog approach
10.2 The basics of Prolog
10.3 Data objects
10.4 Efficiency in Prolog
10.5 A Prolog example
10.6 Concluding remarks
Summary
Exercises
Bibliography
c Contents

11 Concurrency and networking


11.1 lntroduction
11.2 Process synchronisation and communication
11.3 Internet programming
11.4 Real-time programming
Summary
Exercises
Bibliography

12 Syntax and semantics


12.1 Syntax
12.2 Semantics
Summary
Exercises
Bibliography

13 input, output and GUls


13.1 Introduction
13.2 Input and output of text
13.3 Graphical user interfaces
13.4 Binary and direct access files
13.5 Multimedia
Summary
Exercises
Bibliography

14 The future
14.1 lntroduction
14.2 Procedural and object-oriented languages
14.3 Declarative languages
14.4 Language design
14.5 Implementation considerations
14.6 Development methods
14.7 Conclusions
Summary
Bibliography

Appendices
1 Language summaries
2 Language texts

Solutions t o selected exercises

Index
Preface to the third edition

Aims and objectives

In this book we consider the principal programming language concepts and show ho
they are dealt with in object-oriented languages such as Java and Delphi, in tradition
procedural languages such as Pascal, C and Fortran, in hybrid object-oriented or ohjec
based languages such as C++ and Ada 95, in functional languages such as ML and
logic languages like Prolog.
Theprogramminglanguagescenehas always been bedevilledby protagonists pushi
their favourite language. This often leads to a reluctance to examine rival languag
and to a 'love me, love my language' approach. The lack of a scientific approach
languages tends to emphasise the differences between them, even when these are qu
minor. Our approach is to find common ground between languages and to identi
the underlying principles. Although the top-level organisation of a program writt
in an object-oriented language is different from that of one written in a procedur
language, their underlying principles are the same. Similarly, although the approa
.of logic languages and functional languages is different from that of procedural a
object-oriented languages, there are many areas of similarity, and seeing these helps
to understand the differences better.
This approach is important at the present time when there are major and importa
controversies as to which way we should be heading. To what extent is it worthwh
developing new versions of the old faithfuls as has happened with Fortran 90 and Obje
Oriented Cobol? Should we standardise instead on modem object-oriented languag
like Java? It is certainly the case that procedural languages have proved themselv
extremely resilient and have been given a new lease of life by the incorporation
object-oriented features as has happened with C++, Ada 95 and Object Pascal. Wh
is going to be the future position of functional and logic languages? What effect w
new hardware designs and the growth of the Internet have on programming language
Although it is likely that many of these issues will not be resolved for many years, it
xii Preface to the third edition

important that they are addressed.


The type of course taught in our department is, we thi& fairly typical. It combines
both theoretical and practical work and so there is a limit to the amount of material that
can be covered. We feel that it is important to get the balance right. A purely theoretical
approach usually leaves most computing students unable to fit the theory with their
practical experience, while an approach that merely piles one language on top of the
other leaves the student ignorant of the common threads in language design.

Changes from the second edition -


There have been major developments in programming languages and their use since
the second edition was published in 1993 and the changes reflect this. Revisions have
been made throughout the book to bring material up to date and as a result of feedback
from students at Stirling University. Although object orientation was a major paradigm
in 1993, object-oriented languages had not achieved the central position that they enjoy
today. Secondly, features to support GraphicalUserInterfaces (GUIs) haveonly recently
become an integral pafi of language design. This has meant that topics such as event-
driven programming have moved from being advanced to being taught in introductoq
courses
Changes include:
Java loins Pascal and C++ as one of the central language5 used in the book,
O the object-oriented approach plays a central role,
(L, discussion of language support for graphical user interfaces and event-driven pro-
grams,
.2? discussion of scripting languages such as Perl,
. revised mscussion of concurrency is centred on Ada 95 tasks and Java threads,
b discussion of Internet programming includes applets, CGI programs and CORBA,
O revised discussion on exceptions with exception handling in Eiffel being replaced
by exception handling in Java,
web site p i n g course support and executable program examples.
Our aim has been to keep the size of the book within bounds. Therefore, to offset
the inclusion of new material, we have significantly reduced or removed coverage of
languages which are no longer in widespread use.
The first two editions were written jointly by Leslie Wilson and Robert Clark while
the changes to the third edition are solely the work of Robert Clark. He therefore takes
full responsibility for any errors that may have been introduced!

Intended audience

Although this bookis intended primarily as a student text for a comparative language or
language concepts course, we hope that it will also be read by practising computer pro-
Preface to the third edition x

grammers. It is easy to be overwhelmed by the large number of available programmin


languages and we hope this book provides some guidance through the programmin
language jungle.
Students are likely to have been taught either an object-oriented or a procedura
language as their first language with the proportion heing taught an object-oriente
language continuing to increase in the next few years. Hence, while the previous edition
assumed proficiency in a structured procedural language such as Pascal, C or Ada, th
edition is equally suited to students whose first language is an object-oriented languag
such as Java, C++ or Delphi and who do not have experience of a procedural languag
We have not assumed any prior howledge of logic or functional languages.

Structure and contents

Many books on comparative programming languages have separate chapters on each o


the main languages covered. We have not adopted this approach, but have organised th
book so that there are separate chapters on the main language concepts with example
and discussion of how they are dealt with in particular languages. The emphasis is o
ohject-oriented and procedural languages and although a wide range of languages a
covered, Java, Pascal, Ada, C and C++ are used in a central role.
Chapter 1 Introduction This shows how programming languages fit into the sof
ware development process and the effect this has had on language design.
Chapter 2 Historical survey This provides a historical survey so that the develo
ment of present languages can be traced.
Chapter 3 Types, values and declarations This deals with variables, types an
declarations. It is shown that many of the differences between languages can b
explained in terms of the binding time of their attributes.
Chapter 4 Expressions and statements This discusses expressions and statemen
with the emphasis on structured control statements.
Chapter 5 Program structure This deals with the high-level organisation of ohje
oriented and procedural languages, including hybrids between the two.
Chapter 6 Procedures, functions and methods This looks atprocedures andme
ods with particular attention heing paid to parameter-passing mechanisms.
Chapter 7 Structured data This deals with mays, records anddynamic data stru
tures and how classes can be used to implement abstract data types.
Chapter 8 Inheritance and dynamic binding The concepts of inheritance and d
namic binding are introduced and explained through examples in Java, C++ and Ad
95.
Chapter 9 Functional languages Thisintroducesfunctionallanguagesanddescr
both the traditional LISP approach, together with its modern dialect Scheme, an
the approach taken in newer languages such as FP and ML.
riv Preface to the third edition

Chapter 10 Logic programming This deals with logic programming and is mainly
concerned with the language Prolog.
Chapter 11 Concurrency and networking This describes how concurrency ishan-
dled in Ada 95 and Java and how these languages deal with inter-process communi-
cation and synchronisation. Applets, CGI scripts and distributed programming are
then discussed.
Chapter 12 Syntax and semantics This describes how the syntax of programming
languages can be described formally in BNF and outlines two approaches, denota-
tional and axiomatic semantics, for fonndly describing the semantics of program-
ming language constructs.
Chapter 13 Input, output and GUIs This describes both traditional text-basedin-
put and output and the now standard approach of graphical user interfaces.
Chapter 14 The future This reviews the present situation and attempts to predict
the direction in which language design is likely to go.
Appendix 1 Language summaries This summarises the features of the principal
languages dealt with in the text.
Appendix 2 Language texts This gives a11 annotated bibliography for a wide range
of programming languages.
Chapters 1-6 introduce the basic concepts and are best read in the given order, while
Chapters 7-13 build on earlier material and, as they are largely self-contained, can be
takenin any order. Each chapter contains a synopsis, outlining the major topics to be
covered, a concise end-of-chapter summary, exercises and an annotated bibliography.
Solutions to selected exercises arc given at thc end of the hook.
Finally we should make clear what this book is not. It is not a language reference
manual or a text on language implementation. We make no attempt to teach any pastic-
ular language, but the annotated bibliography at the end of the book gives suggestions
for further reading in all the languages covered. It is also not a book on the principles of
program construction, although throughout the book we view each language construct
from the point of view of whether or not it helps or hinders the construction of readable
and reliable programs.

Acknowledgements

We would like to thank colleagues in the Computing Science Department at the Uni-
versity of Stirling, in particular Alan Hamilton, Simon Jones, Sam Nelson, Charles
Rattray and Leslie Smith. They have acted as an ideal sounding board and their helpful
criticisms of various chapter drafts have prevented us from straying too far into error.
In addition, we would like to thank Kate Brewin of Pearson Education for her help
and encouragement.
Robert Clark
Stirling, June 2000
Introduction
1.1 The diversity of languages
1.2 The software development process
1.3 Language design
1.4 Languages or systems?
1.5 The lexical elements

Thischapter looks at thedifferentstagesinvolved in thedevelopment of software and co


cludes thatthe main purpose of a programming language is to help in theconstruction
reliable software. It also discusses how designers have tried to include expressive powe
simplicityand orthogonality in their languages whilst noting that pragmatic matters su
as implementation and error detection have a significant influence. We also consider t
distinction between a language and its development environment.
The basic low-level building blocks used in the construction of a language are co
sidered; that is, the character set, the rules for identifiers and special symbols, and ho
comments, blanks and layout are handled.

(1The diversity of languages

Although over a thousand different programming languages have been designed


various research groups, international committees and computer companies, most
these languages have never been used outside the group which designed them wh
others, once popular, have been replaced by newer languages. Nevertheless, a lar
number of languages remain in current use and new languages continue to emerg
This situation can appear very confusing to students who have mastered one languag
often Pascal, Delphi, C++ or Java, and perhaps have a reading knowledge of a coup
of others. They might well ask: 'Does a lifetime of learning new languages aw
me?'
2 Introduction

Fortunately, the situation is not as bleak as it appears because, although two


languages may seem to he supcriicially very different, they often have many more
similarities than differences. Individual languages are not usually built on separate
principles; in fact, their differences are often due to quite minor variations in the same
principle.
The aim of this book is to consider the principal programming language concepts
and to show how they have been dealt with in various languages. We will see that by
studying these features and principles we can better understand why languages have
been designed in the way they have. Furthermore, when faced with a new language,
we can identify where the language differs h m those we already know and where it
provides the same facilities disguised in a different syntax.

(Iz) The software development process


A computer is a tool that solves problems by means of programs (or software) written
in a programming language. The development of software is a multi-stage process.
First, it is necessary to determine what needs to be done. Unfortunately, initial informal
user requirements are usually vague, inconsistent, ambiguous and incomplete. The
purpose of requirements analysis is to understand and clarify the requirements and
often involves resolving the conflicting views of different users.
The next stage is concerned with the production of a document, the specification,
which defines as accurately as possible the problem to be solved; in other words, it
detennines what the system is to do. Requirements analysis and specification are the
most difficult tasks in software development.
Having defined what the system is to achieve, we then design a solution and imple-
ment the design on acomputer. It is only at theimplementation stage that aprograwning
language becomes directly involved.
The aim of validation and verification is to show that the implemented solution
does what the users expect and satisfies the original specification. Although there has
been a lot of theoretical work on verification, or program proving, it is usually still
necessary to run the program with carefully chosen test data. But the problem with
program testing is that it can only show the presence of errors, it can never prove their
absence.
The final stage of software development, usually termed maintenance, covers two
quite distinct activities:
1. The correction of errors that were missed at an earlier stage hut have been detected
after the program has been in active service.
2. Modification of the program to take account of additions or changes in the users'
requirements.
Althoughaprogramminglanguage is only explicitly introducedduring theimplemen-
tation stage, it has traditionally influenced the earlier stages of the process. Designers
are, for example, often aware of the implementation language to be used and bias their
designs to take account of the language's strong points.
The software development process

maintenance

Figure 1.1 The watelfail model of software development.

Development models
It is important to realise that the software development process is iterative, not seque
tial. Therefore, knowledge gained at any one of the stages outlined can (and should) b
used to give feedback to earlier stages. The traditional approach is to treat the differe
stages in thedevelopmentprocess as being self-contained and this has led to the waterfa
model of software development shown in Figure 1.1.
However, there has been increasing acceptance of the idea that an incremental an
iterative approach is much more realistic. Central to this approach is the idea of ris
management. Every time we make a decision, there is the possibility that we get
wrong. Wetherefore wantto have continual feedhackto show up possible errors becau
the longer an error remains undetected, the more expensive it will be to put right. Th
led to the spiral model shown in Figure 1.2 (Boehm, 1988). We start at the centre of th
spiral and go repeatedly through the different stages as our systemis built incrementall
Many modem languages are object-oriented and this has led to the creation
object-oriented development methods. In other development methods, there is a cle
distinction in the techniques used in specification, design and implementation. Howeve
in object-oriented development, a problem can be understood and a solution designe
and then implemented using the same framework of a set of communicating objec
The object-oriented development process is therefore well suited to an incremental an
iterative approach. At any given stage, different objects can be described at differe
levels of abstraction. As the iterative development process continues, we incremental
add more detail to the object descriptions.
The need for a notation in which a specification or design can he written dow
Introduction

Determ nc ob.ccl vcs Eval-alc a tcrnallves


a ternat vcs constralnrs loent l y an0 rcso be r sas

Plan next phase

Figure 1.2 Simplified Boehrn spiral model

has led to the development of both specification and program design languages. Such
languages are at a higher level of abstraction and give fewer details than implementation
languages. Many specification languages are mathematical in form and are amenable to
prooftechniques. However, languages of this type are outside the scope of this book and
so we only look at what are conventionally considered to be implementation languages,
although some functional languages have been used as executable specification lan-
guages.
Another approach is to use graphical notations to capture the requirements and
represent designs. Examples of diagrams that occur in many different development
methods are data flow diagrams, entity relationship diagrams, state transition diagrams
and message sequence charts. A problem is that each development method can use its
own set of diagram notations which, although they arerepresentingmuchthe same thing,
can differ in detail. This situation is far from satisfactory as it can suggest differences
in the development process that do not really exist. In object-oriented development,
a standard notation called the Unified Modeling Language (UML) has been adopted
so that different methods now use a common notation (Booch et al., 1998). There are
several different kinds of diagram in UML, but the single most important is the class
diagram which shows the classes involved in an object system and their associations.
An example class diagram is shown in Chapter 8.
The use of a systematic software development process has greatly influenced both
language design and how languages are used. For example, Pascal was designed to
support the ideas of structured programming. The problems of constructing large
systems and of program maintenance led to the introduction of language feahues that
allow large systems to be broken down into self-contained modules. Packages in Ada
and classes in object-oriented languages satisfy that need. It is clear, therefore, that
programming languages do not exist in a vacuum; rather, the design of modern languages
is a direct response to the needs and problems of the software development process.
Language design

a
Language design

Most widely used programming languages are imperative; examples are Fortra
COBOL, C, C++, Pascal, Ada and Java. A program written in an imperative langua
achieves its effect by changing the value of variables or the attributes of objects b
means of assignment statements. Until quite recently, most widely used imperati
languages were procedural, that is their organisation was centred around the definiti
of procedures. Many .procedural languages have now been extended to include objec
nrientedfeatures (C by C++, Pascal by Delphi, Adaby Ada95, COBOLby OOCOBO
Basic by Visual Basic) while other neW.purely object-oriented languages such as Eif
and Java have been designed. Object-oriented programs are organised as a set of objec
which communicate with one another through small strictly defined interfaces.
Other approaches to language design include functional languages (such as pu
Lisp and ML) and logic languages (such as Prolog). These alternative approaches a
dealt with in Chapters 9 and 10 respectively.
The p r i m q purpose of a programming language is to support the constmcti
of reliable software. Hence, in most modem languages, type checking takes place
compile time, which is a considerable help in catching logical errors before the progra
is run. It is also important that a language is user friendly so that it is straightforward
design, write, read, test, m, document and modify programs written in that languag
To understand how these objectives may be achieved, the issues of language desig
can be divided into several broad categories:

expressive power,
slmpl~cityand orthogonality,
implementation,
a error detection and correction.
"correctness and standards

Expressive power
A programming language with high expressive power enables solutions to be express
in terms of the problem being solved rather than in terms of the computer on whi
the solution is to be implemented. Hence, the programmer can concentrate on proble
solving. Such a language should provide a convenient notation to describe both alg
rithms and data stmctures in addition to supporting the ideas of structured programmin
and modularisation.
Another aspect of expressive power is the number of types provided together wi
their associated operations. Instead of providing a large number of built-in types, mo
modem languages provide facilities, such as the Adapackage or the C++ and Java cla
for defining new types, called abstract data types. Such languages can then provide
wide range of predefined types by means of standard libraries which the programm
can use to build new types for the problem in hand. When a language, together wi
its standard libraries, does not include a suitable range of types and operations, th
the programmer generally has to provide these by declarations, thereby distracti
5 Introduction

the programmer's attention to the lower level aspects of solving the problem. Often,
languages may have high expressive power in some areas, but not in others; for example,
Ada has a range of numerical operations that give it expressive power for numerical
work, but it is less effective in data processing applications.
Also included under the heading of expressive power is readability; that is, the ease
with which someone familiar with the language can read and understand programs
written by other people. Readability is considerably enhanced by a well-designed
comment facility, and good layout and naming conventions. In practice, it should be
possible to write programs that can act, to anextent, as their own documentation, thereby
making maintenance and extension of the pregram much easier.

Simplicity and orthogonality


Simplicity implies that a language allows programs to be expressed concisely in a
manner that is easily written, understood and read. This objective is often underrated
by computer scientists, but is a high priority for non-professional programmers. The
success, first of Basic and then of Visual Basic, is an eloquent commentary on the
importance that users place on simplicity.
A simple language either avoids complexity or handles it well. Inherent in most simple
languages is the avoidance of features that most human programmers find difficult.
Simple languages should not allow alternative ways of implementing constructs nor
should they produce surprising results from standard applications of their rules. An
orthogonal language is one in which any combination of the basic language constructs is
allowed and so there are few, if any, restrictions or special cases. Examples of orthogonal
languages are Algol 68 and Smalltalk, which were bothdesigned with the aimof keeping
the number of basic concepts as small as possible. The idea was that the resulting
language would be simple as it would only consist of combinations of features from a
small set of basic concepts.
There can, however, be a clash between the ideas of orthogonality and simplicity.
For example, Pascal, which is not orthogonal, is simpler to learn and use than Algol 68.
Where a new special construct is introduced in Pascal, the same effect is achieved in
Algol 68 by the combination of simpler existing constructs. As an example, Pascal
separates the notion of the type of a parameter from whether it is a value or a variable
parameter. (Details are given in Chapter 6.) Algol 68, on the other hand, combines both
pieces of information within the parameter type. Although the Algol 68 approach is
elegant and powerful, the more pragmatic approach mirth, 1975) taken in the design
of Pascal has led to a more understandable language.
What is generally agreed is that the use of constructs should be consistent; that is,
they should have a similar effect wherever they appear. This is an important design
principle for any language although it is obviously of great importance in an orthogonal
language which gets its expressive power from a large number of combinations of basic
concepts. Whether simplicity or orthogonality is the goal, once the basic constmcts
are known, their combination should be predictable. This is sometimes called the law
of minimum surprise. However, again the importance of simplicity should not be
underestimated. In Java, the declaration i n t x; defines x to be an integer variable
while the declaration S o m e C l a s s x; defines x to be a reference to a S o m e c l a s s
Language design

object. We therefore have the same syntax meaning different things. Although this i
inconsistent, it can be argued that inventing new syntax to make the distinction clea
would have just complicated matters.

Implementation
Execution of a program written in an imperative language, such as Pascal, Ada or C++
normally takes place by translating (compiling) the source program into an equivalen
machine code program. This machine code program is then executed. The ease wit
which a language can be translated m d the efficiency of the resulting code can b
major factors in a language's success. Large languages, for example, have an inheren
disadvantage in this respect because the compiler will, almost inevitably, be large, slo
and expensive.
An alternative to compiling a source program is to use an interpreter. An interprete
can directly execute a source program, hut what is more commonis for a source program
to be translated into some intermediate form which is then executed by the interprete
The interpreter can be said to implement a virtual machine. Executing a program
under the control of an interpreter is much slower than running the equivalent machin
code program, but does give much more flexibility at rnn time. The added flexibili
is important in languages whose main purpose is symbolic manipulation rather tha
numerical calculation. Examples of such languages are the string processing languag
SNOBOL4, the object-oriented language Smalltalk, the functional language Lisp, th
logic language Prolog and the scripting language Perl.
The use of an interpreter also supports an interactive programming environmen
in which programs may be developed incrementally. When developing a Lisp program
for example, a programmer can interact directly with the Lisp interpreter and type
the definition of functions followed by expressions which call these functions. Th
expressions are immediately executed and the results made available. This allows th
early detection, and easy correction, of logical errors. Once the complete program ha
been developed, it can be compiled so that it will run faster.
Java is an imperative language and so we would expect that it would normal
he compiled into machine code. However, that is not the case; Java programs a
interpreted. An exciting use of Java is to animate web pages. A person can download
.web page which contains a Java applet (a small application) and, using a Java-enable
web browser such as Netscape or Internet Explorer, can run the applet. To achieve thi
it must he possible for a Java program to be translated on one computer and to ru
on a different kind of computer and the easiest way of doing this is to translate Jav
source programs into code for a Java virtual machine. Java-enabled browsers provid
interpreters for the Java virtual machine.
Some language designers, notably Wirth the designer of Pascal and Modula-2, hav
made many of their design decisions on the basis of the ease with which a feature ca
be compiled and executed efficiently. One of the many advantages of having a clo
working relationship between the language design and language implementation team
is that the designers can obtain early feedback on constructs that are causing troubl
Often, features that are difficult to translate are also difficult for human programme
to understand. Algol 68 is a prime example of a language that had a lack of success du
B Introduction

to the fact that it was designed by a committee who largely ignored inlplementation
considerations, as they felt that such considerations would restrict the ability to produce
a powerful language. In contrast, the implementation of C, C++, Pascal ;md Java went
hand in hand with their design and the Ada design team was dominated by language
implementers.
However, it is necessary to achieve a proper balance between the introduction of
powerful new features and their ease of implementation. I S 0 Standard Pascal, for
example, has features, such as procedures being able to accept array parameters of
differing lengths, which were omitted from the original version of the language on the
grounds that they were too expensive to implement.

Error detection and correction


It is important that programs are correct and satisfy their original specification. However,
demonstrating that this is indeed the case is no easy matter. As most programmers still
rely on program testing as a means of showing that a program is error free, a good
language should assist in this task. It is therefore sensible for language designers to
include features that help in error detection and to omit features that are difficult to
check.
Ideally, errors shonld he found at compile time when they are easier to pinpoint and
correct. The later an error is detected in the software development process, the more
difficult it is to find and correct without destroying the program stmcture.
As an example of the importance of language design on error detection, consider the
original Fortran method of type declarations where the initial letter of a variable name
implicitly determines the type of the variable. Although this method is convenient and
greatly reduces the number of declarations required, it is in fact inherently unsound
since any misspelling of variable names is not detected at compile time and leads to
logical errors.
Conversely, explicit type declarations have the following advantages. Firstly, they
provide extra information that enables more checking to be carried out at compile time
and, secondly, they act as part of the program documentation.

Correctness and standards


The most exacting requirement of correctness is proving that a program satisfies its
original specification. With the major exception of purely functional languages, which
are amenable to mathematical reasoning, such proofs of correctness have not, as yet,
had a major influence on language design. However, the basic ideas of structured
programming do support the notion of proving the correctness of a program, as it
is clearly easier to reason about a program with high-level control structures than about
one with unrestricted goto statements.
To prove that a program is correct, or to reason about the meaning of a program, it
is necessary to have a rigorous definition of the meaning of each language construct.
(Methods for defining the syntax and semantics of a language are discussed in Chap-
ter 12.) However, although it is not difficultto provide a precise definition of the syntax
Languages or systems?

of a language, it is very difficult, if not impossible, to produce afull semantic definitio


and as far as most programmers are concerned it is unreadable anyway.
It is therefore vital in the early stages of alanguage's development to have aninfonn
description that is understandable by programmers. As in many aspects of compute
science. there needs to be a com~romisebetween exactness and informalitv.
A programming language should also have an official standard definition to which a
implementers adhere. Unfortunately, this seldom happens as implementers often om
features that are difficult to implement and add features that they feel will improve th
language. As aresult, program portability suffers. The exception to this is Ada. An Ad
compiler must be validated using a spaially constructed suite of test programs befor
it can be called an Ada compiler. It is interesting that one of the aims of these tests
to rule out supersets as well as subsets of the language. This is an excellent idea and
is hoped that it will become the norm.

(14) Languages or systems?


An important feature of many modem languages is that they support network program
ming and the creation of graphical user interfaces (GUIs). These features are ofte
provided through libraries and so we have the question of whether they are part of
language or part of its support environment. The problem is compounded by the fa
that GUIs and networking are often highly dependent on the facilities provided by th
operating system.
A major advantage of Java is that it provides support for GUIs and network pro
gramming in a way that is independent of any particular operatig system. Java has a
extensive set of standard libraries where the necessary facilities are defined in terms
the Java Virtual Machine. As all Java programs make heavy use of these libraries, the
are regarded by Java programmers as an integral part of the language.
Languages such as Visual Basic and Delphi also provide these facilities throug
an extensive set of libraries, but differ from Java in that they are closely tied
a particular operating system, namely Microsoft Windows. This allows a close an
efficient integration between the language and operating system facilities. However,
does do away with one of the major advantages of high-level languages which is th
they are machine independent. It also raises the question of when are we talking abo
anew language and when are we talking about a new implementation of an existin
language.
There are many different implementations of C++, each of which provides its own s
of libraries for GUIs. It is therefore clear when we are talking about the C++ languag
and when we are talking about a particular implementation. However, the GUI an
networking facilities of Visual Basic and Delphi form such a large part of the syste
used by their programmers that they can claim to be new languages although they d
have Basic and Object Pascal respectively as their core. Moreover, Visual Basic an
Delphi both provide extensive visual development environments. One view is therefo
that they are not languages, but are system development environments. This lack of
clear distinction between a language and its development environment will continue
increase as support facilities become ever more sophisticated.
10 Introduction

An impoaant feature of programs that use graphical user interfaces is that they are
event driven. They wait for some user event such as the click of a mouse over the
representation of a button on the screen, handle that event and then wait for the next
user event to occur. This leads to a very different program structure from that provided
by traditional programming languages. Writing event driven programs is difficult, but is
dealt with in languages such as Java, Visual Basic and Delphi by most of the work being
done behind the scenes. This allows the programmer to work at a very high level of
abstraction and not worry about implementation details. With earlier languages, event
handling had to be explicitly programmed. This is therefore another example of where
the distinction between a language and its supporting environment has become blurred.

(Is) The lexical elements


The basic building blocks used in writing programs in a particular language are often
known as the lexical elements. This covers such items as the character set, the rules for
identifiers and operators, the use of keywords or reserved words, how comments are
written, and the manner in which blanks and layout are handled.

Character set
The character set can be thought of as containing the basic building blocks of a pro-
gramming language - letters, digits and special characters such as arithmetic operators
and punctuation symbols. Two different approaches were taken when deciding the
character set to be used in early languages. One is to choose all the characters deemed
necessary. This is the approach taken with APL and Algol 60, but it has the drawback
that either special input/output equipment has to be used or changes have to be made
to the published language when it is used on a computer.
The other approach is to use only the characters commonly available with current
input and output devices. Hence, the character set of early versions of Fortran was
restricted by the 64 characters available with punched cards while Pascal initially was
constrained by the character set available with the CDC 6000 series computer on which
it was first implemented.
Since the early 1970s, most input and output devices have supported internationally
accepted character sets such as ASCII (American Standard Code for Information
Interchange) and this has been reflected in the character sets of languages. The ASCII
character set has 128 characters of which 95 are printable; the remaining characters are
special control characters. The printable characters are the upper and lower case letters,
digits, punctuation characters, arithmetic operators and three different sets of brackets
0,[I and 1). Composite symbols are used to extend the range of symbols available.
Commonly used examples are the relational operators <= and >= and the assignment
operator :=used in the Algol family of languages.
More recently, the Unicode character set has been created to give a much larger
range of characters. Each Unicode character occupies 16 bits rather than the 8 used
with ASCII characters. Java uses the Unicode character set.
The lexical elements 1

Identifiers and reserved words


The character set is the collection from which the symbols making up the vocabulary of
programming language are formed. Clearly, a language needs conventions for groupin
characters into words so that names (usually known as identifiers in computing) can b
given to entities such as variables, constants, etc. (Naming conventions are discusse
in Chapter 3.)
Some of the words in aprogramminglanguage are given aspecialmeaning. Example
of this are DO and GOTO in Fortran and begin,end and for in Pascal. Two method
are used for including such words in alanguage. The method adopted by Fortran is t
allow such words to have their specialwell-defined meaning in certain contexts. Th
words are then called keywords. This method was also adopted by the designers of PL
since it limited the number of special words that the programmer had to remember - th
scientific programmer usingPL/Iis unlikely to know all the business-orientedkeyword
while the business programmer is unlikely to know all the scientific keywords. Howeve
the drawback of this method is that the reader of a program written in a language wi
keywords has the task of deciding whether a keyword is being used for its speci
meaning or is an occurrence of an ordinary identifier. Furthermore, when an erro
occurs due to the inadvertent use of an unknown keyword, it is not always clear whe
a word has its special meaning, without consulting all the declarations.
The alternative method, used initially by COBOL and adopted by most mode
languages, is to restrict the use of such words to their special meaning. The words a
then called resewed words. The advantage of the reserved word method is best see
in languages like Pascal and C++ where the number of such words is quite sma
In COBOL, however, the number of reserved words is much larger - over 300
and so the programmer has the task of remembering a large number of words th
must not be used for such things as variables. As well as reserved words, languag
often have predefined identifiers. These are ordinary identifiers that have been give
an initial definition by the system, but which may be redefined by the programme
Examples, in Pascal, are the predefined type Integer and the input and outp
procedures read and write. In languages such as Ada and Java, a large numb
of identifiers are defined in the standard libraries. Such identifiers can be redefined
programs.
In Algol 60 programs, reserved words are written in a different typeface, eith
underlined or bold face, depending on the situation. The drawback of this is that man
input devices cannot cope with underlined words, so less attractive alternatives, such
writing reserved words in quotes, had to be used. In handwritten versions of program
in Pascal and Ada, for example, reserved words are often underlined so that they stan
out while in books they are often printed in boldface. In the version presented to
compiler, however, they are typed in the same way as ordinary identifiers.

Comments
Almost all languages allow comments, thereby making the program more readi
understood by the human reader. Such comments are, however, ignored by the compile
In early languages such as Fortran, which has a fixed format of one statement per lin
12 Introduction

comments are terminated by the end of a line. In Fortran's case, the comment lines were
started by a C in column 1. A similar method is used in other early languages such as
COBOL and SNOBOU.
Algol 60 uses a different method for comments: they begin with the reserved word
connnent and terminate with a semi-colon. However, the problem with this method is
that programmers often fail to terminate comments correctly. Consequently, the com-
piler, interpreting the program exactly, incorporates the next declaration or statement
into the comment. Errors caused in this way are difficult to find as the error message, if
one is generated, is usually quite unrelated to the actual error.
Most later languages enclose comments iff brackets. Pascal, for example, uses either
( * and * ) or [ and ] while C uses / * and * / . But this approach still leaves the problem
of terminating a comment unresolved. Some compilers alleviate this problem by giving
a warning if a statement separator - that is, a semi-colon - occurs within a comment.
Ada, in contrast, commences a comment wilh two hyphens and ends it by the end of
a line, so reverting to the methods of the earliest high-level languages. In C++ and
Java, the programmer can either use C style comments or start a comment by / / and
terminate it by the end of a line.
The problem of failing to properly terminats acomment is largely solved by integrated
development environments (IDES) which automatically colour-code different parts of
the program. When a comment is in a different colour, it is obvious when it has not
been properly terminated.

Spaces and line termination


Early programming languages varied considerably in the importance they attached to
spaces. At one extreme, languages such as Algol 60 (and Fortran within columns 7 to
72 of a line) ignore spaces wherever they occur while, at the other extreme, SNOBOL4
uses spaces as separators and as the primitive operation of concatenating two strings.
Most languages, however, use spaces as separators in amanner similar to that of natural
language. Several spaces or new lines may also be used wherever a single space is
allowed. Spaces in identifiers are normally forbidden, but to aid readability many
languages, for example C++ and Ada, include the underscore character so identifiers
like current-account and centre-of-gravitycan be used. A recent trend is
not to use the underscore, even when it is available, but to capitalise inner words as in:
CurrentAccount and centre0fGravity.
The significance of the end of a line also varies according to the language. Early ver-
sions of Fortran and COBOL, because they were card oriented, adopted the convention
that the end of a line terminated a statement. If the statement could not be contained
within one line, acontinuation character was required in the following line. Most recent
high-level languages use the semi-colon either to separate (Pascal) or to terminate (Ada,
C++, Java) statements. This method is often known as free fonuat since it means that
a new line can be started anywhere that a space may occur in a statement. Fortran 90
includes a free format option in which spaces are significant and in which columns 1
to 6 have no special significance.
Exercises 1

Summary

1. Although there are many different programming languages in existence, language


do, in fact, have more similarities than they have differences.
i
2. The stages in software development are requirements analysis, specification, design
implementation, verification and maintenance. A programming language is no
directly involved until the implementation stage.
3. The evolution of systematic software development methods has greatly influence
6
language design
4. The primary purpose of a programming language is to support the construction o
reliable software.
5. A programming language should he simple, have high expressive power and lan
guage constructs that are consistent - that is, they should have a similar effe
wherever they appear.
6. It is important to have a proper balance between the introduction of powerfu
language features and their ease of implementation.
7. Languages should support the detection of as many errors as possible at compi
time.
8. Alanguage should have asingle official definition to which all implementers adher
9. It is often difficult to distinguish between a language and its development environ
ment.
10. The lexical elements of a language cover such items as the character set and th
rules for identifiers and special symbols.
11. Foman and PL/I have keywords while most other languages such as Pascal hav
reserved words.

1. I An orthogonal language is constructed by combining a small set of basic lan


guage constsucts. It should therefore be simple to learn and use. Discuss wheth
that is in fact the case.
1.2 Smalltalk can be regarded as a purer object-oriented language than Java. Doe
that make it easier to write ohject-oriented programs in Smalltalk than it is
Java?
1.3 Describe the importance of readability in the creation of reliable software. Wh
are the features of a language that enhance readability and what makes it mo
difficult?
14 Introduction

1.4 How important is it that there is a single standard definition of a language?


1.5 What are the advantages and disadvantages of considering a library to be part
of a language definition as opposed to it being considered part of a language
implementation?

( Bibliography )
The approach taken in this book to focus op language concepts and show how these
concepts are realised in different programming languages is similar to that adopted by
Ghezzi and Jazayeri (1997) and Sebesta (1998). The hook by Sethi (1996) also takes
this approach, but is more theoretical.
The alternative approach of having separate chapters on the main programming
languages has been adopted by MacLennan (1987) and Friedman (1991). Pratt and
Zelkowitz (1996), on the other hand, has combined both approaches: the language
features are discussed in the first part of the book while the second part is devoted to
individual languages. The result is comprehensive, although it leads to a very large
text. Both F'ratt and MacLennan also contain a large amount of material on language
implementation.
Boehm, B.W. (1988). 'A Spiral Mode1 of Software Development and Enhancement'
Computer, 21(5), 61-72.
Booch, G., Rumhaugh, J. and Jacobson, I. (1998). The Un$ied Modeling Language
User Guide. Addison-Wesley.
Friedman, L.W. (1991). Comparative Programming Languages. Prentice-Hall.
Ghezzi, C . and Jazayeri, M. (1997). Programming Language Concepts (Third Edition).
John Wiley & Sons.
MacLennan, B.J. (1987). Principles of Programming Languages (Second Edition).
Holt, Rinehart and Winston.
F'ratt, T.W. and Zelkowitz,M. (1996). Programming Languages: Design andlmplement-
atidn (Third Edition). Prentice-Hall.
Sebesta, R. (1998). Concepts of Programming Languages (Fourth Edition). Addison-
Wesley.
Sethi, R. (1996). Programming Languages (Second Edition). Addison-Wesley.
Wirth, N. (1975). 'On the Design of Programming Languages', Z F I P 74, pp. 386-393,
North-Holland.

You might also like