SYMBOL TABLE ORGANIZATION
THE SYMBOL TABLE
When identifiers are found, they will be entered into a symbol
table, which will hold all relevant information about identifiers.
This information will be used later by the semantic analyzer and
the code generator.
Lexical
Analyzer
Syntax
Analyzer
Symbol
Table
Semantic
Analyzer
Code
Generator
SYMBOL TABLE
A compiler needs to collect and use information about the names appearing in
the source program. This information is entered into a data structure called a
symbol table.
The symbol table is built up during the lexical and syntactic analysis.
The information collected about a name includes: 1- The string of characters by which it is denoted.
2- It's type (e.g. integer, real, string).
3- It's form (e.g. a simple variable, a structure).
4- It's location in memory.
5- Other attributes depending on the language.
Each entry in the symbol table is a pair of the form (name,
information).
Each time a name is encountered, the symbol table is searched to
see whether that name has been seen previously.
If new, it is entered into the table.
Information about that name is entered into the table
lexical and syntactic analysis.
during
The information collected in the symbol table is used during several
stages in the compilation process.
in semantic analysis, i.e, in checking uses of names are consistent with
their implicit or explicit declarations.
used during code generation-to know how much and what kind of runtime storage must be allocated to a name.
SYMBOL TABLE REQUIREMENTS
As a minimum we must be able to
quick insertion of an identifier
quick search for an identifier
efficient insertion of information (attributes) about an id
quick access to information about a certain id
Space- and time- efficiency
SYMBOL TABLE ENTRIES
STORING CHARACTERS
Method 1: A fixed size space within each entry large
enough to hold the largest possible name. Most names will
be much shorter than this so there will be a lot of wasted
storage
Method 2: Store all symbols in one large separate array.
Each symbol is terminated with an end of symbol mark
(EOS). Each symbol table record contains a pointer to the
first character of the symbol.
CONTENTS OF A SYMBOL TABLE
A simple table is a table with two fields, a name field and
information field. We require several capabilities of the symbol
table. We need to be able to:
12345-
Determine whether a given name is in the table.
Add a new name to the table.
Access the information, associated with a given name.
Add new information for a given name.
Delete a name or group of names from the table.
SYMBOL TABLE DATA STRUCTURES
Issues to consider
Operations required
Insert
Add symbol to symbol table
Look UP
Find symbol in the symbol table (and get its attributes)
Insertion is done only once
Look Up is done many times
Need Fast Look Up.
CHAP 8 SYMBOL TABLE
SYMBOL TABLE
ORGANIZATION
basic operations: enter() and find()
considerations: number of names
storage space
retrieval time
organizations:
<1> unordered list (linked list/array)
<2> ordered list: If a list of names in an array is kept ordered, it may be searched using binary search
binary search on arrays
expensive insertion
(+) good for a fixed set of name (e.g. reserved words, assembly opcodes)
<3> binary search tree
On average, searching takes O(log(n)) time.
However, names in programs are not chosen randomly.
<4> hash table: most common
LIST DATA STRUCTURE
The simplest and easiest to implement DS for a symbol table is
a linear list of records
In this method an array is used to store names and associated
information
New names can be added in the order as they arrive
The pointer available is maintained at the end of all stored
records
To retrieve information about a name, we search from the
SELF ORGANIZING LISTS
Using linked link.
A link field is added to each record
We search the records in the order pointed by the link of
the link field
A pointer first is maintained to point to first record of the
symbol table
BINARY SEARCH TREE
Another approach to organize symbol table is that we add two link
fields i.e. left and right child
we use these field as binary search tree. All names are created as
child of root node that always follow the property of binary tree i.e.
name <name ie and Namej <namei.
These two statements show that all smaller name than Namei must
be left child of name otherwise right child of namej
(1) While P nil do
(2) If NAME = NAME (P) then /* Name found, take action on success */
(3) Else if NAME< NAME (P) then P:= LEFT (P) /* visit left child */
(4) Else /* NAME (P) < NAME */ P:= RIGHT (P) /* visit right child*/
/* if we fall through the loop, we have failed to find NAME */
SYMBOL TABLE DATA STRUCTURE
Hash table:
Run the symbol name through a hash function to create
an index in a table.
If some other symbol has already claimed the space then
rehash with another hash function to get another index,
etc.
Hash Table must be large enough to accommodate
largest number of symbols.