CS445/CS545 Compiler Construction
SPRING 2009
COURSE INFORMATION
Instructor: Christino Tamon
Lectures: Tue-Thu 9:30-10:45am, Science Center 342 (or 334 ITL Lab)
Office hours: SC373, Tue-Thu 9:15-9:30,10:45-13:00
Pre-requisites: CS344 Data Structures and Algorithms, CS345 Automata Theory and Formal Languages
SYLLABUS:
A study of compiler design. Overview of the compilation process.
Formal definition of syntax, lexical scanning, parsing including LL and LR grammars,
run-time structures, intermediate code generation, and storage allocation.
Students are expected to develop a compiler for a substantial subset of a
high-level language using compiler tools such as lex and yacc.
GRADING Project 50%; Exams 50%
TEXTS
- Required:
Alfred V. Aho, Monica Lam, Ravi Sethi and Jeffrey D. Ullman. "Compilers: Principles, Techniques, and Tools," 2nd edition,
Addison-Wesley (2006).
- Brian W. Kernighan and Dennis M. Ritchie, "The C Programming Language," 2nd edition,
Prentice-Hall (1988) (on reserve)
- John Levine, Tony Mason, and Jason Brown, "lex & yacc," 2nd edition, O'Reilly (1992) (on reserve)
OUTLINE
- Language translation. Compilation. Overview.
- Lexical Analysis. Regular expressions.
- Syntax Analysis. Theory of Parsing: top-down LL(1); bottom-up SLR(1), LR(1) (and LALR(1)).
- Semantic Analysis. Parse trees. Syntax-directed translation. Type checking.
- Code generation. Runtime environments.
OBJECTIVES/OUTCOMES
The objective of this course is to learn the fundamentals of compiler theory and its application
in constructing a real working compiler. The specific outcomes are:
- Proficiency in C.
- Knowledge of standard UNIX tools for software design and compiler construction.
- Knowledge of assembly language and its run-time environment relevant for compilers.
- Ability to build a small and functional compiler in a UNIX programming environment.
REQUIREMENTS/POLICIES
Although attendance is not mandatory, students are responsible for all course materials
covered in lectures and any exams given during class periods. Students that need to make
up missing course work must follow the official Clarkson policies regarding these matters.
All students must submit their own work; the exchange of ideas are encouraged but
ultimately the submitted work must be the student's own. Please refer to the Clarkson
University Regulations for more guidelines on academic integrity and related matters.
SCHEDULE
- Introduction: read Chapter 1 [Dragon]
Phases: lexical analysis, syntax analysis, semantic analysis,
intermediate code generation, code optimization, code generation.
- Course project: Appendix A [Dragon, 1st ed]
Source language: Pascal; target language: IA32.
Implementation language and tools: C and lex & yacc.
- Syntax-directed translation: read Chapter 2 [Dragon]
Recursive-descent parsing (top down). LL(1) grammars.
- Parsing techniques: read Chapter 4 [Dragon]
Top-down (stack-based) LL(1); bottom-up SLR(1); LR(1) and LALR(1) (yacc)
- Symbol Table: hashing (see Chapter 7 [Dragon])
with P.J. Weinberger's hashpjw function.
- Semantic analysis: read Chapter 5 and 6 [Dragon]
building syntax trees; simple type systems.
- Code generation: Chapter 9 [Dragon].
Run-time environments Chapter 7 [Dragon]; access to nonlocal names.
The gencode algorithm (9.10) and its labeling algorithm
(c.f.: Ershov numbers).
DRAGON PROJECT
- Arrange an online demo of your compiler (30min).
- Send a self-contained compressed tar source of your compiler by email.
- Submit a hardcopy of your compiler documentation: design document, user manual,
testing report, status report (limitations, caveats, or bugs), and Dragon haiku.
Indicate clearly in your report an extra feature that is unique to your compiler
(choose one of the bonus options in the check-list below).
PROJECT CHECK LIST
Note:
Bonus (optional for all),
Team (required for groups of more than one),
Grad (required for CS545)
(1.0) Lexical Analysis
- line numbering
- comments [style: (* .. *)]
- Bonus: allow two styles of comments [(* .. *) and {..}]
- Bonus: scientific notation for floating point values
(1.5) Syntax Analysis: grammar additions
- nested subprograms
- array access as both L-value and R-value
- Team/Grad: FOR statement
(2.0) Symbol Table
- Bonus: statistics of hashpjw
- Bonus: memory leak handler
(2.5) Syntax Tree (Intermediate Code Generation)
- visual print
- Bonus: memory leak handler
(3.0) Semantic Analysis
- Clear the check list
- error reporting
- Bonus: error recovery
Note: test files in /afs/cu/class/cs445/public/test-files/semantic-analysis/
(4.0) Code Generation
- Input/Output
- simple expressions (arithmetic and relational): gencode
- statements (assignment, conditional, loop)
- nonlocal names: base frame pointer (static scope parent)
- recursive routines (GCD test)
- Team/Grad: complex expressions (register spilling)
- Grad: arrays (L-value, R-value, parameters, nonlocal)
- Bonus: floating-point support
Note: test files in /afs/cu/class/cs445/public/test-files/code-generation/