SearchWiki:

CC Course 2005

PmWiki

Recent Changes Printable View Page History Edit Page

C - Supplementary Extension


Team

  • Martin Leitner ( martin<dot>leitner<at>sbg.ac.at )
  • Till Radermacher ( till<dot>radermacher<at>gmail<dot>com )

Introduction

This compiler is being programmed during the course compiler construction held by Prof. Dr. Kirsch at the University of Salzburg during the summer semester 2005. The compiler itself is named after the two programmers: Marillyn.

You can find the documentation of our compiler here: Attach:marillyn_doku_071505.pdf .

The presentation we held on 28th of June, 2005 is also available: Attach:marillyn_ppt.zip .

The source code including some example programs can be downloaded here: Attach:marillyn.zip .


C - SUpplmenetary? eXtension

C - SUX is a C based language which adds some functionalities, but keeps its syntax close to the original language. Some more complicated concepts are not implemented since this reduces the complexity of the language and makes it easier to access.


The Scanner

The scanner scans the source code and generates logical tokens out of it. One logical token could be something like "String", {, ==, while, else, 123 or just a variable like my_var. It is implemented in a function called getsym() which delivers the next logical token from the source code.

Status: done, man hours: 40


The Scanner Vokabulary

 while, if, else if, else, 
EOF, FILE,
true, false, 
int, char, boolean, void, struct,
return, 
typedef, define, include,
=, +, -, *, /, 
<, >, !=, ==, >=, <=, ||, &&, &, 
;, :, ',', '.', #,
(, ), {, }, [, ] 


The Parser

The parser is implemented in the fucntion parser(). However, it should be noted that the type checking and code generation are also being processed in this function, since Marillyn is a single pass compiler. The function parser() itself is very simple since it just uses other functions to check the source code for syntactical errors.

Arithmetical expressions are implemented using the EBNF of Niklaus Wirth used in the book Compiler Construction (Addison-Wesley 1996, ISBN 0-201-40353-6).

Status: done, man hours: 80


Type Checking

A symbol table is generated at the beginning of the program, which is later used for typechecking. In this symbol table all variables and functions are stored with their type, class, size, function (for a distinction between local and global variables) and their name. The type checking as mentioned above is implemented in the same subfunctions that parser() uses.

Status: done, man hours: 70


Virtual Machine

As with the EBNF the virtual machine was developed close to the original VM used in the book Compiler Construction by Wirth. It uses the RISC architecture with 32 registers and four megabytes of main memory. The virtual machine itself scans the file source.vm which contains the instructions written by the compiler. One instruction looks something like this: 'INSTRUCTION A B C'. Normally these instructions manipulate the registers and the stack. E.g. the instructions LWD (load word) and STW (store word) are used to copy a value from a memory address to a register and vice versa.

Status: done, man hours: 20

The instructions are:

  
ADD = 100, SUB = 110, MUL = 120, DIV = 130, MOD = 140, CMP = 150, CHK = 160
ADDI = 200, SUBI = 210, MULI = 220, DIVI = 230, MODI = 240, CMPI = 250, CHKI = 260
AND = 300, BIC = 310, OR =  320, XOR = 330
ANDI = 400, BICI = 410, ORI =  420, XORI = 430
LSH = 500, ASH = 510
LSHI = 600, ASHI = 610
LDW = 700, LDB = 710, POP = 720, STW = 730, STB = 740, PSH = 750
BEQ = 800, BNE = 810, BLT = 820, BGE = 830, BGT = 840, BLE = 850, BSR = 860, BJU = 870, 
RET = 880, EIF = 890
RD =  900, WRD = 910, WRH = 920, WRL = 930
SPRC =  950, PRM = 960, NPRM=961, SPRM=962, CPRM=963, DPRM=964, WORD = 965, PRF = 970, 
FPRF = 971, FOPN = 972, FGTC = 973, FEOF = 974, EXIT = 975, FLE = 976, FRV=977, FCLS=978
 


Code Generation

As the type checking, the code generation is done in the same functions as the parsing.

For variables which are about to be processed items are generated which store the current memory address and the register in which the variable is placed. The function put(), which gets an instruction as a parameter value, produces a machine instruction for the virtual machine. A typical machine instruction would be 'ADD 1 1 2' which adds the value stored in the registers one and two and stores them in the register one. An equivalent source code instruction would be a = a + b;.

Code is generated for arithmetical expressions, arrays, structs, if and while expressions. Standard functions are also implemented. Code generation for functions and modules is left open.

Status: under development, man hours: 160

Edit Page - Page History - Printable View - Recent Changes - WikiHelp - SearchWiki
Page last modified on July 15, 2005, at 07:09 PM