CC Course 2005PmWiki |
C - Supplementary ExtensionTeam
IntroductionThis compiler is being programmed during the course compiler construction held by Prof. Dr. Kirsch at the University of Salzburg during the summer semester 2005. The compiler itself is named after the two programmers: Marillyn. You can find the documentation of our compiler here: Attach:marillyn_doku_071505.pdf . The presentation we held on 28th of June, 2005 is also available: Attach:marillyn_ppt.zip . The source code including some example programs can be downloaded here: Attach:marillyn.zip . C - SUpplmenetary? eXtensionC - SUX is a C based language which adds some functionalities, but keeps its syntax close to the original language. Some more complicated concepts are not implemented since this reduces the complexity of the language and makes it easier to access. The ScannerThe scanner scans the source code and generates logical tokens out of it. One logical token could be something like "String", {, ==, while, else, 123 or just a variable like my_var. It is implemented in a function called getsym() which delivers the next logical token from the source code. Status: done, man hours: 40 The Scanner Vokabulary while, if, else if, else,
EOF, FILE,
true, false,
int, char, boolean, void, struct,
return,
typedef, define, include,
=, +, -, *, /,
<, >, !=, ==, >=, <=, ||, &&, &,
;, :, ',', '.', #,
(, ), {, }, [, ]
The ParserThe parser is implemented in the fucntion parser(). However, it should be noted that the type checking and code generation are also being processed in this function, since Marillyn is a single pass compiler. The function parser() itself is very simple since it just uses other functions to check the source code for syntactical errors. Arithmetical expressions are implemented using the EBNF of Niklaus Wirth used in the book Compiler Construction (Addison-Wesley 1996, ISBN 0-201-40353-6). Status: done, man hours: 80 Type CheckingA symbol table is generated at the beginning of the program, which is later used for typechecking. In this symbol table all variables and functions are stored with their type, class, size, function (for a distinction between local and global variables) and their name. The type checking as mentioned above is implemented in the same subfunctions that parser() uses. Status: done, man hours: 70 Virtual MachineAs with the EBNF the virtual machine was developed close to the original VM used in the book Compiler Construction by Wirth. It uses the RISC architecture with 32 registers and four megabytes of main memory. The virtual machine itself scans the file source.vm which contains the instructions written by the compiler. One instruction looks something like this: 'INSTRUCTION A B C'. Normally these instructions manipulate the registers and the stack. E.g. the instructions LWD (load word) and STW (store word) are used to copy a value from a memory address to a register and vice versa. Status: done, man hours: 20 The instructions are: ADD = 100, SUB = 110, MUL = 120, DIV = 130, MOD = 140, CMP = 150, CHK = 160 ADDI = 200, SUBI = 210, MULI = 220, DIVI = 230, MODI = 240, CMPI = 250, CHKI = 260 AND = 300, BIC = 310, OR = 320, XOR = 330 ANDI = 400, BICI = 410, ORI = 420, XORI = 430 LSH = 500, ASH = 510 LSHI = 600, ASHI = 610 LDW = 700, LDB = 710, POP = 720, STW = 730, STB = 740, PSH = 750 BEQ = 800, BNE = 810, BLT = 820, BGE = 830, BGT = 840, BLE = 850, BSR = 860, BJU = 870, RET = 880, EIF = 890 RD = 900, WRD = 910, WRH = 920, WRL = 930 SPRC = 950, PRM = 960, NPRM=961, SPRM=962, CPRM=963, DPRM=964, WORD = 965, PRF = 970, FPRF = 971, FOPN = 972, FGTC = 973, FEOF = 974, EXIT = 975, FLE = 976, FRV=977, FCLS=978 Code GenerationAs the type checking, the code generation is done in the same functions as the parsing. For variables which are about to be processed items are generated which store the current memory address and the register in which the variable is placed. The function put(), which gets an instruction as a parameter value, produces a machine instruction for the virtual machine. A typical machine instruction would be 'ADD 1 1 2' which adds the value stored in the registers one and two and stores them in the register one. An equivalent source code instruction would be a = a + b;. Code is generated for arithmetical expressions, arrays, structs, if and while expressions. Standard functions are also implemented. Code generation for functions and modules is left open. Status: under development, man hours: 160 |