ACSE 2.0.2
Advanced Compiler System for Education (basic documentation)
|
ACSE is a complete toolchain consisting of a compiler (acse
itself), an assembler (asrv32im
), and a simulator (simrv32im
), developed for education on the topic of compilers in the "Formal Languages and Compilers" course at Politecnico di Milano.
ACSE, together with its supporting tools, aims to provide a simple but reasonably accurate sandbox for learning how a compilation toolchain works. It is based on the standard RISC-V specification: specifically, ACSE emits RV32IM assembly code files that are also compatible with the GNU toolchain and RARS.
The language accepted by ACSE is called LANCE (Language for Compiler Education), and is an extremely simplified subset of C. More in detail:
int
and int[]
(fixed size array of int
).+=
, ++
, ...)while
, do-while
, if
), no break
and continue
.read
and write
which read and write integers from standard input and output respectively.Here is an example LANCE program which computes the first 10 Fibonacci numbers by storing them in an array, and then prints them out:
ACSE is written in C99 and is supported on the following operating systems:
If you are using Linux or macOS, ACSE requires the following programs to be installed:
If you use Windows, first you must install either the MSYS2 environment or Windows Services for Linux (WSL). Both MSYS2 and Windows Services for Linux (WSL) provide a Linux-like environment inside of Windows.
Once you have installed either MSYS2 or WSL, you can use the following instructions just as if you were using Linux or macOS.
To build the ACSE compiler toolchain, open a terminal and type:
make
The built executables will be located in the bin
directory.
To compile some examples (located in the directory tests
) type:
make tests
You can compile and run new Lance programs in this way (suppose you have saved a Lance program in myprog.src
):
./bin/acse myprog.src -o myprog.asm ./bin/asrv32im myprog.asm -o myprog.o ./bin/simrv32im myprog.o
Alternatively, you can add a test to the tests
directory by following these steps:
tests
. You can choose whatever directory name you wish, as long as it is not test
..src
.make tests
to compile all tests, included the one you have just added.The make tests
command only runs the ACSE compiler and the assembler, you will have to invoke the simulator manually.
All assembly files produced by ACSE are compatible with RARS so you can also run any compiled program through it.
All symbols are in lower camel case, and all functions that operate on a structure have a prefix that depends on the structure. Additional prefixes are used for special purposes:
Additional rules are used for ACSE alone:
All source code files use 2 spaces for alignment and indentation.
In ACSE, the lexer and parser defined in scanner.l and parser.y are generated by Flex and Bison respectively. The semantic actions in the parser also take care of initial code generation – which makes ACSE a syntax-driven translator.
The current state of the compilation process is stored in an object of type t_program. This object contains the current intermediate representation, consisting of the instruction list and the symbol table. The intermediate representation is manipulated by the semantic actions by using the functions defined in program.h and codegen.h, and is in a symbolic and simplified form of RISC-V assembly code. Specifically, the function prefixed with gen
add new instructions at the end of the instruction list, and the functions createLabel() and assignLabel() allow for the creation and placement of labels. The ACSE IR provides infinite temporary registers, and new (unused) registers can be retrieved by calling getNewRegister().
After the generation of the initial IR by the frontend, the IR is normalized by translating system function calls to lower level code, and other pseudo-instructions not defined by the RISC-V standard are translated to legal equivalent instruction sequences. Then, each temporary register is allocated to a physical machine register, spilling values to memory if the number of physical registers is not sufficient. Finally, the instructions of the program are written to the assembly-language output file specified by the command line arguments.
ACSE is copyright (c) 2008-2024 Politecnico di Milano, and is licensed as GNU GPL version 3. It has been developed by the following contributors:
Additional help and input has been provided by:
Please report any suspected bugs or defects to daniele.cattaneo <at> polimi.it
.