From 2e124e11be8b88458f4fe8b8eb1ec0ff66a34eb7 Mon Sep 17 00:00:00 2001 From: Yann Herklotz Date: Sat, 10 Mar 2018 21:37:11 +0000 Subject: Improving documentation --- 3-compiler-documentation.md | 221 ------------------------------------ README.md | 160 ++++++++++++++++++++++++++ doc/.gitignore | 0 my-ast.png | Bin 62299 -> 0 bytes readme.md | 50 -------- res/my-ast.png | Bin 0 -> 62299 bytes test.c | 10 -- testing/ref_compiler.sh | 3 - testing/testcases/test_ADD.c | 4 - testing/testcases/test_ADD_driver.c | 8 -- 10 files changed, 160 insertions(+), 296 deletions(-) delete mode 100644 3-compiler-documentation.md create mode 100644 README.md delete mode 100644 doc/.gitignore delete mode 100644 my-ast.png delete mode 100644 readme.md create mode 100644 res/my-ast.png delete mode 100644 test.c delete mode 100644 testing/ref_compiler.sh delete mode 100644 testing/testcases/test_ADD.c delete mode 100644 testing/testcases/test_ADD_driver.c diff --git a/3-compiler-documentation.md b/3-compiler-documentation.md deleted file mode 100644 index da75e1d..0000000 --- a/3-compiler-documentation.md +++ /dev/null @@ -1,221 +0,0 @@ -Documentation -============= - -In total the documentation burden is (at most) 1000 words -plus one diagram. Assessment of the documentation is not relative -to compiler functionality, it just requires a description -of the compiler as-is, and a realistic assessment of the -compiler's strengths and weaknesses. - - -AST -=== - -Overview Diagram ----------------- - -_Add a diagram of your AST, which is designed to *usefully* communicate -the *important* properties of the AST._ - -![my-ast.png](my-ast.png) - - -Description ------------ - -_Describe the structure and organisation of your AST in 200 words -or fewer_. - -I used a pure abstract Node class as an entry point to the AST. -I then had a TranslationUnit class which contained the external declarations and function -definitions in a vector. All other classes used linked lists instead of vectors to store -multiple elements, for example, the statement class had a pointer to a next statement instead -of having a separate StatementList class. This meant that I did not have to write an extra List -class and made it easier to inherit from the right classes. I used the grammar to separated the -classes properly, as an Expression is completely different to a Type or a Statement or a Function. -This meant that I could separate the member functions and not have to declare all of them in the -Node class, as that would lead to a lot of empty definitions. These classes were mostly abstract too -but contained a few function definitions that would throw exceptions so that I did not have to -define these functions when I did not need them. I also had a few specific member functions -(eg. for UnaryExpression) for which I had to use dynamic and static pointer casting to access -them. - - -Strengths ---------- - -_Give two strengths or capabilities of your AST, using 50 words or less for each one_. - -### Strength 1 - -The class hierarchy and structure is very logical as it closely matches the grammar. -This also that it was easy to know what member variables a class needed and the -inheritance was very logical too. All the member functions are well separated -and only appear where they are actually used. - -### Strength 2 - -All the general base classes, that are mostly abstract as well, are in the bison union, -which means that I can use and assign all of those classes directly, and be more -specific in the member variables of those classes so that they only contain the types I need. - -Limitations ------------ - -_Give two limitations of your AST, using 50 words or less for each one_. - -### Limitation 1 - -The Type class is not very useful as it does not capture arrays correctly and store -all their information when using a multidimensional array for example. It also does not -enable me to extract this information correctly as it will only give me the primitive type of -the array. - -### Limitation 2 - -As I did not want all the classes to contain functions that they do not need, classes like -UnaryOperator have member functions. To access these I have to use dynamic casts and with my -linked lists, I always have to check for nullptr before doing anything with it. - - -Variable binding -================ - -General approach ----------------- - -_Describe your overall approach to mapping variable, parameters, etc. -into registers or memory locations at execution time, using 200 words -or less_. - -I did not use many registers because they are a limited resource, and instead I decided -only to use registers $2 and $3 for performing operations and rarely use registers $t0 and $t1 -when loading the address of a value, for example for pointers. I also used registers $4, $5, $6, $7 -for passing values to functions. I used a frame pointer for the current function to access the -stack. This frame is split into multiple parts, first, enough space to store and pass values -when making function calls, then space for all declared variables in the function, and lastly enough -space to store temporary results from expressions. It also stores the previous frame pointer and the -return address at the end of the frame. Every time I perform any operations, I store -the result of that operation in that temporary space in the frame. I keep track of the stack -position and temporary expression stack position using the Bindings class. This class also includes -a map of variables bindings, which binds the identifier to its type and stack position -relative to the frame pointer. The Bindings class is passed by value to account for scopes and -variable shadowing. - -Strengths ---------- - -_Give two strengths or capabilities of your binding approach, using 50 words or less for each one_. - -### Strength 1 - -The Bindings class stores the type of the identifier so that I can look -it up and perform the right operation in the Expression class. Storing the type, however, also -means that I do not have to store the type of an Expression, but can just deduce it. - -### Strength 2 - -By only using two registers for operations, I do not have to worry about having no more -registers available, and which registers will not be overwritten after a function call. -The temporary expression result will always be in the current frame of the function even after -a function call. - -Limitations ------------ - -_Give two limitations of your binding approach, using 50 words or less for each one_. - -### Limitation 1 - -As I am only using two registers to perform operations, I have to include loads and -stores to access the temporary results of the operations. I also store results of an operation -when I do not need the result anymore. This means that the code will run much slower. - -### Limitation 2 - -When counting the variables that are being assigned in the current function, I assume that -they are all integers. I also always store the return value $31 even when there is no function -call. This means that the frame is always much larger than it needs to be. - - -Reflection -========== - -Strengths ---------- - -_What two aspects of your compiler do you think work well (beyond -those identified in the AST and binding parts)?_ - -### Strength 1 - -The Type class was not structured well for handling arrays, however, it does work well -with the other types and makes it easy to store and load variables using the -right operations. It also made it easy to identify pointers for pointer arithmatic and store -values in arrays without padding. - -### Strength 2 - -Function calls work well as they follow the MIPS ABI, and leave enough space for the called -function to store the parameters in the previous frame. The return address is also always stored, -which make recursive calls possible. Function declarations also work because they do not have -to emit assembly. - -Scope for Improvement ---------------------- - -_What parts of your compiler do you think could be improved?_ - -### Improvement 1 - -I would like to work on multidimensional arrays as they do not fully work. To do -that I would have to add information such as the initial array in the type so that I can assign -arrays inside the array to a pointer and perform the right pointer arithmetic. - -### Improvement 2 - -I would also like to reduce the number of dynamic and static casts by adding more -intermediate classes. This would also make the whole AST design more extensible and it would -be easier to possibly add structures in the future. This would also make classes more specialized -and maintainable. - - -Functionality (not assessed) -============================ - -Which of these features does your compiler support (insert -an `x` to check a box): - -1 - [x] Local variables -2 - [x] Integer arithmetic -3 - [x] While -4 - [x] IfElse -5 - [x] For -6 - [x] Function calls -7 - [x] Arrays -8 - [x] Pointers -9 - [x] Strings -10 - [ ] Structures -11 - [ ] Floating-point - -Note that all features will be tested, regardless of what -appears here. This is documentation of what you expect to work, -versus what doesn't work. - - -Feedback (not assessed) -======================= - -_What aspects of your compiler would you like feedback on. -Be specific, as "what could I have done differently" is -too general to answer._ - -### Feedback 1 - -I would like to know if the Type class could have been improved for dealing with arrays and -pointers of pointers. - -### Feedback 2 - -Is using static and dynamic casts to figure out the type of an object a viable method or is -there a better way? diff --git a/README.md b/README.md new file mode 100644 index 0000000..b12c461 --- /dev/null +++ b/README.md @@ -0,0 +1,160 @@ +Documentation +============= + +In total the documentation burden is (at most) 1000 words +plus one diagram. Assessment of the documentation is not relative +to compiler functionality, it just requires a description +of the compiler as-is, and a realistic assessment of the +compiler's strengths and weaknesses. + + +AST +=== + +Overview Diagram +---------------- + +Overview of the ast that is built by the compiler. + +![my-ast.png](my-ast.png) + + +Description +----------- + +### Description of the structure + +I used a pure abstract Node class as an entry point to the AST. +I then had a TranslationUnit class which contained the external declarations and function +definitions in a vector. All other classes used linked lists instead of vectors to store +multiple elements, for example, the statement class had a pointer to a next statement instead +of having a separate StatementList class. This meant that I did not have to write an extra List +class and made it easier to inherit from the right classes. I used the grammar to separated the +classes properly, as an Expression is completely different to a Type or a Statement or a Function. +This meant that I could separate the member functions and not have to declare all of them in the +Node class, as that would lead to a lot of empty definitions. These classes were mostly abstract too +but contained a few function definitions that would throw exceptions so that I did not have to +define these functions when I did not need them. I also had a few specific member functions +(eg. for UnaryExpression) for which I had to use dynamic and static pointer casting to access +them. + + +Strengths +--------- + +The class hierarchy and structure is very logical as it closely matches the grammar. +This also that it was easy to know what member variables a class needed and the +inheritance was very logical too. All the member functions are well separated +and only appear where they are actually used. + +All the general base classes, that are mostly abstract as well, are in the bison union, +which means that I can use and assign all of those classes directly, and be more +specific in the member variables of those classes so that they only contain the types I need. + +Limitations +----------- + +The Type class is not very useful as it does not capture arrays correctly and store +all their information when using a multidimensional array for example. It also does not +enable me to extract this information correctly as it will only give me the primitive type of +the array. + +As I did not want all the classes to contain functions that they do not need, classes like +UnaryOperator have member functions. To access these I have to use dynamic casts and with my +linked lists, I always have to check for nullptr before doing anything with it. + + +Variable binding +================ + +General approach +---------------- + +I did not use many registers because they are a limited resource, and instead I decided +only to use registers $2 and $3 for performing operations and rarely use registers $t0 and $t1 +when loading the address of a value, for example for pointers. I also used registers $4, $5, $6, $7 +for passing values to functions. I used a frame pointer for the current function to access the +stack. This frame is split into multiple parts, first, enough space to store and pass values +when making function calls, then space for all declared variables in the function, and lastly enough +space to store temporary results from expressions. It also stores the previous frame pointer and the +return address at the end of the frame. Every time I perform any operations, I store +the result of that operation in that temporary space in the frame. I keep track of the stack +position and temporary expression stack position using the Bindings class. This class also includes +a map of variables bindings, which binds the identifier to its type and stack position +relative to the frame pointer. The Bindings class is passed by value to account for scopes and +variable shadowing. + +Strengths +--------- + +The Bindings class stores the type of the identifier so that I can look +it up and perform the right operation in the Expression class. Storing the type, however, also +means that I do not have to store the type of an Expression, but can just deduce it. + +By only using two registers for operations, I do not have to worry about having no more +registers available, and which registers will not be overwritten after a function call. +The temporary expression result will always be in the current frame of the function even after +a function call. + +Limitations +----------- + +As I am only using two registers to perform operations, I have to include loads and +stores to access the temporary results of the operations. I also store results of an operation +when I do not need the result anymore. This means that the code will run much slower. + +When counting the variables that are being assigned in the current function, I assume that +they are all integers. I also always store the return value $31 even when there is no function +call. This means that the frame is always much larger than it needs to be. + + +Reflection +========== + +Strengths +--------- + +The Type class was not structured well for handling arrays, however, it does work well +with the other types and makes it easy to store and load variables using the +right operations. It also made it easy to identify pointers for pointer arithmatic and store +values in arrays without padding. + +Function calls work well as they follow the MIPS ABI, and leave enough space for the called +function to store the parameters in the previous frame. The return address is also always stored, +which make recursive calls possible. Function declarations also work because they do not have +to emit assembly. + +Scope for Improvement +--------------------- + +I would like to work on multidimensional arrays as they do not fully work. To do +that I would have to add information such as the initial array in the type so that I can assign +arrays inside the array to a pointer and perform the right pointer arithmetic. + +I would also like to reduce the number of dynamic and static casts by adding more +intermediate classes. This would also make the whole AST design more extensible and it would +be easier to possibly add structures in the future. This would also make classes more specialized +and maintainable. + + +Functionality +============= + +Which of these features does your compiler support (insert +an `x` to check a box): + +- [x] Local variables +- [x] Integer arithmetic +- [x] While +- [x] IfElse +- [x] For +- [x] Function calls +- [x] Arrays +- [x] Pointers +- [x] Strings +- [ ] Structures +- [ ] Floating-point + +Note that all features will be tested, regardless of what +appears here. This is documentation of what you expect to work, +versus what doesn't work. diff --git a/doc/.gitignore b/doc/.gitignore deleted file mode 100644 index e69de29..0000000 diff --git a/my-ast.png b/my-ast.png deleted file mode 100644 index c7ad076..0000000 Binary files a/my-ast.png and /dev/null differ diff --git a/readme.md b/readme.md deleted file mode 100644 index 94aa8ce..0000000 --- a/readme.md +++ /dev/null @@ -1,50 +0,0 @@ -The module code deliverables come in three parts: - - Deliverable | Due | Weight | Notes --------------------------------|-------------|--------|------------------------------------ -[1-lexer.md](1-lexer.md) | Tue 7th Feb | 10% | A C lexer that produces JSON output -[2-parser.md](2-parser.md) | Tue 7th Mar | 10% | A (restricted) C parser that produces XML output -[3-compiler.md](3-compiler.md) | Tue 28th Mar | 50% | The final C compiler - -You will all get a bare private repository. It is up to you -if you want to clone the master specification, or to start from -scratch. Both approaches just require the git operations already used -in lab 1, and possibly looking at the documentation for the commands -you have already used. Remember that you can always do `git ${CMD} --help` -for any CMD such as `pull`, `push`, `commit`, `remote`. - -Starting from scratch ---------------------- - -1 - Create a local git repository using `git init` in a directory of your choice. - -2 - Commit files to your local repository (as you would normally commit) - -3 - Add the URL of your private repository as the remote `origin` (similar - to manually adding the `spec` remote in the lab). - -4 - You should now be able to push and pull commits to origin, which is - your private repo. - -Cloning the spec ----------------- - -1 - Clone the master specification from the public URL. This will produce - a directory called `langproc-2016-cw` by default. - -2 - Rename it to something more meaningful, e.g. `langproc-2016-cw-${LOGIN}`. - -3 - Use `git remote` to rename `origin` (which points at the public spec) to - `spec`. - -4 - Use `git remote` to add a new remote called `origin` pointing at - your private repository. - -5 - You should then be able to push and pull to `origin`, and pull from `spec`, - as in the lab. - -Submissions ------------ - -Submission will be via github (code) + blackboard (commit hash), -as in the lab. diff --git a/res/my-ast.png b/res/my-ast.png new file mode 100644 index 0000000..c7ad076 Binary files /dev/null and b/res/my-ast.png differ diff --git a/test.c b/test.c deleted file mode 100644 index 4bfc9da..0000000 --- a/test.c +++ /dev/null @@ -1,10 +0,0 @@ -int main() -{ - int x[4][3] = { - { 1, 2 }, - { 3, 4, 5}, - { 6, 7, 8 } - }; - - return x[2][1]; -} diff --git a/testing/ref_compiler.sh b/testing/ref_compiler.sh deleted file mode 100644 index b5425b0..0000000 --- a/testing/ref_compiler.sh +++ /dev/null @@ -1,3 +0,0 @@ -#!/bin/bash - -mips-linux-gnu-gcc -c -S -x c - -o - diff --git a/testing/testcases/test_ADD.c b/testing/testcases/test_ADD.c deleted file mode 100644 index 327dc85..0000000 --- a/testing/testcases/test_ADD.c +++ /dev/null @@ -1,4 +0,0 @@ -int f(int a, int b) -{ - return 1; -} diff --git a/testing/testcases/test_ADD_driver.c b/testing/testcases/test_ADD_driver.c deleted file mode 100644 index 9e74f32..0000000 --- a/testing/testcases/test_ADD_driver.c +++ /dev/null @@ -1,8 +0,0 @@ -int f(int a, int b); - -int main() -{ - int r=f(10,11); - - return r == 21; -} -- cgit