diff options
author | Yann Herklotz <git@yannherklotz.com> | 2022-02-26 15:00:07 +0000 |
---|---|---|
committer | Yann Herklotz <git@yannherklotz.com> | 2022-02-26 15:00:07 +0000 |
commit | 4dac87d44f2856f5db25726a2c381f869ee73af6 (patch) | |
tree | 6edf60103fe6f35d01b27fef2877e50d1d577439 /content.org | |
parent | 5e7713bc58833552a36b1dca4e582952860a5d2a (diff) | |
download | vericert-docs-4dac87d44f2856f5db25726a2c381f869ee73af6.tar.gz vericert-docs-4dac87d44f2856f5db25726a2c381f869ee73af6.zip |
Update name of content
Diffstat (limited to 'content.org')
-rw-r--r-- | content.org | 859 |
1 files changed, 859 insertions, 0 deletions
diff --git a/content.org b/content.org new file mode 100644 index 0000000..b702c3a --- /dev/null +++ b/content.org @@ -0,0 +1,859 @@ +#+title: Vericert Documentation +#+author: Yann Herklotz +#+email: git@ymhg.org +#+setupfile: setup.org + +* Vericert +:PROPERTIES: +:EXPORT_FILE_NAME: _index +:CUSTOM_ID: vericert +:END: + +A formally verified high-level synthesis (HLS) tool written in Coq, building on top of [[https://github.com/AbsInt/CompCert][CompCert]]. +This ensures the correctness of the C to Verilog translation according to our Verilog semantics and +CompCert's C semantics, removing the need to check the resulting hardware for behavioural +correctness. + +** Features + +The project is currently a work in progress. Currently, the following C features are supported and +have all been proven correct, providing a verified translation from C to Verilog: + +- all int operations, +- non-recursive function calls, +- local arrays and pointers, and +- control-flow structures such as if-statements, for-loops, etc... + +** Content + +- [[#building][Vericert Documentation]] + +** Papers + +- OOPSLA '21 :: Yann Herklotz, James D. Pollard, Nadesh Ramanathan, and John Wickerson. Formal + Verification of High-Level Synthesis. In /Proc. ACM Program. Lang./ 5, OOPSLA, 2021. [ [[./static/papers/fvhls_oopsla21.pdf][pdf]] ] + +- LATTE '21 :: Yann Herklotz and John Wickerson. High-level synthesis tools should be proven + correct. In /Workshop on Languages, Tools, and Techniques for Accelerator + Design/, 2021. [ [[./static/papers/hlsspc_latte2021.pdf][pdf]] ] + +** Mailing lists + +For discussions, you can join the following mailing list: [[https://lists.sr.ht/~ymherklotz/vericert-discuss][lists.sr.ht/~ymherklotz/vericert-discuss]]. + +For contributing patches to the [[https://sr.ht/~ymherklotz/vericert/][sourcehut]] repository: [[https://lists.sr.ht/~ymherklotz/vericert-devel][lists.sr.ht/~ymherklotz/vericert-devel]]. + +* Index +:PROPERTIES: +:EXPORT_HUGO_SECTION: menu +:EXPORT_FILE_NAME: index +:EXPORT_HUGO_CUSTOM_FRONT_MATTER: :headless true +:END: + +- [[#docs][Docs]] + - [[#building][Building]] + +* Documentation + +** Docs +:PROPERTIES: +:EXPORT_FILE_NAME: _index +:EXPORT_HUGO_SECTION: docs +:CUSTOM_ID: docs +:END: + +Vericert translates C code into a hardware description language called Verilog, which can then be +synthesised into hardware, to be placed onto a field-programmable gate array (FPGA) or +application-specific integrated circuit (ASIC). + +#+attr_html: :width 600 +#+caption: Current design of Vericert, where HTL is an intermediate language representing a finite state machine with data-path (FSMD) and Verilog is the target language. +#+name: fig:design +[[./static/images/toolflow.svg]] + +The design shown in Figure [[fig:design]] shows how Vericert leverages an existing verified C compiler +called [[https://compcert.org/compcert-C.html][CompCert]] to perform this translation. + +** Building Vericert +:PROPERTIES: +:EXPORT_FILE_NAME: building +:EXPORT_HUGO_SECTION: docs +:CUSTOM_ID: building +:END: + +To build Vericert, the provided Makefile can be used. External dependencies are needed to build the +project, which can be pulled in automatically with [[https://nixos.org/nix/][nix]] using the provided ~default.nix~ and ~shell.nix~ +files. + +The project is written in Coq, a theorem prover, which is extracted to OCaml so that it can then be +compiled and executed. The dependencies of this project are the following: + +- [[https://coq.inria.fr/][Coq]]: theorem prover that is used to also program the HLS tool. +- [[https://ocaml.org/][OCaml]]: the OCaml compiler to compile the extracted files. +- [[https://github.com/ocaml/dune][dune]]: build tool for ocaml projects to gather all the ocaml files and compile them in the right + order. +- [[http://gallium.inria.fr/~fpottier/menhir/][menhir]]: parser generator for ocaml. +- [[https://github.com/ocaml/ocamlfind][findlib]] to find installed OCaml libraries. +- [[https://gcc.gnu.org/][GCC]]: compiler to help build CompCert. + +These dependencies can be installed manually, or automatically through Nix. + +*** Downloading CompCert + +CompCert is added as a submodule in the ~lib/CompCert~ directory. It is needed to run the build +process below, as it is the one dependency that is not downloaded by nix, and has to be downloaded +together with the repository. To clone CompCert together with this project, you can run: + +#+begin_src shell +git clone --recursive https://github.com/ymherklotz/vericert +#+end_src + +If the repository is already cloned, you can run the following command to make sure that CompCert is +also downloaded: + +#+begin_src shell +git submodule update --init +#+end_src + +*** Setting up Nix + +Nix is a package manager that can create an isolated environment so that the builds are +reproducible. Once nix is installed, it can be used in the following way. + +To open a shell which includes all the necessary dependencies, one can use: + +#+begin_src shell +nix-shell +#+end_src + +which will open a shell that has all the dependencies loaded. + +*** Makefile build + +If the dependencies were installed manually, or if one is in the ~nix-shell~, the project can be built +by running: + +#+begin_src shell +make -j8 +#+end_src + +and installed locally, or under the ~PREFIX~ location using: + +#+begin_src shell +make install +#+end_src + +Which will install the binary in ~./bin/vericert~ by default. However, this can be changed by changing +the ~PREFIX~ environment variable, in which case the binary will be installed in ~$PREFIX/bin/vericert~. + +*** Testing + +To test out ~vericert~ you can try the following examples which are in the test folder using the +following: + +#+begin_src shell +./bin/vericert test/loop.c -o loop.v +./bin/vericert test/conditional.c -o conditional.v +./bin/vericert test/add.c -o add.v +#+end_src + +Or by running the test suite using the following command: + +#+begin_src shell +make test +#+end_src +** Using Vericert +:PROPERTIES: +:EXPORT_FILE_NAME: using-vericert +:EXPORT_HUGO_SECTION: docs +:CUSTOM_ID: using-vericert +:END: + +Vericert can be used to translate a subset of C into Verilog. As a simple example, consider the +following C file (~main.c~): + +#+begin_src C +void matrix_multiply(int first[2][2], int second[2][2], int multiply[2][2]) { + int sum = 0; + for (int c = 0; c < 2; c++) { + for (int d = 0; d < 2; d++) { + for (int k = 0; k < 2; k++) { + sum = sum + first[c][k]*second[k][d]; + } + multiply[c][d] = sum; + sum = 0; + } + } +} + +int main() { + int f[2][2] = {{1, 2}, {3, 4}}; + int s[2][2] = {{5, 6}, {7, 8}}; + int m[2][2] = {{0, 0}, {0, 0}}; + + matrix_multiply(f, s, m); + return m[1][1]; +} +#+end_src + +It can be compiled using the following command, assuming that vericert is somewhere on the path. + +#+begin_src shell +vericert main.c -o main.v +#+end_src + +The Verilog file contains a top-level test-bench, which can be given to any Verilog simulator to +simulate the hardware, which should give the same result as executing the C code. Using [[http://iverilog.icarus.com/][Icarus +Verilog]] as an example: + +#+begin_src shell +iverilog -o main_v main.v +#+end_src + +When executing, it should therefore print the following: + +#+begin_src shell +$ ./main_v +finished: 50 +#+end_src + +This gives the same result as executing the C in the following way: + +#+begin_src shell +$ gcc -o main_c main.c +$ ./main_c +$ echo $? +50 +#+end_src + +** Unreleased Features +:PROPERTIES: +:EXPORT_FILE_NAME: unreleased +:EXPORT_HUGO_SECTION: docs +:CUSTOM_ID: unreleased-features +:END: + +The following are unreleased features in Vericert that are currently being worked on and have not +been completely proven correct yet. Currently this includes features such as: + +- [[#scheduling][scheduling]], +- [[#operation-chaining][operation chaining]], +- [[#if-conversion][if-conversion]], and +- [[#functions][functions]]. + +This page gives some preliminary information on how the features are implemented and how the proofs +for the features are being done. Once these features are properly implemented, they will be added +to the proper documentation. + +*** Scheduling +:PROPERTIES: +:CUSTOM_ID: scheduling +:END: + +Scheduling is an optimisation which is used to run various instructions in parallel that are +independent to each other. + +*** Operation Chaining +:PROPERTIES: +:CUSTOM_ID: operation-chaining +:END: + +Operation chaining is an optimisation that can be added on to scheduling and allows for the +sequential execution of instructions in a clock cycle, while executing other instructions in +parallel in the same clock cycle. + +*** If-conversion +:PROPERTIES: +:CUSTOM_ID: if-conversion +:END: + +If-conversion is an optimisation which can turn code with simple control flow into a single block +(called a hyper-block), using predicated instructions. + +*** Functions +:PROPERTIES: +:CUSTOM_ID: functions +:END: + +Functions are currently only inlined in Vericert, however, we are working on a proper interface to +integrate function calls into the hardware. + +** Coq Style Guide + :PROPERTIES: + :CUSTOM_ID: coq-style-guide + :EXPORT_FILE_NAME: coq-style-guide + :EXPORT_HUGO_SECTION: docs + :END: + +This style guide was taken from [[https://github.com/project-oak/silveroak][Silveroak]], it outlines code style for Coq code in this +repository. There are certainly other valid strategies and opinions on Coq code style; this is laid +out purely in the name of consistency. For a visual example of the style, see the [[#example][example]] at the +bottom of this file. + +*** Code organization + :PROPERTIES: + :CUSTOM_ID: code-organization + :END: +**** Legal banner + :PROPERTIES: + :CUSTOM_ID: legal-banner + :END: + +- Files should begin with a copyright/license banner, as shown in the example above. + +**** Import statements + :PROPERTIES: + :CUSTOM_ID: import-statements + :END: + +- =Require Import= statements should all go at the top of the file, followed by file-wide =Import= + statements. + + - =Import=s often contain notations or typeclass instances that might override notations or + instances from another library, so it's nice to highlight them separately. + +- One =Require Import= statement per line; it's easier to scan that way. +- =Require Import= statements should use "fully-qualified" names (e.g. =Require Import + Coq.ZArith.ZArith= instead of =Require Import ZArith=). + + - Use the =Locate= command to find the fully-qualified name! + +- =Require Import='s should go in the following order: + + 1. Standard library dependencies (start with =Coq.=) + 2. External dependencies (anything outside the current project) + 3. Same-project dependencies + +- =Require Import='s with the same root library (the name before the first =.=) should be grouped + together. Within each root-library group, they should be in alphabetical order (so =Coq.Lists.List= + before =Coq.ZArith.ZArith=). + +**** Notations and scopes + :PROPERTIES: + :CUSTOM_ID: notations-and-scopes + :END: + +- Any file-wide =Local Open Scope='s should come immediately after the =Import=s (see example). + + - Always use =Local Open Scope=; just =Open Scope= will sneakily open the scope for those who import + your file. + +- Put notations in their own separate modules or files, so that those who import your file can + choose whether or not they want the notations. + + - Conflicting notations can cause a lot of headache, so it comes in very handy to leave this + flexibility! + +*** Formatting + :PROPERTIES: + :CUSTOM_ID: formatting + :END: +**** Line length + :PROPERTIES: + :CUSTOM_ID: line-length + :END: + +- Maximum line length 80 characters. + + - Many Coq IDE setups divide the screen in half vertically and use only half to display source + code, so more than 80 characters can be genuinely hard to read on a laptop. + +**** Whitespace and indentation + :PROPERTIES: + :CUSTOM_ID: whitespace-and-indentation + :END: + +- No trailing whitespace. + +- Spaces, not tabs. + +- Files should end with a newline. + + - Many editors do this automatically on save. + +- Colons may be either "English-spaced", with no space before the colon and one space after (=x: nat=) + or "French-spaced", with one space before and after (=x : nat=). + +- Default indentation is 2 spaces. + + - Keeping this small prevents complex proofs from being indented ridiculously far, and matches IDE + defaults. + +- Use 2-space indents if inserting a line break immediately after: + + - =Proof.= + - =fun <...> =>= + - =forall <...>,= + - =exists <....>,= + +- The style for indenting arguments in function application depends on where you make a line + break. If you make the line break immediately after the function name, use a 2-space + indent. However, if you make it after one or more arguments, align the next line with the first + argument: + + #+begin_src coq + (Z.pow + 1 2) + (Z.pow 1 2 3 + 4 5 6) + #+end_src + +- =Inductive= cases should not be indented. Example: + + #+begin_src coq + Inductive Foo : Type := + | FooA : Foo + | FooB : Foo + . + #+end_src + +- =match= or =lazymatch= cases should line up with the "m" in =match= or "l" in =lazymatch=, as in the + following examples: + + #+begin_src coq + match x with + | 3 => true + | _ => false + end. + + lazymatch x with + | 3 => idtac + | _ => fail "Not equal to 3:" x + end. + + repeat match goal with + | _ => progress subst + | _ => reflexivity + end. + + do 2 lazymatch goal with + | |- context [eq] => idtac + end. + #+end_src + +*** Definitions and Fixpoints + :PROPERTIES: + :CUSTOM_ID: definitions-and-fixpoints + :END: + +- It's okay to leave the return type of =Definition='s and =Fixpoint='s implicit (e.g. ~Definition x := 5~ + instead of ~Definition x : nat := 5~) when the type is very simple or obvious (for instance, the + definition is in a file which deals exclusively with operations on =Z=). + +*** Inductives + :PROPERTIES: + :CUSTOM_ID: inductives + :END: + +- The =.= ending an =Inductive= can be either on the same line as the last case or on its own line + immediately below. That is, both of the following are acceptable: + + #+begin_src coq + Inductive Foo : Type := + | FooA : Foo + | FooB : Foo + . + Inductive Foo : Type := + | FooA : Foo + | FooB : Foo. + #+end_src + +*** Lemma/Theorem statements + :PROPERTIES: + :CUSTOM_ID: lemmatheorem-statements + :END: + +- Generally, use =Theorem= for the most important, top-level facts you prove and =Lemma= for everything + else. +- Insert a line break after the colon in the lemma statement. +- Insert a line break after the comma for =forall= or =exist= quantifiers. +- Implication arrows (=->=) should share a line with the previous hypothesis, not the following one. +- There is no need to make a line break after every =->=; short preconditions may share a line. + +*** Proofs and tactics + :PROPERTIES: + :CUSTOM_ID: proofs-and-tactics + :END: + +- Use the =Proof= command (lined up vertically with =Lemma= or =Theorem= it corresponds to) to open a + proof, and indent the first line after it 2 spaces. + +- Very small proofs (where =Proof. <tactics> Qed.= is <= 80 characters) can go all in one line. + +- When ending a proof, align the ending statement (=Qed=, =Admitted=, etc.) with =Proof=. + +- Avoid referring to autogenerated names (e.g. =H0=, =n0=). It's okay to let Coq generate these names, + but you should not explicitly refer to them in your proof. So =intros; my_solver= is fine, but + =intros; apply H1; my_solver= is not fine. + + - You can force a non-autogenerated name by either putting the variable before the colon in the + lemma statement (=Lemma foo x : ...= instead of =Lemma foo : forall x, ...=), or by passing + arguments to =intros= (e.g. =intros ? x= to name the second argument =x=) + +- This way, the proof won't break when new hypotheses are added or autogenerated variable names + change. + +- Use curly braces ={}= for subgoals, instead of bullets. + +- /Never write tactics with more than one subgoal focused./ This can make the proof very confusing to + step through! If you have more than one subgoal, use curly braces. + +- Consider adding a comment after the opening curly brace that explains what case you're in (see + example). + + - This is not necessary for small subgoals but can help show the major lines of reasoning in large + proofs. + +- If invoking a tactic that is expected to return multiple subgoals, use =[ | ... | ]= before the =.= to + explicitly specify how many subgoals you expect. + + - Examples: =split; [ | ].= =induction z; [ | | ].= + - This helps make code more maintainable, because it fails immediately if your tactic no longer + solves as many subgoals as expected (or unexpectedly solves more). + +- If invoking a string of tactics (composed by =;=) that will break the goal into multiple subgoals + and then solve all but one, still use =[ ]= to enforce that all but one goal is solved. + + - Example: =split; try lia; [ ]=. + +- Tactics that consist only of =repeat=-ing a procedure (e.g. =repeat match=, =repeat first=) should + factor out a single step of that procedure a separate tactic called =<tactic name>_step=, because + the single-step version is much easier to debug. For instance: + + #+begin_src coq + Ltac crush_step := + match goal with + | _ => progress subst + | _ => reflexivity + end. + Ltac crush := repeat crush_step. + #+end_src + +*** Naming + :PROPERTIES: + :CUSTOM_ID: naming + :END: + +- Helper proofs about standard library datatypes should go in a module that is named to match the + standard library module (see example). + + - This makes the helper proofs look like standard-library ones, which is helpful for categorizing + them if they're genuinely at the standard-library level of abstraction. + +- Names of modules should start with capital letters. +- Names of inductives and their constructors should start with capital letters. +- Names of other definitions/lemmas should be snake case. + +*** Example + :PROPERTIES: + :CUSTOM_ID: example + :END: +A small standalone Coq file that exhibits many of the style points. + +#+begin_src coq +(* + * Vericert: Verified high-level synthesis. + * Copyright (C) 2021 Name <email@example.com> + * + * <License...> + *) + + Require Import Coq.Lists.List. + Require Import Coq.micromega.Lia. + Require Import Coq.ZArith.ZArith. + Import ListNotations. + Local Open Scope Z_scope. + + (* Helper proofs about standard library integers (Z) go within [Module Z] so + that they match standard-library Z lemmas when used. *) + Module Z. + Lemma pow_3_r x : x ^ 3 = x * x * x. + Proof. lia. Qed. (* very short proofs can go all on one line *) + + Lemma pow_4_r x : x ^ 4 = x * x * x * x. + Proof. + change 4 with (Z.succ (Z.succ (Z.succ (Z.succ 0)))). + repeat match goal with + | _ => rewrite Z.pow_1_r + | _ => rewrite Z.pow_succ_r by lia + | |- context [x * (?a * ?b)] => + replace (x * (a * b)) with (a * b * x) by lia + | _ => reflexivity + end. + Qed. + End Z. + (* Now we can access the lemmas above as Z.pow_3_r and Z.pow_4_r, as if they + were in the ZArith library! *) + + Definition bar (x y : Z) := x ^ (y + 1). + + (* example with a painfully manual proof to show case formatting *) + Lemma bar_upper_bound : + forall x y a, + 0 <= x <= a -> 0 <= y -> + 0 <= bar x y <= a ^ (y + 1). + Proof. + (* avoid referencing autogenerated names by explicitly naming variables *) + intros x y a Hx Hy. revert y Hy x a Hx. + (* explicitly indicate # subgoals with [ | ... | ] if > 1 *) + cbv [bar]; refine (natlike_ind _ _ _); [ | ]. + { (* y = 0 *) + intros; lia. } + { (* y = Z.succ _ *) + intros. + rewrite Z.add_succ_l, Z.pow_succ_r by lia. + split. + { (* 0 <= bar x y *) + apply Z.mul_nonneg_nonneg; [ lia | ]. + apply Z.pow_nonneg; lia. } + { (* bar x y < a ^ y *) + rewrite Z.pow_succ_r by lia. + apply Z.mul_le_mono_nonneg; try lia; + [ apply Z.pow_nonneg; lia | ]. + (* For more flexible proofs, use match statements to find hypotheses + rather than referring to them by autogenerated names like H0. In this + case, we'll take any hypothesis that applies to and then solves the + goal. *) + match goal with H : _ |- _ => apply H; solve [auto] end. } } + Qed. + + (* Put notations in a separate module or file so that importers can + decide whether or not to use them. *) + Module BarNotations. + Infix "#" := bar (at level 40) : Z_scope. + Notation "x '##'" := (bar x x) (at level 40) : Z_scope. + End BarNotations. +#+end_src +* Blog +** Blog +:PROPERTIES: +:EXPORT_FILE_NAME: _index +:EXPORT_HUGO_SECTION: blog +:CUSTOM_ID: blog +:END: + +Blog posts: + +** A First Look at Vericert :introduction:summary:@article: +:PROPERTIES: +:EXPORT_DATE: 2021-10-09 +:EXPORT_FILE_NAME: a-first-look-at-vericert +:EXPORT_HUGO_SECTION: blog +:EXPORT_HUGO_CUSTOM_FRONT_MATTER: :summary "Vericert is a formally verified high-level synthesis tool, translating C code into a hardware design expressed in Verilog." :aliases '("/2021/09/a-first-look-at-vericert/") +:CUSTOM_ID: a-first-look-at-vericert +:END: + +Compilers are increasingly being relied upon to produce binaries from code without introducing any +bugs in the process. Therefore, formally verified compilers like [[https://compcert.org][CompCert]] were introduced to +formally prove that the compilation process does not change the behaviour of the program. This +allowed companies like Airbus to move critical code from pure assembly to C. + +Similarly, the hardware industry mainly uses hardware description languages, such as Verilog or +VHDL, to design their hardware, which can then either be placed onto an FPGA for quick testing, or +eventually turned into an ASIC. As there are many commercial tools available for the verification +of HDLs, designers prefer to have full control over the hardware and therefore use HDLs directly. +However, there are a few downsides with using an HDL instead of higher-level abstractions. First, +it makes is computationally harder to check the functionality of hardware, as it needs to be +simulated. Second, it often requires a lot of boilerplate code that could be abstracted away +otherwise. Alternatives do exist though, such as high-level synthesis tools (HLS), or higher-level +hardware description languages. + +So why not use higher-level HDLs then? They seem to solve all the problems that current HDLs face. +Well, the adoption is hindered by the fact that even though it seems like the perfect solution to +have a new language dedicated to hardware design that provides a higher-level of abstraction than +existing HDLs, the fact is that the syntax and semantics of such a language often change drastically +compared to the existing HDLs which everyone is familiar with. It is already the case that there is +a shortage of hardware designers, and it is therefore even more unlikely that these hardware +designers would know an alternative hardware design language. + +HLS, on the other hand, tackles the problem from a different perspective, and instead ideally aims +to bring hardware design to the software designer. This is achieved by compiling a subset of valid +software programs directly into hardware, therefore making it much easier to get started with +hardware design and write algorithms in hardware. This not only brings hardware design to a larger +number of people, but also brings all the benefits of the software ecosystem to verify the hardware +designs and algorithms. The following sections will describe the benefits and drawbacks of HLS in +more detail, as well as why one would even consider having to formally verify the translation as +well. + +*** A bit about HLS + +Ideally, HLS converts any software code into an equivalent hardware design, meaning one does not +have to worry about designing any hardware and only has to focus on the functionality of the +hardware. In addition to that, this means that a lot of software code can be reused to create +hardware accelerators, speeding up the design process even more. + +#+caption: Typical HLS flow compared to the standard software flow. +#+attr_html: :width 500px +[[./static/images/hls-flow-handdrawn.svg]] + +However, in practice this does not seem to be the case yet. For now, the state-of-the-art HLS +compilers restrict the subset of the input language greatly, and often cannot optimise it optimally. +The main problem with optimisations is that hardware has a much greater design space which can be +explored, compared to software. The HLS compiler therefore has to guess in which direction it +should optimise, and try and find a Pareto optimal solution. This is unlikely to be optimal, +especially if one has various different constraints one could not supply to the HLS tool, making it +very difficult to then use the generated hardware. These are currently the main research questions +in HLS. + +One aspect which is often overlooked though, is that HLS also needs to be correct to be useful, as +it generates designs that are often difficult to check. One of the main advantages of HLS is also +that it should be possible to check the correctness of the hardware design purely in software, +making the design process so much more efficient. The fact that it might therefore introduce bugs +in the process, means that the final HLS designs need to be re-verified afterwards. + +We have written a [[./static/papers/hlsspc_latte2021.pdf][workshop paper]] published at LATTE'21 which describes why HLS in particular is an +important translation to verify. + +*** How do we formally verify it? + +#+caption: Add a back end branching off the three-address code intermediate language to produce Verilog. +#+attr_html: :width 500px +[[./static/images/toolflow-handdrawn.svg]] + +We use CompCert, an existing formally verified C compiler, to translate C into 3AC[fn:1], which is +the starting point of our high-level synthesis pass. We chose this intermediate language in +CompCert to branch off of, as it does not make any assumptions on the number of registers available, +which is ideal for hardware as registers are abundantly available. In addition to that, each +instruction in 3AC actually directly corresponds to simple operations that can be implemented in +hardware. + +The process of formally verifying the translation of instructions is quite simple then, we just need +to prove that the Verilog statement we translate it to corresponds to the instruction. In most +cases there is an equivalent operator in Verilog that performs this action, however, in some cases +they don't quite match up perfectly, in which case we need to prove some properties about integer +arithmetic to prove that the two operations are the same.[fn:2] + +**** Translating the memory model + +The main problem in the proof is: how do we translate CompCert's abstract view and description of +memory to a more concrete implementation of RAM which can subsequently be implemented as a proper +RAM by the synthesis tool? + +Implementation wise this is quite straight-forward. However, there are some constraints that need +to be taken into account: + +- The memory in hardware needs to be word-addressed for the best performance. +- Memory in hardware needs to use a proper interface to work. + +These two constraints don't really complicate the implementation much, but they do complicate the +proofs quite considerably. Firstly, it's not straightforward that /any/ address can even be divided +by four (to translate it from a byte-addressed memory into an index of a word-addressed array). +Secondly, it's not clear that the RAM will always behave properly, as one now has to reason about a +completely different ~always~ block, which will be triggered at all clock edges. + +However, this translation is still provable. The first issue can be simplified by proving that +loads and stores that are valid always have to be word-aligned, meaning they can then be divided by +four. + +The insertion of the RAM is also provable, because we can design a self-disabling RAM which will +only be activated when it is being used, making the proof considerably simpler as one doesn't have +to reason about the RAM when it isn't enabled. + +*** Useful links + +- [[./static/papers/fvhls_oopsla21.pdf][OOPSLA'21 preprint]]. +- [[https://youtu.be/clPiKbKVlUA][OOPSLA'21 presentation]]. +- [[https://github.com/ymherklotz/vericert][Github repository]]. + +* Future Work + +** Future Work +:PROPERTIES: +:EXPORT_FILE_NAME: future +:END: + +This section contains future work that should be added to Vericert to make it into a better +high-level synthesis tool. + +The next interesting optimisations that should be looked at are the following: + +- [[#globals][globals]], +- [[#type-support][type support]], +- [[#memory-partitioning][memory partitioning]], and +- [[#loop-pipelining][loop pipelining]]. + +*** Globals +:PROPERTIES: +:CUSTOM_ID: globals +:END: + +Globals are an important feature to add, as have to be handled carefully in HLS, because they have +to be placed into memory, and are often used in HLS designs. Proper handling of globals would allow +for a larger subset of programs to be compiled, even allowing for larger benchmarks to be used, such +as CHStone. + +#+begin_quote +Vericert is a formally verified high-level synthesis (HLS) tool written in Coq, which translates C code +into Verilog hardware designs. One challenge in HLS is supporting different memory architectures +and interacting with memory in an optimal way. + +Vericert currently lacks support for multiple memories, and instead only constructs one large memory +which contains all elements of the stack. Due to this limitation, Vericert mainly inlines all +function calls and does not support global variables. This project would be about adding support +for multiple memories to Vericert and proving the correctness of the improved translation. This +would then allow globals to be compiled, which greatly increases the kinds of programs that can be +translated. + +This project would be ideal for students interested in hardware and theorem proving in Coq. +#+end_quote + +*** Type Support +:PROPERTIES: +:CUSTOM_ID: type-support +:END: + +It would also be useful to have support for other datatypes in C, such as ~char~ or ~short~, as using +these small datatypes is also quite popular in HLS to make the final designs more efficient. + +#+begin_quote +Vericert is a formally verified high-level synthesis (HLS) tool written in Coq, which translates C code +into Verilog hardware designs. However, it currently only supports 32-bit integer types, which +limits the effectiveness of HLS, as smaller types cannot be supported and are therefore not +represented properly. + +This project would address this problem by adding support for multiple different types in Vericert, +and prove the new translation correct in Coq. Furthermore, the current memory in Vericert also only +supports 32-bits, so this project could also address that by generalising the memory module and +supporting smaller memory types. +#+end_quote + +*** Register Allocation +:PROPERTIES: +:CUSTOM_ID: register-allocation +:END: + +Register allocation is an optimisation which minimises the register resource usage of the design, +while still keeping the same throughput in most cases. Compared to standard software compilers, HLS +tools do not normally require a lot of register allocation, however, this can still help in a lot of +cases when registers are used unnecessarily. + +#+begin_quote +Vericert is a formally verified high-level synthesis (HLS) tool written in Coq, which translates C code +into Verilog hardware designs. +#+end_quote + +*** Memory Partitioning +:PROPERTIES: +:CUSTOM_ID: memory-partitioning +:END: + +Memory partitioning is quite an advanced optimisation, which could be combined with the support for +globals so as to make memory layouts on the FPGA more efficient and run various memory operations in +parallel. + +*** Loop pipelining +:PROPERTIES: +:CUSTOM_ID: loop-pipelining +:END: + +Loop pipelining is an optimisation to schedule loops, instead of only scheduling the instructions +inside of the loop. There are two versions of loop pipelining, software and hardware loop +pipelining. The former is done purely on instructions, whereas the latter is performed in tandem +with [[#scheduling][scheduling]]. + +* Footnotes +[fn:2] One example is with the ~Oshrximm~ instruction, which is represented using division in +CompCert, whereas it should actually perform multiple shifts. + +[fn:1] Also known as RTL in the CompCert literature, however, we refer to it as 3AC to avoid +confusion with register-transfer level, often used to describe the target of the HLS tool. |