Update name of content

author: Yann Herklotz <git@yannherklotz.com> 2022-02-26 15:00:07 +0000
committer: Yann Herklotz <git@yannherklotz.com> 2022-02-26 15:00:07 +0000
commit: 4dac87d44f2856f5db25726a2c381f869ee73af6 (patch)
tree: 6edf60103fe6f35d01b27fef2877e50d1d577439 /content.org
parent: 5e7713bc58833552a36b1dca4e582952860a5d2a (diff)
download: vericert-docs-4dac87d44f2856f5db25726a2c381f869ee73af6.tar.gz
vericert-docs-4dac87d44f2856f5db25726a2c381f869ee73af6.zip
1 files changed, 859 insertions, 0 deletions
diff --git a/content.org b/content.org
new file mode 100644
index 0000000..b702c3a
--- /dev/null
+++ b/content.org
@@ -0,0 +1,859 @@
+#+title: Vericert Documentation
+#+author: Yann Herklotz
+#+email: git@ymhg.org
+#+setupfile: setup.org
+
+* Vericert
+:PROPERTIES:
+:EXPORT_FILE_NAME: _index
+:CUSTOM_ID: vericert
+:END:
+
+A formally verified high-level synthesis (HLS) tool written in Coq, building on top of [[https://github.com/AbsInt/CompCert][CompCert]].
+This ensures the correctness of the C to Verilog translation according to our Verilog semantics and
+CompCert's C semantics, removing the need to check the resulting hardware for behavioural
+correctness.
+
+** Features
+
+The project is currently a work in progress.  Currently, the following C features are supported and
+have all been proven correct, providing a verified translation from C to Verilog:
+
+- all int operations,
+- non-recursive function calls,
+- local arrays and pointers, and
+- control-flow structures such as if-statements, for-loops, etc...
+
+** Content
+
+- [[#building][Vericert Documentation]]
+
+** Papers
+
+- OOPSLA '21 :: Yann Herklotz, James D. Pollard, Nadesh Ramanathan, and John Wickerson. Formal
+  Verification of High-Level Synthesis. In /Proc. ACM Program. Lang./ 5, OOPSLA, 2021. [ [[./static/papers/fvhls_oopsla21.pdf][pdf]] ]
+
+- LATTE '21 :: Yann Herklotz and John Wickerson. High-level synthesis tools should be proven
+  correct. In /Workshop on Languages, Tools, and Techniques for Accelerator
+  Design/, 2021. [ [[./static/papers/hlsspc_latte2021.pdf][pdf]] ]
+
+** Mailing lists
+
+For discussions, you can join the following mailing list: [[https://lists.sr.ht/~ymherklotz/vericert-discuss][lists.sr.ht/~ymherklotz/vericert-discuss]].
+
+For contributing patches to the [[https://sr.ht/~ymherklotz/vericert/][sourcehut]] repository: [[https://lists.sr.ht/~ymherklotz/vericert-devel][lists.sr.ht/~ymherklotz/vericert-devel]].
+
+* Index
+:PROPERTIES:
+:EXPORT_HUGO_SECTION: menu
+:EXPORT_FILE_NAME: index
+:EXPORT_HUGO_CUSTOM_FRONT_MATTER: :headless true
+:END:
+
+- [[#docs][Docs]]
+  - [[#building][Building]]
+
+* Documentation
+
+** Docs
+:PROPERTIES:
+:EXPORT_FILE_NAME: _index
+:EXPORT_HUGO_SECTION: docs
+:CUSTOM_ID: docs
+:END:
+
+Vericert translates C code into a hardware description language called Verilog, which can then be
+synthesised into hardware, to be placed onto a field-programmable gate array (FPGA) or
+application-specific integrated circuit (ASIC).
+
+#+attr_html: :width 600
+#+caption: Current design of Vericert, where HTL is an intermediate language representing a finite state machine with data-path (FSMD) and Verilog is the target language.
+#+name: fig:design
+[[./static/images/toolflow.svg]]
+
+The design shown in Figure [[fig:design]] shows how Vericert leverages an existing verified C compiler
+called [[https://compcert.org/compcert-C.html][CompCert]] to perform this translation.
+
+** Building Vericert
+:PROPERTIES:
+:EXPORT_FILE_NAME: building
+:EXPORT_HUGO_SECTION: docs
+:CUSTOM_ID: building
+:END:
+
+To build Vericert, the provided Makefile can be used. External dependencies are needed to build the
+project, which can be pulled in automatically with [[https://nixos.org/nix/][nix]] using the provided ~default.nix~ and ~shell.nix~
+files.
+
+The project is written in Coq, a theorem prover, which is extracted to OCaml so that it can then be
+compiled and executed. The dependencies of this project are the following:
+
+- [[https://coq.inria.fr/][Coq]]: theorem prover that is used to also program the HLS tool.
+- [[https://ocaml.org/][OCaml]]: the OCaml compiler to compile the extracted files.
+- [[https://github.com/ocaml/dune][dune]]: build tool for ocaml projects to gather all the ocaml files and compile them in the right
+  order.
+- [[http://gallium.inria.fr/~fpottier/menhir/][menhir]]: parser generator for ocaml.
+- [[https://github.com/ocaml/ocamlfind][findlib]] to find installed OCaml libraries.
+- [[https://gcc.gnu.org/][GCC]]: compiler to help build CompCert.
+
+These dependencies can be installed manually, or automatically through Nix.
+
+*** Downloading CompCert
+
+CompCert is added as a submodule in the ~lib/CompCert~ directory. It is needed to run the build
+process below, as it is the one dependency that is not downloaded by nix, and has to be downloaded
+together with the repository. To clone CompCert together with this project, you can run:
+
+#+begin_src shell
+git clone --recursive https://github.com/ymherklotz/vericert
+#+end_src
+
+If the repository is already cloned, you can run the following command to make sure that CompCert is
+also downloaded:
+
+#+begin_src shell
+git submodule update --init
+#+end_src
+
+*** Setting up Nix
+
+Nix is a package manager that can create an isolated environment so that the builds are
+reproducible. Once nix is installed, it can be used in the following way.
+
+To open a shell which includes all the necessary dependencies, one can use:
+
+#+begin_src shell
+nix-shell
+#+end_src
+
+which will open a shell that has all the dependencies loaded.
+
+*** Makefile build
+
+If the dependencies were installed manually, or if one is in the ~nix-shell~, the project can be built
+by running:
+
+#+begin_src shell
+make -j8
+#+end_src
+
+and installed locally, or under the ~PREFIX~ location using:
+
+#+begin_src shell
+make install
+#+end_src
+
+Which will install the binary in ~./bin/vericert~ by default. However, this can be changed by changing
+the ~PREFIX~ environment variable, in which case the binary will be installed in ~$PREFIX/bin/vericert~.
+
+*** Testing
+
+To test out ~vericert~ you can try the following examples which are in the test folder using the
+following:
+
+#+begin_src shell
+./bin/vericert test/loop.c -o loop.v
+./bin/vericert test/conditional.c -o conditional.v
+./bin/vericert test/add.c -o add.v
+#+end_src
+
+Or by running the test suite using the following command:
+
+#+begin_src shell
+make test
+#+end_src
+** Using Vericert
+:PROPERTIES:
+:EXPORT_FILE_NAME: using-vericert
+:EXPORT_HUGO_SECTION: docs
+:CUSTOM_ID: using-vericert
+:END:
+
+Vericert can be used to translate a subset of C into Verilog.  As a simple example, consider the
+following C file (~main.c~):
+
+#+begin_src C
+void matrix_multiply(int first[2][2], int second[2][2], int multiply[2][2]) {
+    int sum = 0;
+    for (int c = 0; c < 2; c++) {
+        for (int d = 0; d < 2; d++) {
+            for (int k = 0; k < 2; k++) {
+                sum = sum + first[c][k]*second[k][d];
+            }
+            multiply[c][d] = sum;
+            sum = 0;
+        }
+    }
+}
+
+int main() {
+    int f[2][2] = {{1, 2}, {3, 4}};
+    int s[2][2] = {{5, 6}, {7, 8}};
+    int m[2][2] = {{0, 0}, {0, 0}};
+
+    matrix_multiply(f, s, m);
+    return m[1][1];
+}
+#+end_src
+
+It can be compiled using the following command, assuming that vericert is somewhere on the path.
+
+#+begin_src shell
+vericert main.c -o main.v
+#+end_src
+
+The Verilog file contains a top-level test-bench, which can be given to any Verilog simulator to
+simulate the hardware, which should give the same result as executing the C code.  Using [[http://iverilog.icarus.com/][Icarus
+Verilog]] as an example:
+
+#+begin_src shell
+iverilog -o main_v main.v
+#+end_src
+
+When executing, it should therefore print the following:
+
+#+begin_src shell
+$ ./main_v
+finished: 50
+#+end_src
+
+This gives the same result as executing the C in the following way:
+
+#+begin_src shell
+$ gcc -o main_c main.c
+$ ./main_c
+$ echo $?
+50
+#+end_src
+
+** Unreleased Features
+:PROPERTIES:
+:EXPORT_FILE_NAME: unreleased
+:EXPORT_HUGO_SECTION: docs
+:CUSTOM_ID: unreleased-features
+:END:
+
+The following are unreleased features in Vericert that are currently being worked on and have not
+been completely proven correct yet.  Currently this includes features such as:
+
+- [[#scheduling][scheduling]],
+- [[#operation-chaining][operation chaining]],
+- [[#if-conversion][if-conversion]], and
+- [[#functions][functions]].
+
+This page gives some preliminary information on how the features are implemented and how the proofs
+for the features are being done.  Once these features are properly implemented, they will be added
+to the proper documentation.
+
+*** Scheduling
+:PROPERTIES:
+:CUSTOM_ID: scheduling
+:END:
+
+Scheduling is an optimisation which is used to run various instructions in parallel that are
+independent to each other.
+
+*** Operation Chaining
+:PROPERTIES:
+:CUSTOM_ID: operation-chaining
+:END:
+
+Operation chaining is an optimisation that can be added on to scheduling and allows for the
+sequential execution of instructions in a clock cycle, while executing other instructions in
+parallel in the same clock cycle.
+
+*** If-conversion
+:PROPERTIES:
+:CUSTOM_ID: if-conversion
+:END:
+
+If-conversion is an optimisation which can turn code with simple control flow into a single block
+(called a hyper-block), using predicated instructions.
+
+*** Functions
+:PROPERTIES:
+:CUSTOM_ID: functions
+:END:
+
+Functions are currently only inlined in Vericert, however, we are working on a proper interface to
+integrate function calls into the hardware.
+
+** Coq Style Guide
+  :PROPERTIES:
+  :CUSTOM_ID: coq-style-guide
+  :EXPORT_FILE_NAME: coq-style-guide
+  :EXPORT_HUGO_SECTION: docs
+  :END:
+
+This style guide was taken from [[https://github.com/project-oak/silveroak][Silveroak]], it outlines code style for Coq code in this
+repository. There are certainly other valid strategies and opinions on Coq code style; this is laid
+out purely in the name of consistency. For a visual example of the style, see the [[#example][example]] at the
+bottom of this file.
+
+*** Code organization
+   :PROPERTIES:
+   :CUSTOM_ID: code-organization
+   :END:
+**** Legal banner
+    :PROPERTIES:
+    :CUSTOM_ID: legal-banner
+    :END:
+
+- Files should begin with a copyright/license banner, as shown in the example above.
+
+**** Import statements
+    :PROPERTIES:
+    :CUSTOM_ID: import-statements
+    :END:
+
+- =Require Import= statements should all go at the top of the file, followed by file-wide =Import=
+  statements.
+
+  - =Import=s often contain notations or typeclass instances that might override notations or
+    instances from another library, so it's nice to highlight them separately.
+
+- One =Require Import= statement per line; it's easier to scan that way.
+- =Require Import= statements should use "fully-qualified" names (e.g. =Require Import
+  Coq.ZArith.ZArith= instead of =Require Import ZArith=).
+
+  - Use the =Locate= command to find the fully-qualified name!
+
+- =Require Import='s should go in the following order:
+
+  1. Standard library dependencies (start with =Coq.=)
+  2. External dependencies (anything outside the current project)
+  3. Same-project dependencies
+
+- =Require Import='s with the same root library (the name before the first =.=) should be grouped
+  together. Within each root-library group, they should be in alphabetical order (so =Coq.Lists.List=
+  before =Coq.ZArith.ZArith=).
+
+**** Notations and scopes
+    :PROPERTIES:
+    :CUSTOM_ID: notations-and-scopes
+    :END:
+
+- Any file-wide =Local Open Scope='s should come immediately after the =Import=s (see example).
+
+  - Always use =Local Open Scope=; just =Open Scope= will sneakily open the scope for those who import
+    your file.
+
+- Put notations in their own separate modules or files, so that those who import your file can
+  choose whether or not they want the notations.
+
+  - Conflicting notations can cause a lot of headache, so it comes in very handy to leave this
+    flexibility!
+
+*** Formatting
+   :PROPERTIES:
+   :CUSTOM_ID: formatting
+   :END:
+**** Line length
+    :PROPERTIES:
+    :CUSTOM_ID: line-length
+    :END:
+
+- Maximum line length 80 characters.
+
+  - Many Coq IDE setups divide the screen in half vertically and use only half to display source
+    code, so more than 80 characters can be genuinely hard to read on a laptop.
+
+**** Whitespace and indentation
+    :PROPERTIES:
+    :CUSTOM_ID: whitespace-and-indentation
+    :END:
+
+- No trailing whitespace.
+
+- Spaces, not tabs.
+
+- Files should end with a newline.
+
+  - Many editors do this automatically on save.
+
+- Colons may be either "English-spaced", with no space before the colon and one space after (=x: nat=)
+  or "French-spaced", with one space before and after (=x : nat=).
+
+- Default indentation is 2 spaces.
+
+  - Keeping this small prevents complex proofs from being indented ridiculously far, and matches IDE
+    defaults.
+
+- Use 2-space indents if inserting a line break immediately after:
+
+  - =Proof.=
+  - =fun <...> =>=
+  - =forall <...>,=
+  - =exists <....>,=
+
+- The style for indenting arguments in function application depends on where you make a line
+  break. If you make the line break immediately after the function name, use a 2-space
+  indent. However, if you make it after one or more arguments, align the next line with the first
+  argument:
+
+  #+begin_src coq
+    (Z.pow
+       1 2)
+    (Z.pow 1 2 3
+           4 5 6)
+  #+end_src
+
+- =Inductive= cases should not be indented. Example:
+
+  #+begin_src coq
+    Inductive Foo : Type :=
+    | FooA : Foo
+    | FooB : Foo
+    .
+  #+end_src
+
+- =match= or =lazymatch= cases should line up with the "m" in =match= or "l" in =lazymatch=, as in the
+  following examples:
+
+  #+begin_src coq
+    match x with
+    | 3 => true
+    | _ => false
+    end.
+
+    lazymatch x with
+    | 3 => idtac
+    | _ => fail "Not equal to 3:" x
+    end.
+
+    repeat match goal with
+           | _ => progress subst
+           | _ => reflexivity
+           end.
+
+    do 2 lazymatch goal with
+         | |- context [eq] => idtac
+         end.
+  #+end_src
+
+*** Definitions and Fixpoints
+   :PROPERTIES:
+   :CUSTOM_ID: definitions-and-fixpoints
+   :END:
+
+- It's okay to leave the return type of =Definition='s and =Fixpoint='s implicit (e.g. ~Definition x := 5~
+  instead of ~Definition x : nat := 5~) when the type is very simple or obvious (for instance, the
+  definition is in a file which deals exclusively with operations on =Z=).
+
+*** Inductives
+   :PROPERTIES:
+   :CUSTOM_ID: inductives
+   :END:
+
+- The =.= ending an =Inductive= can be either on the same line as the last case or on its own line
+  immediately below. That is, both of the following are acceptable:
+
+  #+begin_src coq
+    Inductive Foo : Type :=
+    | FooA : Foo
+    | FooB : Foo
+    .
+    Inductive Foo : Type :=
+    | FooA : Foo
+    | FooB : Foo.
+  #+end_src
+
+*** Lemma/Theorem statements
+   :PROPERTIES:
+   :CUSTOM_ID: lemmatheorem-statements
+   :END:
+
+- Generally, use =Theorem= for the most important, top-level facts you prove and =Lemma= for everything
+  else.
+- Insert a line break after the colon in the lemma statement.
+- Insert a line break after the comma for =forall= or =exist= quantifiers.
+- Implication arrows (=->=) should share a line with the previous hypothesis, not the following one.
+- There is no need to make a line break after every =->=; short preconditions may share a line.
+
+*** Proofs and tactics
+   :PROPERTIES:
+   :CUSTOM_ID: proofs-and-tactics
+   :END:
+
+- Use the =Proof= command (lined up vertically with =Lemma= or =Theorem= it corresponds to) to open a
+  proof, and indent the first line after it 2 spaces.
+
+- Very small proofs (where =Proof. <tactics> Qed.= is <= 80 characters) can go all in one line.
+
+- When ending a proof, align the ending statement (=Qed=, =Admitted=, etc.) with =Proof=.
+
+- Avoid referring to autogenerated names (e.g. =H0=, =n0=). It's okay to let Coq generate these names,
+  but you should not explicitly refer to them in your proof. So =intros; my_solver= is fine, but
+  =intros; apply H1; my_solver= is not fine.
+
+  - You can force a non-autogenerated name by either putting the variable before the colon in the
+    lemma statement (=Lemma foo x : ...= instead of =Lemma foo : forall x, ...=), or by passing
+    arguments to =intros= (e.g. =intros ? x= to name the second argument =x=)
+
+- This way, the proof won't break when new hypotheses are added or autogenerated variable names
+  change.
+
+- Use curly braces ={}= for subgoals, instead of bullets.
+
+- /Never write tactics with more than one subgoal focused./ This can make the proof very confusing to
+  step through! If you have more than one subgoal, use curly braces.
+
+- Consider adding a comment after the opening curly brace that explains what case you're in (see
+  example).
+
+  - This is not necessary for small subgoals but can help show the major lines of reasoning in large
+    proofs.
+
+- If invoking a tactic that is expected to return multiple subgoals, use =[ | ... | ]= before the =.= to
+  explicitly specify how many subgoals you expect.
+
+  - Examples: =split; [ | ].= =induction z; [ | | ].=
+  - This helps make code more maintainable, because it fails immediately if your tactic no longer
+    solves as many subgoals as expected (or unexpectedly solves more).
+
+- If invoking a string of tactics (composed by =;=) that will break the goal into multiple subgoals
+  and then solve all but one, still use =[ ]= to enforce that all but one goal is solved.
+
+  - Example: =split; try lia; [ ]=.
+
+- Tactics that consist only of =repeat=-ing a procedure (e.g. =repeat match=, =repeat first=) should
+  factor out a single step of that procedure a separate tactic called =<tactic name>_step=, because
+  the single-step version is much easier to debug. For instance:
+
+  #+begin_src coq
+    Ltac crush_step :=
+      match goal with
+      | _ => progress subst
+      | _ => reflexivity
+      end.
+    Ltac crush := repeat crush_step.
+  #+end_src
+
+*** Naming
+   :PROPERTIES:
+   :CUSTOM_ID: naming
+   :END:
+
+- Helper proofs about standard library datatypes should go in a module that is named to match the
+  standard library module (see example).
+
+  - This makes the helper proofs look like standard-library ones, which is helpful for categorizing
+    them if they're genuinely at the standard-library level of abstraction.
+
+- Names of modules should start with capital letters.
+- Names of inductives and their constructors should start with capital letters.
+- Names of other definitions/lemmas should be snake case.
+
+*** Example
+   :PROPERTIES:
+   :CUSTOM_ID: example
+   :END:
+A small standalone Coq file that exhibits many of the style points.
+
+#+begin_src coq
+(*
+ * Vericert: Verified high-level synthesis.
+ * Copyright (C) 2021 Name <email@example.com>
+ *
+ * <License...>
+ *)
+
+  Require Import Coq.Lists.List.
+  Require Import Coq.micromega.Lia.
+  Require Import Coq.ZArith.ZArith.
+  Import ListNotations.
+  Local Open Scope Z_scope.
+
+  (* Helper proofs about standard library integers (Z) go within [Module Z] so
+     that they match standard-library Z lemmas when used. *)
+  Module Z.
+    Lemma pow_3_r x : x ^ 3 = x * x * x.
+    Proof. lia. Qed. (* very short proofs can go all on one line *)
+
+    Lemma pow_4_r x : x ^ 4 = x * x * x * x.
+    Proof.
+      change 4 with (Z.succ (Z.succ (Z.succ (Z.succ 0)))).
+      repeat match goal with
+             | _ => rewrite Z.pow_1_r
+             | _ => rewrite Z.pow_succ_r by lia
+             | |- context [x * (?a * ?b)] =>
+               replace (x * (a * b)) with (a * b * x) by lia
+             | _ => reflexivity
+             end.
+    Qed.
+  End Z.
+  (* Now we can access the lemmas above as Z.pow_3_r and Z.pow_4_r, as if they
+     were in the ZArith library! *)
+
+  Definition bar (x y : Z) := x ^ (y + 1).
+
+  (* example with a painfully manual proof to show case formatting *)
+  Lemma bar_upper_bound :
+    forall x y a,
+      0 <= x <= a -> 0 <= y ->
+      0 <= bar x y <= a ^ (y + 1).
+  Proof.
+    (* avoid referencing autogenerated names by explicitly naming variables *)
+    intros x y a Hx Hy. revert y Hy x a Hx.
+    (* explicitly indicate # subgoals with [ | ... | ] if > 1 *)
+    cbv [bar]; refine (natlike_ind _ _ _); [ | ].
+    { (* y = 0 *)
+      intros; lia. }
+    { (* y = Z.succ _ *)
+      intros.
+      rewrite Z.add_succ_l, Z.pow_succ_r by lia.
+      split.
+      { (* 0 <= bar x y *)
+        apply Z.mul_nonneg_nonneg; [ lia | ].
+        apply Z.pow_nonneg; lia. }
+      { (* bar x y < a ^ y *)
+        rewrite Z.pow_succ_r by lia.
+        apply Z.mul_le_mono_nonneg; try lia;
+          [ apply Z.pow_nonneg; lia | ].
+        (* For more flexible proofs, use match statements to find hypotheses
+           rather than referring to them by autogenerated names like H0. In this
+           case, we'll take any hypothesis that applies to and then solves the
+           goal. *)
+        match goal with H : _ |- _ => apply H; solve [auto] end. } }
+  Qed.
+
+  (* Put notations in a separate module or file so that importers can
+     decide whether or not to use them. *)
+  Module BarNotations.
+    Infix "#" := bar (at level 40) : Z_scope.
+    Notation "x '##'" := (bar x x) (at level 40) : Z_scope.
+  End BarNotations.
+#+end_src
+* Blog
+** Blog
+:PROPERTIES:
+:EXPORT_FILE_NAME: _index
+:EXPORT_HUGO_SECTION: blog
+:CUSTOM_ID: blog
+:END:
+
+Blog posts:
+
+** A First Look at Vericert :introduction:summary:@article:
+:PROPERTIES:
+:EXPORT_DATE: 2021-10-09
+:EXPORT_FILE_NAME: a-first-look-at-vericert
+:EXPORT_HUGO_SECTION: blog
+:EXPORT_HUGO_CUSTOM_FRONT_MATTER: :summary "Vericert is a formally verified high-level synthesis tool, translating C code into a hardware design expressed in Verilog." :aliases '("/2021/09/a-first-look-at-vericert/")
+:CUSTOM_ID: a-first-look-at-vericert
+:END:
+
+Compilers are increasingly being relied upon to produce binaries from code without introducing any
+bugs in the process.  Therefore, formally verified compilers like [[https://compcert.org][CompCert]] were introduced to
+formally prove that the compilation process does not change the behaviour of the program.  This
+allowed companies like Airbus to move critical code from pure assembly to C.
+
+Similarly, the hardware industry mainly uses hardware description languages, such as Verilog or
+VHDL, to design their hardware, which can then either be placed onto an FPGA for quick testing, or
+eventually turned into an ASIC.  As there are many commercial tools available for the verification
+of HDLs, designers prefer to have full control over the hardware and therefore use HDLs directly.
+However, there are a few downsides with using an HDL instead of higher-level abstractions.  First,
+it makes is computationally harder to check the functionality of hardware, as it needs to be
+simulated.  Second, it often requires a lot of boilerplate code that could be abstracted away
+otherwise.  Alternatives do exist though, such as high-level synthesis tools (HLS), or higher-level
+hardware description languages.
+
+So why not use higher-level HDLs then? They seem to solve all the problems that current HDLs face.
+Well, the adoption is hindered by the fact that even though it seems like the perfect solution to
+have a new language dedicated to hardware design that provides a higher-level of abstraction than
+existing HDLs, the fact is that the syntax and semantics of such a language often change drastically
+compared to the existing HDLs which everyone is familiar with.  It is already the case that there is
+a shortage of hardware designers, and it is therefore even more unlikely that these hardware
+designers would know an alternative hardware design language.
+
+HLS, on the other hand, tackles the problem from a different perspective, and instead ideally aims
+to bring hardware design to the software designer.  This is achieved by compiling a subset of valid
+software programs directly into hardware, therefore making it much easier to get started with
+hardware design and write algorithms in hardware.  This not only brings hardware design to a larger
+number of people, but also brings all the benefits of the software ecosystem to verify the hardware
+designs and algorithms.  The following sections will describe the benefits and drawbacks of HLS in
+more detail, as well as why one would even consider having to formally verify the translation as
+well.
+
+*** A bit about HLS
+
+Ideally, HLS converts any software code into an equivalent hardware design, meaning one does not
+have to worry about designing any hardware and only has to focus on the functionality of the
+hardware.  In addition to that, this means that a lot of software code can be reused to create
+hardware accelerators, speeding up the design process even more.
+
+#+caption: Typical HLS flow compared to the standard software flow.
+#+attr_html: :width 500px
+[[./static/images/hls-flow-handdrawn.svg]]
+
+However, in practice this does not seem to be the case yet.  For now, the state-of-the-art HLS
+compilers restrict the subset of the input language greatly, and often cannot optimise it optimally.
+The main problem with optimisations is that hardware has a much greater design space which can be
+explored, compared to software.  The HLS compiler therefore has to guess in which direction it
+should optimise, and try and find a Pareto optimal solution.  This is unlikely to be optimal,
+especially if one has various different constraints one could not supply to the HLS tool, making it
+very difficult to then use the generated hardware.  These are currently the main research questions
+in HLS.
+
+One aspect which is often overlooked though, is that HLS also needs to be correct to be useful, as
+it generates designs that are often difficult to check.  One of the main advantages of HLS is also
+that it should be possible to check the correctness of the hardware design purely in software,
+making the design process so much more efficient.  The fact that it might therefore introduce bugs
+in the process, means that the final HLS designs need to be re-verified afterwards.
+
+We have written a [[./static/papers/hlsspc_latte2021.pdf][workshop paper]] published at LATTE'21 which describes why HLS in particular is an
+important translation to verify.
+
+*** How do we formally verify it?
+
+#+caption: Add a back end branching off the three-address code intermediate language to produce Verilog.
+#+attr_html: :width 500px
+[[./static/images/toolflow-handdrawn.svg]]
+
+We use CompCert, an existing formally verified C compiler, to translate C into 3AC[fn:1], which is
+the starting point of our high-level synthesis pass.  We chose this intermediate language in
+CompCert to branch off of, as it does not make any assumptions on the number of registers available,
+which is ideal for hardware as registers are abundantly available.  In addition to that, each
+instruction in 3AC actually directly corresponds to simple operations that can be implemented in
+hardware.
+
+The process of formally verifying the translation of instructions is quite simple then, we just need
+to prove that the Verilog statement we translate it to corresponds to the instruction.  In most
+cases there is an equivalent operator in Verilog that performs this action, however, in some cases
+they don't quite match up perfectly, in which case we need to prove some properties about integer
+arithmetic to prove that the two operations are the same.[fn:2]
+
+**** Translating the memory model
+
+The main problem in the proof is: how do we translate CompCert's abstract view and description of
+memory to a more concrete implementation of RAM which can subsequently be implemented as a proper
+RAM by the synthesis tool?
+
+Implementation wise this is quite straight-forward.  However, there are some constraints that need
+to be taken into account:
+
+- The memory in hardware needs to be word-addressed for the best performance.
+- Memory in hardware needs to use a proper interface to work.
+
+These two constraints don't really complicate the implementation much, but they do complicate the
+proofs quite considerably.  Firstly, it's not straightforward that /any/ address can even be divided
+by four (to translate it from a byte-addressed memory into an index of a word-addressed array).
+Secondly, it's not clear that the RAM will always behave properly, as one now has to reason about a
+completely different ~always~ block, which will be triggered at all clock edges.
+
+However, this translation is still provable.  The first issue can be simplified by proving that
+loads and stores that are valid always have to be word-aligned, meaning they can then be divided by
+four.
+
+The insertion of the RAM is also provable, because we can design a self-disabling RAM which will
+only be activated when it is being used, making the proof considerably simpler as one doesn't have
+to reason about the RAM when it isn't enabled.
+
+*** Useful links
+
+- [[./static/papers/fvhls_oopsla21.pdf][OOPSLA'21 preprint]].
+- [[https://youtu.be/clPiKbKVlUA][OOPSLA'21 presentation]].
+- [[https://github.com/ymherklotz/vericert][Github repository]].
+
+* Future Work
+
+** Future Work
+:PROPERTIES:
+:EXPORT_FILE_NAME: future
+:END:
+
+This section contains future work that should be added to Vericert to make it into a better
+high-level synthesis tool.
+
+The next interesting optimisations that should be looked at are the following:
+
+- [[#globals][globals]],
+- [[#type-support][type support]],
+- [[#memory-partitioning][memory partitioning]], and
+- [[#loop-pipelining][loop pipelining]].
+
+*** Globals
+:PROPERTIES:
+:CUSTOM_ID: globals
+:END:
+
+Globals are an important feature to add, as have to be handled carefully in HLS, because they have
+to be placed into memory, and are often used in HLS designs.  Proper handling of globals would allow
+for a larger subset of programs to be compiled, even allowing for larger benchmarks to be used, such
+as CHStone.
+
+#+begin_quote
+Vericert is a formally verified high-level synthesis (HLS) tool written in Coq, which translates C code
+into Verilog hardware designs.  One challenge in HLS is supporting different memory architectures
+and interacting with memory in an optimal way.
+
+Vericert currently lacks support for multiple memories, and instead only constructs one large memory
+which contains all elements of the stack.  Due to this limitation, Vericert mainly inlines all
+function calls and does not support global variables.  This project would be about adding support
+for multiple memories to Vericert and proving the correctness of the improved translation.  This
+would then allow globals to be compiled, which greatly increases the kinds of programs that can be
+translated.
+
+This project would be ideal for students interested in hardware and theorem proving in Coq.
+#+end_quote
+
+*** Type Support
+:PROPERTIES:
+:CUSTOM_ID: type-support
+:END:
+
+It would also be useful to have support for other datatypes in C, such as ~char~ or ~short~, as using
+these small datatypes is also quite popular in HLS to make the final designs more efficient.
+
+#+begin_quote
+Vericert is a formally verified high-level synthesis (HLS) tool written in Coq, which translates C code
+into Verilog hardware designs.  However, it currently only supports 32-bit integer types, which
+limits the effectiveness of HLS, as smaller types cannot be supported and are therefore not
+represented properly.
+
+This project would address this problem by adding support for multiple different types in Vericert,
+and prove the new translation correct in Coq.  Furthermore, the current memory in Vericert also only
+supports 32-bits, so this project could also address that by generalising the memory module and
+supporting smaller memory types.
+#+end_quote
+
+*** Register Allocation
+:PROPERTIES:
+:CUSTOM_ID: register-allocation
+:END:
+
+Register allocation is an optimisation which minimises the register resource usage of the design,
+while still keeping the same throughput in most cases.  Compared to standard software compilers, HLS
+tools do not normally require a lot of register allocation, however, this can still help in a lot of
+cases when registers are used unnecessarily.
+
+#+begin_quote
+Vericert is a formally verified high-level synthesis (HLS) tool written in Coq, which translates C code
+into Verilog hardware designs.
+#+end_quote
+
+*** Memory Partitioning
+:PROPERTIES:
+:CUSTOM_ID: memory-partitioning
+:END:
+
+Memory partitioning is quite an advanced optimisation, which could be combined with the support for
+globals so as to make memory layouts on the FPGA more efficient and run various memory operations in
+parallel.
+
+*** Loop pipelining
+:PROPERTIES:
+:CUSTOM_ID: loop-pipelining
+:END:
+
+Loop pipelining is an optimisation to schedule loops, instead of only scheduling the instructions
+inside of the loop.  There are two versions of loop pipelining, software and hardware loop
+pipelining.  The former is done purely on instructions, whereas the latter is performed in tandem
+with [[#scheduling][scheduling]].
+
+* Footnotes
+[fn:2] One example is with the ~Oshrximm~ instruction, which is represented using division in
+CompCert, whereas it should actually perform multiple shifts.
+
+[fn:1] Also known as RTL in the CompCert literature, however, we refer to it as 3AC to avoid
+confusion with register-transfer level, often used to describe the target of the HLS tool.
author	Yann Herklotz <git@yannherklotz.com>	2022-02-26 15:00:07 +0000
committer	Yann Herklotz <git@yannherklotz.com>	2022-02-26 15:00:07 +0000
commit	4dac87d44f2856f5db25726a2c381f869ee73af6 (patch)
tree	6edf60103fe6f35d01b27fef2877e50d1d577439 /content.org
parent	5e7713bc58833552a36b1dca4e582952860a5d2a (diff)
download	vericert-docs-4dac87d44f2856f5db25726a2c381f869ee73af6.tar.gz vericert-docs-4dac87d44f2856f5db25726a2c381f869ee73af6.zip