summaryrefslogtreecommitdiffstats
path: root/proof.tex
diff options
context:
space:
mode:
authorJohn Wickerson <j.wickerson@imperial.ac.uk>2021-08-03 11:34:42 +0000
committernode <node@git-bridge-prod-0>2021-08-03 12:37:17 +0000
commitf7e372cacdc85498828fb9f0fc3ea86099f9301e (patch)
treec7ecee89a7d0cc6af2d307892043aaa241117224 /proof.tex
parent85824b706017e69b12a250c8a873dd0a881d66cb (diff)
downloadoopsla21_fvhls-f7e372cacdc85498828fb9f0fc3ea86099f9301e.tar.gz
oopsla21_fvhls-f7e372cacdc85498828fb9f0fc3ea86099f9301e.zip
Update on Overleaf.
Diffstat (limited to 'proof.tex')
-rw-r--r--proof.tex15
1 files changed, 8 insertions, 7 deletions
diff --git a/proof.tex b/proof.tex
index 804c10d..4566619 100644
--- a/proof.tex
+++ b/proof.tex
@@ -2,7 +2,7 @@
Now that the Verilog semantics have been adapted to the CompCert model, we are in a position to formally prove the correctness of our C-to-Verilog compilation. This section describes the main correctness theorem that was proven and the main ideas behind the proof. The full Coq proof is available in auxiliary material.
-The main correctness theorem is analogous to that stated in \compcert{}~\cite{leroy09_formal_verif_realis_compil}: for all Clight source programs $C$, if the translation to the target Verilog code succeeds, and $C$ has safe observable behaviour $B$ when executed, then the target Verilog code will have the same behaviour $B$. Here, a `safe' execution is one that either converges or diverges, but does not ``go wrong''. If the program does admit some wrong behaviour (like undefined behaviour in C), the correctness theorem does not apply. A behaviour, then, is either a final state (in the case of convergence) or divergence. In \compcert{}, a behaviour is also associated with a trace of I/O events, but since external function calls are not supported in \vericert{}, this trace will always be empty. This correctness theorem is also appropriate for HLS, as HLS is often used as a part of a larger hardware design that is connected together using a hardware description language like Verilog. This means that HLS designs are normally triggered multiple times and results are returned each time when the computation terminates, which is the property that the correctness theorem states. Note that the compiler is allowed to fail and not produce any output; the correctness theorem only applies when the translation succeeds.
+The main correctness theorem is analogous to that stated in \compcert{}~\cite{leroy09_formal_verif_realis_compil}: for all Clight source programs $C$, if the translation to the target Verilog code succeeds, and $C$ has safe observable behaviour $B$ when executed, then the target Verilog code will have the same behaviour $B$. Here, a `safe' execution is one that either converges or diverges, but does not ``go wrong''. If the program does admit some wrong behaviour (like undefined behaviour in C), the correctness theorem does not apply. A behaviour, then, is either a final state (in the case of convergence) or divergence. In \compcert{}, a behaviour is also associated with a trace of I/O events, but since external function calls are not supported in \vericert{}, this trace will always be empty. This correctness theorem is also appropriate for HLS \JW{Perhaps it would be worth first explaining why somebody might think this correctness theorem might \emph{not} be appropriate for HLS. At the moment, it feels like you're giving the answer without saying the question. Is it to do with the fact that hardware tends to run forever?}, as HLS is often used as a part of a larger hardware design that is connected together using a hardware description language like Verilog. This means that HLS designs are normally triggered multiple times and results are returned each time when the computation terminates, which is the property that the correctness theorem states. Note that the compiler is allowed to fail and not produce any output; the correctness theorem only applies when the translation succeeds.
%The following `backwards simulation' theorem describes the correctness theorem, where $\Downarrow$ stands for simulation and execution respectively.
@@ -13,7 +13,7 @@ The main correctness theorem is analogous to that stated in \compcert{}~\cite{le
\end{equation*}
\end{theorem}
-The theorem is a `backwards simulation' result (from target to source), following the terminology used in the \compcert{} literature. The theorem does not demand the `if' direction too, because compilers are permitted to resolve any non-determinism present in their source programs.
+The theorem is a `backwards simulation' result (from target to source), following the terminology used in the \compcert{} literature~\cite{leroy09_formal_verif_realis_compil}. The theorem does not demand the `if' direction too, because compilers are permitted to resolve any non-determinism present in their source programs.
In practice, Clight programs are all deterministic, as are the Verilog programs in the fragment we consider. This means that we can prove the correctness theorem above by first inverting it to become a forwards simulation result, following standard \compcert{} practice.
Furthermore, to prove the forward simulation, it suffices to prove forward simulations between each intermediate language, as these results can be composed to prove the correctness of the whole HLS tool.
@@ -21,14 +21,15 @@ The forward simulation from 3AC to HTL is stated in Lemma~\ref{lemma:htl} (Secti
\subsection{Main challenges in the proof}
-The proof of correctness of the Verilog back end is quite different to the usual proofs performed in CompCert, mainly because the difference in Verilog semantics compared to the standard CompCert intermediate languages and because of the translation of the memory model.
+The proof of correctness of the Verilog back end is quite different to the usual proofs performed in CompCert, mainly because of the difference in Verilog semantics compared to the standard CompCert intermediate languages and because of the translation of the memory model.
-Because the memory model in our Verilog semantics is finite and concrete, whereas the CompCert memory model is more abstract and infinite with additional bounds, the equivalence of both these models needs to be proven. Moreover, our memory is word-addressed for efficiency reasons, whereas CompCert's memory is byte-addressed.
+First, because the memory model in our Verilog semantics is finite and concrete, but the CompCert memory model is more abstract and infinite with additional bounds, the equivalence of both these models needs to be proven. Moreover, our memory is word-addressed for efficiency reasons, whereas CompCert's memory is byte-addressed.
-The Verilog semantics operates quite differently to the usual intermediate languages in the backend. All the CompCert intermediate languages use a map from control-flow nodes to instructions. An instruction can therefore be selected using an abstract program pointer. On the other hand, in the Verilog semantics the whole design is executed at every clock cycle, because hardware is inherently parallel. The program pointer is part of the design as well, not just part of an abstract state. This makes the semantics of Verilog simpler, but comparing it to the semantics of 3AC becomes more challenging, as one has to map the abstract notion of the state to concrete values in registers.
+Second, the Verilog semantics operates quite differently to the usual intermediate languages in the backend. All the CompCert intermediate languages use a map from control-flow nodes to instructions. An instruction can therefore be selected using an abstract program pointer. On the other hand, in the Verilog semantics the whole design is executed at every clock cycle, because hardware is inherently parallel. The program pointer is part of the design as well, not just part of an abstract state. This makes the semantics of Verilog simpler, but comparing it to the semantics of 3AC becomes more challenging, as one has to map the abstract notion of the state to concrete values in registers.
Both these differences mean that translating 3AC directly to Verilog is infeasible, as the differences in the semantics is too large. Instead, a new intermediate language needs to be introduced, called HTL, which bridges the gap in the semantics between the two languages. HTL still consists of maps, like many of the other CompCert languages, however, each state corresponds to a Verilog statement.
+
\subsection{Forward simulation from 3AC to HTL}\label{sec:proof:3ac_htl}
As HTL is quite far removed from 3AC, this first translation is the most involved and therefore requires a larger proof, because the translation from 3AC instructions to Verilog statements needs to be proven correct in this step. In addition to that, the semantics of HTL are also quite different to the 3AC semantics, as instead of defining small-step semantics for each construct in Verilog, the semantics are instead defined over one clock cycle and mirror the semantics defined for Verilog. Lemma~\ref{lemma:htl} shows the result that needs to be proven in this subsection.
@@ -227,9 +228,9 @@ The final lemma we need is that the Verilog we generate is deterministic. This r
\label{tab:proof_statistics}
\end{table*}
-The lines of code for the implementation and proof of \vericert{} can be found in Table~\ref{tab:proof_statistics}. Overall, it took about 1.5 person-years to build \vericert{} -- about three person-months on implementation and 15 person-months on proofs. The largest proof is the correctness proof for the HTL generation, which required equivalence proofs between all integer operations supported by \compcert{} and those supported in hardware. From the 3069 lines of proof code in the HTL generation, 1189 are for the correctness proof of just the load and store instructions. These were tedious to prove correct because of the substantial difference between the memory models used, and the need to prove properties such as stores outside of the allocated memory being undefined, so that a finite array could be used. In addition to that, since pointers in HTL and Verilog are represented as integers, whereas there is a separate pointer value in the \compcert{} semantics, it was painful to reason about them and many new theorems had to be proven about integers and pointers in \vericert{}. In addition to that, the second largest proof of the correct RAM generation includes many proofs about the extensional equality of array operations, such as merging arrays with different assignments. As the negative edge implies two merges take place every clock cycle, the proofs about the equality of the arrays becomes more tedious as well.
+The lines of code for the implementation and proof of \vericert{} can be found in Table~\ref{tab:proof_statistics}. Overall, it took about 1.5 person-years to build \vericert{} -- about three person-months on implementation and 15 person-months on proofs. The largest proof is the correctness proof for the HTL generation, which required equivalence proofs between all integer operations supported by \compcert{} and those supported in hardware. From the 3069 lines of proof code in the HTL generation, 1189 are for the correctness proof of just the load and store instructions. These were tedious to prove correct because of the substantial difference between the memory models used, and the need to prove properties such as stores outside of the allocated memory being undefined, so that a finite array could be used. In addition to that, since pointers in HTL and Verilog are represented as integers, whereas there is a separate `pointer' value in the \compcert{} semantics, it was painful to reason about them and many new theorems had to be proven about integers and pointers in \vericert{}. In addition to that, \JW{Repeated `In addition to that'. I quite like `Moreover', for a bit of variety.} the second-largest proof of the correct RAM generation includes many proofs about the extensional equality of array operations, such as merging arrays with different assignments. As the negative edge implies two merges take place every clock cycle, the proofs about the equality of the arrays becomes more tedious as well.
-Looking at the trusted base of \vericert{}, the Verilog semantics are 431 lines of code. This, together with the Clight semantics from \compcert{}, are the only parts of the compiler that need to be trusted. Compared to the 1721 lines of the implementation that are written in Coq, which are the verified parts of the HLS tool, this is larger than the 431 lines of Verilog semantics specification, even if the Clight semantics are added. In addition to that, reading semantics specifications is simpler than trying to understand algorithms, meaning the trusted base has been successfully reduced.
+Looking at the trusted \JW{computing?} base of \vericert{}, the Verilog semantics are 431 lines of code. This, together with the Clight semantics from \compcert{}, are the only parts of the compiler that need to be trusted. Compared to the 1721 lines of the implementation that are written in Coq, which are the verified parts of the HLS tool, this is larger than the 431 lines of Verilog semantics specification, even if the Clight semantics are added. In addition to that, reading semantics specifications is simpler than trying to understand algorithms, meaning the trusted base has been successfully reduced.
%\JW{Can we include a comment about the size of the trusted base, in case we get that reviewer again?}