summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--archive/proof.tex4
-rw-r--r--proof.tex30
2 files changed, 23 insertions, 11 deletions
diff --git a/archive/proof.tex b/archive/proof.tex
index 9e962b7..6641c8d 100644
--- a/archive/proof.tex
+++ b/archive/proof.tex
@@ -1,3 +1,7 @@
+One example is the translation of instructions, where the translation function might contain many implementation specific details, such as in which order instructions are added to the data path or the control path, whereas the specification describes only describes which instructions were added. The specification, \texttt{tr\_instr} can then be described as follows, where \textit{fin}, \textit{rtrn}, $\sigma$ and \textit{stk} are the registers for the finished signal, return value, current state and stack respectively, $i$ is the 3AC instruction being translated, and \textit{data} and \textit{control} are the data-flow and control logic map respectively.
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
Firstly, as the input representation is a map, all the keys of the map will be unique, which will be translated to the matched expressions in the case statement. In addition to that, the assumption is made that every possible evaluation of the state value will give a valid state, meaning the key will be present in the map. Both of these observations mean that after the translation, for every possible value of the state, the case statement will be able to match an expression and enter the correct statement. As each key is unique, if the state matches a case expression, then this also means that this is the only case expression that it could match, and it therefore has to be the correct case expression which contains the same statement as the map.
However, the proof of uniqueness of the keys only works if the translation function is \emph{injective}, meaning that the function will result in distinct outputs for all possible inputs. Otherwise, the proof of uniqueness of the keys for the input would not translate to a uniqueness of possible case expression matches in the output. However, in our case the translation function is actually not injective, even though it might at first seem like it, because the state is stored as an integer, whereas the map is addressable by any positive number. This means that if the positive number is greater than the maximum value that can be stored in the integer, the overflow would result in the wrong states being accessed. To make the function injective, we therefore have to prove that the input positive number is never greater than $2^{32}-1$ so that the uniqueness property also holds for the output.
diff --git a/proof.tex b/proof.tex
index 90472c1..210092b 100644
--- a/proof.tex
+++ b/proof.tex
@@ -1,6 +1,6 @@
\section{Proof}\label{sec:proof}
-Now that the Verilog semantics have been adapted to the CompCert model, we are in a position to formally prove the correctness of our C to Verilog compilation. This section describes the main correctness theorem that was proven and the main ideas behind the proofs.
+Now that the Verilog semantics have been adapted to the CompCert model, we are in a position to formally prove the correctness of our C to Verilog compilation. This section describes the main correctness theorem that was proven and the main ideas behind the proof.
The main correctness theorem is the exact same correctness theorem as stated in \compcert{}~\cite{leroy09_formal_verif_realis_compil} which states that for all Clight source programs $C$, if the translation from the source to the target Verilog code succeeds, assuming that $C$ has correct observable behaviour $B$ when executed, then the target Verilog code will simulate with the same behaviour $B$, where behaviour can either be convergent, divergent or ``going wrong'' and is associated with a trace $t$ of any external function calls. The following backwards simulation theorem describes this property, where $\Downarrow_{s}$ and $\Downarrow$ stand for simulation and execution respectively.
@@ -8,20 +8,22 @@ The main correctness theorem is the exact same correctness theorem as stated in
\forall C, V, B \notin \texttt{Wrong},\, \yhfunction{HLS} (C) = \yhconstant{OK} (V) \land V \Downarrow_{s} B \implies C \Downarrow B.
\end{equation*}
-To prove this top-level theorem, a forward simulation has to be done between each intermediate language, which implies that the forward simulation between the C and Verilog also holds. To then prove the backward simulation, it suffices to show that the Verilog semantics are deterministic, which implies that there is only one possible behaviour for the Verilog output, and that it therefore also has to imply that $C$ has the same behaviour.
+To prove this top-level theorem, we first have to build a specification of the translation algorith, with the properties that are needed to prove the simulation correct, which is described in Section~\ref{sec:proof:specification}. Next, a forward simulation has to be proven between the semantics of 3AC and HTL, and then between HTL and Verilog, which then implies that the forward simulation holds between C and Verilog. Then, to prove the backward simulation, it suffices to show that the Verilog semantics are deterministic, which implies that there is only one possible behaviour for the Verilog output, and that it therefore also has to imply that $C$ has the same behaviour.
-\subsection{Specification}
+\subsection{Specification}\label{sec:proof:specification}
-To simplify the proof, instead of using the translation algorithm as an assumption, as was done in the backward simulation stated above, a specification of the translation can be constructed instead which contains all the properties that are needed to prove the correctness. For example, for the translation from 3AC to HTL, the translation might be defined using the following function: $\yhfunction{transl\_3ac\_htl} (c) = \yhconstant{OK} (h)$, where $c$ is the 3AC input code and $h$ is the generated HTL code. However, instead we can define a relation \texttt{spec\_3ac\_htl} between the 3AC and HTL code, which contains all the properties about the translation that are needed, without any of the implementation. If the following can be proven, it can then be used instead of the translation algorithm when performing the proof of correctness.
+To simplify the proof, instead of using the translation algorithm as an assumption, as was done in the backward simulation stated above, a specification of the translation can be constructed instead which contains all the properties that are needed to prove the correctness. For example, for the translation from 3AC to HTL, the function implementing the translation from 3AC instructions to HTL Verilog statements is defined using the following function:
-\begin{equation*}
- \forall\ c\ h,\ \yhfunction{transl\_3ac\_htl} (c) = \yhconstant{OK}(h) \implies \texttt{spec\_3ac\_htl}\ c\ h.
-\end{equation*}
+\begin{minted}{coq}
+tr_instr : 3AC.instruction -> state -> res state
+\end{minted}
-One example is the translation of instructions, where the translation function might contain many implementation specific details, such as in which order instructions are added to the data path or the control path, whereas the specification describes only describes which instructions were added. The specification, \texttt{tr\_instr} can then be described as follows, where \textit{fin}, \textit{rtrn}, $\sigma$ and \textit{stk} are the registers for the finished signal, return value, current state and stack respectively, $i$ is the 3AC instruction being translated, and \textit{data} and \textit{control} are the data-flow and control logic map respectively.
+\noindent where the type \texttt{res} can either be an \texttt{Error}, or an \texttt{OK} with the translated resulting \texttt{state}, and the \texttt{state} is a collection of various results that are needed to build the final HTL code, such as the current data path, control logic and name of registers.
+
+However, instead we can define a specification for the translation of instructions by defining a relation \texttt{spec\_tr\_instr}, containing all the properties about the instruction translation that are needed, without any of the implementation details of the \texttt{tr\_instr} function, such as the exact structure of the \texttt{state} or order in which statements are added to the data path or the control logic.
\begin{equation*}
- \yhfunction{tr\_instr } \textit{fin rtrn }\ \sigma\ \textit{stk }\ i\ \textit{data }\ \textit{control}
+ \yhfunction{spec\_tr\_instr } \textit{fin rtrn }\ \sigma\ \textit{stk }\ i\ \textit{data }\ \textit{control}
\end{equation*}
\noindent An specific example of a rule describing the translation of an \texttt{Iop} operation in 3AC is the following:
@@ -32,6 +34,12 @@ One example is the translation of instructions, where the translation function m
\noindent This rule describes the specification of the translation by describing the nonblocking assignments that are added to the data path and control logic.
+If the following can be proven, it can then be used instead of the translation algorithm when performing the proof of correctness.
+
+\begin{equation*}
+ \forall\ c\ h,\ \yhfunction{transl\_3ac\_htl} (c) = \yhconstant{OK}(h) \implies \texttt{spec\_3ac\_htl}\ c\ h.
+\end{equation*}
+
\subsection{Forward Simulation}
The forward simulation between C and Verilog can be separated into forward simulations of each compiler pass, which can then be composed to provide a whole proof from C to Verilog. Therefore, only the forward simulations from 3AC to HTL and from the HTL to Verilog need to be proven correct.
@@ -87,9 +95,9 @@ Using the \texttt{match\_states}, we can then define the forward simulation for
\subsubsection{HTL to Verilog forward simulation}
-The HTL to Verilog simulation is quite simple, as the only transformation is from the map representation of the code to the case statement representation. As the representations are quite different though, to prove that they are equivalent the following observations have to be made.
+The HTL to Verilog simulation is conceptually simple, as the only transformation is from the map representation of the code to the case statement representation. As the representations are quite different though, to prove that they are equivalent the following observations have to be made.
-The translation from maps to case statements is done by turning each node of the tree into a case expression with the statements in each being the same. The main difficulty for the proof is that a structure that can be directly accessed is transformed into an inductive structure where a certain number of constructors need to be called to get to the correct case. The proof of the translation from maps to case statements follows by induction over the list of elements in the map and the fact that each key in this list will be unique. In addition to that, the statement that is currently being evaluated is guaranteed by the correctness of the list of elements to be in that list. The latter fact therefore eliminates the base case, as an empty list does not contain the element we know is in the list. The other two cases follow by the fact that either the key is equal to the evaluated value of the case expression, or it isn't. In the first case we can then evaluate the statement and get the state after the case expression, as the uniqueness of the key tells us that the key cannot show up in the list anymore. In the other case we can just apply the inductive hypothesis and remove the current case from the case statement, as it did not match.
+The translation from maps to case statements is done by turning each node of the tree into a case expression with the statements in each being the same. The main difficulty for the proof is that a structure that can be directly accessed is transformed into an inductive structure where a certain number of constructors need to be called to get to the correct case. The proof of the translation from maps to case statements follows by induction over the list of elements in the map and the fact that each key will be unique. In addition to that, the statement that is currently being evaluated is guaranteed by the correctness of the list of elements to be in that list. The latter fact therefore eliminates the base case, as an empty list does not contain the element we know is in the list. The other two cases follow by the fact that either the key is equal to the evaluated value of the case expression, or it isn't. In the first case we can then evaluate the statement and get the state after the case expression, as the uniqueness of the key tells us that the key cannot show up in the list anymore. In the other case we can just apply the inductive hypothesis and remove the current case from the case statement, as it did not match.
Another problem with the representation of the state as an actual register is that we have to make sure that the state does not overflow. Currently, the state register will always be 32 bits, meaning the maximum number of states can only be $2^{32} - 1$. We therefore have to prove that the state value will never go over that value. This means that during the translation we have to check for each state that it can fit into an integer. Finally, as we have to assume that there are $2^{32} - 1$ states, \vericert{} will error out when there are more instructions to be translated, which allows us to satisfy and prove that assumption correct.