4 files changed, 65 insertions, 46 deletions
diff --git a/appendix.tex b/appendix.tex
index 39dd72d..11bfe41 100644
--- a/appendix.tex
+++ b/appendix.tex
@@ -53,23 +53,6 @@
   \caption{Inferrence rules for statements.}\label{fig:inferrence_statements}
 \end{figure*}
 
-\begin{figure*}
-  \centering
-  \begin{minipage}{1.0\linewidth}
-    \begin{gather*}
-      \inferrule[Module]{\Gamma_{r} ! s_{t} = \texttt{Some } v \\ (m_{i}, \Gamma_{r}^{0}, \Gamma_{a}^{0}, \epsilon, \epsilon\ l)\ \longrightarrow_{\vec{m}} (m_{i}, \Gamma_{r}^{1}, \Gamma_{a}^{1}, \Delta_{r}^{1}, \Delta_{a}^{1}) \\ (\Gamma_{r}^{1} // \Delta_{r}^{1}) ! s_{t} = \texttt{Some } v'}{\texttt{State } \textit{sf }\ m\ v\ \Gamma_{r}^{0}\ \Gamma_{a}^{0} \longrightarrow \texttt{State } \textit{sf }\ m\ v'\ (\Gamma_{r}^{1} // \Delta_{r}^{1})\ (\Gamma_{a}^{1} // \Delta_{a}^{1})}\\
-      %
-      \inferrule[Finish]{\Gamma_{r}!\textit{fin} = \texttt{Some } 1 \\ \Gamma_{r}!\textit{ret} = \texttt{Some } r}{\texttt{State } \textit{sf }\ m\ s_{t}\ \Gamma_{r}\ \Gamma_{a} \longrightarrow \texttt{Returnstate } \textit{sf }\ r}\\
-      %
-      \inferrule[Call]{ }{\texttt{Callstate } \textit{sf }\ m\ \vec{r} \longrightarrow \texttt{State } \textit{sf }\ m\ n\ (\textit{init\_params }\ \vec{r}\ a // \{s_{t} \rightarrow n\})}\\
-      %
-      \inferrule[Return]{ }{\texttt{Returnstate } (\texttt{Stackframe } r\ m\ \textit{pc }\ \Gamma_{r}\ \Gamma_{a} :: \textit{sf}) \longrightarrow \texttt{State } \textit{sf }\ m\ \textit{pc }\ (\Gamma_{r} // \{ \textit{st} \rightarrow \textit{pc}, r \rightarrow i \})\ \epsilon}
-    \end{gather*}
-  \end{minipage}
-  \caption{Inferrence rules for modules}%
-  \label{fig:inferrence_module}
-\end{figure*}
-
 \begin{figure}
   \centering
 \begin{minted}{coq}
diff --git a/proof.tex b/proof.tex
index 5d0a7c6..a5ccb44 100644
--- a/proof.tex
+++ b/proof.tex
@@ -40,23 +40,6 @@ The forward simulation between C and Verilog can be separated into forward simul
 
 \subsubsection{3AC to HTL forward simulation}
 
-\begin{figure}
-  \centering
-  \begin{tikzpicture}
-    \begin{scope}
-      \node[circle,minimum size=2] (s1) at (0,3) {$S_{1}$};
-      \node[circle,minimum size=2] (r1) at (4,3) {$R_{1}$};
-      \node[circle,minimum size=2] (s2) at (0,0) {$S_{2}$};
-      \node[circle,minimum size=2] (r2) at (4,0) {$R_{2}$};
-      \draw (s1) -- node[above] {\textasciitilde{}} node[below] {\small\texttt{match\_states}} ++ (r1);
-      \draw[-{Latex[length=3mm]}] (s1) -- (s2);
-      \draw[dashed] (s2) -- node[above] {\textasciitilde{}} node[below] {\small\texttt{match\_states}} ++ (r2);
-      \draw[-{Latex[length=3mm]},dashed] (r1) -- node[left] {+} ++ (r2);
-    \end{scope}
-  \end{tikzpicture}
-  \caption{Simulation diagram proving the correctness of the translation from 3AC to HTL.}\label{fig:simulation_diagram}
-\end{figure}
-
 As HTL is quite different to 3AC, this first translation is the most involved and therefore requires a larger proof, because the translation from 3AC instructions to Verilog statements needs to be proven correct in this step.  In addition to that, the semantics of HTL are also quite different to the 3AC semantics, as instead of defining small-step semantics for each construct in Verilog, the semantics are instead defined over one clock cycle and mirror the semantics defined for Verilog.
 
 The first step in proving the forward simulation is to define a relation that matches an 3AC state to an HTL state, which shows when the states are equivalent.  This relation also defines assumptions that are made about the 3AC code that we receive, so that these assumptions can be used to prove the translations correct.  These assumptions then have to be proven to always hold assuming the HTL code was created by the translation algorithm.  Some of these assertions that need to be made about the 3AC and HTL code for a state to match are:
@@ -79,7 +62,23 @@ This \texttt{match\_states} predicate that is used to match the states of the 3A
   \item \texttt{match\_initial\_state} matches the initial call to the main function, and cannot be used for any other function calls, as the stack frame is assumed to be empty.
 \end{enumerate}
 
-Using the \texttt{match\_states}, we can then define the correctness theorem for the translation, shown as a simulation diagram in Figure~\ref{fig:simulation_diagram}.
+Using the \texttt{match\_states}, we can then define the correctness theorem for the translation, shown as a simulation diagram below:
+
+\begin{center}
+  \begin{tikzpicture}
+    \begin{scope}
+      \node[circle] (s1) at (0,1.5) {$R, M, \textit{pc}$};
+      \node[circle] (r1) at (6,1.5) {$\Gamma_{r}, \Gamma_{a}, \Gamma_{r}!\sigma$};
+      \node[circle] (s2) at (0,0) {$R', M', \textit{pc}'$};
+      \node[circle] (r2) at (6,0) {$\Gamma_{r}', \Gamma_{a}', \Gamma_{r}'!\sigma$};
+      \node at (6.3,3) {+};
+      \draw (s1) -- node[above] {$R \le \Gamma_{r} \land M \le \Gamma_{a} \land \textit{pc} = \Gamma_{r}!\sigma$} ++ (r1);
+      \draw[-{Latex}] ($(s1.south) + (0,0.3)$) -- ($(s2.north) - (0,0.3)$);
+      \draw[-{Latex},dashed] ($(r1.south) + (0,0.4)$) -- ($(r2.north) - (0,0.4)$);
+      \draw[dashed] (s2) -- node[above] {\textasciitilde{}} node[below] {\small\texttt{match\_states}} ++ (r2);
+    \end{scope}
+  \end{tikzpicture}
+\end{center}
 
 $S_{1}$ and $S_{2}$ are 3AC states and $R_{1}$ and $R_{2}$, HTL states and $\xrightarrow{t}$ is one step in the semantics of 3AC and $\xrightarrow{t}_{+}$ is one or more steps in the semantics of HTL.\@  The correctness theorem then says that for all possible starting states $S_{1}$ and for the resulting state $S_{2}$ after one step in the semantics of 3AC, for all HTL states $R_{1}$ that matches the state $S_{1}$, there should exist a state $R_{2}$ such that $R_{2}$ is the result of the simulation of the HTL semantics and that the states $S_{2}$ and $R_{2}$ should match as well.  Using this property, the forward simulation can then be proven correct by also showing that the initial and final states of the simulation match as well.
 
@@ -87,23 +86,19 @@ $S_{1}$ and $S_{2}$ are 3AC states and $R_{1}$ and $R_{2}$, HTL states and $\xri
 
 The HTL to Verilog simulation is quite simple, as the only transformation is from the map representation of the code to the case statement representation.  As the representations are quite different though, to prove that they are equivalent the following observations have to be made.
 
-Firstly, as the input representation is a map, all the keys of the map will be unique, which will be translated to the matched expressions in the case statement.  In addition to that, the assumption is made that every possible evaluation of the state value will give a valid state, meaning the key will be present in the map.  Both of these observations mean that after the translation, for every possible value of the state, the case statement will be able to match an expression and enter the correct statement.  As each key is unique, if the state matches a case expression, then this also means that this is the only case expression that it could match, and it therefore has to be the correct case expression which contains the same statement as the map.
-
-However, the proof of uniqueness of the keys only works if the translation function is \emph{injective}, meaning that the function will result in distinct outputs for all possible inputs.  Otherwise, the proof of uniqueness of the keys for the input would not translate to a uniqueness of possible case expression matches in the output.  However, in our case the translation function is actually not injective, even though it might at first seem like it, because the state is stored as an integer, whereas the map is addressable by any positive number.  This means that if the positive number is greater than the maximum value that can be stored in the integer, the overflow would result in the wrong states being accessed.  To make the function injective, we therefore have to prove that the input positive number is never greater than $2^{32}-1$ so that the uniqueness property also holds for the output.
-
 The translation from maps to case statements is done by turning each node of the tree into a case expression with the statments in each being the same.  The main difficulty for the proof is that a structure that can be directly accessed is transformed into an inductive structure where a certain number of constructors need to be called to get to the correct case.  The proof of the translation from maps to case statements follows by induction over the list of elements in the map and the fact that each key in this list will be unique.  In addition to that, the statement that is currently being evaluated is guaranteed by the correctness of the list of elements to be in that list.  The latter fact therefore eliminates the base case, as an empty list does not contain the element we know is in the list.  The other two cases follow by the fact that either the key is equal to the evaluated value of the case expression, or it isn't.  In the first case we can then evaluate the statement and get the state after the case expression, as the uniqueness of the key tells us that the key cannot show up in the list anymore.  In the other case we can just apply the inductive hypothesis and remove the current case from the case statement, as it did not match.
 
-Another problem with the representation of the state as an actual register is that we have to make sure that the state does not overflow.  Currently, the state register will always be 32 bits, meaning the maximum number of states can only be $2^{32} - 1$.  We therefore have to prove that the state value will never go over that value.  This means that during the translation we have to check for each state that it can fit into an integer. \JW{So I guess this means that Vericert will refuse to compile a program with 5 million instructions? If so, might be worth making that explicit, and maybe even making `Size of state register' one of your Key Challenges?}
+Another problem with the representation of the state as an actual register is that we have to make sure that the state does not overflow.  Currently, the state register will always be 32 bits, meaning the maximum number of states can only be $2^{32} - 1$.  We therefore have to prove that the state value will never go over that value.  This means that during the translation we have to check for each state that it can fit into an integer.  Finally, as we have to assume that there are $2^{32} - 1$ states, Vericert will error out when there are more instructions to be translated, which allows us to satisfy and prove that assumption correct.
 
 \subsection{Deterministic Semantics}
 
 Finally, to prove the backward simulation given the forward simulation, it has to be shown that if we generate hardware with a specific behaviour, that it is the only possible program with that behaviour.  This only has to be performed for the final intermediate language, which is Verilog, so that the backward simulation holds for the whole chain from Clight to Verilog.
 
-The Verilog semantics that are used are deterministic, as the order of operation of all the constructs is defined.
+The Verilog semantics that are used are deterministic, as the order of operation of all the constructs is defined, and there is therefore only one way to evaluate the module and only one possible behaviour.  This was proven correct for the small-step semantics shown in Figure~\ref{fig:inferrence_module}
 
 %\subsection{Coq Mechanisation}
 
-\JW{Would be nice to include a few high-level metrics here. How many person-years of effort was the proof (very roughly)? How many lines of Coq? How many files, how many lemmas? How long does it take for the Coq proof to execute?}
+%\JW{Would be nice to include a few high-level metrics here. How many person-years of effort was the proof (very roughly)? How many lines of Coq? How many files, how many lemmas? How long does it take for the Coq proof to execute?}
 
 \subsection{Coq Mechanisation}
 
@@ -124,11 +119,11 @@ The Verilog semantics that are used are deterministic, as the order of operation
     \textbf{Total}                                    & 1184 & 747 & 1007 & 5526 & 8464 \\
     \bottomrule
   \end{tabular}
-  \caption{Statistics about the proof.}
+  \caption{Statistics about the proof and implementation of \vericert{}.}
   \label{tab:proof_statistics}
 \end{table*}
 
-The Coq mechanisation 
+The lines of code for the implementation and proof of \vericert{} can be found in Table~\ref{tab:proof_statistics}.  In general it took 1 person year to finish the implementation and proofs of \vericert{}.  The main proof is the correctness proof for the HTL generation, which required the equivalence proofs between all integer operations supported by \compcert{} and the ones supported in hardware.  From the 3349 lines of proof code in the HTL generation, 1189 are only for the correctness proof of the load and store instructions.  These were tedious to prove correct because of the large difference in memory models used, and the need to prove properties such as writes to the outside of the allocated memory being undefined, so that a finite sized array could be used.  In addition to that, as pointers in HTL and Verilog are represented as integers, whereas there is a separate pointer value in the \compcert{} semantics, it was painful to reason about them and a many new theorems had to be proven about integers and pointers in \vericert{}.
 
 %%% Local Variables:
 %%% mode: latex
diff --git a/references.bib b/references.bib
index ec8e945..7df5cbe 100644
--- a/references.bib
+++ b/references.bib
@@ -639,3 +639,44 @@ year = {2020},
   year            = 1999,
   pages           = {7--es},
 }
+
+@inproceedings{loow19_verif_compil_verif_proces,
+	author = {L\"{o}\"{o}w, Andreas and Kumar, Ramana and Tan, Yong Kiam and
+                  Myreen, Magnus O. and Norrish, Michael and Abrahamsson, Oskar
+                  and Fox, Anthony},
+	title = {Verified Compilation on a Verified Processor},
+	tags = {verification},
+	booktitle = {Proceedings of the 40th ACM SIGPLAN Conference on Programming
+                  Language Design and Implementation},
+	year = 2019,
+	pages = {1041--1053},
+	doi = {10.1145/3314221.3314622},
+	url = {https://doi.org/10.1145/3314221.3314622},
+	acmid = 3314622,
+	address = {New York, NY, USA},
+	isbn = {978-1-4503-6712-7},
+	keywords = {compiler verification, hardware verification, program
+                  verification, verified stack},
+	location = {Phoenix, AZ, USA},
+	numpages = 13,
+	publisher = {ACM},
+	series = {PLDI 2019}
+}
+
+@inproceedings{loow19_proof_trans_veril_devel_hol,
+	author = {L\"{o}\"{o}w, Andreas and Myreen, Magnus O.},
+	title = {A Proof-producing Translator for Verilog Development in HOL},
+	tags = {hls, semantics},
+	booktitle = {Proceedings of the 7th International Workshop on Formal
+                  Methods in Software Engineering},
+	year = 2019,
+	pages = {99--108},
+	doi = {10.1109/FormaliSE.2019.00020},
+	url = {https://doi.org/10.1109/FormaliSE.2019.00020},
+	acmid = 3338828,
+	address = {Piscataway, NJ, USA},
+	location = {Montreal, Quebec, Canada},
+	numpages = 10,
+	publisher = {IEEE Press},
+	series = {FormaliSE '19}
+}
diff --git a/verilog.tex b/verilog.tex
index 1f3a3dc..03cafad 100644
--- a/verilog.tex
+++ b/verilog.tex
@@ -5,7 +5,7 @@
 This section describes the Verilog semantics that were \JW{was} chosen for the target language, including the changes that were made to the semantics to make it \JWcouldcut{a better fit} a \JW{suitable} HLS target.  The Verilog standard is quite large~\cite{06_ieee_stand_veril_hardw_descr_languag,05_ieee_stand_veril_regis_trans_level_synth}, but the syntax and semantics can be reduced to a small subset that \vericert{} needs to target.
 \NR{Have we discussed what our options were and why we chose the HOL4 semantics?} \JW{Good point -- we should cite the Maude semantics too. I think that's the only viable alternative.}\YH{Yes, I should make that more clear, I cited Meredith semantics in the algorithm section, will add maude as well, the main reason is practicality and we don't need continuous assignment, basically the same reasons that Loow used to justify creating his own semantics.}
 
-The Verilog semantics \JW{we use is ported to Coq from} a semantics written in HOL4  by \citet{loow19_verif_compil_verif_proces}.% which was used to create a formal translation from a logic representation encoded in the HOL4~\cite{slind08_brief_overv_hol4} theorem prover into an equivalent Verilog design. 
+The Verilog semantics we use is ported to Coq from a semantics written in HOL4  by \citet{loow19_proof_trans_veril_devel_hol}, used to prove the translation from HOL4 to Verilog~\cite{loow19_verif_compil_verif_proces}.% which was used to create a formal translation from a logic representation encoded in the HOL4~\cite{slind08_brief_overv_hol4} theorem prover into an equivalent Verilog design. 
 This semantics is quite practical as it is restricted to a small subset of Verilog, which can nonetheless be used to model all hardware constructs one would want to design.  The main features that are not supported by the syntax and semantics are continuous assignment and combinational always blocks.
 \NR{Shall we use special font fo always-blocks: maybe \alwaysblock{}}
 
@@ -44,7 +44,7 @@ When targeting a hardware description language such as Verilog, it is important
 
 \subsection{Changes to the Semantics}
 
-Some small changes were made to the semantics proposed by \citet{loow19_verif_compil_verif_proces} to make it suitable as a HLS target.
+Some small changes were made to the semantics proposed by \citet{loow19_proof_trans_veril_devel_hol} to make it suitable as a HLS target.
 
 The main change is the addition of support for two-dimensional arrays, which are needed to model RAM in Verilog.  RAM is needed to model the stack in C efficiently, without having to declare a variable for each possible stack location.  In the original semantics, RAMs, as well as inputs and outputs to the module could be modelled using a function from variable names (strings) to values, which could be modified accordingly to model inputs to the module.  This is quite an abstract description of memory, which can also be expressed as an array of bitvectors, which is the path we took instead.  This requires the addition of array operators to the semantics and correct reasoning of loads and stores to the array in different always blocks simultaneously.  Consider the following Verilog code: