3 files changed, 17 insertions, 19 deletions
diff --git a/algorithm.tex b/algorithm.tex
index e07406a..848a94f 100644
--- a/algorithm.tex
+++ b/algorithm.tex
@@ -117,7 +117,7 @@ The first translation performed in Vericert is from 3AC to a new hardware transl
 \begin{figure*}
   \centering
   \includegraphics[scale=0.3,trim={10cm 8cm 5cm 5cm},clip=true]{data/accumulator_fsmd2.pdf}
-  \caption{The FSMD for our running example. \JW{Maybe replace `State' with `Current State'? And maybe `Calculate State' could be clearer as `Calculate Next State'?} \JW{Is it worth distinguishing between the different types of box? We have a RAM box which is a single hardware block, then the Registers and State boxes both indicate a collection of registers, and finally the Update and Calculate boxes indicate combinational logic. Perhaps the combinational logic bits could be visualised as clouds? (Or if clouds are a bit of a faff to draw, then rounded rectangles, or something?)} \JW{Can we label `data path' and `control path' on the diagram? Looks like the diagram is nicely split into these two parts already.} \JW{Can state 15 (or should it be state 16??) have a dangling incoming arrow to indicate that it is the start state? And perhaps state 1 could have a double outline to indicate that it is an `accepting' state? Since there's space above the `Calculate State' box, I'd be mildly in favour of expanding that box a bit so that it included all 15 states explicitly (snaking back and forth).}}\label{fig:accumulator_diagram}
+  \caption{The FSMD for our running example. \JW{Maybe replace `State' with `Current State'? And maybe `Calculate State' could be clearer as `Calculate Next State'?} \JW{Is it worth distinguishing between the different types of box? We have a RAM box which is a single hardware block, then the Registers and State boxes both indicate a collection of registers, and finally the Update and Calculate boxes indicate combinational logic. Perhaps the combinational logic bits could be visualised as clouds? (Or if clouds are a bit of a faff to draw, then rounded rectangles, or something?)} \JW{Can state 15 (or should it be state 16??) have a dangling incoming arrow to indicate that it is the start state? And perhaps state 1 could have a double outline to indicate that it is an `accepting' state? Since there's space above the `Calculate State' box, I'd be mildly in favour of expanding that box a bit so that it included all 15 states explicitly (snaking back and forth).}\YH{If this is better I can mock up a tikz version of it maybe and fix the last bits then too.}}\label{fig:accumulator_diagram}
 \end{figure*}
 
 The translation from 3AC to HTL is straightforward, as each 3AC instruction either matches up quite well to a hardware construct, or does not have to be handled by the translation, such as function calls.
@@ -129,11 +129,10 @@ For example, in state 16 in figure~\ref{fig:accumulator_rtl}, the register \text
 
 \subsection{Translating HTL to Verilog}
 
-Finally, we have to translate the HTL code into proper Verilog and prove that it behaves the same as the 3AC according to the Verilog semantics.  The Verilog output is modelled as a complete abstract syntax tree (AST) instead of being an abstract map over the instructions that are executed. \JW{I find `an abstract map over the instructions that are executed' hard to unpick. How about ``Whereas HTL is a language that is specifically designed to represent the FSMDs we are interested in, Verilog is a general-purpose HDL. So the challenge here is to translate our FSMD representation into a Verilog AST.''}  However, as all the instructions are already expressed in Verilog, only the maps need to be translated to valid Verilog, and correct declarations for all the variables in the program need to be added as well.
+Finally, we have to translate the HTL code into proper Verilog and prove that it behaves the same as the 3AC according to the Verilog semantics.  Whereas HTL is a language that is specifically designed to represent the FSMDs we are interested in, Verilog is a general-purpose HDL.\@  So the challenge here is to translate our FSMD representation into a Verilog AST.  However, as all the instructions are already expressed in Verilog, only the maps need to be translated to valid Verilog, and correct declarations for all the variables in the program need to be added as well.
 
-This translation seems quite straightforward, however, proving that it is correct is not that simple, as all the implicit assumptions that were made in HTL need to be translated explicitly to Verilog and needs to have the same behaviour according to the semantics. 
-\JW{What does `needs' refer to? Missing `it' perhaps?}
-Figure~\ref{fig:accumulator_v} shows the final Verilog output that is generated.  In general, the structure is similar to \JWcouldcut{the structure}\JW{that} of the HTL code, however, the control and datapath maps have been translated to case statements \JWcouldcut{that serve the same purpose}.  The other main addition to the code is the initialisation of all the variables in the code to the correct bitwidths and the declaration of the inputs and outputs to the module, so that the module can be used inside a larger hardware design.  
+This translation seems quite straightforward, however, proving that it is correct is not that simple, as all the implicit assumptions that were made in HTL need to be translated explicitly to Verilog statements and needs it needs to be shown that these explicit behaviours are equivalent to the assumptions made in the HTL semantics.
+Figure~\ref{fig:accumulator_v} shows the final Verilog output that is generated.  In general, the structure is similar to that of the HTL code, however, the control and datapath maps have been translated to case statements \JWcouldcut{that serve the same purpose}.  The other main addition to the code is the initialisation of all the variables in the code to the correct bitwidths and the declaration of the inputs and outputs to the module, so that the module can be used inside a larger hardware design.
 
 \begin{figure}
   \centering
@@ -203,14 +202,9 @@ endmodule
 \end{minted}
     \caption{Verilog always block describing the control logic of the module.}\label{fig:accumulator_v_2}
   \end{subfigure}
-  \caption{Accumulator example using \vericert{} to translate the 3AC to a state machine expressed in Verilog. \JW{If space permits, it would probably be preferable to have this code in a single column, as splitting a single module across two subfigures is a bit jarring.}}\label{fig:accumulator_v}
+  \caption{Accumulator example using \vericert{} to translate the 3AC to a state machine expressed in Verilog. \JW{If space permits, it would probably be preferable to have this code in a single column, as splitting a single module across two subfigures is a bit jarring.}\YH{I actually don't mind it for some reason, because it separates control flow and data path, but it's true it is weird.  The only problem is it's very long if I don't do that.}}\label{fig:accumulator_v}
 \end{figure}
 
-\JW{What do you think about moving the following two paragraphs to the Proof section?}
-The translation from maps to case statements is done by turning each node of the tree into a case expression with the statments in each being the same.  The main difficulty for the proof is that a structure that can be directly accessed is transformed into an inductive structure where a certain number of constructors need to be called to get to the correct case.  The proof of the translation from maps to case statements follows by induction over the list of elements in the map and the fact that each key in this list will be unique.  In addition to that, the statement that is currently being evaluated is guaranteed by the correctness of the list of elements to be in that list.  The latter fact therefore eliminates the base case, as an empty list does not contain the element we know is in the list.  The other two cases follow by the fact that either the key is equal to the evaluated value of the case expression, or it isn't.  In the first case we can then evaluate the statement and get the state after the case expression, as the uniqueness of the key tells us that the key cannot show up in the list anymore.  In the other case we can just apply the inductive hypothesis and remove the current case from the case statement, as it did not match.
-
-Another problem with the representation of the state as an actual register is that we have to make sure that the state does not overflow.  Currently, the state register will always be 32 bits, meaning the maximum number of states can only be $2^{32} - 1$.  We therefore have to prove that the state value will never go over that value.  This means that during the translation we have to check for each state that it can fit into an integer. \JW{So I guess this means that Vericert will refuse to compile a program with 5 million instructions? If so, might be worth making that explicit, and maybe even making `Size of state register' one of your Key Challenges?}
-
 \subsection{Optimisations}
 
 Although \vericert{} is not yet a proper `optimising' HLS compiler, we have implemented a few optimisations that aim to improve the quality of the hardware designs it produces.
@@ -218,14 +212,7 @@ Although \vericert{} is not yet a proper `optimising' HLS compiler, we have impl
 \subsubsection{Byte- and word-addressable memories}
 
 One big difference between C and Verilog is how memory is represented.  In hardware, efficient RAMs are not as available as in software, and need to be explicitly implemented by declaring two-dimensional arrays with specific properties.  A major limitation is that RAMs often only allow one read and one write per clock cycle, \JWcouldcut{for example if implementing single port RAM, which is the most common type of RAM}.  To make loads and stores as efficient as possible, the RAM needs to be \JWcouldcut{implemented as being} word-addressable, so that an entire integer can be loaded or stored in one clock cycle.
-However, the memory model that \compcert{} uses for its intermediate languages~\cite{blazy05_formal_verif_memor_model_c} is byte-addressable.  It therefore has to be proven that the byte-addressable memory behaves in the same way as the word-addressable memory in hardware.  Any modifications of the bytes in the \compcert{} memory model also have to be shown to modify the word-addressable memory in the same way.  As only integer loads and stores are currently supported in our HLS back end, it follows that the addresses given to the loads and stores should be divisible by four.  If that is the case, then the translation from byte-addressed memory to word-addressed memory could be done by dividing the address by four and subtracting \JWcouldcut{by} the base address of the memory. \JW{Why does `subtracting the base address of the memory' have anything to do with whether the memory is byte or word addressed? Don't you have to do that either way? Or perhaps you're saying that your memory is not only word-addressed, but it also starts at address 0 rather than some random address like you'd get in software?}
-
-\subsubsection{Reset signals}
-
-\YH{This section could maybe go into the proof section instead.} 
-
-\JW{Yeah I agree that the rest of this paragraph describes a detail of the proof and should be in the Proof section.}
-Even though functions calls are not supported by \vericert{}, the initial function call that \compcert{} uses to enter the main function does need to be supported in the Verilog semantics, as explained further in Section~\ref{sec:verilog}.  The reset signal therefore has to be reasoned about correctly in the Semantics and in the initial function call to ensure that the initial state of the module is set correctly, as in 3AC, the entry point of the function is part of the function definition.
+However, the memory model that \compcert{} uses for its intermediate languages~\cite{blazy05_formal_verif_memor_model_c} is byte-addressable.  It therefore has to be proven that the byte-addressable memory behaves in the same way as the word-addressable memory in hardware.  Any modifications of the bytes in the \compcert{} memory model also have to be shown to modify the word-addressable memory in the same way.  As only integer loads and stores are currently supported in our HLS back end, it follows that the addresses given to the loads and stores should be divisible by four.  If that is the case, then the translation from byte-addressed memory to word-addressed memory could be done by dividing the address by four.
 
 \subsubsection{Implementing the \texttt{Oshrximm} instruction}
 
diff --git a/archive/algorithm.tex b/archive/algorithm.tex
index 376196a..1d66770 100644
--- a/archive/algorithm.tex
+++ b/archive/algorithm.tex
@@ -1,3 +1,10 @@
+\subsubsection{Reset signals}
+
+\YH{This section could maybe go into the proof section instead.}
+
+\JW{Yeah I agree that the rest of this paragraph describes a detail of the proof and should be in the Proof section.}
+Even though functions calls are not supported by \vericert{}, the initial function call that \compcert{} uses to enter the main function does need to be supported in the Verilog semantics, as explained further in Section~\ref{sec:verilog}.  The reset signal therefore has to be reasoned about correctly in the Semantics and in the initial function call to ensure that the initial state of the module is set correctly, as in 3AC, the entry point of the function is part of the function definition.
+
 \paragraph{Key challenge: Adding reset signals}
 In hardware, there is no notion of a stack that can be popped or pushed.  It is therefore necessary to add reset signals to the main module \JWcouldcut{that will be called} so that the proper state can be set at the start of the function.
 The main subtle change that was added to the code is the reset signal which sets the state to the starting state correctly.  In HTL, this was described directly in the semantics, where the entry point is stored in the module interface of HTL.  However, in Verilog we also want to verify that the hardware will be placed into the right state when we power it up or reset the design, and it therefore has to be encoded directly in the Verilog code.
diff --git a/proof.tex b/proof.tex
index 804a36a..14b0011 100644
--- a/proof.tex
+++ b/proof.tex
@@ -83,6 +83,10 @@ Firstly, as the input representation is a map, all the keys of the map will be u
 
 However, the proof of uniqueness of the keys only works if the translation function is \emph{injective}, meaning that the function will result in distinct outputs for all possible inputs.  Otherwise, the proof of uniqueness of the keys for the input would not translate to a uniqueness of possible case expression matches in the output.  However, in our case the translation function is actually not injective, even though it might at first seem like it, because the state is stored as an integer, whereas the map is addressable by any positive number.  This means that if the positive number is greater than the maximum value that can be stored in the integer, the overflow would result in the wrong states being accessed.  To make the function injective, we therefore have to prove that the input positive number is never greater than $2^{32}-1$ so that the uniqueness property also holds for the output.
 
+The translation from maps to case statements is done by turning each node of the tree into a case expression with the statments in each being the same.  The main difficulty for the proof is that a structure that can be directly accessed is transformed into an inductive structure where a certain number of constructors need to be called to get to the correct case.  The proof of the translation from maps to case statements follows by induction over the list of elements in the map and the fact that each key in this list will be unique.  In addition to that, the statement that is currently being evaluated is guaranteed by the correctness of the list of elements to be in that list.  The latter fact therefore eliminates the base case, as an empty list does not contain the element we know is in the list.  The other two cases follow by the fact that either the key is equal to the evaluated value of the case expression, or it isn't.  In the first case we can then evaluate the statement and get the state after the case expression, as the uniqueness of the key tells us that the key cannot show up in the list anymore.  In the other case we can just apply the inductive hypothesis and remove the current case from the case statement, as it did not match.
+
+Another problem with the representation of the state as an actual register is that we have to make sure that the state does not overflow.  Currently, the state register will always be 32 bits, meaning the maximum number of states can only be $2^{32} - 1$.  We therefore have to prove that the state value will never go over that value.  This means that during the translation we have to check for each state that it can fit into an integer. \JW{So I guess this means that Vericert will refuse to compile a program with 5 million instructions? If so, might be worth making that explicit, and maybe even making `Size of state register' one of your Key Challenges?}
+
 \subsection{Deterministic Semantics}
 
 Finally, to prove the backward simulation given the forward simulation, it has to be shown that if we generate hardware with a specific behaviour, that it is the only possible program with that behaviour.  This only has to be performed for the final intermediate language, which is Verilog, so that the backward simulation holds for the whole chain from Clight to Verilog.