3 files changed, 16 insertions, 12 deletions
diff --git a/introduction.tex b/introduction.tex
index 0188b93..cbf4780 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -2,29 +2,33 @@
 
 %% Motivation for why HLS might be needed
 
-The current approach to writing energy-efficient and high-throughput applications is to use application-specific hardware, instead of relying on a general purpose CPU.\@  However, custom hardware designs come at the cost of having to design and produce it, which can be a tedious and error-prone process using hardware description languages (HDL) such as Verilog.  Especially with the size of hardware designs grow over the years, it can become difficult to verify that the hardware design behaves in the expected way, as simulation of hardware description languages can be quite inefficient.  Furthermore, the algorithms that are being accelerated in hardware often already have a software implementation, meaning they have to be reimplemented efficiently in a hardware description language which can be time consuming.
+The current approach to writing energy-efficient and high-throughput applications is to use application-specific hardware, instead of relying on a general purpose \JW{general-purpose} CPU.\@  However, custom hardware designs come at the cost of having to design and produce it \JW{them?}, which can be a tedious and error-prone process using hardware description languages (HDL) such as Verilog.  Especially with the size of hardware designs grow \JW{growing} over the years, it can become difficult to verify that the hardware design behaves in the expected way, as simulation of hardware description languages \JW{HDLs} can be quite inefficient.  Furthermore, the algorithms that are being accelerated in hardware often already have a software implementation, meaning they have to be reimplemented efficiently in a hardware description language which can be time consuming \JW{time-consuming}.
 
 %% Definition and benefits of HLS
-%%\NR{What does "this" refer to in the first, second and fourth sentence? You had "this" in the previous paragraph too.}\YH{Explained better}
-One possible solution to tedious design process of custom hardware is to use high-level synthesis (HLS), which is the process of generating custom hardware, represented in a hardware description language, based on a behavioural description, often a subset of C.  This elevates the level of abstraction, because the description of the algorithm in C is inherently untimed, meaning actions don't have to be scheduled into clock cycles.  The higher level of abstraction makes it easier to reason about the algorithms and therefore also makes them easier to maintain.  This already reduces the time taken to design the hardware, especially if a software description of the algorithm already exists, because it won't have to be designed again at a lower level and directly in hardware.  However, another benefit of using HLS to design the hardware, is that functional verification of the design is much simpler and more efficient than if it was done at the HDL stage, as the whole software ecosystem can be used to do that.  Instead of having to run simulations of the hardware, the C code can just be compiled and executed natively, as the hardware design after HLS should have the same behaviour.
-\NR{The abstraction of HLS helps in two ways: improving productivity of hardware designers and reducing the entry barrier of hardware design for software programmers. Both these audiences benefit from better stand to benefit from the guarantees provided by verification and correct-by-construction C-to-Verilog generation.}
+One possible solution to \JW{the} tedious design process of custom hardware is to use high-level synthesis (HLS), which is the process of generating custom hardware, represented in a hardware description language \JW{HDL}, based on a behavioural description, often \JW{in} a subset of C.  This elevates the level of abstraction, because the description of the algorithm in C is inherently untimed, meaning actions don't have to be scheduled into clock cycles.  The higher level of abstraction makes it easier to reason about the algorithms and therefore also makes them easier to maintain.  This already reduces the time taken to design the hardware, especially if a software description of the algorithm already exists, because it won't \JW{doesn't} have to be designed again at a lower level and directly in hardware.  However, another benefit of using HLS to design the hardware, \JW{no comma here} is that functional verification of the design is much simpler and more efficient than if it was \JW{were} done at the HDL stage, as the whole software ecosystem can be used to do that.  Instead of having to run simulations of the hardware, the C code can just be compiled and executed natively, as the hardware design after HLS should have the same behaviour.
+\NR{The abstraction of HLS helps in two ways: improving productivity of hardware designers and reducing the entry barrier of hardware design for software programmers. Both these audiences stand to benefit from the guarantees provided by verified C-to-Verilog generation.} \JW{Yes, Nadesh makes a good point here. Worth incorporating.}
 
 %% Unreliability of HLS
 
-However, the fact that the behaviour is preserved after HLS cannot be guaranteed most existing tools,\YH{Mentor's catapult C can in some cases} meaning behavioural simulation of the hardware design still has to be performed. HLS tools are also known to be quite unreliable, for example, Intel's (formerly Altera's) OpenCL SDK compiler contained too many bugs to even be considered for random testing, as more than 25\% of the testcases failed~\cite{lidbury15_many_core_compil_fuzzin}.  In addition to that, designers often feel like HLS tools are quite unreliable and fragile with respect to which language features that are supported.\YH{Need citation}  As HLS tools are extremely complex and can therefore incorrectly change the behaviour of the design, it is not possible to guarantee that all the properties of the code that were proven in software will also hold for the generated hardware.
+However, the fact that the behaviour is preserved after HLS cannot be guaranteed most existing tools,\YH{Mentor's catapult C can in some cases} meaning behavioural simulation of the hardware design still has to be performed. HLS tools are also known to be quite unreliable, for example, Intel's (formerly Altera's) OpenCL SDK compiler contained too many bugs to even be considered for random testing, as more than 25\% of the testcases failed~\cite{lidbury15_many_core_compil_fuzzin}. In addition to that, designers often feel like HLS tools are quite unreliable and fragile with respect to which language features that are supported.\YH{Need citation}
+\JW{Here's some text that could be helpful in this paragraph... However, most HLS tools cannot guarantee that compilation is behaviour-preserving. In fact, on the contrary, there is some evidence that current HLS tools are actually quite \emph{unreliable} in this regard. For instance, an attempt by \citet{lidbury15_many_core_compil_fuzzin} to fuzz Altera's (now Intel's) OpenCL compiler had to be abandoned because the compiler ``either crashed or emitted an internal compiler error'' on so many of their test inputs. 
+Meanwhile, Xilinx's Vivado HLS has been shown to apply pipelining optimisations incorrectly\footnote{\url{https://bit.ly/vivado-hls-pipeline-bug}} or to silently generate wrong code should the programmer stray outside the fragment of C that it supports\footnote{\url{https://bit.ly/vivado-hls-pointer-bug}}.}
+% JW: Another candidate, probably less interesting:
+% https://bit.ly/intel-hls-memory-bug
+As HLS tools are extremely complex and can therefore incorrectly change the behaviour of the design, it is not possible to guarantee that all the properties of the code that were proven in software will also hold for the generated hardware.
 
 %% Current work in formal verification of HLS
 %%\NR{This is a good paragraph, but we need to relate it more to this work and why this work is different.}
 %%\NR{Focus on more high-level of "why this work is interesting"? Two key points we want to get across to the reader is that in existing works: validation is neccessary every time a new program is compiled and the verifying algorithm is not verified.}
 %%\NR{Also define words like validation, verifying algorithm (can you use the word ``verifier'',mechanisation}
 %%\NR{Having said that, keep the text for related work section.}\YH{Added into related works.}
-To mitigate the problems about the unreliability of synthesis tool, it is often required to check the generated hardware for functional correctness.  This can either be done by simulation with a large test bench, however, to be sure that the hardware does indeed behave in the same way as the C code, it may be necessary to prove that they are equivalent.  Translation validation~\cite{pnueli98_trans} is the main method which is used to prove that the HLS translation was successful, and has been successfully applied to many HLS optimisations~\cite{kim04_autom_fsmd,karfa06_formal_verif_method_sched_high_synth,chouksey20_verif_sched_condit_behav_high_level_synth,banerjee14_verif_code_motion_techn_using_value_propag,chouksey19_trans_valid_code_motion_trans_invol_loops}.  However, the main problem is that the validator itself has often not been mechanically proven correct, meaning that the implementation is quite separate from the proof.  In addition to that, with large designs it may not be feasible to perform translation validation, as the state space would grow exponentially.  A mechanically verified HLS tool would remove the need to perform simulation after the synthesis process if one has proven desirable properties about the C code.  In addition to that, it would allow for the implementation of translation validated optimisation passes which are also proven correct mechanically, thereby greatly improving the reliability of these passes.
+To mitigate the problems about the unreliability of synthesis tool, it is often required to check the generated hardware for functional correctness.  This can either be done by simulation with a large test bench, however, to be sure that the hardware does indeed behave in the same way as the C code, it may be necessary to prove that they are equivalent. \JW{I think that point could be strengthened by emphasising that simulation with a test-bench only provides guarantees that are as good as the test-bench! That is, if the test-bench does not cover all possible inputs then bugs may remain.} Translation validation~\cite{pnueli98_trans} is the main method which is used to prove that the HLS translation was successful, and has been successfully applied to many HLS optimisations~\cite{kim04_autom_fsmd,karfa06_formal_verif_method_sched_high_synth,chouksey20_verif_sched_condit_behav_high_level_synth,banerjee14_verif_code_motion_techn_using_value_propag,chouksey19_trans_valid_code_motion_trans_invol_loops}.  However, the main problem is that the validator itself has often not been mechanically proven correct, meaning that the implementation is quite separate from the proof.  In addition to that, with large designs it may not be feasible to perform translation validation, as the state space would grow exponentially. \JW{Does this link back to Mentor's Catapult-C, which you mentioned earlier? Does Catapult-C attempt to do translation validation as part of its HLS process? And if so, can you make the point that this effort is largely ineffective because once the design is a reasonable size, the translation validation usually fails anyway?} A mechanically verified HLS tool would remove the need to perform simulation after the synthesis process if one has proven desirable properties about the C code.  In addition to that, it would allow for the implementation of translation validated optimisation passes which are also proven correct mechanically, thereby greatly improving the reliability of these passes.
 
 CompCert~\cite{leroy06_formal_certif_compil_back_end} is a C compiler that has been written and formally verified in the Coq theorem prover~\cite{bertot04_inter_theor_provin_progr_devel}.  First of all, most of the compiler passes in CompCert have been proven correct, meaning that once the compiler is built, the proofs can be erased as the algorithm has been shown to be correct independent of the input.  However, some optimisation passes such as software pipelining require translation validation~\cite{tristan08_formal_verif_trans_valid}, in which case the correctness of the compiler pass needs to be checked at runtime.  However, even in this case the verifier itself is proven to only verify code correct that does indeed behave in the same way.
 
 %% Contributions of paper
 
-In this paper we describe a fully verified high-level synthesis tool called CoqUp, which adds a Verilog backend to CompCert and proves that the behaviour of the C code does not change according to existing Verilog semantics.  The main contributions of the paper are the following:
+In this paper we describe a fully verified high-level synthesis \JW{HLS} tool called CoqUp, which adds a Verilog backend to CompCert and proves that the behaviour of the C code is preserved with respect to an existing Verilog semantics.  The main contributions of the paper are the following:
 
 \begin{itemize}
   \item First mechanised and formally verified HLS flow from C to Verilog.
@@ -34,7 +38,7 @@ In this paper we describe a fully verified high-level synthesis tool called CoqU
 %%  \item \NR{We implement our Verilog semantics in CompCert and we are able to generate correct-by-construction Verilog for all programs in the CHStone benchmark suite, which is a well-known HLS benchmark.}
 \end{itemize}
 
-The first section will describe the Verilog semantics that were used and extended to fit into CompCert's model.  The second section will then describe the HLS algorithm, together with it's proof.
+The first section will describe the Verilog semantics that were used and extended to fit into CompCert's model.  The second section will then describe the HLS algorithm, together with its proof.
 
 CoqUp is open source and is hosted on Github\footnote{https://github.com/ymherklotz/coqup}.
 
diff --git a/main.tex b/main.tex
index a7e7fb5..40151be 100644
--- a/main.tex
+++ b/main.tex
@@ -50,7 +50,7 @@
 \usepackage{minted}
 
 \newif\ifCOMMENTS
-\COMMENTSfalse
+\COMMENTStrue
 \newcommand{\Comment}[3]{\ifCOMMENTS\textcolor{#1}{{\bf [\![#2:} #3{\bf ]\!]}}\fi}
 \newcommand\JW[1]{\Comment{red!75!black}{JW}{#1}}
 \newcommand\YH[1]{\Comment{green!50!blue}{YH}{#1}}
diff --git a/verilog.tex b/verilog.tex
index 6f1e5ee..13fee98 100644
--- a/verilog.tex
+++ b/verilog.tex
@@ -2,7 +2,7 @@
 
 Verilog is a hardware description language commonly used to design hardware.  A Verilog design can then be synthesised into more basic logic which describes how different gates connect to each other, called a netlist.  This representation can then be put onto either a field-programmable gate array (FPGA) or turned into an application-specific integrated circuit (ASPIC) to implement the design that was described in Verilog.  The Verilog standard is quite large though, and not all Verilog features are needed to be able to describe hardware.  Many Verilog features are only useful for simulation and do not affect the actual hardware itself, which means that these features do not have to be modelled in the semantics.  In addition to that, as the HLS algorithm dictates which Verilog constructs are generated, meaning the Verilog subset that has to be modelled by the semantics can be reduced even further to only support the constructs that are needed.  Only supporting a smaller subset in the semantics also means that there is less chance that the standard is misunderstood, and that the semantics actually model how the Verilog is simulated.
 
-The Verilog semantics are based on the semantics proposed by \citet{loow19_verif_compil_verif_proces}, which were used to create a formal translation from HOL logic into a Verilog circuit.  These semantics are quite practical as they restrict themselves to a small subset of Verilog, which can nonetheless be used to model all hardware constructs one would want to design.  The main syntax for the Verilog subset is the following:
+The Verilog semantics are based on the semantics proposed by \citet{loow19_verif_compil_verif_proces}, which were used to create a formal translation from HOL logic into a Verilog circuit.  These semantics are quite practical as they restrict themselves to a small subset of Verilog, which can nonetheless be used to model all hardware constructs one would want to design.  The main syntax for the Verilog subset is the following: \JW{Eventually the $e$ alternatives could be listed horizontally rather than with one per line, to save space.} \JW{This verilog syntax looks weird to me. I didn't think there was a `then' keyword, for instance. Perhaps you're aiming more at some sort of abstracted syntax of Verilog? What does the semicolon on its own mean? Some sort of skip statement? The case statement looks weird too -- how do you get multiple cases in a single switch statement, and where is the default case? }
 
 \begin{align*}
   v ::=&\; \mathit{sz} * n\\
@@ -10,7 +10,7 @@ The Verilog semantics are based on the semantics proposed by \citet{loow19_verif
   |&\; x\\[-2pt]
   |&\; e [e]\\[-2pt]
   |&\; e\ \mathit{op}\ e\\[-2pt]
-  |&\; \texttt{!} e\ |\ \texttt{~} e\\[-2pt]
+  |&\; \texttt{!} e\ |\ \texttt{\textasciitilde} e\\[-2pt]
   |&\; e \texttt{ ? } e \texttt{ : } e\\
   s ::=&\; s\ \texttt{;}\ s\ |\ \texttt{;}\\[-2pt]
   |&\; \texttt{if } e \texttt{ then } s \texttt{ else } s\\[-2pt]
@@ -22,7 +22,7 @@ The Verilog semantics are based on the semantics proposed by \citet{loow19_verif
 |&\; \text{\tt always @(posedge clk)}\ s
 \end{align*}
 
-The main addition to the Verilog syntax is the explicit declaration of inputs and outputs, as well as variables and arrays.  This means that the declarations have to be handled in the semantics as well, adding to the safety that all the registers are declared properly with the right size, as this affects how the Verilog module is synthesised and simulated.  In addition to that, literal values are not represented by a list of nested boolean values, but instead they are represented by a size and its value, meaning a boolean is represented as a value with size one.  Finally, the last difference is that the syntax supports two dimensional arrays in Verilog explicitly which model memory so that we can reason about array loads and stores properly.
+The main addition to the Verilog syntax is the explicit declaration of inputs and outputs, as well as variables and arrays.  This means that the declarations have to be handled in the semantics as well, adding to the safety that all the registers are declared properly with the right size, as this affects how the Verilog module is synthesised and simulated.  In addition to that, literal values are not represented by a list of nested boolean values, but instead they are represented by a size and its value \JW{But I wouldn't use `$*$' to separate the size and the value here, because it makes it look like you're multiplying them together. You could use the apostrophe symbol like real Verilog? \texttt{4'b5} and so on?}, meaning a boolean is represented as a value with size one.  Finally, the last difference is that the syntax supports two dimensional arrays in Verilog explicitly which model memory so that we can reason about array loads and stores properly.
 
 \subsection{Semantics}