Fix more maths

author: Yann Herklotz <git@yannherklotz.com> 2022-05-06 15:31:30 +0100
committer: Yann Herklotz <git@yannherklotz.com> 2022-05-06 15:31:30 +0100
commit: f6add5b74e9ec6e251d8c744d3084f5b869f8d48 (patch)
tree: f6ede425e9d77d14b41c2c50ce1ac5e67fa45e50
parent: 29abf1b418989b544295e004a215c0e9cfb18280 (diff)
download: lsr22_fvhls-f6add5b74e9ec6e251d8c744d3084f5b869f8d48.tar.gz
lsr22_fvhls-f6add5b74e9ec6e251d8c744d3084f5b869f8d48.zip
2 files changed, 39 insertions, 38 deletions
diff --git a/chapters/hls.tex b/chapters/hls.tex
index 6a86107..ad2b32c 100644
--- a/chapters/hls.tex
+++ b/chapters/hls.tex
@@ -44,17 +44,17 @@ pointers, recursive function calls, non-32-bit integers, floats, and global vari
   input software and the word-addressed memory that we implement in the output hardware, different
   handling of unsigned comparisons between C and Verilog, and carefully implementing memory reads
   and writes so that these behave properly as a RAM in hardware.
-\item In \in{Section}[sec:hls:evaluation], we evaluate Vericert on the PolyBench/C benchmark
-  suite~\cite[pouchet13_polyh], and compare the performance of our generated hardware against an
-  existing, unverified HLS tool called LegUp~\cite[canis11_legup]. We show that Vericert generates
-  hardware that is \slowdownOrig$\times$ slower (\slowdownDiv$\times$ slower in the absence of
-  division) and \areaIncr$\times$ larger than that generated by LegUp. This performance gap can be
-  largely attributed to Vericert's current lack of support for instruction-level parallelism and the
-  absence of an efficient, pipelined division operator. We intend to close this gap in the future by
-  introducing (and verifying) HLS optimisations of our own, such as scheduling and memory analysis.
-  This section also reports on our campaign to fuzz-test Vericert using over a hundred thousand
-  random C programs generated by Csmith~\cite{yang11_findin_under_bugs_c_compil} in order to confirm
-  that its correctness theorem is watertight.
+%\item In \in{Section}[sec:hls:evaluation], we evaluate Vericert on the PolyBench/C benchmark
+%  suite~\cite[pouchet13_polyh], and compare the performance of our generated hardware against an
+%  existing, unverified HLS tool called LegUp~\cite[canis11_legup]. We show that Vericert generates
+%  hardware that is \slowdownOrig$\times$ slower (\slowdownDiv$\times$ slower in the absence of
+%  division) and \areaIncr$\times$ larger than that generated by LegUp. This performance gap can be
+%  largely attributed to Vericert's current lack of support for instruction-level parallelism and the
+%  absence of an efficient, pipelined division operator. We intend to close this gap in the future by
+%  introducing (and verifying) HLS optimisations of our own, such as scheduling and memory analysis.
+%  This section also reports on our campaign to fuzz-test Vericert using over a hundred thousand
+%  random C programs generated by Csmith~\cite{yang11_findin_under_bugs_c_compil} in order to confirm
+%  that its correctness theorem is watertight.
 \stopitemize
 
 \section[sec:hls:design]{Designing a Verified HLS Tool}
@@ -190,7 +190,7 @@ control-flow using if-statements and case-statements.
     implementation in Verilog and it's corresponding state diagram.}]
   \startfloatcombination[nx=2, ny=1]
     \startplacesubfigure
-      \startframedtext[width={0.9\textwidth},frame=off,offset=none,loffset=3cm,bodyfont=11pt]
+      \startframedtext[width={0.6\textwidth},frame=off,offset=none,bodyfont=11pt]
       \starthlverilog
          module main(input rst, input y,
                      input clk, output reg z);
@@ -649,7 +649,7 @@ Vericert, the equivalence of the representation in terms of divisions and shifts
 integers and the specification, thereby making it simpler to prove the correctness of the Verilog
 implementation in terms of shifts.
 
-\section[title={A Formal Semantics for Verilog},reference={sec:verilog}]
+\section[title={A Formal Semantics for Verilog},reference={sec:hls:verilog}]
 
 This section describes the Verilog semantics that was chosen for the target language, including the
 changes that were made to the semantics to make it a suitable HLS target. The Verilog standard is
@@ -707,8 +707,8 @@ $v$.
 Finally, the following rule dictates how the whole module runs in one clock cycle:
 
 \placeformula\startformula \text{\sc Module }\ \frac{(\Gamma, \epsilon, \vec{m})\
-  \downarrow_{\text{module}} (\Gamma', \Delta')}{(\Gamma, \text{\tt module } \text{\tt main}
-  \text{\tt (...);}\ \vec{m}\ \text{\tt endmodule}) \downarrow_{\text{program}} (\Gamma'\ //\
+  \downarrow_{\text{module}} (\Gamma', \Delta')}{\left(\Gamma, \startmatrix\NC\text{\tt module}\ \text{\tt main}
+  \text{\tt (...);}\NR\NC\vec{m}\NR\NC \text{\tt endmodule}\NR\stopmatrix\right) \downarrow_{\text{program}} (\Gamma'\ //\
   \Delta')} \stopformula
 
 where $\Gamma$ is the initial state of all the variables, $\epsilon$ is the empty map because the
@@ -755,10 +755,12 @@ described in \in{Section}[sec:algorithm:optimisation:ram]), support for negative
 always-blocks was added to the semantics. This is shown in the modifications of the {\sc Module}
 rule shown below:
 
-\placeformula\startformula \text{\sc Module }\ \frac{(\Gamma, \epsilon, \vec{m})\
-  \downarrow_{\text{module}^{+}} (\Gamma', \Delta') \\ (\Gamma'\ //\ \Delta', \epsilon, \vec{m})
-  \downarrow_{\text{module}^{-}} (\Gamma'', \Delta'')}{(\Gamma, \text{\tt module}\ \text{\tt main}
-  \text{\tt (...);}\ \vec{m}\ \text{\tt endmodule}) \downarrow_{\text{program}} (\Gamma''\ //\
+\placeformula\startformula \text{\sc Module }\ \frac{
+\startmatrix
+(\NC \Gamma, \epsilon, \vec{m})\
+  \downarrow_{\text{module}^{+}} (\Gamma', \Delta') \NR\NC (\Gamma'\ //\ \Delta', \epsilon, \vec{m})
+  \downarrow_{\text{module}^{-}} (\Gamma'', \Delta'')\NR\stopmatrix}{\left(\Gamma, \startmatrix\NC\text{\tt module}\ \text{\tt main}
+  \text{\tt (...);}\NR\NC\vec{m}\NR\NC \text{\tt endmodule}\NR\stopmatrix\right) \downarrow_{\text{program}} (\Gamma''\ //\
   \Delta'')} \stopformula
 
 The main execution of the module $\downarrow_{\text{module}}$ is split into
@@ -789,7 +791,7 @@ inputs for Verilog modules.
 Finally, we use 32-bit integers to represent bitvectors rather than arrays of booleans. This is
 because Vericert (currently) only supports types represented by 32 bits.
 
-\subsection[sec:verilog:integrating]{Integrating the Verilog Semantics into CompCert's Model}
+\subsection[sec:hls:verilog:integrating]{Integrating the Verilog Semantics into CompCert's Model}
 
 \startplacefigure[reference={fig:inference_module},title={Top-level semantics used in CompCert to
     initiate the first call to \type{main} and well as return the final result of the \type{main}
@@ -880,7 +882,7 @@ for the \type{main} function of program $P$, where this \type{main} function nee
 current translation unit. The final state results in the program output of value $n$ when reaching a
 \type{Returnstate} with an empty stack frame.
 
-\subsection[sec:verilog:memory]{Memory Model}
+\subsection[sec:hls:verilog:memory]{Memory Model}
 
 The Verilog semantics do not define a memory model for Verilog, as this is not needed for a hardware
 description language. There is no preexisting architecture that Verilog will produce; it can
@@ -983,7 +985,7 @@ $\Delta_{\rm a}$. The optional values are present to ensure correct merging of t
 maps at the end of the clock cycle. The invariant used in the proofs is that block 0 should be
 equivalent to the merged representation of the $\Gamma_{\rm a}$ and $\Delta_{\rm a}$ maps.
 
-\section[sec:proof]{Correctness Proof}
+\section[sec:hls:proof]{Correctness Proof}
 
 Now that the Verilog semantics have been adapted to the CompCert model, we are in a position to
 formally prove the correctness of our C-to-Verilog compilation. This section describes the main
@@ -998,7 +1000,7 @@ Verilog and CompCert's existing intermediate languages.
 
 \startitemize
 \item
-  As already mentioned in \in{Section}[sec:verilog:memory], because the memory model in our Verilog
+  As already mentioned in \in{Section}[sec:hls:verilog:memory], because the memory model in our Verilog
   semantics is finite and concrete, but the CompCert memory model is more abstract and infinite with
   additional bounds, the equivalence of these models needs to be proven.  Moreover, our memory is
   word-addressed for efficiency reasons, whereas CompCert's memory is byte-addressed.
@@ -1067,13 +1069,13 @@ simulation result (following standard practice), which makes it easier to prove.
 forward simulation, it suffices to prove forward simulations between each pair of consecutive
 intermediate languages, as these results can be composed to prove the correctness of the whole HLS
 tool. The forward simulation from 3AC to HTL is stated in \in{Lemma}[lemma:htl]
-(\in{Section}[sec:proof:3ac_htl]), the forward simulation for the RAM insertion is shown in
-\in{Lemma}[lemma:htl_ram] (\in{Section}[sec:proof:ram_insertion]), then the forward simulation
-between HTL and Verilog is shown in \in{Lemma}[lemma:verilog] (\in{Section}[sec:proof:htl_verilog]),
+(\in{Section}[sec:hls:proof:3ac_htl]), the forward simulation for the RAM insertion is shown in
+\in{Lemma}[lemma:htl_ram] (\in{Section}[sec:hls:proof:ram_insertion]), then the forward simulation
+between HTL and Verilog is shown in \in{Lemma}[lemma:verilog] (\in{Section}[sec:hls:proof:htl_verilog]),
 and finally, the proof that Verilog is deterministic is given in \in{Lemma}[lemma:deterministic]
-(\in{Section}[sec:proof:deterministic]).
+(\in{Section}[sec:hls:proof:deterministic]).
 
-\subsection[sec:proof:3ac_htl]{Forward Simulation from 3AC to HTL}
+\subsection[sec:hls:proof:3ac_htl]{Forward Simulation from 3AC to HTL}
 
 As HTL is quite far removed from 3AC, this first translation is the most involved and therefore
 requires a larger proof, because the translation from 3AC instructions to Verilog statements needs
@@ -1098,7 +1100,7 @@ show that the specification implies the desired simulation result. This strategy
 standard practice.
 \stopproof
 
-\subsubsection[sec:proof:3ac_htl:specification]{From Implementation to Specification}
+\subsubsection[sec:hls:proof:3ac_htl:specification]{From Implementation to Specification}
 
 The specification for the translation of 3AC instructions into HTL data-path and control logic can
 be defined by the following predicate:
@@ -1108,8 +1110,8 @@ be defined by the following predicate:
 
 Here, the $\mathit{control}$ and $\mathit{data}$ parameters are the statements that the current 3AC
 instruction $i$ should translate to. The other parameters are the special registers defined in
-Section~\goto{{[}sec:verilog:integrating{]}}[sec:verilog:integrating].  An example of a rule
-describing the translation of an arithmetic/logical operation from 3AC is the following:
+\in{Section}[sec:hls:verilog:integrating].  An example of a rule describing the translation of an
+arithmetic/logical operation from 3AC is the following:
 
 \startformula
   \text{\sc Iop}\ \frac{\text{\tt tr\_op}\ \mathit{op}\ \vec{a} = \text{\tt OK}\ e}{\text{\tt spec\_instr}\
@@ -1190,7 +1192,7 @@ This simulation diagram is proven by induction over the operational semantics of
 matching state.
 \stopproof
 
-\subsection[sec:proof:ram_insertion]{Forward Simulation of RAM Insertion}
+\subsection[sec:hls:proof:ram_insertion]{Forward Simulation of RAM Insertion}
 
 \startplacefigure[reference={fig:htl_ram_spec},title={Specification for the memory implementation in
     HTL, where $r$ is the RAM, which is then implemented by equivalent Verilog code.}]
@@ -1275,7 +1277,7 @@ introduced by the RAM.
 \stopformula
 \stoplemma
 
-\subsection[sec:proof:htl_verilog]{Forward Simulation from HTL to Verilog}
+\subsection[sec:hls:proof:htl_verilog]{Forward Simulation from HTL to Verilog}
 
 The HTL-to-Verilog simulation is conceptually simple, as the only transformation is from the map
 representation of the code to the case-statement representation. The proof is more involved, as the
@@ -1301,7 +1303,7 @@ converted.
   constructors need to be called to get to the correct case.
 \stopproof
 
-\subsection[sec:proof:deterministic]{Deterministic Verilog Semantics}
+\subsection[sec:hls:proof:deterministic]{Deterministic Verilog Semantics}
 
 The final lemma we need is that the \pindex{determinism}Verilog semantics is deterministic.  This
 result allows us to replace the forwards simulation we have proved with the backwards simulation we
diff --git a/chapters/schedule.tex b/chapters/schedule.tex
index c1bb4c3..73cb4b2 100644
--- a/chapters/schedule.tex
+++ b/chapters/schedule.tex
@@ -12,10 +12,9 @@
 
 The structure of the thesis has been described in the previous chapters.  Out of these chapters, the
 work for \in{Chapter}[sec:hls] has been completed, the work for \in{Chapter}[sec:scheduling] is
-still ongoing and finally the work in \in{Chapter}[sec:pipelining] and \in{Chapter}[sec:dynamic] has
-not been started yet.  This chapter will describe how the rest of the work will be completed.  The
-current plan is to finish all the work including the thesis by June 2023, which is when the funding
-will run out.
+still ongoing and finally the work in \in{Chapter}[sec:pipelining] has not been started yet.  This
+chapter will describe how the rest of the work will be completed.  The current plan is to finish all
+the work including the thesis by June 2023, which is when the funding will run out.
 
 \section{Timeline}
author	Yann Herklotz <git@yannherklotz.com>	2022-05-06 15:31:30 +0100
committer	Yann Herklotz <git@yannherklotz.com>	2022-05-06 15:31:30 +0100
commit	f6add5b74e9ec6e251d8c744d3084f5b869f8d48 (patch)
tree	f6ede425e9d77d14b41c2c50ce1ac5e67fa45e50
parent	29abf1b418989b544295e004a215c0e9cfb18280 (diff)
download	lsr22_fvhls-f6add5b74e9ec6e251d8c744d3084f5b869f8d48.tar.gz lsr22_fvhls-f6add5b74e9ec6e251d8c744d3084f5b869f8d48.zip