Add more content to algorithm

author: Yann Herklotz <git@yannherklotz.com> 2020-07-01 18:50:18 +0100
committer: Yann Herklotz <git@yannherklotz.com> 2020-07-01 18:50:18 +0100
commit: c8ace4645eb389f95fcad9b499fee03c539b943e (patch)
tree: 0ca835c7bc119ec05558528eb71b6ca8b6b5ce19
parent: ae2883a6d42c36d19783061f7217994ebec9d9d6 (diff)
download: oopsla21_fvhls-c8ace4645eb389f95fcad9b499fee03c539b943e.tar.gz
oopsla21_fvhls-c8ace4645eb389f95fcad9b499fee03c539b943e.zip
2 files changed, 17 insertions, 3 deletions
diff --git a/algorithm.tex b/algorithm.tex
index c4d4838..90e7644 100644
--- a/algorithm.tex
+++ b/algorithm.tex
@@ -1,6 +1,6 @@
 \section{Turning CompCert into an HLS tool}
 
-%% Should maybe go in the introduction instead. 
+%% Should maybe go in the introduction instead.
 
 \begin{figure}
   \centering
@@ -105,7 +105,21 @@ Going back to to the previous accumulator example, the HTL code that is generate
 
 The translation from RTL to HTL is quite straightforward, as each RTL instruction either matches up quite well to a hardware construct, or does not have to be handled by the translation, such as function calls.  At each instruction, the control flow is separated from the data computation and is then added to the control logic and data-flow map respectively.  For example, in state 16 in the RTL code in figure~\ref{fig:accumulator_rtl}, the register \texttt{x9} is initialised to one and then moves on to state 15.  This is encoded in HTL by initialising a 32 bit register \texttt{reg\_9} to one as well in the data-flow graph section, and also adding a state transition to the state 15 in the controllogic transition.  Simple operator instructions are translated in a similar way.  For example, in state 5, the value in the array is added to the current value of the accumulated sum, which is simply translated to an addition of the equivalent registers in the HTL code.
 
-In the accumulator C code, the for loop is translated to a branch statement in state 3, which can also be translated to equivalent HTL code by placing the comparison into the control logic part of the main function, and not performing any data operations in the data-flow section.  On the next clock cycle, the state will therefore transition to state 7 or state 2 depending on if \texttt{reg\_3} is less than 3 or not.  One thing to note is that in this case, the comparison is signed.  By default, all operators and registers in Verilog and HTL are unsigned, so to force an operation to handle the bits as signed, both operators have to be forced to signed, otherwise the operation will
+The translation can be described by a relational specification, which can be described by the following relation, where \textit{fin}, \textit{rtrn}, $\sigma$ and \textit{stk} are the registers for the finished signal, return value, current state and stack respectively, $i$ is the RTL instruction being translated, and \textit{data} and \textit{control} are the data-flow and control logic map respectively.
+
+\begin{equation*}
+  \yhfunction{tr\_instr } \textit{fin rtrn }\ \sigma\ \textit{stk }\ i\ \textit{data }\ \textit{control}
+\end{equation*}
+
+\noindent An example of a rule describing the translation of an operation is the following:
+
+\begin{equation*}
+  \inferrule[Iop]{\yhfunction{tr\_op } \textit{op }\ \vec{a} = \yhconstant{OK } e}{\yhfunction{tr\_instr } \textit{fin rtrn }\ \sigma\ \textit{stk }\ (\yhconstant{Iop } \textit{op }\ \vec{a}\ d\ n)\ (d \Leftarrow e)\ (\sigma \Leftarrow n)}
+\end{equation*}
+
+In the accumulator C code, the for loop is translated to a branch statement in state 3, which can also be translated to equivalent HTL code by placing the comparison into the control logic part of the main function, and not performing any data operations in the data-flow section.  On the next clock cycle, the state will therefore transition to state 7 or state 2 depending on if \texttt{reg\_3} is less than 3 or not.  One thing to note is that in this case, the comparison is signed.  By default, all operators and registers in Verilog and HTL are unsigned, so to force an operation to handle the bits as signed, both operators have to be forced to signed.  In addition to that, Verilog resizes expressions to the largest needed size by default, which can affect the result of the computation.  This feature is also not supported by the Verilog semantics we used, and there would therefore be a mismatch between the Verilog semantics and the actual behaviour of Verilog according to the standard.  To bypass this issue braces are used to stop the Verilog simulator or synthesis tool from resizing anything inside the braces.  Instead, explicit resizing is used in the semantics and operations can only be performed on two registers that have the same size.
+
+Finally, this example also contains an array, which is stored on the stack for the main function.  Translating arrays means that load and store instructions need to be supported.  In hardware RAM is not as available as in software, and needs to be explicitly implemented by declaring a two dimensional array with specific properties, such as only being able to write and read from it once per clock cycle.  Therefore, to make memory as efficient as possible and support reads and writes in one clock cycle, a memory only addressable by word needs to be implemented, instead of the standard RAM which is addressable by byte.  This is because the clock cycles in devices such as FPGAs are normally an order of magnitude slower than CPUs, meaning taking four cycles to load an integer is not feasible.  However, this leads to various problems, the main one being that addresses need to be word-aligned, which is not the case in RTL or in C.  Therefore, addresses need to be divisible by four to also be valid addresses in HTL, which is shown in the load from the stack in state 6 of the RTL.  The equivalent HTL code is shown in the datapath branch where the same addition is done, however, the result is divided by 4.  To prove that these two loads have the same behaviour, we have to prove that
 
 \subsection{Verilog}
 
diff --git a/main.tex b/main.tex
index 3a4c120..c07b93a 100644
--- a/main.tex
+++ b/main.tex
@@ -52,7 +52,7 @@
 \setminted{fontsize=\small}
 
 \newif\ifCOMMENTS
-\COMMENTStrue
+\COMMENTSfalse
 \newcommand{\Comment}[3]{\ifCOMMENTS\textcolor{#1}{{\bf [\![#2:} #3{\bf ]\!]}}\fi}
 \newcommand\JW[1]{\Comment{red!75!black}{JW}{#1}}
 \newcommand\YH[1]{\Comment{green!50!blue}{YH}{#1}}
author	Yann Herklotz <git@yannherklotz.com>	2020-07-01 18:50:18 +0100
committer	Yann Herklotz <git@yannherklotz.com>	2020-07-01 18:50:18 +0100
commit	c8ace4645eb389f95fcad9b499fee03c539b943e (patch)
tree	0ca835c7bc119ec05558528eb71b6ca8b6b5ce19
parent	ae2883a6d42c36d19783061f7217994ebec9d9d6 (diff)
download	oopsla21_fvhls-c8ace4645eb389f95fcad9b499fee03c539b943e.tar.gz oopsla21_fvhls-c8ace4645eb389f95fcad9b499fee03c539b943e.zip