summaryrefslogtreecommitdiffstats
path: root/algorithm.tex
diff options
context:
space:
mode:
authorJohn Wickerson <j.wickerson@imperial.ac.uk>2021-08-10 20:51:02 +0000
committernode <node@git-bridge-prod-0>2021-08-11 08:07:42 +0000
commit7837c2ecb1c7d34efb50a94a4981d19af2926c8d (patch)
tree66d242dbc106ae22065c401df8453b0215d40448 /algorithm.tex
parent4a545910310e186c82603e90a1e0292a5d1da29d (diff)
downloadoopsla21_fvhls-7837c2ecb1c7d34efb50a94a4981d19af2926c8d.tar.gz
oopsla21_fvhls-7837c2ecb1c7d34efb50a94a4981d19af2926c8d.zip
Update on Overleaf.
Diffstat (limited to 'algorithm.tex')
-rw-r--r--algorithm.tex17
1 files changed, 9 insertions, 8 deletions
diff --git a/algorithm.tex b/algorithm.tex
index c367917..8793e41 100644
--- a/algorithm.tex
+++ b/algorithm.tex
@@ -67,8 +67,7 @@ The .NET framework has been used as a basis for other HLS tools, such as Kiwi~\c
\draw[->,thick] (htl) -- (verilog);
\draw[->,thick] (htl.west) to [out=180,in=150] (4,-2.2) to [out=330,in=270] (htl.south);
\end{tikzpicture}%}
- \caption{\vericert{} as a Verilog back end to \compcert{}. For scale, the approximate lines of code (kloc) are shown for \vericert{}, as well as for the front end and back end of \compcert{}, including any comments and whitespace.
-%\JW{Did we ought to add CompCert's other back ends to the diagram? X86 etc? Otherwise it might look like we have a very out-of-date view of CompCert.}%
+ \caption{\vericert{} as a Verilog back end to \compcert{}. For scale, the approximate lines of code (kloc) are shown for \vericert{}, as well as for the front end and back end of \compcert{}, including any comments and whitespace. \JW{Nice. I like the inclusion of the LOCs -- it makes the diagram do a bit more work.}
}%
\label{fig:rtlbranch}
\end{figure}
@@ -87,13 +86,13 @@ It has an unlimited number of pseudo-registers, and is represented as a control
\subsection{An introduction to Verilog}
-This section will introduce Verilog for readers that may not be familiar with the language, introducing the parts of the language that are used in the output of \vericert{}. Verilog is a hardware description language (HDL) and is used to design hardware from CPU's which are eventually produced as an integrated circuit, or small application specific hardware accelerators that are placed on an FPGA. Verilog is a populare language, because it allows for fine-grained control over the hardware, as well as some high-level constructs to simplify the development. For example, a net list with transfer delay information could be implemented in Verilog, as well as a more abstract state machine which would first have to translated to the net list level.
+This section will introduce Verilog for readers that may not be familiar with the language, concentrating on the features that are used in the output of \vericert{}. Verilog is a hardware description language (HDL) and is used to design hardware ranging from complete CPUs that are eventually produced as an integrated circuit, to small application-specific accelerators that are placed on an FPGA. Verilog is a popular language because it allows for fine-grained control over the hardware, and also provides high-level constructs to simplify the development. For example, a net list with transfer delay information could be implemented in Verilog, as well as a more abstract state machine which would first have to translated to the net list level. \JW{I don't like that last sentence because I don't think it's pitched to the right audience. If you don't know Verilog, you won't understand `net list' or `transfer delay'. I'd cut it.}
-Verilog behaves quite differently to standard software programming languages due to it having to express the parallel nature of hardware. The basic construct to achieve this is an always-block, which is a block of assignments that are executed every time some event occurs. In the case of \vericert{}, this event is either a positive or a negative clock edge. Each always block triggering at the same event will be executed in parallel. In addition to that, control-flow can also be expressed in always-blocks using if-statements or case-statements.
+Verilog behaves quite differently to standard software programming languages due to it having to express the parallel nature of hardware. The basic construct to achieve this is the always-block, which is a block \JW{Can we avoid repeating `block'? Like, can we say `collection' here? It just feels a bit inelegant to say `an always-block is a block...'.} of assignments that are executed every time some event occurs. In the case of \vericert{}, this event is either a positive \JW{(rising)} or a negative \JW{(falling)} clock edge. All always-blocks triggering on the same event are executed in parallel. Always-blocks can also express control-flow using if-statements and case-statements.
-A simple state machine can therefore be implemented as follows:
+A simple state machine can therefore be implemented as follows: \JW{This should be floated out into a figure. And might it be nice to put a little picture of the corresponding state machine in the gap on the right-hand side?}
-\begin{minted}[linenos,xleftmargin=20pt]{verilog}{verilog}
+\begin{minted}[linenos,xleftmargin=20pt]{verilog}
module main(input state, input y, input clk, output z);
reg x;
always @(posedge clk)
@@ -109,7 +108,7 @@ module main(input state, input y, input clk, output z);
endmodule
\end{minted}
-At every positive edge of the clock (\texttt{clk}), both the always-blocks will trigger simultaneously. The first always-block controls the values of the internal state of the register \texttt{x} and the output \texttt{z}, whereas the second always-block controls the next state the state machine should go to. When the input \texttt{state} is 0, \texttt{x} will be assigned to the input \texttt{y} using nonblocking assignment, denoted by \texttt{<=}. Nonblocking assignment assigns registers in parallel at the end of the clock cycle, instead of sequentially throughout the always-block. In the second always block, the input \texttt{y} will be checked, and if it's high it will move on to the next state, otherwise it will stay in the current state. When \texttt{state} is 1, the first always block will reset the value of \texttt{x} and then set \texttt{z} to the original value of \texttt{x}, as nonblocking assignment does not change it's value until the end of the clock cycle. Finally, the last always block will set the state to be 0 again.
+At every positive edge of the clock (\texttt{clk}), both of the always-blocks will trigger simultaneously. The first always-block controls the values of the internal state of the register \JW{Is it ok to simplify `the values of the internal state of the register' to `the values in the register'?} \texttt{x} and the output \texttt{z}, while the second always-block controls the next state the state machine should go to. When the input \texttt{state} is 0, \texttt{x} will be assigned to the input \texttt{y} using nonblocking assignment, denoted by \texttt{<=}. Nonblocking assignment assigns registers in parallel at the end of the clock cycle, rather than sequentially throughout the always-block. In the second always-block, the input \texttt{y} will be checked, and if it's high it will move on to the next state, otherwise it will stay in the current state. When \texttt{state} is 1, the first always-block will reset the value of \texttt{x} and then set \texttt{z} to the original value of \texttt{x}, since nonblocking assignment does not change its value until the end of the clock cycle. Finally, the last always-block will set the state to be 0 again.
\begin{figure}
\centering
@@ -343,7 +342,9 @@ A high-level overview of the architecture and of the RAM interface can be seen i
Most 3AC instructions correspond to hardware constructs.
%Each 3AC instruction either corresponds to a hardware construct or does not have to be handled by the translation, such as function calls (because of inlining). \JW{Are function calls the only 3AC instruction that we ignore? (And I guess return statements too for the same reason.)}\YH{Actually, return instructions are translated (because you can return from main whenever), so call instructions (Icall, Ibuiltin and Itailcall) are the only functions that are not handled.}
% JW: Thanks; please check proposed new text.
-For example, line 2 in Figure~\ref{fig:accumulator_rtl} shows a 32-bit register \texttt{x5} being initialised to 3, after which the control flow moves execution to line 3. This initialisation is also encoded in the Verilog generated from HTL at state 8 in both the control logic and data-path always-blocks, shown at lines 33 and 16 respectively in Figure~\ref{fig:accumulator_v}. Simple operator instructions are translated in a similar way. For example, the add instruction is just translated to the built-in add operator, similarly for the multiply operator. All 32-bit instructions can be translated in this way, but some special instructions require extra care. One such is the \texttt{Oshrximm} instruction, which is discussed further in Section~\ref{sec:algorithm:optimisation:oshrximm}. Another is the \texttt{Oshldimm} instruction, which is a left rotate instruction that has no Verilog equivalent and therefore has to be implemented in terms of other operations and proven to be equivalent. In addition to any non-32-bit operations, the remaining instructions that we do not translate are those related to function calls (\texttt{Icall}, \texttt{Ibuiltin}, and \texttt{Itailcall}), because we enable inlining by default.
+For example, line 2 in Figure~\ref{fig:accumulator_rtl} shows a 32-bit register \texttt{x5} being initialised to 3, after which the control flow moves execution to line 3. This initialisation is also encoded in the Verilog generated from HTL at state 8 in both the control logic and data-path always-blocks, shown at lines 33 and 16 respectively in Figure~\ref{fig:accumulator_v}. Simple operator instructions are translated in a similar way. For example, the add instruction is just translated to the built-in add operator, similarly for the multiply operator. All 32-bit instructions can be translated in this way, but some special instructions require extra care. One such is the \texttt{Oshrximm} instruction, which is discussed further in Section~\ref{sec:algorithm:optimisation:oshrximm}. Another is the \texttt{Oshldimm} instruction, which is a left rotate instruction that has no Verilog equivalent and therefore has to be implemented in terms of other operations and proven to be equivalent. \JW{I reverted the following back to my version because it seems clearer to me.}
+% In addition to any non-32-bit operations, the remaining
+The only 32-bit instructions that we do not translate are those related to function calls (\texttt{Icall}, \texttt{Ibuiltin}, and \texttt{Itailcall}), because we enable inlining by default.
\subsubsection{Translating HTL to Verilog}