summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorYann Herklotz <git@yannherklotz.com>2021-04-15 10:42:47 +0100
committerYann Herklotz <git@yannherklotz.com>2021-04-15 10:42:47 +0100
commit968d2ef34f10ccf1d458d45fd8a49c7985fba8eb (patch)
tree784d38f7ddd73db0127ac45e4e35b1e1b1b77819
parent2bf1ff846a792ef3007fa1e97928697509f318f7 (diff)
downloadoopsla21_fvhls-968d2ef34f10ccf1d458d45fd8a49c7985fba8eb.tar.gz
oopsla21_fvhls-968d2ef34f10ccf1d458d45fd8a49c7985fba8eb.zip
Fix spelling
-rw-r--r--algorithm.tex4
-rw-r--r--appendix.tex2
-rw-r--r--proof.tex2
-rw-r--r--verilog.tex8
4 files changed, 8 insertions, 8 deletions
diff --git a/algorithm.tex b/algorithm.tex
index 2241ee6..21b25eb 100644
--- a/algorithm.tex
+++ b/algorithm.tex
@@ -46,7 +46,7 @@ The .NET framework has been used as a basis for other HLS tools, such as Kiwi~\c
\node[language] at (6.7,-1.5) (verilog) {Verilog};
\node at (0,1) {\bf\compcert{}};
\node at (0,-1.5) {\bf\vericert{}};
- \node[align=center] at (3.3,-2.4) {\footnotesize Memory\\[-0.5em]\footnotesize Inference};
+ \node[align=center] at (3.3,-2.4) {\footnotesize RAM\\[-0.5em]\footnotesize generation};
\draw[->,thick] (clight) -- (conta);
\draw[->,thick] (conta) -- (cminor);
\draw[->,thick] (cminor) -- (rtl);
@@ -332,7 +332,7 @@ However, the memory model that \compcert{} uses for its intermediate languages i
\subsubsection{Implementation of RAM templates}
-Verilog arrays can be used in a variety of ways, however, these do not all produce optimal hardware designs. If, for example, arrays in Verilog are accessed immediately in the data-path, then the synthesis tool is not be able to identify it as having the right properties for a RAM, and would instead implement the array using registers. This is extremely expensive, and for large memories this can easily blow up the area usage of the FPGA, and because of the longer wires that are needed, it would also affect the performance of the circuit. The synthesis tools therefore provide code snippets that they know how to transform into various constructs, including snippets that will generate proper RAMs in the final hardware. This process is called memory inferrence. The initial translation from 3AC to HTL converts loads and stores to direct accesses to the memory, as this preserves the same behaviour without having to insert more registers and logic. We therefore have another pass from HTL to itself which performs the translation from this na\"ive use of arrays to a representation which always allows for memory inferrence. This pass creates a separate always block to perform the loads and stores to the memory, and adds the necessary data, address and enable signals to communicate with that always-block from other always-blocks. This always-block is shown between lines 10-15 in Figure~\ref{fig:accumulator_v}.
+Verilog arrays can be used in a variety of ways, however, these do not all produce optimal hardware designs. If, for example, arrays in Verilog are accessed immediately in the data-path, then the synthesis tool is not be able to identify it as having the right properties for a RAM, and would instead implement the array using registers. This is extremely expensive, and for large memories this can easily blow up the area usage of the FPGA, and because of the longer wires that are needed, it would also affect the performance of the circuit. The synthesis tools therefore provide code snippets that they know how to transform into various constructs, including snippets that will generate proper RAMs in the final hardware. This process is called memory inference. The initial translation from 3AC to HTL converts loads and stores to direct accesses to the memory, as this preserves the same behaviour without having to insert more registers and logic. We therefore have another pass from HTL to itself which performs the translation from this na\"ive use of arrays to a representation which always allows for memory inference. This pass creates a separate always block to perform the loads and stores to the memory, and adds the necessary data, address and enable signals to communicate with that always-block from other always-blocks. This always-block is shown between lines 10-15 in Figure~\ref{fig:accumulator_v}.
There are two interesting parts to this RAM template. Firstly, the memory updates are triggered on the negative edge of the clock, out of phase with the rest of the design which is triggered on the positive edge of the clock. The main advantage is that instead of loads and stores taking three clock cycles and two clock cycles respectively, they only take two clock cycles and one clock cycle instead, greatly improving their performance. In addition to that, using the negative edge for the clock is supported by many synthesis tools, it therefore does not affect the maximum frequency of the final design.
diff --git a/appendix.tex b/appendix.tex
index defe44d..8d41fba 100644
--- a/appendix.tex
+++ b/appendix.tex
@@ -51,7 +51,7 @@
\inferrule[Nonblocking Array]{\yhkeyword{name}\ \textit{lhs} = \yhkeyword{OK}\ n \\ \textit{erun}\ \Gamma\ \textit{rhs}\ v_{\textit{rhs}}}{\textit{srun}\ (\Gamma_{r}, \Gamma_{a}, \Delta_{r}, \Delta_{a})\ (\textit{lhs} \Leftarrow \textit{rhs})\ (\Gamma_{r}, \Gamma_{a}, \Delta_{r} // \{n \rightarrow v_{\textit{rhs}}\}, \Delta_{a})}
\end{gather*}
\end{minipage}
- \caption{Inference rules for statements.}\label{fig:inferrence_statements}
+ \caption{Inference rules for statements.}\label{fig:inference_statements}
\end{figure*}
\begin{figure}
diff --git a/proof.tex b/proof.tex
index 1b1f234..9d250e0 100644
--- a/proof.tex
+++ b/proof.tex
@@ -143,7 +143,7 @@ The final lemma we need is that the Verilog we generate is deterministic. This r
\end{lemma}
\begin{proof}[Proof sketch]
- The Verilog semantics is deterministic because the order of operation of all the constructs is defined, and there is therefore only one way to evaluate the module and hence only one possible behaviour. This was proven for the small-step semantics shown in Figure~\ref{fig:inferrence_module}.
+ The Verilog semantics is deterministic because the order of operation of all the constructs is defined, and there is therefore only one way to evaluate the module and hence only one possible behaviour. This was proven for the small-step semantics shown in Figure~\ref{fig:inference_module}.
\end{proof}
%\subsection{Coq Mechanisation}
diff --git a/verilog.tex b/verilog.tex
index e95889c..c698b51 100644
--- a/verilog.tex
+++ b/verilog.tex
@@ -52,7 +52,7 @@ always @(posedge clk) begin x[0] = 1; x[1] <= 1; end
which modifies one array element using blocking assignment and then a second using nonblocking assignment. If the existing semantics were used to update the array, then during the merge, the entire array \texttt{x} from the nonblocking association map would replace the entire array from the blocking association map. This would replace \texttt{x[0]} with its original value and therefore behave incorrectly. Accordingly, we modified the maps so they record updates on a per-el\-em\-ent basis. Our state $\Gamma$ is therefore split up into $\Gamma_{r}$ for instantaneous updates to variables, and $\Gamma_{a}$ for instantaneous updates to arrays; $\Delta$ is split similarly. The merge function then ensures that only the modified indices get updated when $\Gamma_{a}$ is merged with the nonblocking map equivalent $\Delta_{a}$.
\paragraph{Adding negative edge support}
-To support memory inferrence efficiently and create and reason about a circuit that executes at the negative edge of the clock, support for the negative edge triggered \alwaysblock{}s was added to the semantics. This is shown in the modifications of the \textsc{Module} rule shown below:
+To support memory inference efficiently and create and reason about a circuit that executes at the negative edge of the clock, support for the negative edge triggered \alwaysblock{}s was added to the semantics. This is shown in the modifications of the \textsc{Module} rule shown below:
\begin{equation*}
\inferrule[Module]{(\Gamma, \epsilon, \vec{m})\ \downarrow_{\text{module}^{+}} (\Gamma', \Delta') \\ (\Gamma'\ //\ \Delta', \epsilon, \vec{m}) \downarrow_{\text{module}^{-}} (\Gamma'', \Delta'')}{(\Gamma, \yhkeyword{module } \yhconstant{main} \yhkeyword{(...);}\ \vec{m}\ \yhkeyword{endmodule}) \downarrow_{\text{program}} (\Gamma''\ //\ \Delta'')}
@@ -83,7 +83,7 @@ The main execution of the module $\downarrow_{\text{module}}$ is split into $\do
\end{gather*}
\end{minipage}
\caption{Top-level small-step semantics for Verilog modules in \compcert{}'s computational framework.}%
- \label{fig:inferrence_module}
+ \label{fig:inference_module}
\end{figure*}
The \compcert{} computation model defines a set of states through which execution passes. In this subsection, we explain how we extend our Verilog semantics with five special-purpose registers in order to integrate it into \compcert{}.
@@ -104,10 +104,10 @@ To support this computational model, we extend the Verilog module we generate wi
\item[stack] The function stack can be modelled as a RAM block, which is implemented using an array in the module, and denoted as \textit{stk}.
\end{description}
-Figure~\ref{fig:inferrence_module} shows the inference rules for moving between the computational states. The first, \textsc{Step}, is the normal rule of execution. It defines one step in the \texttt{State} state, assuming that the module is not being reset, that the finish state has not been reached yet, that the current and next state are $v$ and $v'$, and that the module runs from state $\Gamma$ to $\Gamma'$ using the \textsc{Step} rule. The \textsc{Finish} rule returns the final value of running the module and is applied when the \textit{fin} register is set; the return value is then taken from the \textit{ret} register.
+Figure~\ref{fig:inference_module} shows the inference rules for moving between the computational states. The first, \textsc{Step}, is the normal rule of execution. It defines one step in the \texttt{State} state, assuming that the module is not being reset, that the finish state has not been reached yet, that the current and next state are $v$ and $v'$, and that the module runs from state $\Gamma$ to $\Gamma'$ using the \textsc{Step} rule. The \textsc{Finish} rule returns the final value of running the module and is applied when the \textit{fin} register is set; the return value is then taken from the \textit{ret} register.
Note that there is no step from \texttt{State} to \texttt{Callstate}; this is because function calls are not supported, and it is therefore impossible in our semantics to ever reach a \texttt{Callstate} except for the initial call to main. So the \textsc{Call} rule is only used at the very beginning of execution; likewise, the \textsc{Return} rule is only matched for the final return value from the main function. %as there is no rule that allocates another stack frame \textit{sf} except for the initial call to main.
-Therefore, in addition to the rules shown in Figure~\ref{fig:inferrence_module}, an initial state and final state need to be defined:
+Therefore, in addition to the rules shown in Figure~\ref{fig:inference_module}, an initial state and final state need to be defined:
\begin{gather*}
\inferrule[Initial]{\yhfunction{is\_internal}\ (P.\texttt{main})}{\yhfunction{initial\_state}\ (\yhconstant{Callstate } []\ (P.\texttt{main})\ [])}\qquad