Push

author: Yann Herklotz <git@yannherklotz.com> 2020-11-19 10:01:56 +0000
committer: Yann Herklotz <git@yannherklotz.com> 2020-11-19 10:01:56 +0000
commit: ce1b8290fbacf709a1a5320c11e9d2157dadcc8f (patch)
tree: 8449fe4bc0d58102fa097357eb6575109aa47fdd /verilog.tex
parent: 8745901dc8b53042cc004049b2dfc49d0f7651fd (diff)
download: oopsla21_fvhls-ce1b8290fbacf709a1a5320c11e9d2157dadcc8f.tar.gz
oopsla21_fvhls-ce1b8290fbacf709a1a5320c11e9d2157dadcc8f.zip
1 files changed, 21 insertions, 27 deletions
diff --git a/verilog.tex b/verilog.tex
index 5768a43..2ca42dd 100644
--- a/verilog.tex
+++ b/verilog.tex
@@ -2,67 +2,63 @@
 
 \newcommand{\alwaysblock}{\texttt{always}-block}
 
-This section describes the Verilog semantics that were \JW{was} chosen for the target language, including the changes that were made to the semantics to make it \JWcouldcut{a better fit} a \JW{suitable} HLS target.  The Verilog standard is quite large~\cite{06_ieee_stand_veril_hardw_descr_languag,05_ieee_stand_veril_regis_trans_level_synth}, but the syntax and semantics can be reduced to a small subset that \vericert{} needs to target.
-\NR{Have we discussed what our options were and why we chose the HOL4 semantics?} \JW{Good point -- we should cite the Maude semantics too. I think that's the only viable alternative.}\YH{Yes, I should make that more clear, I cited Meredith semantics in the algorithm section, will add maude as well, the main reason is practicality and we don't need continuous assignment, basically the same reasons that Loow used to justify creating his own semantics.}
+This section describes the Verilog semantics that was chosen for the target language, including the changes that were made to the semantics to make it a suitable HLS target.  The Verilog standard is quite large~\cite{06_ieee_stand_veril_hardw_descr_languag,05_ieee_stand_veril_regis_trans_level_synth}, but the syntax and semantics can be reduced to a small subset that \vericert{} needs to target.
 
 The Verilog semantics we use is ported to Coq from a semantics written in HOL4  by \citet{loow19_proof_trans_veril_devel_hol}, used to prove the translation from HOL4 to Verilog~\cite{loow19_verif_compil_verif_proces}.% which was used to create a formal translation from a logic representation encoded in the HOL4~\cite{slind08_brief_overv_hol4} theorem prover into an equivalent Verilog design. 
-This semantics is quite practical as it is restricted to a small subset of Verilog, which can nonetheless be used to model all hardware constructs one would want to design.  The main features that are not supported by the syntax and semantics are continuous assignment and combinational always blocks.
-\NR{Shall we use special font fo always-blocks: maybe \alwaysblock{}}
+This semantics is quite practical as it is restricted to a small subset of Verilog, which can nonetheless be used to model all hardware constructs one would want to design.  The main features that are not supported by the syntax and semantics are continuous assignment and combinational \alwaysblock{}s, which are modelled in other semantics such as~\citet{meredith10_veril}, however, these are not necessarily needed, but require more complicated event queues and execution model.
 
-The semantics of Verilog differs from regular programming languages, as it is used to describe hardware directly, which is inherently parallel, rather than an algorithm, which is usually sequential.  The main construct in Verilog is the always block \JW{consider `always-block' as it's easier to parse}. \JWcouldcut{which is a construct that contains statements.}
-A module can contain multiple always blocks, all of which run in parallel.  These always blocks further contain statements such as if-statements or assignments to variables.  Each always block also contains a list of events at which it should trigger, which could either contain signals that are assigned to other signals in that always block, or a different signal such as a clock which will trigger the always block periodically, only the latter are actually supported in our target semantics. \JW{That sentence is a bit wordy. Can you just say something like: `We support only \emph{synchronous} logic, which means that the always block is triggered on (and only on) the rising edge of a clock signal.'?}  
-\NR{We should mention that variables cannot be driven by multiple \alwaysblock{}s, since one might get confused with data races when relating to concurrent processes in software.} \JW{Given the recent discussion on Teams, it seems to me that we perhaps don't need to mention here what happens if a variable is driven multiple times per clock cycle, especially since Vericert isn't ever going to do that.}
+The semantics of Verilog differs from regular programming languages, as it is used to describe hardware directly, which is inherently parallel, rather than an algorithm, which is usually sequential.  The main construct in Verilog is the \alwaysblock{}.
+A module can contain multiple \alwaysblock{}s, all of which run in parallel.  These \alwaysblock{}s further contain statements such as if-statements or assignments to variables.  We support only \emph{synchronous} logic, which means that the \alwaysblock{} is triggered on (and only on) the rising edge of a clock signal.
+%\NR{We should mention that variables cannot be driven by multiple \alwaysblock{}s, since one might get confused with data races when relating to concurrent processes in software.} \JW{Given the recent discussion on Teams, it seems to me that we perhaps don't need to mention here what happens if a variable is driven multiple times per clock cycle, especially since Vericert isn't ever going to do that.}
 
-As hardware designs normally describe events that will be executed periodically for an infinite amount of time, the top-level of the semantics is best described using small-step semantics, whereas the execution of one small step is described using big-step semantics. 
-\JW{I reckon that can be made clearer by talking about clock cycles. Maybe something like: ``The semantics combines the big-step and small-step styles. The overall execution of the hardware is described using a small-step semantics, with one small step per clock cycle; this is appropriate because hardware is routinely designed to run for an unlimited number of clock cycles and the big-step style is ill-suited to describing infinite executions. Then, within each clock cycle, a big-step semantics is used to execute all the statements.'' }
-An example of a rule \JWcouldcut{in the semantics} for executing an always block is shown below, where $\Sigma$ is the state of the registers in the module and $s$ is the statement inside the always block:
+The semantics combines the big-step and small-step styles. The overall execution of the hardware is described using a small-step semantics, with one small step per clock cycle; this is appropriate because hardware is routinely designed to run for an unlimited number of clock cycles and the big-step style is ill-suited to describing infinite executions. Then, within each clock cycle, a big-step semantics is used to execute all the statements.
+An example of a rule for executing an \alwaysblock{} is shown below, where $\Sigma$ is the state of the registers in the module and $s$ is the statement inside the \alwaysblock{}:
 
 \begin{equation*}
   \inferrule[Always]{(\Sigma, s)\downarrow_{\text{stmnt}} \Sigma'}{(\Sigma, \yhkeyword{always @(posedge clk) } s) \downarrow_{\text{always}} \Sigma'}
 \end{equation*}
 
-\noindent which shows that assuming the statement $s$ in the always block runs with state $\Sigma$ and produces the new state $\Sigma'$, the always block will result in the same final state.  As only clocked always blocks are supported, and one step in the semantics correspond to one clock cycle, it means that this rule is run once per clock cycle \NRcouldcut{which is what it is defined to do}.
-\NR{The mention about small steps being one cycle and 'only clocked/synchronous \alwaysblock{} are supported' can move earlier.} \JW{Actually couldn't this rule be run multiple times per clock cycle in the case where you have multiple always blocks in your module all triggered on the same rising edge?}\YH{Yes, it is run multiple times, which is controlled by the \textsc{Module} rule, which runs the always blocks sequentially. This is controlled by another big step semantics that either takes the Cons of the list and runs the left and right hand sides, or nil which returns the same state.}
+\noindent which shows that assuming the statement $s$ in the \alwaysblock{} runs with state $\Sigma$ and produces the new state $\Sigma'$, the \alwaysblock{} will result in the same final state.  As only clocked \alwaysblock{} are supported, and one step in the semantics correspond to one clock cycle, it means that this rule is run once per clock cycle for each \alwaysblock{}.
 
-Two types of assignments are supported in always blocks: nonblocking and blocking assignment.  Nonblocking assignment modifies the signal at the end of the clock cycle and atomically.  Blocking assignment, on the other hand, assigns the variable \NRreplace{directly}{instantaneously?} in the always block for later signals to pick up.  To model both these assignments, the state $\Sigma$ has to be split into two \NRreplace{parts}{sets?}: $\Gamma$, containing the current values of all variables, and $\Delta$, containing the values that will be assigned to the variables at the end of the clock cycle, we can therefore say that $\Sigma = (\Gamma, \Delta)$.~\NR{Can we say that $\Gamma$ contains instantaneous (ephemeral) updates for the current cycle and $\Delta$ contains periodical updates for the next cycle? Will be good to distinguish those two updates with the same terms across the paper, which can tie in with small-step and big-step distinction maybe asynchronous vs synchronous updates. }\YH{Hmm, so I don't think nonblocking and blocking ties in with big-step and smallstep, but yes it would be better to use the same terms throughoug the paper}  The nonblocking assignment can therefore be expressed as the following:
+Two types of assignments are supported in \alwaysblock{}: nonblocking and blocking assignment.  Nonblocking assignment modifies the signal at the end of the clock cycle and atomically.  Blocking assignment, on the other hand, assigns the variable instantaneously in the \alwaysblock{} for later signals to pick up.  To model both these assignments, the state $\Sigma$ has to be split into two maps: $\Gamma$, containing the current values of all variables, and $\Delta$, containing the values that will be assigned to the variables at the end of the clock cycle, we can therefore say that $\Sigma = (\Gamma, \Delta)$.  The nonblocking assignment can therefore be expressed as the following:
+
+%~\NR{Can we say that $\Gamma$ contains instantaneous (ephemeral) updates for the current cycle and $\Delta$ contains periodical updates for the next cycle? Will be good to distinguish those two updates with the same terms across the paper, which can tie in with small-step and big-step distinction maybe asynchronous vs synchronous updates. }\YH{Hmm, so I don't think nonblocking and blocking ties in with big-step and smallstep, but yes it would be better to use the same terms throughoug the paper}
 
 \begin{equation*}
   \inferrule[Nonblocking Reg]{\yhkeyword{name}\ d = \yhkeyword{OK}\ n \\ (\Gamma, e) \downarrow_{\text{expr}} v}{((\Gamma, \Delta), d\ \yhkeyword{ <= } e) \downarrow_{\text{stmnt}} (\Gamma, \Delta [n \mapsto v])}\\
 \end{equation*}
 
-\noindent where assuming that $\downarrow_{\text{expr}}$ evaluates an expression $e$ to a value $v$, then the nonblocking assignment $d\ \yhkeyword{ <= } e$ will update the future state of the variable $d$ with value $v$.  Finally, at the end of the clock cycle, the registers need to be updated with their future values in $\Delta$, which can be shown by the rule in the semantics that runs a whole module, $\vec{m}$ is a list of variable declarations and always blocks, and the $\downarrow_{\vec{m}}$ evaluates these sequentially and obtains the final state $(\Gamma', \Delta')$. \JW{I'm a little torn. On the one hand, this is interesting stuff, and I almost want to suggest putting in the `Blocking Reg' rule alongside the one above, because seeing the rules side-by-side makes it really clear how they both work. But I'm conscious that talking about the distinction between non-blocking and blocking assignments is all a bit moot because Vericert never produces blocking assignments (right?). And this text is all just explaining Loow's work, not your own work. So, in short, I'm a bit torn. Probably best to keep it as-is for now, but bear this in mind as a candidate for chopping nearer the deadline. What do yout think?}
+\noindent where assuming that $\downarrow_{\text{expr}}$ evaluates an expression $e$ to a value $v$, then the nonblocking assignment $d\ \yhkeyword{ <= } e$ will update the future state of the variable $d$ with value $v$.  Finally, at the end of the clock cycle, the registers need to be updated with their future values in $\Delta$, which can be shown by the rule in the semantics that runs a whole module, $\vec{m}$ is a list of variable declarations and \alwaysblock{}s, and the $\downarrow_{\vec{m}}$ evaluates these sequentially and obtains the final state $(\Gamma', \Delta')$.
+%\JW{I'm a little torn. On the one hand, this is interesting stuff, and I almost want to suggest putting in the `Blocking Reg' rule alongside the one above, because seeing the rules side-by-side makes it really clear how they both work. But I'm conscious that talking about the distinction between non-blocking and blocking assignments is all a bit moot because Vericert never produces blocking assignments (right?). And this text is all just explaining Loow's work, not your own work. So, in short, I'm a bit torn. Probably best to keep it as-is for now, but bear this in mind as a candidate for chopping nearer the deadline. What do yout think?}
 
 
 \begin{equation*}
   \inferrule[Module]{(\Gamma, \epsilon, \vec{m})\ \downarrow_{\text{module}} (\Gamma', \Delta')}{(\Gamma, \yhkeyword{module } \yhconstant{main} \yhkeyword{(...);}\ \vec{m}\ \yhkeyword{endmodule}) \downarrow_{\text{program}} (\Gamma' // \Delta')}
 \end{equation*}
 
-\noindent This rule dictates how the whole module runs in one clock cycle, from an initial state of all the variables $\Gamma$, to a final state $(\Gamma' // \Delta')$, where $//$ defines a merge between the two maps from variable names to values, where the right hand side values take priority if there is a conflict. \JW{Is $//$ a standard notation for this? If so, that's fine. It's just I haven't come across it before.}\YH{It is not, should I use different notation? Is there standard notation?} This rule correctly implements the behaviours for the blocking and nonblocking assignment by overwriting the assignments made through blocking assignment at the end of the clock cycle with the atomic assignments from the nonblocking assignments.
-
-\JW{I think most of the following paragraph could be chopped. We've already mentioned that we're only targeting a small fragment of Verilog, so we could mention at that point that in the fragment we target there is no discrepancy between the simulation behaviour and the synthesis behaviour. I don't know how important it is to talk about the details of evaluation order here -- does the reader really need to be aware of this in order to understand your work?}
-When targeting a hardware description language such as Verilog, it is important to be consistent between simulating the hardware and the behaviour of the synthesised result on actual hardware.  In the target Verilog semantics, only clocked always blocks are supported as these ensure that there are not mismatches between simulation and synthesis, as combinational always blocks that trigger on any change of an internal signal may behave differently in simulation or synthesis based on the order of operations.  This limitation in the semantics therefore helps keep the Verilog correct and consistent.  In addition to that, only nonblocking assignment is used in signals that are used in multiple always blocks, which stops any race conditions from occurring as all the signal updates happen deterministically. \NR{So the semantics allows multiple drivers then?}  Finally, a specific order of evaluation for the always blocks is chosen, and because of the limitations listed before, choosing an order is guaranteed to have the same behaviour as executing the always blocks in any order.\NR{Order is not guaranteed if we have multiple drivers.}
+\noindent This rule dictates how the whole module runs in one clock cycle, from an initial state of all the variables $\Gamma$, to a final state $(\Gamma' // \Delta')$, where $//$ defines a merge between the two maps from variable names to values, where the right hand side values take priority if there is a conflict.  This rule correctly implements the behaviours for the blocking and nonblocking assignment by overwriting the assignments made through blocking assignment at the end of the clock cycle with the atomic assignments from the nonblocking assignments.
+%\JW{Is $//$ a standard notation for this? If so, that's fine. It's just I haven't come across it before.}\YH{It is not, should I use different notation? Is there standard notation?}
 
 \subsection{Changes to the Semantics}
 
 Some small changes were made to the semantics proposed by \citet{loow19_proof_trans_veril_devel_hol} to make it suitable as a HLS target.
 
-The main change is the addition of support for two-dimensional arrays, which are needed to model RAM in Verilog.  RAM is needed to model the stack in C efficiently, without having to declare a variable for each possible stack location.  In the original semantics, RAMs, as well as inputs and outputs to the module could be modelled using a function from variable names (strings) to values, which could be modified accordingly to model inputs to the module.  This is quite an abstract description of memory, which can also be expressed as an array of bitvectors, which is the path we took instead.  This requires the addition of array operators to the semantics and correct reasoning of loads and stores to the array in different always blocks simultaneously.  Consider the following Verilog code:
+The main change is the addition of support for two-dimensional arrays, which are needed to model RAM in Verilog.  RAM is needed to model the stack in C efficiently, without having to declare a variable for each possible stack location.  In the original semantics, RAMs, as well as inputs and outputs to the module could be modelled using a function from variable names (strings) to values, which could be modified accordingly to model inputs to the module.  This is quite an abstract description of memory, which can also be expressed as an array of bitvectors, which is the path we took instead.  This requires the addition of array operators to the semantics and correct reasoning of loads and stores to the array in different \alwaysblock{}s simultaneously.  Consider the following Verilog code:
 
 \begin{minted}{verilog}
 always @(posedge clk) begin
-  x[0] = 23;
+  x[0] = 1;
   x[1] <= 1;
 end
 \end{minted}
 
-The code above modifies one array value using nonblocking assignment and it then modifies a second entry in the array with nonblocking assignment.  If the existing semantics used for values was used to update the array, then during the merge of values for the array, the array \texttt{x} from the nonblocking association map would replace the array from the blocking association map.  This would replace \texttt{x[0]} with it's original value and therefore behave incorrectly.  Instead, updates to arrays need to be recorded for each index in the array, and merging of the blocking and nonblocking queue needs to take into account each index separately.
+The code above modifies one array value using nonblocking assignment and it then modifies a second entry in the array with nonblocking assignment.  If the existing semantics used for values was used to update the array, then during the merge of values for the array, the array \texttt{x} from the nonblocking association map would replace the array from the blocking association map.  This would replace \texttt{x[0]} with it's original value and therefore behave incorrectly.  Instead, updates to arrays need to be recorded for each index in the array, and merging of the blocking and nonblocking queue needs to take into account each index separately.  Our state $\Gamma$ in practice are therefore split up into a map $\Gamma_{r}$, for instantaneous updates to variables, and $\Gamma_{a}$ for instantaneous updates to arrays, where the merge function then ensures that only the modified indices get updated when it is merged with the nonblocking map equivalent $\Delta_{a}$.
 
 Explicit support for declaring inputs, outputs and internal variables was added to the semantics to make sure that the generated Verilog also contains the correct declarations.  This adds some guarantees to the generated Verilog and ensures that it synthesises and simulates correctly.
 
 In addition to adding support for two-dimensional arrays, support for receiving external inputs was removed from the semantics for the case of simplicity, as these are not needed for an HLS target.  The main module in Verilog models the main function in C, and as the inputs to a C function shouldn't change during it's execution, there is no need to add support for external inputs for Verilog modules.  Finally, another simplification to the semantics that was made is to use 32 bit integers instead of arrays of booleans for the bitvector representation.  As the translation only currently has support for \texttt{int}, it is possible to simplify the semantics further and not have to handle bitvectors of an arbitrary size.
 
-\subsection{Integrating the Semantics in \compcert{}'s Model}
-\NR{The \textit{Verilog} semantics perhaps.}
+\subsection{Verilog Semantics in \compcert{}'s Model}
 
 \begin{figure*}
   \centering
@@ -81,12 +77,10 @@ In addition to adding support for two-dimensional arrays, support for receiving
   \label{fig:inferrence_module}
 \end{figure*}
 
-\NR{Would signposting this section at the start: "\compcert{} defines a specific model of computation which all semantics have to follow in order to prove properties about them. In this section, we first describe three of \compcert{}'s original computational states. Then, we describe the five registers and semantic rules we add to Verilog semantics to integrate it to the \compcert{} model."} 
-
-\compcert{} defines a specific model of computation which all semantics have to follow in order to prove properties about them.  \compcert{} has three main states that need to be defined:
+\compcert{} defines a specific model of computation which all semantics have to follow in order to prove properties about them. In this section, we first describe three of \compcert{}'s original computational states. Then, we describe the five registers and semantic rules we add to Verilog semantics to integrate it to the \compcert{} model.  \compcert{} has three main states that need to be defined:
 
 \begin{description}
-  \item[\texttt{State} \textit{sf} $m$ $v$ $\Gamma_{r}$ $\Gamma_{a}$] The main state when executing a function, with stack frame \textit{sf}, current module $m$, current state $v$ and variable states $\Gamma_{r}$ and $\Gamma_{a}$\NR{does r and a mean register and array?}.
+  \item[\texttt{State} \textit{sf} $m$ $v$ $\Gamma_{r}$ $\Gamma_{a}$] The main state when executing a function, with stack frame \textit{sf}, current module $m$, current state $v$ and variable states $\Gamma_{r}$ and $\Gamma_{a}$.
   \item[\texttt{Callstate} \textit{sf} $m$ $\vec{r}$] The state that is reached when a function is called, with the current stack frame \textit{sf}, current module $m$ and arguments $\vec{r}$.
   \item[\texttt{Returnstate} \textit{sf} $r$] The state that is reached when a function returns back to the caller, with stack frame \textit{sf} and return value $r$.
 \end{description}
author	Yann Herklotz <git@yannherklotz.com>	2020-11-19 10:01:56 +0000
committer	Yann Herklotz <git@yannherklotz.com>	2020-11-19 10:01:56 +0000
commit	ce1b8290fbacf709a1a5320c11e9d2157dadcc8f (patch)
tree	8449fe4bc0d58102fa097357eb6575109aa47fdd /verilog.tex
parent	8745901dc8b53042cc004049b2dfc49d0f7651fd (diff)
download	oopsla21_fvhls-ce1b8290fbacf709a1a5320c11e9d2157dadcc8f.tar.gz oopsla21_fvhls-ce1b8290fbacf709a1a5320c11e9d2157dadcc8f.zip