\section{Verilog}\label{sec:verilog}

This section describes the Verilog semantics that were chosen for the target language, including the changes that were made to the semantics to be a better fit as an HLS target.

The Verilog standard is quite large, and not all Verilog features are needed to be able to describe hardware.  Many Features are only useful for simulation and do not affect the hardware itself.  We can therefore restrict the Verilog semantics to the synthesisable subset of Verilog~\cite{05_ieee_stand_veril_regis_trans_level_synth}.  In addition to that, HLS dictates which Verilog constructs are generated, meaning the Verilog subset that has to be modelled by the semantics can be reduced even further to only support the constructs that are needed.  Supporting a smaller subset in the semantics also means that there is less chance that the standard is misunderstood, and that the semantics actually model how the Verilog is simulated and synthesised by existing tools and how it is dictated by the standard.

The Verilog semantics are based on the semantics proposed by \citet{loow19_verif_compil_verif_proces}, which was used to create a formal translation from a logic representation encoded in the HOL4~\cite{slind08_brief_overv_hol4} theorem prover into an equivalent Verilog design.  These semantics are quite practical as they restrict themselves to a small subset of Verilog, which can nonetheless be used to model all hardware constructs one would want to design.  The main features that are supported by the syntax and semantics are module items, such as variable declarations and always blocks, statements and expressions.  As hardware designs normally describe events that will be executed periodically for an infinite amount of time, the top-level of the semantics can be described using small-step semantics, whereas the execution of one small step is then described using big-step semantics.

The semantics of Verilog differ from regular programming languages, as it is used to describe hardware directly, which is inherently parallel, instead of describing an algorithm, which is often done sequentially.  The main construct in Verilog is the always block, which is construct that contains statements.  A module can contain multiple always blocks, which all run in parallel.  These always blocks further contain statements such as if-statements or assignments to variables.  Each always block also contains a list of events at which it should trigger, which could either contain signals that are assigned to other signals in that always block, or a different signal such as a clock which will trigger the always block periodically, only the latter are actually supported in our target semantics.  An example of a rule in the semantics for executing an always block in the semantics shown below, where $\Sigma$ is the state of the registers in the module and $s$ is the statement inside the always block:

\begin{equation*}
  \inferrule[Always]{(\Sigma, s)\longrightarrow_{\text{stmnt}} \Sigma'}{(\Sigma, \yhkeyword{always @(posedge clk) } s) \longrightarrow_{\text{always}} \Sigma'}
\end{equation*}

\noindent which shows that assuming the statement $s$ in the always block runs with state $\Sigma$ and produces the new state $\Sigma'$, the always block will result in the same final state.  As only clocked always blocks are supported, and one step in the semantics correspond to one clock cycle, it means that this rule is run once per clock cycle which is what it is defined to do.

Two types of assignments are supported in always blocks: nonblocking and blocking assignment.  Nonblocking assignment modifies the signal at the end of the timestep and atomically.  Blocking assignment, on the other hand, assigns the variable directly in the always block for later signals to pick up.  To model both these assignments, the state $\Sigma$ has to be split into two parts: $\Gamma$, containing the current values of all variables, and $\Delta$, containing the values that will be assigned to the variables at the end of the clock cycle, we can therefore say that $\Sigma = (\Gamma, \Delta)$.  The nonblocking assignment can therefore be expressed as the following:

\begin{equation*}
  \inferrule[Nonblocking Reg]{\yhkeyword{name}\ d = \yhkeyword{OK}\ n \\ (\Gamma, e) \longrightarrow_{\text{expr}} v}{((\Gamma, \Delta), d\ \yhkeyword{ <= } e) \longrightarrow_{\text{stmnt}} (\Gamma, \Delta [n \mapsto v])}\\
\end{equation*}

\noindent where assuming that $\longrightarrow_{\text{expr}}$ evaluates an expression $e$ to a value $v$, then the nonblocking assignment $d\ \yhkeyword{ <= } e$ will update the future state of the variable $d$ with value $v$.  Finally, at the end of the clock cycle, the registers need to be updated with their future values in $\Delta$, which can be shown by the rule in the semantics that runs a whole module, $\vec{m}$ is a list of variable declarations and always blocks, and the $\longrightarrow_{\vec{m}}$ evaluates these sequentially and obtains the final state $(\Gamma', \Delta')$.

\begin{equation*}
  \inferrule[Module]{(\Gamma, \epsilon, \vec{m})\ \longrightarrow_{\vec{m}} (\Gamma', \Delta')}{(\Gamma, \yhkeyword{module } \yhconstant{main} \yhkeyword{(...);}\ \vec{m}\ \yhkeyword{endmodule}) \longrightarrow_{m} (\Gamma' // \Delta')}
\end{equation*}

\noindent This rule dictates how the whole module runs in one clock cycle, from an initial state of all the variables $\Gamma$, to a final state $(\Gamma' // \Delta')$, where $//$ defines a merge between the two maps from variable names to values, where the right hand side values take priority if there is a conflict.  This rule correctly implements the behaviours for the blocking and nonblocking assignment by overwriting the assignments made through blocking assignment at the end of the clock cycle with the atomic assignments from the nonblocking assignments.

When targeting a hardware description language such as Verilog, it is important to be consistent between simulating the hardware and the behaviour of the synthesised result on actual hardware.  In the target Verilog semantics, only clocked always blocks are supported as these ensure that there are not mismatches between simulation and synthesis, as combinational always blocks that trigger on any change of an internal signal may behave differently in simulation or synthesis based on the order of operations.  This limitation in the semantics therefore helps keep the Verilog correct and consistent.  In addition to that, only nonblocking assignment is used in signals that are used in multiple always blocks, which stops any race conditions from occurring as all the signal updates happen deterministically.  Finally, a specific order of evaluation for the always blocks is chosen, and because of the limitations listed before, choosing an order is guaranteed to have the same behaviour as executing the always blocks in any order.

\subsection{Changes to the Semantics}

Some small changes were made to the semantics proposed by \citet{loow19_verif_compil_verif_proces} to adapt the semantics to work better as a HLS target, as well as adding support for features that were needed in the target language.

The main change to the semantics is the support for two-dimensional arrays, which are needed to model RAM in Verilog.  RAM is needed to model the stack in C efficiently, without having to declare a variable for each possible stack location.  In the original semantics, RAMs, as well as inputs and outputs to the module could be modelled using a function from variable names (strings) to values, which could be modified accordingly to model inputs to the module.  This is quite an abstract description of memory, which can also be expressed as an array of bitvectors, which is the path we took instead.  This requires the addition of array operators to the semantics and correct reasoning of loads and stores to the array in different always blocks simultaneously.  Consider the following Verilog code:

\begin{minted}{verilog}
always @(posedge clk) begin
  x[0] = 23;
  x[1] <= 1;
end
\end{minted}

The code above modifies one array value using nonblocking assignment and it then modifies a second entry in the array with nonblocking assignment.  If the existing semantics used for values was used to update the array, then during the merge of values for the array, the array \texttt{x} from the nonblocking association map would replace the array from the blocking association map.  This would replace \texttt{x[0]} with it's original value and therefore behave incorrectly.  Instead, updates to arrays need to be recorded for each index in the array, and merging of the blocking and nonblocking queue needs to take into account each index separately.

Explicit support for declaring inputs, outputs and internal variables was added to the semantics to make sure that the generated Verilog also contains the correct declarations.  This adds some guarantees to the generated Verilog and ensures that it synthesises and simulates correctly.

In addition to adding support for two-dimensional arrays, support for receiving external inputs was removed from the semantics for the case of simplicity, as these are not needed for an HLS target.  The main module in Verilog models the main function in C, and as the inputs to a C function shouldn't change during it's execution, there is no need to add support for external inputs for Verilog modules.  Finally, another simplification to the semantics that was made is to use 32 bit integers instead of arrays of booleans for the bitvector representation.  As the translation only currently has support for \texttt{int}, it is possible to simplify the semantics further and not have to handle bitvectors of an arbitrary size.

\subsection{Integrating the Semantics in \compcert{}'s Model}

\begin{figure*}
  \centering
  \begin{minipage}{1.0\linewidth}
    \begin{gather*}
      \inferrule[Step]{\Gamma\ !\ \textit{rst} = \yhconstant{Some } 0 \\ \Gamma\ !\ \textit{fin} = \yhconstant{Some } 0 \\ \Gamma\ !\ \sigma = \yhconstant{Some } v \\ (m, \Gamma)\ \longrightarrow_{m} \Gamma' \\ \Gamma'\ !\ \sigma = \yhconstant{Some } v'}{\yhconstant{State } \textit{sf }\ m\ v\ \Gamma \longrightarrow \yhconstant{State } \textit{sf }\ m\ v'\ \Gamma'}\\
      %
      \inferrule[Finish]{\Gamma\ !\ \textit{fin} = \yhconstant{Some } 1 \\ \Gamma\ !\ \textit{ret} = \yhconstant{Some } r}{\yhconstant{State } \textit{sf }\ m\ \sigma\ \Gamma \longrightarrow \yhconstant{Returnstate } \textit{sf }\ r}\\
      %
      \inferrule[Call]{ }{\yhconstant{Callstate } \textit{sf }\ m\ \vec{r} \longrightarrow \yhconstant{State } \textit{sf }\ m\ n\ ((\yhfunction{init\_params}\ \vec{r}\ a)[\sigma \mapsto n, \textit{fin} \mapsto 0, \textit{rst} \mapsto 0])}\\
      %
      \inferrule[Return]{ }{\yhconstant{Returnstate } (\yhconstant{Stackframe } r\ m\ \textit{pc }\ \Gamma :: \textit{sf}) \longrightarrow \yhconstant{State } \textit{sf }\ m\ \textit{pc }\ (\Gamma [ \textit{st} \mapsto \textit{pc}, r \mapsto i ])\ \epsilon}
    \end{gather*}
  \end{minipage}
  \caption{Inferrence rules for modules}%
  \label{fig:inferrence_module}
\end{figure*}

\compcert{} defines a specific model of computation which all semantics have to follow in order to prove properties about them.  \compcert{} has three main states that need to be defined:

\begin{description}
  \item[\texttt{State} \textit{sf} $m$ $v$ $\Gamma_{r}$ $\Gamma_{a}$] The main state when executing a function, with stack frame \textit{sf}, current module $m$, current state $v$ and variable states $\Gamma_{r}$ and $\Gamma_{a}$.
  \item[\texttt{Callstate} \textit{sf} $m$ $\vec{r}$] The state that is reached when a function is called, with the current stack frame \textit{sf}, current module $m$ and arguments $\vec{r}$.
  \item[\texttt{Returnstate} \textit{sf} $r$] The state that is reached when a function returns back to the caller, with stack frame \textit{sf} and return value $r$.
\end{description}

However, as Verilog behaves quite differently to software programming languages, this model does not match the Verilog semantics.  Instead, the module definition in Verilog needs to be enriched to support this model of computation, by adding required signals to the inputs and outputs of modules.

\begin{description}
  \item[program counter] First of all, the program counter needs to be modelled, which can be done using a register which keeps track of state, denoted as $\sigma$.
  \item[function entry point] When a function is called, the entry point denotes the first instruction that will be executed, which can be modelled using a reset signal that sets the state accordingly, denoted as $rst$.
  \item[return value] The return value can be modelled by setting a finished flag to one when the result is ready, and putting the result into a 32-bit output register, denoted as \textit{fin} and \textit{rtrn} respectively.
  \item[stack] The function stack can be modelled as a RAM using a two-dimensional array in the module, denoted as \textit{stk}.
\end{description}

Figure~\ref{fig:inferrence_module} shows the inferrence rules that convert from one state to another.  The first rule \textsc{Step} is the normal rule of execution.  Assuming that the module is not being reset, and that the finish state has not been reached yet and that the current and next state are $v$ and $v'$, and finally that the module runs from state $\Gamma$ to $\Gamma'$ using the \textsc{Module} rule, we can then define one step in the \texttt{State}.  The \textsc{Finish} rule is the rule that returns the final value of running the module and is applied when the \textit{fin} register is set.  The return value is then takeng from the \textit{ret} register.

The first thing to note about the \textsc{Call} rule is that there is no step from \texttt{State} to \texttt{Callstate}, as function calls are not supported, and it is therefore impossible in our semantics to ever reach a \texttt{Callstate} except for the initial call to main.  The same can be said about the \textsc{Return} state rule, which will only be matched for the final return value from the main function, as there is no rule that allocates another stack frame \textit{sf} except for the initial call to main.  Therefore, in addition to the rules shown in Figure~\ref{fig:inferrence_module}, an initial state and final state need to be defined.

\begin{align*}
  \texttt{initial:}\quad &\forall P,\ \yhconstant{Callstate } []\ P.\texttt{main } []\\
  \texttt{final:}\quad &\forall n,\ \yhconstant{Returnstate } []\ n \implies n
\end{align*}

\noindent where the initial state is defined as the \texttt{Callstate} with an empty stack frame and no arguments for the \texttt{main} function of program $P$, and the final state results in the program output of value $n$ when reaching a \texttt{Returnstate} with an empty stack frame.

%%% Local Variables:
%%% mode: latex
%%% TeX-master: "main"
%%% End: