summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorYann Herklotz <git@yannherklotz.com>2022-05-05 22:39:29 +0100
committerYann Herklotz <git@yannherklotz.com>2022-05-05 22:39:29 +0100
commit01e0d441b18a238125715ac853ccc34e0332f0b1 (patch)
tree8e60bf82dd10f1a70b97ecb6fe59a076bc297afe
parentc1943d233602adb5fdc3679ecbbf6ed5db7e6307 (diff)
downloadlsr22_fvhls-01e0d441b18a238125715ac853ccc34e0332f0b1.tar.gz
lsr22_fvhls-01e0d441b18a238125715ac853ccc34e0332f0b1.zip
Fix some figures
-rw-r--r--chapters/hls.tex255
1 files changed, 177 insertions, 78 deletions
diff --git a/chapters/hls.tex b/chapters/hls.tex
index ae162a0..6a6809f 100644
--- a/chapters/hls.tex
+++ b/chapters/hls.tex
@@ -140,10 +140,10 @@ formal semantics.
\stopplacefigure
\paragraph{Architecture of Vericert.} The main work flow of Vericert is given in
-Fig.~\ref{fig:rtlbranch}, which shows those parts of the translation that are performed in CompCert,
-and those that have been added. This includes translations to two new intermediate languages added
-in Vericert, \lindex{HTL}HTL and \lindex{Verilog}Verilog, as well as an additional optimisation pass
-labelled as \quotation{RAM insertion}.
+\in{Figure}[fig:rtlbranch], which shows those parts of the translation that are performed in
+CompCert, and those that have been added. This includes translations to two new intermediate
+languages added in Vericert, \lindex{HTL}HTL and \lindex{Verilog}Verilog, as well as an additional
+optimisation pass labelled as \quotation{RAM insertion}.
\def\numcompcertlanguages{ten}
@@ -214,18 +214,18 @@ control-flow using if-statements and case-statements.
\stopplacesubfigure
\startplacesubfigure
\hbox{\starttikzpicture
- \node[draw,circle,inner sep=6pt] (s0) at (0,0) {$S_{\mathit{start}} / \mono{x}$};
- \node[draw,circle,inner sep=8pt] (s1) at (1.5,-3) {$S_{1} / \mono{1}$};
- \node[draw,circle,inner sep=8pt] (s2) at (3,0) {$S_{0} / \mono{1}$};
- \node (s2s) at ($(s2.west) + (-0.3,1)$) {\mono{00}};
- \node (s2ss) at ($(s2.east) + (0.3,1)$) {\mono{1x}};
+ \node[draw,circle,inner sep=6pt] (s0) at (0,0) {$S_{\mathit{start}} / \text{\tt x}$};
+ \node[draw,circle,inner sep=8pt] (s1) at (1.5,-3) {$S_{1} / \text{\tt 1}$};
+ \node[draw,circle,inner sep=8pt] (s2) at (3,0) {$S_{0} / \text{\tt 1}$};
+ \node (s2s) at ($(s2.west) + (-0.3,1)$) {\type{00}};
+ \node (s2ss) at ($(s2.east) + (0.3,1)$) {\type{1x}};
\draw[-{Latex[length=2mm,width=1.4mm]}] ($(s0.west) + (-0.3,1)$) to [out=0,in=120] (s0);
\draw[-{Latex[length=2mm,width=1.4mm]}] (s0)
- to [out=-90,in=150] node[midway,left] {\mono{01}} (s1);
+ to [out=-90,in=150] node[midway,left] {\type{01}} (s1);
\draw[-{Latex[length=2mm,width=1.4mm]}] (s1)
- to [out=80,in=220] node[midway,left] {\mono{xx}} (s2);
+ to [out=80,in=220] node[midway,left] {\type{xx}} (s2);
\draw[-{Latex[length=2mm,width=1.4mm]}] (s2)
- to [out=260,in=50] node[midway,right] {\mono{01}} (s1);
+ to [out=260,in=50] node[midway,right] {\type{01}} (s1);
\draw[-{Latex[length=2mm,width=1.4mm]}] (s2)
to [out=120,in=40] ($(s2.west) + (-0.3,0.7)$) to [out=220,in=170] (s2);
\draw[-{Latex[length=2mm,width=1.4mm]}] (s2)
@@ -235,16 +235,16 @@ control-flow using if-statements and case-statements.
\stopfloatcombination
\stopplacemarginfigure
-A simple state machine can be implemented as shown in Fig.~\ref{fig:tutorial:state_machine}. At
-every positive edge of the clock (\mono{clk}), both of the always-blocks will trigger
-simultaneously. The first always-block controls the values in the register \mono{x} and the output
-\mono{z}, while the second always-block controls the next state the state machine should go to.
-When the \mono{state} is 0, \mono{tmp} will be assigned to the input \mono{y} using nonblocking
-assignment, denoted by \mono{<=}. Nonblocking assignment assigns registers in parallel at the end
+A simple state machine can be implemented as shown in \in{Figure}[fig:tutorial:state_machine]. At
+every positive edge of the clock (\type{clk}), both of the always-blocks will trigger
+simultaneously. The first always-block controls the values in the register \type{x} and the output
+\type{z}, while the second always-block controls the next state the state machine should go to.
+When the \type{state} is 0, \type{tmp} will be assigned to the input \type{y} using nonblocking
+assignment, denoted by \type{<=}. Nonblocking assignment assigns registers in parallel at the end
of the clock cycle, rather than sequentially throughout the always-block. In the second
-always-block, the input \mono{y} will be checked, and if it's high it will move on to the next
-state, otherwise it will stay in the current state. When \mono{state} is 1, the first always-block
-will reset the value of \mono{tmp} and then set \mono{z} to the original value of \mono{tmp}, since
+always-block, the input \type{y} will be checked, and if it's high it will move on to the next
+state, otherwise it will stay in the current state. When \type{state} is 1, the first always-block
+will reset the value of \type{tmp} and then set \type{z} to the original value of \type{tmp}, since
nonblocking assignment does not change its value until the end of the clock cycle. Finally, the
last always-block will set the state to 0 again.
@@ -560,7 +560,7 @@ supported in Vericert, it follows that the addresses given to the loads and stor
of four. Translating from byte-addressed memory to word-addressed memory can then be done by
dividing the address by four.
-\subsubsection[sec:hls:algorithm:optimisation:ram]{Implementation of RAM Interface}
+\subsubsection[sec:algorithm:optimisation:ram]{Implementation of RAM Interface}
The simplest way to implement loads and stores in Vericert would be to access the Verilog array
directly from within the data-path (i.e., inside the always-block on lines 16--32 of
@@ -653,14 +653,18 @@ implementation in terms of shifts.
This section describes the Verilog semantics that was chosen for the target language, including the
changes that were made to the semantics to make it a suitable HLS target. The Verilog standard is
-quite large~, but the syntax and semantics can be reduced to a small subset that needs to
-target. This section also describes how 's representation of memory differs from 's memory model.
-
-The Verilog semantics we use is ported to Coq from a semantics written in HOL4 by and used to prove
-the translation from HOL4 to Verilog~. This semantics is quite practical as it is restricted to a
-small subset of Verilog, which can nonetheless be used to model the hardware constructs required for
-HLS. The main features that are excluded are continuous assignment and combinational always-blocks;
-these are modelled in other semantics such as that by~.
+quite
+large~\cite[06_ieee_stand_veril_hardw_descr_languag,05_ieee_stand_veril_regis_trans_level_synth],
+but the syntax and semantics can be reduced to a small subset that Vericert needs to target. This
+section also describes how CompCert's representation of memory differs from CompCert's memory model.
+
+The Verilog semantics we use is ported to Coq from a semantics written in HOL4 by
+\cite[authoryears][loow19_proof_trans_veril_devel_hol] and used to prove the translation from HOL4
+to Verilog~\cite[loow19_verif_compil_verif_proces]. This semantics is quite practical as it is
+restricted to a small subset of Verilog, which can nonetheless be used to model the hardware
+constructs required for HLS. The main features that are excluded are continuous assignment and
+combinational always-blocks; these are modelled in other semantics such as that
+by~\cite[authoryears][meredith10_veril].
The semantics of Verilog differs from regular programming languages, as it is used to describe
hardware directly, which is inherently parallel, rather than an algorithm, which is usually
@@ -697,7 +701,7 @@ as follows:
\placeformula\startformula \text{\sc Nonblocking Reg }\ \frac{\text{\tt name}\ d = \text{\tt OK}\ n \\ (\Gamma, e) \downarrow_{\text{expr}} v}{((\Gamma, \Delta), d\ \text{\tt <= } e) \downarrow_{\text{stmnt}} (\Gamma, \Delta [n \mapsto v])}\\ \stopformula
where assuming that $\downarrow_{\text{expr}}$ evaluates an expression $e$ to a value $v$, the
-nonblocking assignment $d\ \mono{ <= } e$ updates the future state of the variable $d$ with value
+nonblocking assignment $d\ \text{\tt <= } e$ updates the future state of the variable $d$ with value
$v$.
Finally, the following rule dictates how the whole module runs in one clock cycle:
@@ -717,7 +721,8 @@ the cycle.
\subsection[changes-to-the-semantics]{Changes to the Semantics}
-Five changes were made to the semantics proposed by to make it suitable as an HLS target.
+Five changes were made to the semantics proposed by
+\cite[authoryears][loow19_proof_trans_veril_devel_hol] to make it suitable as an HLS target.
\subsubsection[adding-array-support]{Adding array support}
@@ -746,9 +751,9 @@ indices get updated when $\Gamma_{a}$ is merged with the nonblocking map equival
\subsubsection[adding-negative-edge-support]{Adding negative edge support}
To reason about circuits that execute on the negative edge of the clock (such as our RAM interface
-described in Section~\goto{{[}sec:algorithm:optimisation:ram{]}}[sec:algorithm:optimisation:ram]),
-support for negative-edge-triggered always-blocks was added to the semantics. This is shown in the
-modifications of the {\sc Module} rule shown below:
+described in \in{Section}[sec:algorithm:optimisation:ram]), support for negative-edge-triggered
+always-blocks was added to the semantics. This is shown in the modifications of the {\sc Module}
+rule shown below:
\placeformula\startformula \text{\sc Module }\ \frac{(\Gamma, \epsilon, \vec{m})\
\downarrow_{\text{module}^{+}} (\Gamma', \Delta') \\ (\Gamma'\ //\ \Delta', \epsilon, \vec{m})
@@ -782,7 +787,7 @@ inputs for Verilog modules.
\subsubsection[simplifying-representation-of-bitvectors]{Simplifying representation of bitvectors}
Finally, we use 32-bit integers to represent bitvectors rather than arrays of booleans. This is
-because (currently) only supports types represented by 32 bits.
+because Vericert (currently) only supports types represented by 32 bits.
\subsection[sec:verilog:integrating]{Integrating the Verilog Semantics into CompCert's Model}
@@ -815,11 +820,11 @@ because (currently) only supports types represented by 32 bits.
\stopalign \stopformula
\stopplacefigure
-The computation model defines a set of states through which execution passes. In this subsection, we
-explain how we extend our Verilog semantics with four special-purpose registers in order to
-integrate it into .
+The CompCert computation model defines a set of states through which execution passes. In this
+subsection, we explain how we extend our Verilog semantics with four special-purpose registers in
+order to integrate it into CompCert.
-executions pass through three main states:
+CompCert executions pass through three main states:
\desc{\type{State} $\mathit{sf}$ $m$ $v$ $\Gamma_{r}$ $\Gamma_{a}$} The main state when executing a
function, with stack frame $\mathit{sf}$, current module $m$, current state $v$ and variable states
@@ -848,7 +853,7 @@ and $\mathit{ret}$ respectively.
\desc{stack} The function stack can be modelled as a RAM block, which is implemented using an array
in the module, and denoted as $\mathit{stk}$.
-Fig.~\goto{{[}fig:inference_module{]}}[fig:inference_module] shows the inference rules for moving
+\in{Figure}[fig:inference_module] shows the inference rules for moving
between the computational states. The first, {\sc Step}, is the normal rule of execution. It defines
one step in the \type{State} state, assuming that the module is not being reset, that the finish
state has not been reached yet, that the current and next state are $v$ and $v'$, and that the
@@ -860,9 +865,8 @@ Note that there is no step from \type{State} to \type{Callstate}; this is becaus
not supported, and it is therefore impossible in our semantics ever to reach a \type{Callstate}
except for the initial call to main. So the {\sc Call} rule is only used at the very beginning of
execution; likewise, the {\sc Return} rule is only matched for the final return value from the main
-function. Therefore, in addition to the rules shown in
-Fig.~\goto{{[}fig:inference_module{]}}[fig:inference_module], an initial state and final state need
-to be defined:
+function. Therefore, in addition to the rules shown in \in{Figure}[fig:inference_module], an initial
+state and final state need to be defined:
\placeformula\startformula \startalign[n=1]
\NC\text{\sc Initial }\ \frac{\text{\tt is\_internal}\ P.\text{\tt main}}{\text{\tt initial\_state}\
@@ -870,9 +874,9 @@ to be defined:
\NC\text{\sc Final }\ \frac{}{\text{\sc final\_state}\ (\text{\sc Returnstate}\ []\ n)\ n}\NR
\stopalign \stopformula
-where the initial state is the \type{Callstate} with an empty stack frame and no arguments for the
-\type{main} function of program $P$, where this \type{main} function needs to be in the current
-translation unit. The final state results in the program output of value $n$ when reaching a
+\noindent where the initial state is the \type{Callstate} with an empty stack frame and no arguments
+for the \type{main} function of program $P$, where this \type{main} function needs to be in the
+current translation unit. The final state results in the program output of value $n$ when reaching a
\type{Returnstate} with an empty stack frame.
\subsection[sec:verilog:memory]{Memory Model}
@@ -882,25 +886,101 @@ description language. There is no preexisting architecture that Verilog will pro
describe any memory layout that is needed. Instead of having specific semantics for memory, the
semantics only needs to support the language features that can produce these different memory
layouts, these being Verilog arrays. We therefore define semantics for updating Verilog arrays
-using blocking and nonblocking assignment. We then have to prove that the C memory model that uses
-matches with the interpretation of arrays used in Verilog. The memory model is infinite, whereas our
-representation of arrays in Verilog is inherently finite. There have already been efforts to define
-a general finite memory model for all intermediate languages in , such as S~ or -TSO~, or keeping
+using blocking and nonblocking assignment. We then have to prove that the C memory model that
+CompCert uses matches with the interpretation of arrays used in Verilog. The CompCert memory model
+is infinite, whereas our representation of arrays in Verilog is inherently finite. There have
+already been efforts to define a general finite memory model for all intermediate languages in
+CompCert, such as CompCertS~\cite[besson18_compc] or CompCert-TSO~\cite[sevcik13_compc], or keeping
the intermediate languages intact and translate to a more concrete finite memory model in the back
-end, such as in ELF~. We also define such a translation from 's standard infinite memory model to
-finite arrays that can be represented in Verilog. There is therefore no more notion of an abstract
-memory model and all the interactions to memory are encoded in the hardware itself.
-
-This translation is represented in
-Fig.~\goto{{[}fig:memory_model_transl{]}}[fig:memory_model_transl]. defines a map from blocks to
-maps from memory addresses to memory contents. Each block represents an area in memory; for example,
-a block can represent a global variable or a stack for a function. As there are no global variables,
-the main stack can be assumed to be block 0, and this is the only block we translate. Meanwhile, our
-Verilog semantics defines two finite arrays of optional values, one for the blocking assignments map
-$\Gamma_{\rm a}$ and one for the nonblocking assignments map $\Delta_{\rm a}$. The optional values
-are present to ensure correct merging of the two association maps at the end of the clock cycle. The
-invariant used in the proofs is that block 0 should be equivalent to the merged representation of
-the $\Gamma_{\rm a}$ and $\Delta_{\rm a}$ maps.
+end, such as in CompCertELF~\cite[wang20_compc]. We also define such a translation from CompCert's
+standard infinite memory model to finite arrays that can be represented in Verilog. There is
+therefore no more notion of an abstract memory model and all the interactions to memory are encoded
+in the hardware itself.
+
+\definecolor[compcertmemmodel][x=E2CCEA]
+\definecolor[vericertmemmodel][x=CBE1DB]
+
+\startplacefigure[reference={fig:memory_model_transl},title={Change in the memory model during the
+ translation of 3AC into HTL. The state of the memories in each case is right after the
+ execution of the store to memory.}]
+ \hbox{\starttikzpicture
+ \fill[compcertmemmodel,rounded corners=3pt] (0,0) rectangle (5,-5);
+ \fill[vericertmemmodel,rounded corners=3pt] (7,0) rectangle (12,-5);
+ \node[right] at (0,-0.3) {\rmx {\bf CompCert's Memory Model}};
+ \node[right] at (7,-0.3) {\rmx {\bf Verilog Memory Representation}};
+ \node[right] (x0) at (0.2,-1.9) {\rmx 0};
+ \node[right] (x1) at (0.2,-2.5) {\rmx 1};
+ \node[rotate=90] (x2) at (0.43,-3.1) {$\cdots$};
+ \foreach \x in {0,...,6}{%
+ \node[right] (s\x) at (2.5,-1-\x*0.3) {\rmx \x};
+ \node[right] (t\x) at (4,-1-\x*0.3) {};
+ \draw[->] (s\x) -- (t\x);
+ }
+
+ \node[right] at (t0) {\rmx {\tt DE}};
+ \node[right] at (t1) {\rmx {\tt AD}};
+ \node[right] at (t2) {\rmx {\tt BE}};
+ \node[right] at (t3) {\rmx {\tt EF}};
+ \node[right] at (t4) {\rmx {\tt 12}};
+ \node[right] at (t5) {\rmx {\tt 34}};
+ \node[right] at (t6) {\rmx {\tt 56}};
+ \node[right] at (3.1,-3.1) {$\cdots$};
+
+ \node[right] at (3.1,-4) {$\cdots$};
+ \node[scale=1.3] at (6,-2.5) {\rmd $\Rightarrow$};
+
+ \draw[->] (x0) -- (s3);
+ \draw[->] (x1) -- (2.5,-4);
+ \draw (0,-4.3) -- (5,-4.3);
+ \node at (2.5,-4.7) {\rmx {\tt x[0] = 0xDEADBEEF;}};
+
+ \draw (7.2,-1.2) rectangle (9.4,-3.9);
+ \draw (9.6,-1.2) rectangle (11.8,-3.9);
+
+ \foreach \x in {0,...,8}{%
+ \draw (7.2,-1.2-\x*0.3) -- (9.4,-1.2-\x*0.3);
+ \draw (9.6,-1.2-\x*0.3) -- (11.8,-1.2-\x*0.3);
+ \node (b\x) at (8.3,-1.35-\x*0.3) {};
+ \node (nb\x) at (10.7,-1.35-\x*0.3) {};
+ }
+
+ \node at (b0) {{\ttxx 0: Some 00000000}};
+ \node at (b1) {{\ttxx 1: Some 12345600}};
+ \node at (b2) {{\ttxx 2: Some 00000000}};
+ \node at (b3) {{\ttxx 3: Some 00000000}};
+ \node at (b4) {{\ttxx 4: Some 00000000}};
+ \node at (b5) {{\ttxx 5: Some 00000000}};
+ \node at (b6) {{\ttxx 6: Some 00000000}};
+ \node at ($(b7) - (0,0.05)$) {$\cdots$};
+ \node at (b8) {\rmxx{\tt N: Some 00000000}};
+
+ \node at (nb0) {\rmxx{\tt 0: Some DEADBEEF}};
+ \node[left] at (nb1) {\rmxx{\tt 1: None}};
+ \node[left] at (nb2) {\rmxx{\tt 2: None}};
+ \node[left] at (nb3) {\rmxx{\tt 3: None}};
+ \node[left] at (nb4) {\rmxx{\tt 4: None}};
+ \node[left] at (nb5) {\rmxx{\tt 5: None}};
+ \node[left] at (nb6) {\rmxx{\tt 6: None}};
+ \node at ($(nb7) - (0,0.05)$) {$\cdots$};
+ \node[left] at (nb8) {\rmxx{\tt N: None}};
+
+ \node at (8.3,-1) {$\Gamma_{a}$};
+ \node at (10.7,-1) {$\Delta_{a}$};
+
+ \draw (7,-4.3) -- (12,-4.3);
+ \node at (9.5,-4.7) {\rmx {\tt stack[0] <= 0xDEADBEEF;}};
+\stoptikzpicture}
+\stopplacefigure
+
+This translation is represented in \in{Section}[fig:memory_model_transl]. defines a map from blocks
+to maps from memory addresses to memory contents. Each block represents an area in memory; for
+example, a block can represent a global variable or a stack for a function. As there are no global
+variables, the main stack can be assumed to be block 0, and this is the only block we
+translate. Meanwhile, our Verilog semantics defines two finite arrays of optional values, one for
+the blocking assignments map $\Gamma_{\rm a}$ and one for the nonblocking assignments map
+$\Delta_{\rm a}$. The optional values are present to ensure correct merging of the two association
+maps at the end of the clock cycle. The invariant used in the proofs is that block 0 should be
+equivalent to the merged representation of the $\Gamma_{\rm a}$ and $\Delta_{\rm a}$ maps.
\section[sec:proof]{Correctness Proof}
@@ -944,7 +1024,7 @@ The main correctness theorem is analogous to that stated in ~: for all Clight so
if the translation to the target Verilog code succeeds, $\mathit{Safe}(C)$ holds and the target
Verilog has behaviour $B$ when simulated, then $C$ will have the same behaviour $B$.
$\mathit{Safe}(C)$ means all observable behaviours of $C$ are safe, which can be defined as
-$\forall B,\ C \Downarrow B \implies B \in \mono{Safe}$. A behaviour is in \type{Safe} if it is
+$\forall B,\ C \Downarrow B \implies B \in \text{\tt Safe}$. A behaviour is in \type{Safe} if it is
either a final state (in the case of convergence) or divergent, but it cannot \quote{go
wrong}. (This means that the source program must not contain undefined behaviour.) In , a
behaviour is also associated with a trace of I/O events, but since external function calls are not
@@ -1104,18 +1184,32 @@ matching state.
\subsection[sec:proof:ram_insertion]{Forward Simulation of RAM Insertion}
-%\startformula \begin{gathered}
-% \inferrule[Idle]{\Gamma_{\rm r}[\mathit{r.en}] = \Gamma_{\rm r}[\mathit{r.u_{en}}]}{((\Gamma_{\rm r}, \Gamma_{\rm a}), \Delta, r) \downarrow_{\text{ram}} \Delta}\\
-%%
-% \inferrule[Load]{\Gamma_{\rm r}[\mathit{r.en}] \ne \Gamma_{\rm r}[\mathit{r.u_{en}}] \\ \Gamma_{\rm r}[\mathit{r.wr_{en}}] = 0}{((\Gamma_{\rm r}, \Gamma_{\rm a}), (\Delta_{\rm r}, \Delta_{\rm a}), r) \downarrow_{\text{ram}} (\Delta_{\rm r}[\mathit{r.en} \mapsto \mathit{r.u_{en}}, \mathit{r.d_{out}} \mapsto (\Gamma_{\rm a}[\mathit{r.mem}])[ \mathit{r.addr}]], \Delta_{\rm a}) }\\
-%%
-% \inferrule[Store]{\Gamma_{\rm r}[\mathit{r.en}] \ne \Gamma_{\rm r}[\mathit{r.u_{en}}] \\ \Gamma_{\rm r}[\mathit{r.wr_{en}}] = 1}{((\Gamma_{\rm r}, \Gamma_{\rm a}), (\Delta_{\rm r}, \Delta_{\rm a}), r) \downarrow_{\text{ram}} (\Delta_{\rm r}[\mathit{r.en} \mapsto \mathit{r.u\_en}], \Delta_{\rm a}[\mathit{r.mem} \mapsto (\Gamma_{\rm a}[ \mathit{r.mem}])[\mathit{r.addr} \mapsto \mathit{r.d_{in}}]]) }
-% \end{gathered} \stopformula
+\startplacefigure[reference={fig:htl_ram_spec},title={Specification for the memory implementation in
+ HTL, where $r$ is the RAM, which is then implemented by equivalent Verilog code.}]
+ \startformula \startalign[n=1]
+ \NC \text{\sc Idle}\ \frac{\Gamma_{\rm r}[\mathit{r.en}] = \Gamma_{\rm r}[\mathit{r.u_{en}}]}{((\Gamma_{\rm r}, \Gamma_{\rm a}), \Delta, r) \downarrow_{\text{ram}} \Delta}\NR
+%
+ \NC \text{\sc Load}\ \frac{\Gamma_{\rm r}[\mathit{r.en}] \ne \Gamma_{\rm
+ r}[\mathit{r.u_{en}}]\qquad \Gamma_{\rm r}[\mathit{r.wr_{en}}] = 0}{((\Gamma_{\rm r},
+ \Gamma_{\rm a}), (\Delta_{\rm r}, \Delta_{\rm a}), r) \downarrow_{\text{ram}} (\Delta_{\rm
+ r}\left[\startmatrix[n=1]
+\NC\mathit{r.en} \mapsto \mathit{r.u_{en}},\NR
+\NC\mathit{r.d_{out}} \mapsto (\Gamma_{\rm a}[\mathit{r.mem}\NR\stopmatrix\right])[ \mathit{r.addr}]], \Delta_{\rm a}) }\NR
+%
+ \NC \text{\sc Store}\ \frac{\Gamma_{\rm r}[\mathit{r.en}] \ne \Gamma_{\rm r}[\mathit{r.u_{en}}]\qquad
+ \Gamma_{\rm r}[\mathit{r.wr_{en}}] = 1}{((\Gamma_{\rm r}, \Gamma_{\rm a}), (\Delta_{\rm r},
+ \Delta_{\rm a}), r) \downarrow_{\text{ram}} \left(\startmatrix[n=1]
+\NC \Delta_{\rm r}[\mathit{r.en} \mapsto
+ \mathit{r.u\_en}], \NR
+\NC \Delta_{\rm a}[\mathit{r.mem} \mapsto (\Gamma_{\rm a}[
+ \mathit{r.mem}])[\mathit{r.addr} \mapsto \mathit{r.d_{in}}]]\NR\stopmatrix\right) } \NR
+ \stopalign \stopformula
+\stopplacefigure
HTL can only represent a single state machine, so we must model the RAM abstractly to reason about
the correctness of replacing the direct read and writes to the array by loads and stores to a
-RAM. The specification for the RAM is shown in Fig.~\goto{{[}fig:htl_ram_spec{]}}[fig:htl_ram_spec],
-which defines how the RAM $r$ will behave for all the possible combinations of the input signals.
+RAM. The specification for the RAM is shown in \in{Figure}[fig:htl_ram_spec], which defines how the
+RAM $r$ will behave for all the possible combinations of the input signals.
\subsubsection[from-implementation-to-specification]{From Implementation to Specification}
@@ -1128,9 +1222,14 @@ instruction is shown below, where $\sigma$ is state register, $r$ is the RAM, $d
input data-path and control logic maps, and $i$ is the current state. ($n$ is the newly inserted
state, which only applies to the translation of loads.)
-%\startformula \begin{gathered}
-% \inferrule[Store Spec]{ d[i] = (r.mem\mono{[}e_{1}\mono{]} \mono{ <= } e_{2}) \\ t = (r.u\_en \mono{ <= } \neg r.u\_en; r.wr\_en \mono{ <= } 1; r.d\_in \mono{ <= } e_{2}; r.addr \mono{ <= } e_{1})}%
-% {\yhfunction{spec\_ram\_tr}\ \sigma\ r\ d\ c\ d[i \mapsto t]\ c\ i\ n}\end{gathered} \stopformula
+\startformula
+ \text{\sc Store Spec}\ \frac{ d[i] = (r.mem\text{\tt [}e_{1}\text{\tt ]}\ \text{\tt <= } e_{2})
+ \quad t = \left(\startmatrix[n=1]
+ \NC r.u\_en\ \text{\tt <= } \neg r.u\_en; \NR
+ \NC r.wr\_en\ \text{\tt <= } 1; \NR
+ \NC r.d\_in\ \text{\tt <= } e_{2}; \NR
+ \NC r.addr\ \text{\tt <= } e_{1}\NR\stopmatrix\right)}%
+ {\text{\tt spec\_ram\_tr}\ \sigma\ r\ d\ c\ d[i \mapsto t]\ c\ i\ n}\stopformula
A similar specification is created for the load. We then also prove that the implementation of the
translation proves that the specification holds.