summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorYann Herklotz <git@yannherklotz.com>2022-05-06 16:04:15 +0100
committerYann Herklotz <git@yannherklotz.com>2022-05-06 16:04:15 +0100
commit3041f1d3971567bee4f79edace10ac17b5d35d08 (patch)
tree2714e8d0e3c184c631dc5e377836960f47a67c31
parent400912a60712af84cb87f472f421d010d62443fb (diff)
downloadlsr22_fvhls-3041f1d3971567bee4f79edace10ac17b5d35d08.tar.gz
lsr22_fvhls-3041f1d3971567bee4f79edace10ac17b5d35d08.zip
Some more fixes to the display of mono
-rw-r--r--chapters/hls.tex42
1 files changed, 21 insertions, 21 deletions
diff --git a/chapters/hls.tex b/chapters/hls.tex
index 71a4c1b..ad407b1 100644
--- a/chapters/hls.tex
+++ b/chapters/hls.tex
@@ -503,19 +503,19 @@ and of the RAM interface can be seen in \in{Figure}[fig:accumulator_diagram].
\paragraph{Translating instructions}
Most 3AC instructions correspond to hardware constructs. For example, line 2 in
-\in{Figure}{d}[fig:accumulator_c_rtl] shows a 32-bit register \mono{x5} being initialised to 3,
+\in{Figure}{d}[fig:accumulator_c_rtl] shows a 32-bit register \type{x5} being initialised to 3,
after which the control flow moves execution to line 3. This initialisation is also encoded in the
Verilog generated from HTL at state 8 in both the control logic and data-path always-blocks, shown
at lines 33 and 16 respectively in \in{Figure}{a}[fig:accumulator_c_rtl]. Simple operator
instructions are translated in a similar way. For example, the add instruction is just translated
to the built-in add operator, similarly for the multiply operator. All 32-bit instructions can be
translated in this way, but some special instructions require extra care. One such instruction is
-the \mono{Oshrximm} instruction, which is discussed further in
-\in{Section}[sec:algorithm:optimisation:oshrximm]. Another is the \mono{Oshldimm} instruction,
+the \type{Oshrximm} instruction, which is discussed further in
+\in{Section}[sec:algorithm:optimisation:oshrximm]. Another is the \type{Oshldimm} instruction,
which is a left rotate instruction that has no Verilog equivalent and therefore has to be
implemented in terms of other operations and proven to be equivalent. The only 32-bit instructions
-that we do not translate are case-statements (\mono{Ijumptable}) and those instructions related to
-function calls (\mono{Icall}, \mono{Ibuiltin}, and \mono{Itailcall}), because we enable
+that we do not translate are case-statements (\type{Ijumptable}) and those instructions related to
+function calls (\type{Icall}, \type{Ibuiltin}, and \type{Itailcall}), because we enable
\oindex{inlining}inlining by default.
\subsubsection{Translating HTL to Verilog}
@@ -568,7 +568,7 @@ directly from within the data-path (i.e., inside the always-block on lines 16--3
at several program points, the synthesis tool is unlikely to detect that it can be implemented as a
RAM block, and will resort to using lots of registers instead, ruining the circuit's area and
performance. To avert this, we arrange that the data-path does not access memory directly, but
-simply sets the address it wishes to access and then toggles the \mono{u\_en} flag. This activates
+simply sets the address it wishes to access and then toggles the \type{u\_en} flag. This activates
the RAM interface (lines 9--15 of \in{Figure}{a}[fig:accumulator_c_rtl]) on the next falling clock
edge, which performs the requested load or store. By factoring all the memory accesses out into a
separate interface, we ensure that the underlying array is only accessed from a single program point
@@ -593,9 +593,9 @@ take two clock cycles and one clock cycle instead, greatly improving their perfo
negative edge of the clock is widely supported by synthesis tools, and does not affect the maximum
frequency of the final design.
-Secondly, the logic in the enable signal of the RAM (\mono{en != u\_en}) is also atypical in
+Secondly, the logic in the enable signal of the RAM (\type{en != u\_en}) is also atypical in
hardware designs. Enable signals are normally manually controlled and inserted into the appropriate
-states, by using a check like the following in the RAM: \mono{en == 1}. This means that the RAM
+states, by using a check like the following in the RAM: \type{en == 1}. This means that the RAM
only turns on when the enable signal is set. However, to make the proof simpler and avoid reasoning
about possible side effects introduced by the RAM being enabled but not used, a RAM which disables
itself after every use would be ideal. One method for implementing this would be to insert an extra
@@ -605,21 +605,21 @@ after each load or store and disable the RAM in that state, but this could quick
complicated, especially in the case where the next state also contains a memory operation, and hence
the disable signal should not be added. The method we ultimately chose was to have the RAM become
enabled not when the enable signal is high, but when it \emph{toggles} its value. This can be
-arranged by keeping track of the old value of the enable signal in \mono{en} and comparing it to the
-current value \mono{u\_en} set by the data-path. When the values are different, the RAM gets
-enabled, and then \mono{en} is set to the value of \mono{u\_en}. This ensures that the RAM will
+arranged by keeping track of the old value of the enable signal in \type{en} and comparing it to the
+current value \type{u\_en} set by the data-path. When the values are different, the RAM gets
+enabled, and then \type{en} is set to the value of \type{u\_en}. This ensures that the RAM will
always be disabled straight after it was used, without having to insert or modify any other states.
\startplacemarginfigure[reference={fig:ram_load_store},title={Timing diagrams showing the execution
of loads and stores over multiple clock cycles.}]
\startfloatcombination[nx=2]
- \startplacesubfigure[title={Timing diagram for loads. At time 1, the \mono{u\_en} signal is
- toggled to enable the RAM. At time 2, \mono{d\_out} is set to the value stored at the
- address in the RAM, which is finally assigned to the register \mono{r} at time 3.}]
+ \startplacesubfigure[title={Timing diagram for loads. At time 1, the \type{u\_en} signal is
+ toggled to enable the RAM. At time 2, \type{d\_out} is set to the value stored at the
+ address in the RAM, which is finally assigned to the register \type{r} at time 3.}]
\externalfigure[figures/timing-1.pdf]
\stopplacesubfigure
- \startplacesubfigure[title={Timing diagram for stores. At time 1, the \mono{u\_en} signal is
- toggled to enable the RAM, and the address \mono{addr} and the data to store \mono{d\_in} are
+ \startplacesubfigure[title={Timing diagram for stores. At time 1, the \type{u\_en} signal is
+ toggled to enable the RAM, and the address \type{addr} and the data to store \type{d\_in} are
set. On the negative edge at time 2, the data is stored into the RAM.}]
\externalfigure[figures/timing-2.pdf]
\stopplacesubfigure
@@ -632,7 +632,7 @@ loaded and stored.
\subsubsection[sec:algorithm:optimisation:oshrximm]{Implementing the \mono{Oshrximm}
Instruction}
-Many of the CompCert instructions map well to hardware, but \mono{Oshrximm} (efficient signed
+Many of the CompCert instructions map well to hardware, but \type{Oshrximm} (efficient signed
division by a power of two using a logical shift) is expensive if implemented na\"ively. The problem
is that in CompCert it is specified as a signed division: (where $x, y \in \mathbb{Z}$,
$0 \leq y < 31$, and $-2^{31} \leq x < 2^{31}$) and instantiating divider circuits in hardware is
@@ -682,8 +682,8 @@ always-block that is triggered at the positive edge of the clock is shown below,
the state of the registers in the module and $s$ is the statement inside the always-block:
\placeformula\startformula
- \text{\sc Always }\ \dfrac{(\Sigma, s)\downarrow_{\text{stmnt}} \Sigma'}{(\Sigma, \text{\mono{always
- @(posedge clk) }} s) \downarrow_{\text{always}^{+}} \Sigma'}
+ \text{\sc Always }\ \dfrac{(\Sigma, s)\downarrow_{\text{stmnt}} \Sigma'}{(\Sigma, \text{\tt always
+ @(posedge clk)} s) \downarrow_{\text{always}^{+}} \Sigma'}
\stopformula
This rule says that assuming the statement $s$ in the always-block runs with state $\Sigma$ and
@@ -881,8 +881,8 @@ state and final state need to be defined:
\placeformula\startformula \startalign[n=1]
\NC\text{\sc Initial }\ \frac{\text{\tt is\_internal}\ P.\text{\tt main}}{\text{\tt initial\_state}\
- (\text{\tt Callstate}\ []\ P.\mono{\tt main}\ [])}\NR
- \NC\text{\sc Final }\ \frac{}{\text{\sc final\_state}\ (\text{\sc Returnstate}\ []\ n)\ n}\NR
+ (\text{\tt Callstate}\ []\ P.\text{\tt main}\ [])}\NR
+ \NC\text{\sc Final }\ \frac{}{\text{\tt final\_state}\ (\text{\tt Returnstate}\ []\ n)\ n}\NR
\stopalign \stopformula
\noindent where the initial state is the \type{Callstate} with an empty stack frame and no arguments