summaryrefslogtreecommitdiffstats
path: root/verilog.tex
diff options
context:
space:
mode:
authorYann Herklotz <git@yannherklotz.com>2021-04-12 15:51:21 +0100
committerYann Herklotz <git@yannherklotz.com>2021-04-12 15:51:21 +0100
commit6613a83ad971a00a35315018486b3b12884ebb83 (patch)
treec405231acc47a75098bdbd20999939635d49f0fe /verilog.tex
parentfadbf3345d6c015e2b2fab6cd0b39d4052979d9d (diff)
downloadoopsla21_fvhls-6613a83ad971a00a35315018486b3b12884ebb83.tar.gz
oopsla21_fvhls-6613a83ad971a00a35315018486b3b12884ebb83.zip
Update more sections
Diffstat (limited to 'verilog.tex')
-rw-r--r--verilog.tex34
1 files changed, 18 insertions, 16 deletions
diff --git a/verilog.tex b/verilog.tex
index 0cf13ab..c8b62ee 100644
--- a/verilog.tex
+++ b/verilog.tex
@@ -113,7 +113,7 @@ Therefore, in addition to the rules shown in Figure~\ref{fig:inferrence_module},
The Verilog semantics do not define a memory model for Verilog, as this is not needed for a hardware description language. There is no preexisting architecture that Verilog will produce, it can describe any memory layout that is needed. Instead of having specific semantics for memory, the semantics only need to support the language features that can produce these different memory layouts, these being Verilog arrays.
-However, in practice, assigning and reading from an array directly in the state machine will not produce a memory in the final hardware design, as the synthesis tool cannot identify the array as having the necessary properties that a RAM needs, such as only reading and writing to the array once per clock cycles. Instead, the synthesis tool expects a specific template that ensures these properties, and will then transform the template into a proper RAM during synthesis. We can therefore
+However, in practice, assigning and reading from an array directly in the state machine will not produce a memory in the final hardware design, as the synthesis tool cannot identify the array as having the necessary properties that a RAM needs, even though this is the most natural formulation of memory. Even though theoretically the memory will only be read from once per clock cycle, the synthesis tool cannot ensure that this is true, and will instead create a register for each memory location. This increases the size of the circuit dramatically, as the RAM on the FPGA chip will not be reused. Instead, the synthesis tool expects a specific template that ensures these properties, and will then transform the template into a proper RAM during synthesis. Therefore, a translation has to be performed from the naive use of memory in the state machine, to a proper use of a memory template.
\begin{figure}
\centering
@@ -128,33 +128,35 @@ However, in practice, assigning and reading from an array directly in the state
\inferrule[Write]{\Gamma_{\rm r}\ !\ \textit{r.en} \ne \Gamma_{\rm r}\ !\ \textit{r.u\_en} \\ \Gamma_{\rm r}\ !\ \textit{r.wr\_en} = 1}{ \textit{exec\_ram}\ (\Gamma_{\rm r}, \Gamma_{\rm a})\ (\Delta_{\rm r}, \Delta_{\rm a})\ (\yhconstant{Some}\ r)\ (\Delta_{\rm r}[\textit{r.en} \mapsto \textit{r.u\_en}], \Delta_{\rm a}[\textit{r.mem} \mapsto (\Gamma_{\rm a}\ !\ \textit{r.mem})[\textit{r.addr} \mapsto \textit{r.d\_in}]]) }
\end{gather*}
\end{minipage}
- \caption{Specification for memory implementation in HTL, which is then implemented by equivalent Verilog code.}\label{fig:verilog_syntax}
+ \caption{Specification for memory implementation in HTL, which is then implemented by equivalent Verilog code.}\label{fig:htl_ram_spec}
\end{figure}
\begin{figure}
\begin{minted}{verilog}
- reg [31:0] mem [2047:0];
+ reg [31:0] mem [SIZE-1:0];
always @(negedge clk)
- if (en ^ u_en) begin
+ if (en != u_en) begin
if (wr_en) mem[addr] <= d_in;
else d_out <= mem[addr];
en <= u_en;
end
\end{minted}
- \caption{Implementation of memory specification in Verilog of its HTL specification.}
+ \caption{Implementation of memory specification in Verilog of its HTL specification.}\label{fig:verilog_ram_impl}
\end{figure}
-\begin{figure}
- \centering
- \begin{subfigure}[t]{0.48\linewidth}
- \includegraphics[width=\linewidth]{diagrams/store_waveform.pdf}
- \caption{Store waveform.}
- \end{subfigure}\hfill%
- \begin{subfigure}[t]{0.48\linewidth}
- \includegraphics[width=\linewidth]{diagrams/load_waveform.pdf}
- \caption{Load waveform.}
- \end{subfigure}
-\end{figure}
+This memory template can be represented using the following semantics shown in Figure~\ref{fig:htl_ram_spec}, which is then translated to the equivalent Verilog implementation shown in Figure~\ref{fig:verilog_ram_impl}. There are two interesting parts to the memory template that is used for the stack of the main function. Firstly, the memory updates are triggered on the negative edge of the clock, out of phase with the rest of the design, which is triggered on the positive edge of the clock. The main advantage is that instead of loads and stores taking three and two clock cycles respectively, they only take two and one clock cycle instead, greatly improving their performance. In addition to that, using the negative edge for the clock is supported by many synthesis tools, it therefore does not affect the maximum frequency of the final design. Secondly, the logic in the enable signal of the RAM (\texttt{en != u\_en}) is also atypical. To make the proof simpler, the goal is to create a RAM which disables itself after every use, so that firstly, the proof can assume that the RAM is disabled at the start and end of every clock cycle, and secondly so that only the state which contains the load and store need to be modified to ensure this property.
+
+%\begin{figure}
+% \centering
+% \begin{subfigure}[t]{0.48\linewidth}
+% \includegraphics[width=\linewidth]{diagrams/store_waveform.pdf}
+% \caption{Store waveform.}
+% \end{subfigure}\hfill%
+% \begin{subfigure}[t]{0.48\linewidth}
+% \includegraphics[width=\linewidth]{diagrams/load_waveform.pdf}
+% \caption{Load waveform.}
+% \end{subfigure}
+%\end{figure}
\begin{figure}
\centering