summaryrefslogtreecommitdiffstats
path: root/algorithm.tex
diff options
context:
space:
mode:
Diffstat (limited to 'algorithm.tex')
-rw-r--r--algorithm.tex149
1 files changed, 14 insertions, 135 deletions
diff --git a/algorithm.tex b/algorithm.tex
index e43e804..f35e0c7 100644
--- a/algorithm.tex
+++ b/algorithm.tex
@@ -27,9 +27,11 @@
\caption{Verilog backend branching off at the RTL stage.}\label{fig:rtlbranch}
\end{figure}
-This section covers the main architecture of the HLS tool, and how the backend was added to CompCert. \JW{New paragraph.} CompCert is made up of 11 intermediate languages in between the Clight input and the assembly output. These intermediate languages each serve a different purpose and contain various different optimisations. When designing a new backend for CompCert, it is therefore crucial to know where to branch off and start the hardware generation. Many of the optimisations that the CompCert backend performs are not necessary when generating custom hardware and not relying on a CPU anymore, such as register allocation or even scheduling. It is therefore important to find the right intermediate language so that the HLS tool still benefits from many of the generic optimisations that CompCert performs, but does not receive the code transformations that are specific to CPU architectures.
+This section covers the main architecture of the HLS tool, and how the backend was added to CompCert.
-Existing HLS compilers usually use LLVM IR as an intermediate representation when performing HLS specific optimisations, as each instruction can be mapped quite well to hardware which performs the same behaviour. CompCert's RTL is the intermediate language that resembles LLVM IR the most, as it also has an infinite number of pseudo-registers and each instruction maps well to hardware. \JP{Perhaps this needs some further qualification? RTL uses the operations from the target architecture, and indeed performs architecture specific optimisations prior to RTL gen, so (for sake of example) switching from x86 RTL to RISC-V RTL could have a significant impact on performance.} In addition to that, many optimisations that are also useful for HLS are performed in RTL, which means that if it is supported as the input language, the HLS algorithm benefits from the same optimisations. It is therefore a good candidate to be chosen as the input language to the HLS backend. The complete flow that CoqUp takes is show in figure~\ref{fig:rtlbranch}.
+CompCert is made up of 11 intermediate languages in between the Clight input and the assembly output. These intermediate languages each serve a different purpose and contain various different optimisations. When designing a new backend for CompCert, it is therefore crucial to know where to branch off and start the hardware generation. Many of the optimisations that the CompCert backend performs are not necessary when generating custom hardware and not relying on a CPU anymore, such as register allocation or even scheduling. It is therefore important to find the right intermediate language so that the HLS tool still benefits from many of the generic optimisations that CompCert performs, but does not receive the code transformations that are specific to CPU architectures.
+
+Existing HLS compilers usually use LLVM IR as an intermediate representation when performing HLS specific optimisations, as each instruction can be mapped quite well to hardware which performs the same behaviour. CompCert's RTL is the intermediate language that resembles LLVM IR the most, as it also has an infinite number of pseudo-registers and each instruction maps well to hardware. \JP{Perhaps this needs some further qualification? RTL uses the operations from the target architecture, and indeed performs architecture specific optimisations prior to RTL gen, so (for sake of example) switching from x86 RTL to RISC-V RTL could have a significant impact on performance.}\YH{Yes will definitely include those, just have to think about where.} In addition to that, many optimisations that are also useful for HLS are performed in RTL, which means that if it is supported as the input language, the HLS algorithm benefits from the same optimisations. It is therefore a good candidate to be chosen as the input language to the HLS backend. The complete flow that CoqUp takes is show in figure~\ref{fig:rtlbranch}.
%%TODO: Maybe add why LTL and the other smaller languages are not that well suited
@@ -38,155 +40,32 @@ Existing HLS compilers usually use LLVM IR as an intermediate representation whe
\begin{figure}
\centering
\begin{subfigure}[b]{0.4\linewidth}
-\begin{minted}{c}
-int main() {
- int x[3] = {1, 2, 3};
- int sum = 0, incr = 1;
- for (int i = 0; i < 3; i=i+incr)
- sum += x[i];
- return sum;
-}
-\end{minted}
+ \inputminted{c}{data/accumulator.c}
\caption{Accumulator C code.}\label{fig:accumulator_c}
\end{subfigure}%
\begin{subfigure}[b]{0.6\linewidth}
-\begin{minted}[fontsize=\footnotesize]{c}
-main() {
- 16: x9 = 1
- 15: int32[stack(0)] = x9
- 14: x8 = 2
- 13: int32[stack(4)] = x8
- 12: x7 = 3
- 11: int32[stack(8)] = x7
- 10: x3 = 0
- 9: nop
- 8: x1 = 0
- 7: x6 = stack(0) (int)
- 6: x5 = int32[x6 + x1 * 4 + 0]
- 5: x3 = x3 + x5 + 0 (int)
- 4: x1 = x1 + 1 (int)
- 3: if (x1 <s 3) goto 7 else goto 2
- 2: x4 = x3
- 1: return x4
-}
-\end{minted}
+ \inputminted[fontsize=\footnotesize]{c}{data/accumulator.rtl}
\caption{Accumulator C code.}\label{fig:accumulator_rtl}
\end{subfigure}
- \caption{Accumulator example using CompCert to translate from C to RTL.}\label{fig:accumulator}
+ \caption{Accumulator example using CompCert to translate from C to RTL.}\label{fig:accumulator_c_rtl}
\end{figure}
\begin{figure}
\centering
\begin{subfigure}[b]{0.5\linewidth}
-\begin{minted}[fontsize=\tiny]{systemverilog}
-main() {
- datapath {
- 16: reg_9 <= 32'd1;
- 15: reg_13[32'd0] <= reg_9;
- 14: reg_8 <= 32'd2;
- 13: reg_13[32'd1] <= reg_8;
- 12: reg_7 <= 32'd3;
- 11: reg_13[32'd2] <= reg_7;
- 10: reg_3 <= 32'd0;
- 9: ;
- 8: reg_1 <= 32'd0;
- 7: reg_6 <= 32'd0;
- 6: reg_5 <= reg_13[{{{reg_6 + 32'd0}
- + {reg_1 * 32'd4}} / 32'd4}];
- 5: reg_3 <= {reg_3 + {reg_5 + 32'd0}};
- 4: reg_1 <= {reg_1 + 32'd1};
- 3: ;
- 2: reg_4 <= reg_3;
- 1: reg_11 <= 1'd1; reg_12 <= reg_4;
- }
-
- controllogic {
- 16: reg_10 <= 32'd15;
- 15: reg_10 <= 32'd14;
- 14: reg_10 <= 32'd13;
- 13: reg_10 <= 32'd12;
- 12: reg_10 <= 32'd11;
- 11: reg_10 <= 32'd10;
- 10: reg_10 <= 32'd9;
- 9: reg_10 <= 32'd8;
- 8: reg_10 <= 32'd7;
- 7: reg_10 <= 32'd6;
- 6: reg_10 <= 32'd5;
- 5: reg_10 <= 32'd4;
- 4: reg_10 <= 32'd3;
- 3: reg_10 <= ({$signed(reg_1) < $signed(32'd3)}
- ? 32'd7 : 32'd2);
- 2: reg_10 <= 32'd1;
- 1: ;
- }
-}
-\end{minted}
- \caption{Accumulator C code.}\label{fig:accumulator_c}
+ \inputminted[fontsize=\tiny]{systemverilog}{data/accumulator.htl}
+ \caption{Accumulator C code.}\label{fig:accumulator_htl}
\end{subfigure}%
\begin{subfigure}[b]{0.5\linewidth}
-\begin{minted}[fontsize=\tiny]{systemverilog}
-module main(reg_14, reg_15, reg_16, reg_11, reg_12);
- always @(posedge reg_16)
- if ({reg_15 == 1'd1})
- reg_10 <= 32'd16;
- else
- case (reg_10)
- 32'd16: reg_10 <= 32'd15;
- 32'd8: reg_10 <= 32'd7;
- 32'd4: reg_10 <= 32'd3;
- 32'd12: reg_10 <= 32'd11;
- 32'd2: reg_10 <= 32'd1;
- 32'd10: reg_10 <= 32'd9;
- 32'd6: reg_10 <= 32'd5;
- 32'd14: reg_10 <= 32'd13;
- 32'd1: ;
- 32'd9: reg_10 <= 32'd8;
- 32'd5: reg_10 <= 32'd4;
- 32'd13: reg_10 <= 32'd12;
- 32'd3: reg_10 <= ({$signed(reg_1) < $signed(32'd3)}
- ? 32'd7 : 32'd2);
- 32'd11: reg_10 <= 32'd10;
- 32'd7: reg_10 <= 32'd6;
- 32'd15: reg_10 <= 32'd14;
- default:;
- endcase
- always @(posedge reg_16)
- case (reg_10)
- 32'd16: reg_9 <= 32'd1;
- 32'd8: reg_1 <= 32'd0;
- 32'd4: reg_1 <= {reg_1 + 32'd1};
- 32'd12: reg_7 <= 32'd3;
- 32'd2: reg_4 <= reg_3;
- 32'd10: reg_3 <= 32'd0;
- 32'd6: reg_5 <= reg_13[{{{reg_6 + 32'd0}
- + {reg_1 * 32'd4}} / 32'd4}];
- 32'd14: reg_8 <= 32'd2;
- 32'd1: begin reg_11 = 1'd1; reg_12 = reg_4; end
- 32'd9: ;
- 32'd5: reg_3 <= {reg_3 + {reg_5 + 32'd0}};
- 32'd13: reg_13[32'd1] <= reg_8;
- 32'd3: ;
- 32'd11: reg_13[32'd2] <= reg_7;
- 32'd7: reg_6 <= 32'd0;
- 32'd15: reg_13[32'd0] <= reg_9;
- default:;
- endcase
- reg [31:0] reg_13 [2:0];
- input [0:0] reg_16, reg_14, reg_15;
- reg [31:0] reg_8, reg_4, reg_10, reg_6,
- reg_1, reg_9, reg_5, reg_3, reg_7;
- output reg [31:0] reg_12;
- output reg [0:0] reg_11;
-endmodule
-\end{minted}
- \caption{Accumulator C code.}\label{fig:accumulator_rtl}
+ \inputminted[fontsize=\tiny]{systemverilog}{data/accumulator.v}
+ \caption{Accumulator C code.}\label{fig:accumulator_v}
\end{subfigure}
- \caption{Accumulator example using CompCert to translate from C to RTL.}\label{fig:accumulator}
+ \caption{Accumulator example using CompCert to translate from HTL to Verilog.\YH{I feel like these examples take up too much space, but don't really know of a different way to show off a complete example without the long code.}}\label{fig:accumulator_htl_v}
\end{figure}
-To describe the translation, we start with an example of how to translate a simple accumulator example, which is shown in figure~\ref{fig:accumulator}. Using this example, the different stages in the translation can be explained, together with how they were proven. The example includes constructs such as arrays, for loops and addition, however, CoqUp also supports all other operators like multiplication and division, as well as conditional statements. Function calls are also supported by inlining all of the functions, meaning recursive function calls are not supported.
+To describe the translation, we start with an example of how to translate a simple accumulator example, which is shown in figure~\ref{fig:accumulator_c}. Using this example, the different stages in the translation can be explained, together with how they were proven.
-The first step is to use the CompCert front end passes to convert the C code into RTL, which is the intermediate language that CompCert uses for most of its optimisations. The translation is shown in figure~\ref{fig:accumulator}, where the RTL code is depicted in figure~\ref{fig:accumulator_rtl}. After that, the code is
+The first step is to use the CompCert front end passes to convert the C code into RTL, which is the intermediate language that CompCert uses for most of its optimisations. The translation is shown in figure~\ref{fig:accumulator_c_rtl}, where the RTL code is depicted in figure~\ref{fig:accumulator_rtl}. After that, the code is
\subsection{CompCert RTL}