Remove compilation speed and fix results

author: Yann Herklotz <git@yannherklotz.com> 2021-04-16 13:56:32 +0100
committer: Yann Herklotz <git@yannherklotz.com> 2021-04-16 13:57:11 +0100
commit: d36bd7f187ddf0db1745c82400da996c97ed9a03 (patch)
tree: 0419b7a7725231788af65b475bf19e7ee2e7a7aa /archive
parent: 20ddd80e5eb18e261d6f228d8e9103a9090b7a39 (diff)
download: oopsla21_fvhls-d36bd7f187ddf0db1745c82400da996c97ed9a03.tar.gz
oopsla21_fvhls-d36bd7f187ddf0db1745c82400da996c97ed9a03.zip
1 files changed, 10 insertions, 18 deletions
diff --git a/archive/evaluation.tex b/archive/evaluation.tex
index 6c21cbf..aa952cf 100644
--- a/archive/evaluation.tex
+++ b/archive/evaluation.tex
@@ -150,24 +150,6 @@
 \end{subfigure}
 \end{figure}
 
-Firstly, before comparing any performance metrics, it is worth highlighting that any Verilog produced by \vericert{} is guaranteed to be \emph{correct}, whilst no such guarantee can be provided by \legup{}.
-This guarantee in itself provides a significant leap in terms of HLS reliability, compared to any other HLS tools available.
-
-igure~\ref{fig:comparison_cycles} compares the cycle counts of our 27 programs executed by \vericert{} and \legup{} respectively.
-n most cases, we see that the data points are above the diagonal, which demonstrates that the \legup{}-generated hardware is faster than \vericert{}-generated Verilog.
-
-n average, \legup{} designs are $4.5\times$ faster than \vericert{} designs.
-his performance gap is mostly due to \legup{} optimisations such as scheduling and memory analysis, which are designed to extract parallelism from input programs.
-This gap does not represent the performance cost that comes with formally proving a HLS tool.
-Instead, it is simply a gap between an unoptimised \vericert{} versus an optimised \legup{}.
-t is notable that even without \vericert{} performing many optimisations, a few data points are close to the diagonal and even below it.
-We are very encouraged by these data points.
-s we improve \vericert{} by incorporating further  optimisations, this gap should reduce whilst preserving our correctness guarantees.
-
-ycle count is one factor in calculating execution times; the other is the clock frequency, which determines the duration of each of these cycles. Figure~\ref{fig:comparison_time} compares the execution times of \vericert{} and \legup{}. Across the original \polybench{} benchmarks, we see that \vericert{} designs are about \slowdownOrig$\times$ slower than \legup{} designs. This dramatic discrepancy in performance can be largely attributed to \vericert's na\"ive implementations of division and modulo operations, as explained in Section~\ref{sec:evaluation:setup}. Indeed, \vericert{} achieved an average clock frequency of just 21MHz, while \legup{} managed about 247MHz. After replacing the division/modulo operations with our own C-based implementations, \vericert{}'s average clock frequency becomes about 112MHz. This is better, but still substantially below \legup{}, which uses various additional optimisations and Intel-specific IP blocks. Across the modified \polybench{} benchmarks, we see that \vericert{} designs are about \slowdownDiv$\times$ slower than \legup{} designs.
-
-subsection{RQ2: How area-efficient is \vericert{}-generated hardware?}
-
 \begin{figure}
 \begin{subfigure}[t]{0.48\textwidth}
 \definecolor{resourceutilcol}{HTML}{e7298a}
@@ -221,3 +203,13 @@ subsection{RQ2: How area-efficient is \vericert{}-generated hardware?}
 \label{fig:comparison_comptime}
 \end{subfigure}
 \end{figure}
+
+%These designs therefore fill less than 1\% of the FPGA.
+%The reason for the similar size in hardware is that
+%Synthesis tools such as Quartus generally require array accesses to be in a specific form in order for RAM inference to activate.
+%\legup{}'s Verilog generation is tailored to enable RAM inference by Quartus, while \vericert{} generates more generic array accesses. This may make \vericert{} more portable across different FPGA synthesis tools and vendors.
+%%For a fair comparison, we chose Quartus for these experiments because LegUp supports Quartus efficiently.
+%% Consequently, on average, \legup{} designs use $XX$ RAMs whereas \vericert{} use none.
+%Enabling RAM inference is part of our future plans.
+
+% We see that \vericert{} designs use between 1\% and 30\% of the available logic on the FPGA, averaging at around 10\%, whereas LegUp designs all use less than 1\% of the FPGA, averaging at around 0.45\%. The main reason for this is mainly because RAM is not inferred automatically for the Verilog that is generated by \vericert{}.  Other synthesis tools can infer the RAM correctly for \vericert{} output, so this issue could be solved by either using a different synthesis tool and targeting a different FPGA, or by generating the correct template which allows Quartus to identify the RAM automatically.
author	Yann Herklotz <git@yannherklotz.com>	2021-04-16 13:56:32 +0100
committer	Yann Herklotz <git@yannherklotz.com>	2021-04-16 13:57:11 +0100
commit	d36bd7f187ddf0db1745c82400da996c97ed9a03 (patch)
tree	0419b7a7725231788af65b475bf19e7ee2e7a7aa /archive
parent	20ddd80e5eb18e261d6f228d8e9103a9090b7a39 (diff)
download	oopsla21_fvhls-d36bd7f187ddf0db1745c82400da996c97ed9a03.tar.gz oopsla21_fvhls-d36bd7f187ddf0db1745c82400da996c97ed9a03.zip