summaryrefslogtreecommitdiffstats
path: root/evaluation.tex
diff options
context:
space:
mode:
authorYann Herklotz <git@yannherklotz.com>2021-09-11 20:21:43 +0100
committerYann Herklotz <git@yannherklotz.com>2021-09-11 20:21:43 +0100
commit28256833310c6fe14280c246a096f9dd45d20abd (patch)
tree27a9f5b7c1ec4be71bd5428f9956fae9725fedf2 /evaluation.tex
parentc7dc8c7f5140166b3a3424f3db83fbe5604084cb (diff)
downloadoopsla21_fvhls-28256833310c6fe14280c246a096f9dd45d20abd.tar.gz
oopsla21_fvhls-28256833310c6fe14280c246a096f9dd45d20abd.zip
Fix some small grammar mistakes
Diffstat (limited to 'evaluation.tex')
-rw-r--r--evaluation.tex2
1 files changed, 1 insertions, 1 deletions
diff --git a/evaluation.tex b/evaluation.tex
index 0147bdc..bd10ae3 100644
--- a/evaluation.tex
+++ b/evaluation.tex
@@ -155,7 +155,7 @@ We configured \polybench{}'s parameters so that only integer types are used. We
Firstly, before comparing any performance metrics, it is worth highlighting that any Verilog produced by \vericert{} is guaranteed to be \emph{correct}, whilst no such guarantee can be provided by \legup{}.
This guarantee in itself provides a significant leap in terms of HLS reliability, compared to any other HLS tools available.
-The top graphs of Fig.~\ref{fig:polybench-div} and Fig.~\ref{fig:polybench-nodiv} compare the execution time of the 27 programs executed by \vericert{} and the different optimisation levels of \legup{}. Each graph uses optimised \legup{} as the baseline. When division/modulo operations are present \legup{} designs execute around 27$\times$ faster than \vericert{} designs. However, when division/modulo operations are replaced by the iterative algorithm, \legup{} designs are only 2$\times$ faster than \vericert{} designs. However, the benchmarks with division/modulo replaced show that \vericert{} actually achieves the same execution speed as \legup{} without LLVM optimisations and without operation chaining, which is encouraging, and shows that the hardware generation is following the right steps. The execution time is calculated by multiplying the maximum frequency that the FPGA can run at with this design, by the number of clock cycles that are needed to complete the execution. We can therefore analyse each separately.
+The top graphs of Fig.~\ref{fig:polybench-div} and Fig.~\ref{fig:polybench-nodiv} compare the execution time of the 27 programs executed by \vericert{} and the different optimisation levels of \legup{}. Each graph uses optimised \legup{} as the baseline. When division/modulo operations are present \legup{} designs execute around 27$\times$ faster than \vericert{} designs. However, when division/modulo operations are replaced by the iterative algorithm, \legup{} designs are only 2$\times$ faster than \vericert{} designs. The benchmarks with division/modulo replaced show that \vericert{} actually achieves the same execution speed as \legup{} without LLVM optimisations and without operation chaining, which is encouraging, and shows that the hardware generation is following the right steps. The execution time is calculated by multiplying the maximum frequency that the FPGA can run at with this design, by the number of clock cycles that are needed to complete the execution. We can therefore analyse each separately.
First, looking at the difference in clock cycles, \vericert{} produces designs that have around 4.5$\times$ as many clock cycles as \legup{} designs in both cases, when division/modulo operations are enabled as well as when they are replaced. This performance gap can be explained in part by LLVM optimisations, which seem to account for a 2$\times$ decrease in clock cycles, as well as operation chaining, which decreases the clock cycles by another 2$\times$. The rest of the speed-up is mostly due to \legup{} optimisations such as scheduling and memory analysis, which are designed to extract parallelism from input programs.
This gap does not represent the performance cost that comes with formally proving a HLS tool.