Fix some more comments

author: Yann Herklotz <git@yannherklotz.com> 2021-08-08 18:48:21 +0200
committer: Yann Herklotz <git@yannherklotz.com> 2021-08-08 18:48:21 +0200
commit: aeb8620c1f530d5a43302ea4333fa6abdc951a25 (patch)
tree: f703e76afd17f5a2ba90a3ff712356cf55410d82 /evaluation.tex
parent: 8f7485fa0209cc5857c64c700feee56640d73893 (diff)
download: oopsla21_fvhls-aeb8620c1f530d5a43302ea4333fa6abdc951a25.tar.gz
oopsla21_fvhls-aeb8620c1f530d5a43302ea4333fa6abdc951a25.zip
1 files changed, 8 insertions, 5 deletions
diff --git a/evaluation.tex b/evaluation.tex
index a80b0fc..e1719e3 100644
--- a/evaluation.tex
+++ b/evaluation.tex
@@ -14,10 +14,12 @@ Our evaluation is designed to answer the following three research questions.
 \newcommand\legupnooptchain{\legup{} no-opt no-chaining}
 
 \paragraph{Choice of HLS tool for comparison.} We compare \vericert{} against \legup{} 4.0, because it is open-source and hence easily accessible, but still produces hardware ``of comparable quality to a commercial high-level synthesis tool''~\cite{canis11_legup}.  We also compare against \legup{} with different optimisation levels in an effort to understand which optimisations have the biggest impact on the performance discrepancies between \legup{} and \vericert{}.  The baseline \legup{} version has all the default automatic optimisations turned on.  % \vericert{} is also compared with other optimisation levels of \legup{}. %JW: removed because we said that a couple of sentences ago.
-First, we only turn off the LLVM optimisations in \legup{}, to eliminate all the optimisations that are common to standard software compilers, referred to as `\legupnoopt{}'.  Secondly, we also compare against \legup{} with LLVM optimisations and operation chaining turned off, referred to as `\legupnooptchain{}'. Operation chaining \JW{Should we cite https://ieeexplore.ieee.org/document/4397305 here? Do you think that's the right reference for op-chaining?}\NR{Interesting paper, but I am not sure if it is the seminal paper for chaining because of the year (2007).} is an HLS-specific optimisation that combines data-dependent operations into one clock cycle, and therefore dramatically reduces the number of cycles, without necessarily decreasing the clock speed.
+First, we only turn off the LLVM optimisations in \legup{}, to eliminate all the optimisations that are common to standard software compilers, referred to as `\legupnoopt{}'.  Secondly, we also compare against \legup{} with LLVM optimisations and operation chaining turned off, referred to as `\legupnooptchain{}'. Operation chaining~\cite{paulin89_sched_bindin_algor_high_level_synth,venkataramani07_operat} is an HLS-specific optimisation that combines data-dependent operations into one clock cycle, and therefore dramatically reduces the number of cycles, without necessarily decreasing the clock speed.
+
+% \JW{Should we cite https://ieeexplore.ieee.org/document/4397305 here? Do you think that's the right reference for op-chaining?}\NR{Interesting paper, but I am not sure if it is the seminal paper for chaining because of the year (2007).}
 
 \paragraph{Choice and preparation of benchmarks.} We evaluate \vericert{} using the \polybench{} benchmark suite (version 4.2.1)~\cite{polybench}, which is a collection of 30 numerical kernels. \polybench{} is popular in the HLS context~\cite{choi+18,poly_hls_pouchet2013polyhedral,poly_hls_zhao2017,poly_hls_zuo2013}, since it has affine loop bounds, making it attractive for streaming computation on FPGA architectures.
-We were able to use 27 of the 30 programs; three had to be discarded (\texttt{correlation},~\texttt{gramschmidt} and~\texttt{deriche}) because they involve square roots, requiring floats, which we do not support. 
+We were able to use 27 of the 30 programs; three had to be discarded (\texttt{cor\-re\-la\-tion},~\texttt{gram\-schmi\-dt} and~\texttt{de\-riche}) because they involve square roots, requiring floats, which we do not support.
 % Interestingly, we were also unable to evaluate \texttt{cholesky} on \legup{}, since it produce an error during its HLS compilation. 
 %In summary, we evaluate 27 programs from the latest Polybench suite. 
 We configured \polybench{}'s parameters so that only integer types are used.  We use \polybench{}'s smallest datasets for each program to ensure that data can reside within on-chip memories of the FPGA, avoiding any need for off-chip memory accesses. We have not modified the benchmarks to make them run through \legup{} optimally, e.g. by adding pragmas that trigger more advanced optimisations.
@@ -47,7 +49,7 @@ We configured \polybench{}'s parameters so that only integer types are used.  We
         vertical sep=5pt,
       },
       ymode=log,
-      ybar=0pt,
+      ybar=0.4pt,
       width=1\textwidth,
       height=0.4\textwidth,
       /pgf/bar width=3pt,
@@ -88,8 +90,9 @@ We configured \polybench{}'s parameters so that only integer types are used.  We
       \legend{\vericert{},\legupnooptchain{},\legupnoopt{}};
     \end{groupplot}
   \end{tikzpicture}
-  \caption{Performance of \vericert{} compared to \legup{}, with division and modulo operations enabled. The top graph compares the execution times and the bottom graph compares the area of the generated designs. In both cases, the performance of \vericert{}, \legup{} without LLVM optimisations and without operation chaining, and \legup{} without LLVM optimisations is compared against default \legup{}.\NR{Is it just my eyes or are the bars overlapping per group? Is that intentional?}}\label{fig:polybench-div}
+  \caption{Performance of \vericert{} compared to \legup{}, with division and modulo operations enabled. The top graph compares the execution times and the bottom graph compares the area of the generated designs. In both cases, the performance of \vericert{}, \legup{} without LLVM optimisations and without operation chaining, and \legup{} without LLVM optimisations is compared against default \legup{}.}\label{fig:polybench-div}
 \end{figure}
+%\NR{Is it just my eyes or are the bars overlapping per group? Is that intentional?}
 
 \pgfplotstableread[col sep=comma]{results/rel-time-nodiv.csv}{\nodivtimingtable}
 \pgfplotstableread[col sep=comma]{results/rel-size-nodiv.csv}{\nodivslicetable}
@@ -104,7 +107,7 @@ We configured \polybench{}'s parameters so that only integer types are used.  We
         vertical sep=5pt,
       },
       ymode=log,
-      ybar=0pt,
+      ybar=0.4pt,
       ytick={0.5,1,2,4,8},
       width=1\textwidth,
       height=0.4\textwidth,
author	Yann Herklotz <git@yannherklotz.com>	2021-08-08 18:48:21 +0200
committer	Yann Herklotz <git@yannherklotz.com>	2021-08-08 18:48:21 +0200
commit	aeb8620c1f530d5a43302ea4333fa6abdc951a25 (patch)
tree	f703e76afd17f5a2ba90a3ff712356cf55410d82 /evaluation.tex
parent	8f7485fa0209cc5857c64c700feee56640d73893 (diff)
download	oopsla21_fvhls-aeb8620c1f530d5a43302ea4333fa6abdc951a25.tar.gz oopsla21_fvhls-aeb8620c1f530d5a43302ea4333fa6abdc951a25.zip