summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorYann Herklotz <ymh15@ic.ac.uk>2020-09-12 09:52:07 +0000
committeroverleaf <overleaf@localhost>2020-09-12 10:03:11 +0000
commit47ffe25ba969f577e1829d136aae541b9eeceeb5 (patch)
treec465b3f4115929522efbe7e7dda5278d36f0018f
parent639377a488e3d079bf95d9ee53cb748a3d3bedba (diff)
downloadfccm21_esrhls-47ffe25ba969f577e1829d136aae541b9eeceeb5.tar.gz
fccm21_esrhls-47ffe25ba969f577e1829d136aae541b9eeceeb5.zip
Update on Overleaf.
-rw-r--r--eval_rewrite.tex4
-rw-r--r--intro.tex40
-rw-r--r--main.tex2
-rw-r--r--method-new.tex26
-rw-r--r--related.tex8
-rw-r--r--tool-figure.tex2
6 files changed, 46 insertions, 36 deletions
diff --git a/eval_rewrite.tex b/eval_rewrite.tex
index a55930c..f076fe8 100644
--- a/eval_rewrite.tex
+++ b/eval_rewrite.tex
@@ -80,6 +80,8 @@ int main() {
}
\end{minted}
+In the code above, \texttt{b} has value 1 when run in GCC, but has value 0 when run with LegUp 4.0. If the \texttt{volatile} keyword is removed from \texttt{a}, then the netlist contains the correct result. As \texttt{a} and \texttt{d} are constants, the if-statement should always produce go into the \texttt{true} branch, meaning \texttt{b} should never be set to 0.
+
Example 2
\begin{minted}{c}
@@ -108,8 +110,6 @@ int result() {
}
\end{minted}
-In the code above, \texttt{b} has value 1 when run in GCC, but has value 0 when run with LegUp 4.0. If the \texttt{volatile} keyword is removed from \texttt{a}, then the netlist contains the correct result. As \texttt{a} and \texttt{d} are constants, the if-statement should always produce go into the \texttt{true} branch, meaning \texttt{b} should never be set to 0.
-
\subsection{Bugs in Vivado HLS versions}
In addition to the explanation to bugs given in Section~\ref{sec:evaluation}, bugs found in various versions of Vivado are also analysed.
diff --git a/intro.tex b/intro.tex
index 390c323..7fc8826 100644
--- a/intro.tex
+++ b/intro.tex
@@ -6,36 +6,44 @@ It is even being used in high-assurance settings, such as financial services~\ci
Our method is brought over from the compiler testing literature. We use a tool called Csmith~\cite{yang11_findin_under_bugs_c_compil} to generate well-formed C programs from the subset of the C language that is supported by all the HLS tools we test, and then augment each program with randomly chosen HLS-specific directives. We synthesise each C program to RTL, and use a Verilog simulator to calculate its return value. If synthesis crashes, or if this return value differs from the return value obtained by executing a binary compiled from the C program by gcc, then we have found a candidate bug. We then use trial-and-error to reduce the C program to a minimal version that still triggers a bug.
+
+
+We have tested three widely used HLS tools: LegUp~\cite{canis13_legup}, Xilinx Vivado HLS~\cite{xilinx20_vivad_high_synth}, and the Intel HLS Compiler~\cite{?}. For all three tools, we were able to find valid C programs that cause crashes while compiling and valid C programs that cause wrong RTL to be generated. We have submitted a total of \ref{?} bug reports to the developers, \ref{?} of which have been confirmed and \ref{?} of which have now been fixed at the time of writing.
+
\begin{figure}
\centering
\begin{minted}{c}
unsigned int b = 0x1194D7FF;
int a[6] = {1, 1, 1, 1, 1, 1};
-int result() {
+int main() {
int c;
for (c = 0; c < 2; c++)
b = b >> a[c];
return b;
}
\end{minted}
- \caption{Miscompilation bug found in Vivado 2018.3 and 2019.2 which returns \texttt{6535FF} instead of \texttt{46535FF} which is the correct result.}\label{fig:vivado_bug1}
+ \caption{Miscompilation bug found in Vivado 2018.3 and 2019.2 which returns \code{0x006535FF} instead of \code{0x046535FF} which is the correct result.}\label{fig:vivado_bug1}
\end{figure}
-\NR{
-Notes on program:
-\begin{itemize}
- \item result is main
- \item array a needs to be at least 6 elements
- \item magic number as well
- \item for loop at least two.
- \item constant replacement fixes it too.
- \item all three versions of Vivado.
- \item
-\end{itemize}
-}
-
-We have tested three widely used HLS tools: LegUp~\cite{canis13_legup}, Xilinx Vivado HLS~\cite{xilinx20_vivad_high_synth}, and the Intel HLS Compiler~\cite{?}. For all three tools, we were able to find valid C programs that cause crashes while compiling and valid C programs that cause wrong RTL to be generated. We have submitted a total of \ref{?} bug reports to the developers, \ref{?} of which have been confirmed and \ref{?} of which have now been fixed at the time of writing.
+\paragraph{An example of a fuzzed buggy program}
+Figure~\ref{fig:vivado_bug1} shows a minimal example that produces the wrong result during RTL simulation in all three versions of VivadoHLS, compared to GCC execution.
+In this example, we right shift a large integer value \code{b} by values of array elements, in array \code{a}, within iterations of a \code{for}-loop.
+VivadoHLS returns \code{0x006535FF} instead of \code{0x046535FF} as in GCC.
+The circumstances in which we found this right shift bug shows the challenge of testing current HLS tools.
+
+For instance, the for-loop was also necessary to ensure that a bug was detected.
+Also, the shift value needs to be accessed from an array.
+Replacing the array accesses within the loop with constants result in the bug not surfacing.
+Additionally, the array \code{a} needed to be at least six elements in size although the for-loop only has two iterations.
+Any array smaller than that did not surface this bug.
+Finally, the value of \code{b} is an oracle that could not be changed without masking the bug.
+Producing such circumstances within C code for HLS testing is both arduous and counter-intuitive to human testers.
+In contrast, producing non-intuitive, complex but valid C programs is the cornerstone of fuzzing tools.
+Thus, it was natural to adopt program fuzzing for our HLS testing campaign.
+\NR{Yann, please double check my claims about the bug. I hope I accurately described what we discussed. }\YH{Yes I agree with all that, I think that is a good description of it}
+
+\paragraph{Our contribution}
We hope that our work serves to stimulate efforts to improve the quality of HLS tools.
diff --git a/main.tex b/main.tex
index 8dafae3..44cf1d6 100644
--- a/main.tex
+++ b/main.tex
@@ -31,6 +31,8 @@
\newcommand\NR[1]{\Comment{yellow!50!black}{NR}{#1}}
\newcommand\ZD[1]{\Comment{blue!50!black}{NR}{#1}}
+\newcommand{\code}[1]{\texttt{#1}}
+
%%
%% The majority of ACM publications use numbered citations and
%% references. The command \citestyle{authoryear} switches to the
diff --git a/method-new.tex b/method-new.tex
index ae44fb5..a3ad2ba 100644
--- a/method-new.tex
+++ b/method-new.tex
@@ -30,7 +30,19 @@ Questions:
\end{itemize}
}
-\newcommand{\code}[1]{\texttt{#1}}
+For our testing campaign, we require a random generator that produces program that are both valid and diverse in features.
+Csmith fits both these criteria.
+Csmith is randomised code generator of C programs for compiler testing, that has found more than 400 bugs in software compilers~\cite{yang11_findin_under_bugs_c_compil}.
+Csmith provides several properties to ensure generation of valid C programs.
+Validity is critical for us since these random programs are treated as our ground truth in our testing setup, as shown in Figure~\ref{fig:method:toolflow}.
+
+Additionally, Csmith allows users to diversify the features generated in these programs.
+These features include \NR{bla...}.
+By default, Csmith assigns a probability value for each of these features.
+Csmith also allows users to customise these values to tune program generation for their own purposes.
+The ability to customise program generation is vital for our work since we want to generate programs that are HLS-friendly.
+Configuring Csmith to generate HLS-friendly programs is mostly an exercise of reducing and restricting feature probabilities.
+
\begin{table}[]
\centering
@@ -63,18 +75,6 @@ Questions:
\label{tab:properties}
\end{table}
-For our testing campaign, we require a random generator that produces program that are both valid and diverse in features.
-Csmith fits both these criteria.
-Csmith is randomised code generator of C programs for compiler testing, that has found more than 400 bugs in software compilers~\cite{yang11_findin_under_bugs_c_compil}.
-Csmith provides several properties to ensure generation of valid C programs.
-Validity is critical for us since these random programs are treated as our ground truth in our testing setup, as shown in Figure~\ref{fig:method:toolflow}.
-
-Additionally, Csmith allows users to diversify the features generated in these programs.
-These features include \NR{bla...}.
-By default, Csmith assigns a probability value for each of these features.
-However, it also allows users to customise these values to tune program generation for their own purposes.
-The ability to customise program generation is vital for our work since we want to generate programs that are HLS-friendly.
-Configuring Csmith to generate HLS-friendly programs is mostly an exercise of reducing and restricting feature probabilities.
\paragraph{Significant probability changes}
Table~\ref{tab:properties} lists the important changes that we put in place to ensure that HLS tools are able to synthesise programs generated by Csmith.
diff --git a/related.tex b/related.tex
index 4b4c37d..024bcd2 100644
--- a/related.tex
+++ b/related.tex
@@ -2,15 +2,15 @@
\section{Related Work}
-\JW{Lidbury et al. \cite{lidbury15_many_core_compil_fuzzin} fuzz-tested several OpenCL compilers, including an HLS compiler from Altera (now Intel). They were only able to subject that compiler to superficial testing because so many of the test cases they generated led to it crashing. In comparison to our work: where Lidbury et al. generated target-independent OpenCL programs that could be used to test HLS tools and conventional compilers alike, we specifically generate programs that are tailored for HLS (e.g. with HLS-specific pragmas) with the aim of testing the HLS tools more deeply. Another difference is that where we test using sequential C programs, they test using highly concurrent OpenCL programs, and thus have to go to great lengths to ensure that any discrepancies observed between compilers cannot be attributed to the nondeterminism of concurrency.}
+Lidbury et al. \cite{lidbury15_many_core_compil_fuzzin} fuzz-tested several OpenCL compilers, including an HLS compiler from Altera (now Intel). They were only able to subject that compiler to superficial testing because so many of the test cases they generated led to it crashing. In comparison to our work: where Lidbury et al. generated target-independent OpenCL programs that could be used to test HLS tools and conventional compilers alike, we specifically generate programs that are tailored for HLS (e.g. with HLS-specific pragmas) with the aim of testing the HLS tools more deeply. Another difference is that where we test using sequential C programs, they test using highly concurrent OpenCL programs, and thus have to go to great lengths to ensure that any discrepancies observed between compilers cannot be attributed to the nondeterminism of concurrency.
-\JW{Herklotz et al.~\cite{verismith} fuzz-tested several FPGA synthesis tools using randomly generated Verilog programs. Where they concentrated on the RTL-to-netlist stage of hardware design, we focus our attention on the earlier C-to-RTL stage.}
+Herklotz et al.~\cite{verismith} fuzz-tested several FPGA synthesis tools using randomly generated Verilog programs. Where they concentrated on the RTL-to-netlist stage of hardware design, we focus our attention on the earlier C-to-RTL stage.
-\JW{Several authors have taken steps toward more rigorously engineered HLS tools~\cite{yann_to_add_some_examples_such_as_perna} that may fare better under testing campaigns like ours. However, [something about there not existing a proper mechanically verified HLS tool that is actually usable in practice].}
+Several authors have taken steps toward more rigorously engineered HLS tools~\cite{yann_to_add_some_examples_such_as_perna} that may fare better under testing campaigns like ours. However, .
Existing work has already been done in finding bugs in Intel's OpenCL SDK using metamorphic testing~\cite{lidbury15_many_core_compil_fuzzin}.
-There is not much past research been done on the subject of interest, which is verifying the reliability of HLS tools using the fuzz testing method. Fuzz testing method has been previously performed on C/C++ compilers \ref{Csmith}, FPGA synthesis tools \ref{Verismith}, and OpenCL compiler\ref{manycore}. Those are the tools/applications that engineers or programmers are highly relied on, similar to the HLS tool investigated in this thesis. Previous work has been focusing on questioning the correct conversion from Verilog to netlist done by FPGA synthesis tools, the proper C compilation between different C/C++ compilers, and correct behaviors generated by OpenCL implantations or devices.
+There is not much past research been done on the subject of interest, which is verifying the reliability of HLS tools using the fuzz testing method. Fuzz testing method has been previously performed on C/C++ compilers \ref{Csmith}, FPGA synthesis tools \ref{Verismith}, and OpenCL compiler\ref{manycore}. Those are the tools/applications that engineers or programmers highly rely on, similar to the HLS tool investigated in this thesis. Previous work has been focusing on questioning the correct conversion from Verilog to netlist done by FPGA synthesis tools, the proper C compilation between different C/C++ compilers, and correct behaviors generated by OpenCL implantations or devices.
Through investigating the discrepancies between Verilog and netlists, 13 bugs have been detected across four FPGA synthesis tools. Furthermore, over 400 compiler bugs have been found by fuzz testing C/C++ compilers. And over 50 OpenCL compiler bugs have been reported through testing many-core systems. The bugs mentioned here are introduced by the tool solely rather than the original design.
diff --git a/tool-figure.tex b/tool-figure.tex
index 8b450ed..de4961f 100644
--- a/tool-figure.tex
+++ b/tool-figure.tex
@@ -1,4 +1,4 @@
-\begin{figure*}
+\begin{figure*}[th]
\centering
\tikzset{
file/.style={draw=none, fill=black!25, shape=circle, minimum width=5mm},