summaryrefslogtreecommitdiffstats
path: root/intro.tex
diff options
context:
space:
mode:
authorJohn Wickerson <j.wickerson@imperial.ac.uk>2020-11-10 11:06:38 +0000
committeroverleaf <overleaf@localhost>2020-11-23 16:28:58 +0000
commitcccd6417481dd62e5b48ac5f18454110135daa08 (patch)
treeadccf2ee1de47d6bb2036d18812f542313df04d4 /intro.tex
parent25fc95d19a586f774a99630ea34e58fb76e4e629 (diff)
downloadfccm21_esrhls-cccd6417481dd62e5b48ac5f18454110135daa08.tar.gz
fccm21_esrhls-cccd6417481dd62e5b48ac5f18454110135daa08.zip
Update on Overleaf.
Diffstat (limited to 'intro.tex')
-rw-r--r--intro.tex44
1 files changed, 23 insertions, 21 deletions
diff --git a/intro.tex b/intro.tex
index 896b185..7f0d12f 100644
--- a/intro.tex
+++ b/intro.tex
@@ -3,22 +3,6 @@ High-level synthesis (HLS), which refers to the automatic translation of softwar
It promises to increase the productivity of hardware engineers by raising the abstraction level of their designs, and it promises software engineers the ability to produce application-specific hardware accelerators without having to understand hardware desciption languages (HDL) such as Verilog and VHDL.
It is even being used in high-assurance settings, such as financial services~\cite{hls_fintech}, control systems~\cite{hls_controller}, and real-time object detection~\cite{hls_objdetect}. As such, HLS tools are increasingly relied upon. In this paper, we investigate whether they are trustworthy.
-\begin{figure}[t]
- \centering
-\begin{minted}{c}
-unsigned int b = 0x1194D7FF;
-int a[6] = {1, 1, 1, 1, 1, 1};
-
-int main() {
- for (int c = 0; c < 2; c++)
- b = b >> a[c];
- return b;
-}
-\end{minted}
- \caption{Miscompilation bug found in Xilinx Vivado HLS v2018.3, v2019.1 and v2019.2. The program returns \code{0x006535FF} but the correct result is \code{0x046535FF}.}
- \label{fig:vivado_bug1}
-\end{figure}
-
The approach we take in this paper is \emph{fuzzing}.
%To test the trustworthiness of HLS tools, we need a robust way of generating programs that both have good coverage and also explores various corner cases.
%Therein lies the difficulty in testing HLS tools.
@@ -26,7 +10,7 @@ The approach we take in this paper is \emph{fuzzing}.
%In this paper, we employ program fuzzing on HLS tools.
This is an automated testing method in which randomly generated programs are given to compilers to test their robustness~\cite{fuzzing+chen+13+taming,fuzz+sun+16+toward,fuzzing+liang+18+survey,fuzzing+zhang+19,yang11_findin_under_bugs_c_compil,lidbury15_many_core_compil_fuzzin}.
The generated programs are typically large and rather complex, and they often combine language features in ways that are legal but counter-intuitive; hence they can be effective at exercising corner cases missed by human-designed test suites.
-Fuzzing has been used extensively to test conventional compilers; for example, Yang \textit{et al.}~\cite{yang11_findin_under_bugs_c_compil} used it to reveal more than three hundred bugs in GCC and Clang. In this paper, we bring fuzzing to the HLS context.
+Fuzzing has been used extensively to test conventional compilers; for example, Yang \textit{et al.}~\cite{yang11_findin_under_bugs_c_compil} used it to reveal more than three hundred bugs in GCC and Clang \JW{Clang -> LLVM?}. In this paper, we bring fuzzing to the HLS context.
%We specifically target HLS by restricting a fuzzer to generate programs within the subset of C supported by HLS.
@@ -41,12 +25,27 @@ Fuzzing has been used extensively to test conventional compilers; for example, Y
% Program fuzzing is bla..
% Fuzzing enables us to overcome
-
-\paragraph{An example of a compiler bug found by fuzzing}
-Figure~\ref{fig:vivado_bug1} shows a program that produces the wrong result during RTL simulation in Xilinx Vivado HLS. The bug was initially revealed by a randomly generated program of around 113 lines, which we were able to reduce to the minimal example shown in the figure.
+\subsection*{An example of a compiler bug found by fuzzing}
+Figure~\ref{fig:vivado_bug1} shows a program that produces the wrong result during RTL simulation in Xilinx Vivado HLS.\footnote{\JW{I added the following paragraph -- please check.} This program, like all the others in this paper, includes a \code{main} function, which means that it compiles straightforwardly with GCC. To compile it with an HLS tool, we rename \code{main} to \code{main\_}, synthesise that function, and then add a new \code{main} function as a testbench that calls \code{main\_}.} The bug was initially revealed by a randomly generated program of around 113 lines, which we were able to reduce to the minimal example shown in the figure.
The program repeatedly shifts a large integer value \code{b} right by the values stored in array \code{a}.
Vivado HLS returns \code{0x006535FF}, but the result returned by GCC (and subsequently confirmed manually to be the correct one) is \code{0x046535FF}.
+\begin{figure}[t]
+ \centering
+\begin{minted}{c}
+unsigned int b = 0x1194D7FF;
+int a[6] = {1, 1, 1, 1, 1, 1};
+
+int main() {
+ for (int c = 0; c < 2; c++)
+ b = b >> a[c];
+ return b;
+}
+\end{minted}
+ \caption{Miscompilation bug found in Xilinx Vivado HLS v2018.3, v2019.1 and v2019.2. The program returns \code{0x006535FF} but the correct result is \code{0x046535FF}.}
+ \label{fig:vivado_bug1}
+\end{figure}
+
The circumstances in which we found this bug illustrate some of the challenges in testing HLS tools.
For instance, without the for-loop, the bug goes away.
Moreover, the bug only appears if the shift values are accessed from an array.
@@ -57,7 +56,7 @@ In contrast, producing counter-intuitive, complex but valid C programs is the co
For this reason, we find it natural to adopt fuzzing for our HLS testing campaign.
% \NR{Yann, please double check my claims about the bug. I hope I accurately described what we discussed. }\YH{Yes I agree with all that, I think that is a good description of it}
-\paragraph{Our contribution}
+\subsection*{Our contribution}
This paper reports on our campaign to test HLS tools by fuzzing.
\begin{itemize}
\item We use Csmith~\cite{yang11_findin_under_bugs_c_compil} to generate thousands of valid C programs from within the subset of the C language that is supported by all the HLS tools we test. We also augment each program with a random selection of HLS-specific directives.
@@ -68,6 +67,9 @@ This paper reports on our campaign to test HLS tools by fuzzing.
\item To investigate whether HLS tools are getting more or less reliable over time, we also tested three different versions of Vivado HLS (v2018.3, v2019.1, and v2019.2). We found that in general there about half as many failures in versions v2019.1 and v2019.2 compared to v2018.3. However, there were also test-cases that only failed in versions v2019.1 and v2019.2, meaning bugs were probably introduced due to the addition of new features.
\end{itemize}
+
+\JW{I added the following paragraph -- please check.} The overall aim of our paper is to raise awareness about the (un)reliability of current HLS tools, and to serve as a call-to-arms for investment in better-engineered tools. We hope that future work on developing more reliable HLS tools will find our empirical study a valuable source of motivation.
+
% we test, and then augment each program with randomly chosen HLS-specific directives. We synthesise each C program to RTL, and use a Verilog simulator to calculate its return value. If synthesis crashes, or if this return value differs from the return value obtained by executing a binary compiled from the C program by GCC, then we have found a candidate bug. We then use trial-and-error to reduce the C program to a minimal version that still triggers a bug.
% We have tested three widely used HLS tools: LegUp~\cite{canis13_legup}, Xilinx Vivado HLS~\cite{xilinx20_vivad_high_synth}, and the Intel HLS Compiler~\cite{?}. For all three tools, we were able to find valid C programs that cause crashes while compiling and valid C programs that cause wrong RTL to be generated. We have submitted a total of \ref{?} bug reports to the developers, \ref{?} of which have been confirmed and \ref{?} of which have now been fixed at the time of writing.