summaryrefslogtreecommitdiffstats
path: root/intro.tex
diff options
context:
space:
mode:
authorYann Herklotz <ymh15@ic.ac.uk>2020-09-14 15:05:15 +0000
committeroverleaf <overleaf@localhost>2020-09-14 15:47:05 +0000
commitf09e782d0925bc735aadc29bf595d1e3cc187351 (patch)
tree9c323f14437787f81274f7aba10be2289f5acc94 /intro.tex
parent11f9b46c8c5b3152435b3e5de43008e558f980dc (diff)
downloadfccm21_esrhls-f09e782d0925bc735aadc29bf595d1e3cc187351.tar.gz
fccm21_esrhls-f09e782d0925bc735aadc29bf595d1e3cc187351.zip
Update on Overleaf.
Diffstat (limited to 'intro.tex')
-rw-r--r--intro.tex76
1 files changed, 37 insertions, 39 deletions
diff --git a/intro.tex b/intro.tex
index 99bffa3..da09f55 100644
--- a/intro.tex
+++ b/intro.tex
@@ -1,4 +1,8 @@
+
\section{Introduction}
+High-level synthesis (HLS), which refers to the automatic translation of software into hardware, is becoming an increasingly important part of the computing landscape.
+It promises to increase the productivity of hardware engineers by raising the abstraction level of their designs, and it promises software engineers the ability to produce application-specific hardware accelerators without having to understand hardware desciption languages (HDL) such as Verilog and VHDL.
+It is even being used in high-assurance settings, such as financial services~\cite{hls_fintech}, control systems~\cite{hls_controller}, and real-time object detection~\cite{hls_objdetect}. As such, HLS tools are increasingly relied upon. In this paper, we investigate whether they are trustworthy.
\begin{figure}[t]
\centering
@@ -7,29 +11,23 @@ unsigned int b = 0x1194D7FF;
int a[6] = {1, 1, 1, 1, 1, 1};
int main() {
- int c;
- for (c = 0; c < 2; c++)
- b = b >> a[c];
+ for (int c = 0; c < 2; c++) b = b >> a[c];
return b;
}
\end{minted}
- \caption{Miscompilation bug found in Vivado 2018.3 and 2019.2 which returns \code{0x006535FF} instead of \code{0x046535FF} which is the correct result.}\label{fig:vivado_bug1}
+ \caption{Miscompilation bug found in Xilinx Vivado HLS 2018.3 and 2019.2. The program returns \code{0x006535FF} but the correct result is \code{0x046535FF}. \JW{Collapse lines 5-7 into a single line?}\YH{Yes I think it's good like this}}
+ \label{fig:vivado_bug1}
\end{figure}
-High-level synthesis (HLS), which refers to the automatic translation of software into hardware, is becoming an increasingly important part of the computing landscape.
-It promises to increase the productivity of hardware engineers by raising the abstraction level of their designs, and it promises software engineers the ability to produce application-specific hardware accelerators without having to understand hardware desciption languages (HDL) such as Verilog and VHDL.
-It is even being used in high-assurance settings, such as financial services~\cite{hls_fintech}, control systems~\cite{hls_controller}, and real-time object detection~\cite{hls_objdetect}. As such, HLS tools are increasingly relied upon. In this paper, we investigate whether they are trustworthy.
-
-To test the trustworthiness of HLS tools, we need a robust way of generating programs that both have good coverage and also explores various corner cases.
-Therein lies the difficulty in testing HLS tools.
-Human testing may not achieve both these objectives, as HLS tools are often require complex inputs to trigger wrong behaviour.
-In this paper, we employ program fuzzing on HLS tools.
-
-Fuzzing is an automated testing method that provides unexpected, counter-intuitive and random programs to compilers to test their robustness~\cite{fuzzing+chen+13+taming,fuzz+sun+16+toward,fuzzing+liang+18+survey,fuzzing+zhang+19,yang11_findin_under_bugs_c_compil,lidbury15_many_core_compil_fuzzin}.
-Program fuzzing has been used extensively in testing software compilers.
-For example, Yang \textit{et al.}~\cite{yang11_findin_under_bugs_c_compil} found more than 300 bugs in GCC and clang.
-Despite of the influence of fuzzing on software compilers, to the best of our knowledge, it has not been explored significantly within the HLS context.
-We specifically target HLS by restricting a fuzzer to generate programs within the subset of C supported by HLS.
+The approach we take in this paper is \emph{fuzzing}.
+%To test the trustworthiness of HLS tools, we need a robust way of generating programs that both have good coverage and also explores various corner cases.
+%Therein lies the difficulty in testing HLS tools.
+%Human testing may not achieve both these objectives, as HLS tools are often require complex inputs to trigger wrong behaviour.
+%In this paper, we employ program fuzzing on HLS tools.
+This is an automated testing method in which randomly generated programs are given to compilers to test their robustness~\cite{fuzzing+chen+13+taming,fuzz+sun+16+toward,fuzzing+liang+18+survey,fuzzing+zhang+19,yang11_findin_under_bugs_c_compil,lidbury15_many_core_compil_fuzzin}.
+The generated programs are typically large and rather complex, and they often combine language features in ways that are legal but counter-intuitive; hence they can be effective at exercising corner cases missed by human-designed test suites.
+Fuzzing has been used extensively to test conventional compilers; for example, Yang \textit{et al.}~\cite{yang11_findin_under_bugs_c_compil} used it to reveal more than three hundred bugs in GCC and Clang. In this paper, we bring fuzzing to the HLS context.
+%We specifically target HLS by restricting a fuzzer to generate programs within the subset of C supported by HLS.
% Most fuzzing tools randomly generate random C programs that are then provided to the compiler under test.
@@ -44,31 +42,31 @@ We specifically target HLS by restricting a fuzzer to generate programs within t
% Fuzzing enables us to overcome
-\paragraph{An example of a fuzzed buggy program}
-Figure~\ref{fig:vivado_bug1} shows a minimal example that produces the wrong result during RTL simulation in VivadoHLS, compared to GCC execution.
-In this example, we right shift a large integer value \code{b} by values of array elements, in array \code{a}, within iterations of a \code{for}-loop.
-VivadoHLS returns \code{0x006535FF} instead of \code{0x046535FF} as in GCC.
-The circumstances in which we found this bug shows the challenge of testing HLS tools.
+\paragraph{An example of a compiler bug found by fuzzing}
+Figure~\ref{fig:vivado_bug1} shows a program that produces the wrong result during RTL simulation in Xilinx Vivado HLS. The bug was initially revealed by a large, randomly generated program, which we reduced to the minimal example shown in the figure.
+The program repeatedly shifts a large integer value \code{b} right by the values stored in array \code{a}.
+Vivado HLS returns \code{0x006535FF}, but the result returned by GCC (and subsequently manually confirmed to be the correct one) is \code{0x046535FF}.
-For instance, the for-loop is necessary to ensure that a bug was detected.
-Also, the shift value needs to be accessed from an array.
-Replacing the array accesses within the loop with constants result in the bug not surfacing.
-Additionally, the array \code{a} needed to be at least six elements in size although the for-loop only has two iterations.
-% Any array smaller than that did not surface this bug.
-Finally, the value of \code{b} is an oracle that could not be changed without masking the bug.
-Producing such circumstances within C code for HLS testing is both arduous and counter-intuitive to human testers.
-In contrast, producing non-intuitive, complex but valid C programs is the cornerstone of fuzzing tools.
-Thus, it was natural to adopt program fuzzing for our HLS testing campaign.
+The circumstances in which we found this bug illustrate some of the challenges in testing HLS tools.
+For instance, without the for-loop, the bug goes away.
+Moreover, the bug only appears if the shift values are accessed from an array.
+And -- particularly curiously -- even though the for-loop only has two iterations, the array \code{a} must have at least six elements; if it has fewer than six, the bug disappears.
+Even the seemingly random value of \code{b} could not be changed without masking the bug.
+It seems unlikely that a manually generated test program would bring together all of the components necessary for exposing this bug.
+In contrast, producing counter-intuitive, complex but valid C programs is the cornerstone of fuzzing tools.
+For this reason, we found it natural to adopt fuzzing for our HLS testing campaign.
% \NR{Yann, please double check my claims about the bug. I hope I accurately described what we discussed. }\YH{Yes I agree with all that, I think that is a good description of it}
-\paragraph{Our contributions}
-In this paper, we conduct a widespread testing campaign by fuzzing HLS compilers.
-We do so in the following manner:
+\paragraph{Our contribution}
+This paper reports on our campaign to test HLS tools by fuzzing.
\begin{itemize}
- \item We utilise Csmith~\cite{yang11_findin_under_bugs_c_compil} to generate well-formed C programs from the subset of the C language supported by HLS tools;
- \item Then, we test these programs together with a random selection of HLS directives by comparing the gcc and HLS outputs, and we also keep track of programs that crash HLS tools;
- \item As part of our testing campaign, we generate 10 thousand test cases that we test against the three well-known HLS tools: Vivado HLS~\cite{xilinx20_vivad_high_synth}, LegUp HLS~\cite{canis13_legup} and Intel HLS~\cite{intel20_sdk_openc_applic};
- \item During our testing campaign, we found \ref{XX} bugs that we discuss and also report to the respective developers, where \ref{XX} bugs have been confirmed.
+ \item We use Csmith~\cite{yang11_findin_under_bugs_c_compil} to generate ten thousand valid C programs from within the subset of the C language that is supported by all the HLS tools we test. We augment each program with a random selection of HLS-specific directives.
+
+ \item We give these programs to three widely used HLS tools: Vivado HLS~\cite{xilinx20_vivad_high_synth}, LegUp HLS~\cite{canis13_legup} and Intel HLS~\cite{intel20_sdk_openc_applic}. When we find a program that causes an HLS tool to crash, or to generate hardware that produces a different result from GCC, we reduce it to a minimal example with the help of the C-reduce tool~\cite{creduce}.
+
+ \item Our testing campaign revealed that all three tools could be made to crash while compiling or to generate wrong RTL. In total, we found \ref{XX} bugs across the three tools, all of which have been reported to the respective developers, and \ref{XX} of which have been confirmed at the time of writing.
+
+ \item To investigate whether HLS tools are getting more or less reliable over time, we also tested three different versions of Vivado HLS (2018.3, 2019.1, and 2019.2). \JW{Put a sentence here summarising our findings from this experiment, once we have them.}
\end{itemize}
% we test, and then augment each program with randomly chosen HLS-specific directives. We synthesise each C program to RTL, and use a Verilog simulator to calculate its return value. If synthesis crashes, or if this return value differs from the return value obtained by executing a binary compiled from the C program by gcc, then we have found a candidate bug. We then use trial-and-error to reduce the C program to a minimal version that still triggers a bug.