summaryrefslogtreecommitdiffstats
path: root/intro.tex
diff options
context:
space:
mode:
authorYann Herklotz <ymh15@ic.ac.uk>2021-04-04 20:12:20 +0000
committeroverleaf <overleaf@localhost>2021-04-04 20:18:08 +0000
commit62a127dfb009b8ffe94ac348ecafb7f596406cbd (patch)
tree7dbee2f45b6baa1edc4054d32610ff2b1fad6b5b /intro.tex
parentadc0afcec6fe025f85fbfdfdfc5ef522fa760d98 (diff)
downloadfccm21_esrhls-62a127dfb009b8ffe94ac348ecafb7f596406cbd.tar.gz
fccm21_esrhls-62a127dfb009b8ffe94ac348ecafb7f596406cbd.zip
Update on Overleaf.
Diffstat (limited to 'intro.tex')
-rw-r--r--intro.tex12
1 files changed, 7 insertions, 5 deletions
diff --git a/intro.tex b/intro.tex
index 4432fc1..0ceab4e 100644
--- a/intro.tex
+++ b/intro.tex
@@ -1,7 +1,7 @@
\section{Introduction}
-High-level synthesis (HLS), which refers to the automatic translation of software into hardware, is becoming an increasingly important part of the computing landscape, even in such high-assurance settings as financial services~\cite{hls_fintech}, control systems~\cite{hls_controller}, and real-time object detection~\cite{hls_objdetect}.
-The appeal of HLS is twofold: it promises hardware engineers an increase in productivity by raising the abstraction level of their designs, and it promises software engineers the ability to produce application-specific hardware accelerators without having to understand Verilog and VHDL.
+High-level synthesis (HLS), which refers to the automatic translation of software into hardware, is becoming an important part of the computing landscape, even in such high-assurance settings as financial services~\cite{hls_fintech}, control systems~\cite{hls_controller}, and real-time object detection~\cite{hls_objdetect}.
+The appeal of HLS is twofold: it promises hardware engineers an increase in productivity by raising the abstraction level of their designs, and it promises software engineers the ability to produce application-specific hardware accelerators without having to understand Verilog or VHDL.
As such, we are increasingly reliant on HLS tools. But are these tools reliable? Questions have been raised about the reliability of HLS before; for example, Andrew Canis, co-creator of the LegUp HLS tool, wrote that ``high-level synthesis research and development is inherently prone to introducing bugs or regressions in the final circuit functionality''~\cite[Section 3.4.6]{canis15_legup}. In this paper, we investigate whether there is substance to this concern by conducting an empirical evaluation of the reliability of several widely used HLS tools.
@@ -50,7 +50,7 @@ int main() {
\label{fig:vivado_bug1}
\end{figure}
-The example above demonstrates the effectiveness of fuzzing. It seems unlikely that a human-written test-suite would discover this particular bug, given that it requires several components all to coincide before the bug is revealed!
+The example above demonstrates the effectiveness of fuzzing. It seems unlikely that a human-written test-suite would discover this particular bug, given that it requires several components all to coincide before the bug is revealed. If the loop is unrolled, or the seemingly random value of \code{b} is simplified, or the array is declared with fewer than six elements (even though only two are accessed), then the bug goes away.
Yet this example also begs the question: do bugs found by fuzzers really \emph{matter}, given that they are usually found by combining language features in ways that are vanishingly unlikely to happen `in the real world'~\cite{marcozzi+19}. This question is especially pertinent for our particular context of HLS tools, which are well-known to have restrictions on the language features they support. Nevertheless, although the \emph{test-cases} we generated do not resemble the programs that humans write, the \emph{bugs} that we exposed using those test-cases are real, and \emph{could also be exposed by realistic programs}.
%Moreover, it is worth noting that HLS tools are not exclusively provided with human-written programs to compile: they are often fed programs that have been automatically generated by another compiler.
@@ -61,9 +61,11 @@ Ultimately, we believe that any errors in an HLS tool are worth identifying beca
Our approach to fuzzing HLS tools comprises three steps.
First, we use Csmith~\cite{yang11_findin_under_bugs_c_compil} to generate thousands of valid C programs within the subset of the C language that is supported by all the HLS tools we test. We also augment each program with a random selection of HLS-specific directives. Second, we give these programs to four widely used HLS tools: Xilinx Vivado HLS~\cite{xilinx20_vivad_high_synth}, LegUp HLS~\cite{canis13_legup}, the Intel HLS Compiler, also known as i++~\cite{intel20_sdk_openc_applic}, and finally Bambu~\cite{pilato13_bambu}. Third, if we find a program that causes an HLS tool to crash or to generate hardware that produces a different result from GCC, we reduce it to a minimal example with the help of \creduce{}~\cite{creduce}.
-Our testing campaign revealed that all four tools could be made to generate an incorrect design. In total, \totaltestcases{} test-cases were run through each tool, of which \totaltestcasefailures{} failed in at least one of the tools. Test-case reduction was then performed on some of these failing test-cases to obtain at least \numuniquebugs{} unique failing test-cases.
+Our testing campaign revealed that all four tools could be made to generate an incorrect design. In total, \totaltestcases{} test-cases were run through each tool, of which \totaltestcasefailures{} failed in at least one of the tools. Test-case reduction was then performed on some of these failing test-cases to obtain at least \numuniquebugs{} unique failing test-cases, detailed on our companion webpage: \begin{center}
+ \url{https://ymherklotz.github.io/fuzzing-hls/}
+\end{center}
-To investigate whether HLS tools are getting more or less reliable over time, we also tested three different versions of Vivado HLS (v2018.3, v2019.1, and v2019.2). We found far fewer failures in versions v2019.1 and v2019.2 compared to v2018.3, but we also identified a few test-cases that only failed in versions v2019.1 and v2019.2; this suggests that some new features may have introduced bugs.
+To investigate whether HLS tools are getting more or less reliable, we also tested three different versions of Vivado HLS (v2018.3, v2019.1, and v2019.2). We found fewer failures in v2019.1 and v2019.2 compared to v2018.3, but also identified a few test-cases that only failed in v2019.1 and v2019.2; this suggests that new features may have introduced bugs.
In summary, the overall aim of our paper is to raise awareness about the reliability (or lack thereof) of current HLS tools, and to serve as a call-to-arms for investment in better-engineered tools. We hope that future work on developing more reliable HLS tools will find our empirical study a valuable source of motivation.