eval_rewrite.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181

\section{Evaluation}\label{sec:evaluation}

\definecolor{vivado}{HTML}{7fc97f}
\definecolor{intel}{HTML}{beaed4}
\definecolor{legup}{HTML}{fdc086}
\begin{figure}
  \resizebox{0.47\textwidth}{!}{%
    \begin{tikzpicture}
      \draw (-14.5,7.65) rectangle (0,-1);
      \fill[vivado,fill opacity=0.5] (-4.4,4.4) ellipse (3.75 and 2.75);
      \fill[intel,fill opacity=0.5] (-10.2,4.4) ellipse (3.75 and 2.75);
      \fill[legup,fill opacity=0.5] (-7.3,2) ellipse (3.75 and 2.75);
      \draw[black] (-4.4,4.4) ellipse (3.75 and 2.75); % making the
      \draw[black] (-10.2,4.4) ellipse (3.75 and 2.75); % outlines
      \draw[black] (-7.3,2) ellipse (3.75 and 2.75);    % fully opaque
      \node at (-10.2,6.3) {\Large\textsf{\textbf{Vivado HLS}}};
      \node at (-4.4,6.3) {\Large\textsf{\textbf{Intel HLS}}};
      \node at (-7.3,0) {\Large\textsf{\textbf{LegUp}}};

      \node at (-5.5,3) {\Huge 1 (\textcolor{red}{1})};
      \node at (-9.1,3) {\Huge 4 (\textcolor{red}{0})};
      \node at (-3,5) {\Huge 26 (\textcolor{red}{540})};
      \node at (-11.6,5) {\Huge 79 (\textcolor{red}{20})};
      \node at (-7.3,1) {\Huge 162 (\textcolor{red}{6})};
      \node at (-7.3,5.2) {\Huge 0 (\textcolor{red}{5})};
      \node at (-7.3,3.8) {\Huge 0 (\textcolor{red}{0})};
      \node at (-13.6,-0.5) {\Huge 5856};
    \end{tikzpicture}
  }
\caption{A Venn diagram showing the number of failures in each tool out of 6700 test cases that were run.  Overlapping regions mean that the test cases failed in all those tools.  The numbers in parentheses represent the number of test cases that timed out.}\label{fig:existing_tools}
\end{figure}

\begin{table}
  \centering
  \begin{tabular}{lr}\toprule
    \textbf{Tool} & \textbf{Unique Bugs}\\
    \midrule
    Vivado HLS & $\ge 2$\\
    LegUp HLS & $\ge 3$\\
    Intel HLS & $\ge 1$\\
    \bottomrule
  \end{tabular}
  \caption{Unique bugs found in each tool.}
  \label{tab:unique_bugs}
\end{table}

To evaluate the different HLS tools 6,700 test cases were generated and fed into each tool, keeping the test cases constant so that the comparison between the tools was fair.  Three different tools were tested, including three different versions of Vivado HLS, which are shown in Table~\ref{tab:unique_bugs}.  Bugs were found in all tools that were tested, and in total, \ref{??} unique bugs were found and reported to the tool vendors.
We were only able to test one version of LegUp HLS (version 4.0). LegUp 7.5 is GUI-based and not suitable for scripting; however, bugs we found in LegUp 4.0 were reproduced manually in LegUp 7.5.

\subsection{Bugs found}

This section describes some of the bugs that were found in the various tools that were tested, going over different types of bugs in the process.  Each of the following bugs was first reduced automatically using C-Reduce, and then reduced further manually to achieve the minimal test case.

\subsubsection{LegUp Assertion Error}

The piece of code shown in Figure~\ref{fig:eval:legup:assert} produces an assertion error in LegUp 4.0 and 7.5 even though it should compile without any errors.  The assertion error states that This assertion error is equivalent to an unexpected crash of the tool as it means that an unexpected state was reached by this input.  This shows that there is a bug in one of the compilation passes in LegUp, however, due to the assertion the bug is caught in the tool before it produces an incorrect design.

\begin{figure}
\begin{minted}{c}
int a[2][2][1] = {{{0},{1}},{{0},{0}}};

int main() { a[0][1][0] = 1; }
\end{minted}
\caption{An assertion bug in LegUp HLS when setting \texttt{NO\_INLINE} to prevent inlining.}\label{fig:eval:legup:assert}
\end{figure}

The buggy test case has to do with initialisation and assignment to a three dimensional array, for which the above piece of code is the minimal example.  However, in addition to that it requires the \texttt{NO\_INLINE} constant to be set, which disallows inlining of functions.  The code initialises the array with zeros except for \texttt{a[0][1][0]}, which is set to one.  Then the main function assigns one to that same location, which causes LegUp to crash with an assertion error.  This code on its own should not actually produce a result and should just terminate by returning 0, which is also what the design does that LegUp generates when the \texttt{NO\_INLINE} flag is turned off.

%The following code also produces an assertion error in LegUp, which is a different one this time.  This bug was not discovered during the main test runs of 10 thousand test cases, but beforehand, which meant that we disabled unions from being generated.  However, this bug also requires the \texttt{volatile} keyword which seems to be the reason for quite a few mismatches in LegUp and Vivado.
%
%\begin{minted}{c}
%union U1 { int a; };
%
%volatile union U1 un = {0};
%
%int main() { return un.a; }
%\end{minted}

\subsubsection{LegUp Miscompilation}

The test case in Figure~\ref{fig:eval:legup:wrong} produces an incorrect netlist in LegUp 4.0, meaning the result of simulating the design and running the C code directly is different.

\begin{figure}
\begin{minted}{c}
volatile int a = 0;
int b = 1;

int main() { 
  int d = 1;
  if (d + a) b || 1;
  else b = 0;
  return b;
}
\end{minted}
\caption{An output mismatch where GCC returns 1 and LegUp HLS returns 0.}\label{fig:eval:legup:wrong}
\end{figure}

In the code above, \texttt{b} has value 1 when run in GCC, but has value 0 when run with LegUp 4.0.  If the \texttt{volatile} keyword is removed from \texttt{a}, then the netlist contains the correct result.  As \texttt{a} and \texttt{d} are constants, the if-statement should always produce go into the \texttt{true} branch, meaning \texttt{b} should never be set to 0.  The \texttt{true} branch of the if-statement only executes an expression which is not assigned to any variable, meaning the initial state of all variables should not have changed.  However, LegUp HLS generates a design which enters the \texttt{else} branch instead and assigns \texttt{b} to be 0.  The cause of this bug seems to be the \texttt{volatile} keyword and the analysis that is performed to simplify the if-statement.

\subsubsection{Vivado Miscompilation}

Figure~\ref{fig:eval:vivado:mismatch} shows code that does not output the right value when compiled with all Vivado versions and GCC, as it returns \texttt{0x0} with Vivado whereas it should be returning \texttt{0xF}.  This test case is much longer compared to the other test cases that were reduced and could not be made any smaller, as everything in the code is necessary to trigger the bug.

The array \texttt{a} is initialised to all zeros, as well as the other global variables \texttt{g} and \texttt{c}, so as to not introduce any undefined behaviour.  However, \texttt{g} is also assigned the \texttt{volatile} keyword, which ensures that the variable is not optimised away.  The function \texttt{d} then accumulates the values \texttt{b} that it is passed into a hash that is stored in \texttt{c}.  Each \texttt{b} is eight bits wide, so function \texttt{e} calls the function 7 times for some of the bits in the 64 bit value of \texttt{f} that it is passed.  Finally, in the main function, the array is initialised partially with a for loop, after which the \texttt{e} function is called twice, once on the volatile function but also on a constant.  Interestingly enough, the second function call with the constant is also necessary to trigger the bug.

\begin{figure}
\begin{minted}{c}
volatile unsigned int g = 0;
int a[256] = {0};
int c = 0;

void d(char b) { c = (c & 4095) ^ a[(c ^ b) & 15]; }

void e(long f) {
  d(f); d(f >> 8); d(f >> 16); d(f >> 24);
  d(f >> 32); d(f >> 40); d(f >> 48);
}

int main() {
  for (int i = 0; i < 56; i++) a[i] = i;
  e(g); e(-2L);
  return c;
}
\end{minted}
\caption{An output mismatch where GCC returns \texttt{0xF}, whereas Vivado HLS return \texttt{0x0}.}\label{fig:eval:vivado:mismatch}
\end{figure}

\subsection{Bugs in Vivado HLS versions}

In addition to the explanation to bugs given in Section~\ref{sec:evaluation}, bugs found in various versions of Vivado are also analysed, which are shown in Figure~\ref{fig:sankey_diagram}.  The figure depicts failing test cases in 3645 test cases that were passed to Vivado 2018.3, 2019.1 and 2019.2.  All test cases that fail in the same tools are grouped together into a ribbon, showing when a bug is present in one of the tools.

There is a group of failing test cases that is constant between all versions of Vivado HLS, meaning these are bugs that were not fixed between the versions.  Other ribbons can be seen weaving in and out of failing for a version, meaning these bugs were fixed or reintroduced in those versions.  The diagram demonstrates that Vivado HLS 2018.3 contains the most failing test cases compared to the other versions, having 62 test cases fail in total.  Interestingly, Vivado HLS 2019.1 and 2019.2 have a different number of failing test cases, meaning feature improvements that introduced bugs as well as bug fixes between those minor versions.

\definecolor{ribbon1}{HTML}{8dd3c7}
\definecolor{ribbon2}{HTML}{b3de69}
\definecolor{ribbon3}{HTML}{bebada}
\definecolor{ribbon4}{HTML}{fb8072}
\definecolor{ribbon5}{HTML}{80b1d3}
\definecolor{ribbon6}{HTML}{fdb462}
\begin{figure}
  \centering
  \begin{tikzpicture}
    \draw[white, fill=ribbon1] (-1.0,4.1) -- (0.0,4.1000000000000005) to [out=0,in=180] (2.0,4.1000000000000005) to [out=0,in=180] (4.0,4.1000000000000005) -- (6.0,4.1000000000000005) -- %(7.55,3.325) -- 
    (6.0,2.5500000000000003) -- (4.0,2.5500000000000003) to [out=180,in=0] (2.0,2.5500000000000003) to [out=180,in=0] (0.0,2.5500000000000003) -- (-1.0,2.55) -- cycle;
    \draw[white, fill=ribbon2] (-1.0,2.55) -- (0.0,2.5500000000000003) to [out=0,in=180] (2.0,1.8) to [out=0,in=180] (4.0,1.55) -- (6.0,1.55) -- %(7.3,0.9) -- 
    (6.0,0.25) -- (4.0,0.25) to [out=180,in=0] (2.0,0.5) to [out=180,in=0] (0.0,1.25) -- (-1.0,1.25) -- cycle;
    \draw[white, fill=ribbon3] (-1.0,1.25) -- (0.0,1.25) to [out=0,in=180] (2.0,2.5500000000000003) to [out=0,in=180] (4.0,0.25) -- (6.0,0.25) -- %(6.05,0.225) -- 
    (6.0,0.2) -- (4.0,0.2) to [out=180,in=0] (2.0,2.5) to [out=180,in=0] (0.0,1.2000000000000002) -- (-1.0,1.2) -- cycle;
    \draw[white, fill=ribbon4] (-1.0,0.5) -- (0.0,0.5) to [out=0,in=180] (2.0,2.5) to [out=0,in=180] (4.0,0.2) -- (6.0,0.2) -- %(6.2,0.1) --
    (6.0,0.0) -- (4.0,0.0) to [out=180,in=0] (2.0,2.3000000000000003) to [out=180,in=0] (0.0,0.30000000000000004) -- (-1.0,0.3) -- cycle;
    \draw[white, fill=ribbon5] (-1.0,1.2) -- (0.0,1.2000000000000002) to [out=0,in=180] (2.0,0.5) to [out=0,in=180] (4.0,2.5500000000000003) -- (6.0,2.5500000000000003) -- %(6.2,2.45) --
    (6.0,2.35) -- (4.0,2.35) to [out=180,in=0] (2.0,0.30000000000000004) to [out=180,in=0] (0.0,1.0) -- (-1.0,1.0) -- cycle;
    \draw[white, fill=ribbon6] (-1.0,0.3) -- (0.0,0.30000000000000004) to [out=0,in=180] (2.0,0.30000000000000004) to [out=0,in=180] (4.0,2.35) -- (6.0,2.35) -- %(6.3,2.2) -- 
    (6.0,2.0500000000000003) -- (4.0,2.0500000000000003) to [out=180,in=0] (2.0,0.0) to [out=180,in=0] (0.0,0.0) -- (-1.0,0.0) -- cycle;

    \draw[white, fill=black] (-0.4,4.1) rectangle (0.0,1.0);
    \draw[white, fill=black] (1.8,4.1) rectangle (2.2,2.3);
    \draw[white, fill=black] (3.8,4.1) rectangle (4.2,2.05);

    \node at (-0.2,4.5) {2018.3};
    \node at (2,4.5) {2019.1};
    \node at (4,4.5) {2019.2};
    %\node at (2,5) {Vivado HLS};

    \node at (5.5,3.325) {31};
    \node at (5.5,0.9) {26};
    \node at (5.5,2.2) {6};

    \node[white] at (-0.2,1.2) {62};
    \node[white] at (2,2.5) {36};
    \node[white] at (4,2.25) {41};
  \end{tikzpicture}
  \caption{A Sankey diagram that tracks 3645 test cases through three different versions of Vivado HLS. The ribbons collect the test cases that pass and fail together. The 3573 test cases that pass in all three versions are not depicted. }\label{fig:sankey_diagram}
\end{figure}

From this diagram it can also be observed that there must be at least six individual bugs that were found by the fuzzer in Vivado HLS, as a ribbon must contain at least one unique bug.\YH{Contradicts value of 3 in Table~\ref{tab:unique_bugs}, maybe I can change that to 6?}.

%%% Local Variables:
%%% mode: latex
%%% TeX-master: "main"
%%% End: