main.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261

%% For double-blind review submission, w/o CCS and ACM Reference (max submission space)
\documentclass[sigplan,10pt,pagebackref=true,review]{acmart}\settopmatter{printfolios=true,printccs=false,printacmref=false}
%\documentclass[pagebackref=true,acmsmall,review]{acmart}\settopmatter{printfolios=true,printccs=false,printacmref=false}
%% For double-blind review submission, w/ CCS and ACM Reference
%\documentclass[acmsmall,review,anonymous]{acmart}\settopmatter{printfolios=true}
%% For single-blind review submission, w/o CCS and ACM Reference (max submission space)
%\documentclass[acmsmall,review]{acmart}\settopmatter{printfolios=true,printccs=false,printacmref=false}
%% For single-blind review submission, w/ CCS and ACM Reference
%\documentclass[acmsmall,review]{acmart}\settopmatter{printfolios=true}
%% For final camera-ready submission, w/ required CCS and ACM Reference
%\documentclass[acmsmall]{acmart}\settopmatter{}

%% Journal information
%% Supplied to authors by publisher for camera-ready submission;
%% use defaults for review submission.
\acmJournal{PACMPL}
\acmVolume{1}
\acmNumber{CONF} % CONF = POPL or ICFP or OOPSLA
\acmArticle{1}
\acmYear{2018}
\acmMonth{1}\acmDOI{}
\startPage{1}

%% Copyright information
%% Supplied to authors (based on authors' rights management selection;
%% see authors.acm.org) by publisher for camera-ready submission;
%% use 'none' for review submission.
\setcopyright{none}
%\setcopyright{acmcopyright}
%\setcopyright{acmlicensed}
%\setcopyright{rightsretained}
%\copyrightyear{2018}

%% Bibliography style
\bibliographystyle{ACM-Reference-Format}
%% Citation style
%\citestyle{acmauthoryear}

\usepackage{amsmath}
\usepackage{booktabs}
\usepackage{mathpartir}
\usepackage{pgfplots}
\usepackage{soul}
\usepackage{subcaption}
\usepackage{tikz}
\usepackage{minted}
\usepackage{microtype}
\usepackage{epigraph}

\usetikzlibrary{shapes,calc,arrows.meta}
\pgfplotsset{compat=1.16}

\setminted{fontsize=\small}
\usemintedstyle{manni}

\newif\ifANONYMOUS
\ANONYMOUSfalse

\newif\ifCOMMENTS
\COMMENTStrue
\newcommand{\Comment}[3]{\ifCOMMENTS\textcolor{#1}{{\bf [\![#2:} #3{\bf ]\!]}}\fi}
\newcommand\JW[1]{\Comment{red!75!black}{JW}{#1}}
\newcommand\YH[1]{\Comment{green!50!blue}{YH}{#1}}

\newcommand\JWcouldcut[1]{{\st{#1}}}
\newcommand\JWreplace[2]{{\st{#1} \JW{#2}}}

\definecolor{compcert}{HTML}{66c2a5}
\definecolor{formalhls}{HTML}{fc8d62}
\definecolor{keywordcolour}{HTML}{8f0075}
\definecolor{functioncolour}{HTML}{721045}
\definecolor{constantcolour}{HTML}{0000bb}

\newcommand\yhkeywordsp[1]{\;\;\texttt{\textcolor{keywordcolour}{#1}}}
\newcommand\yhkeyword[1]{\texttt{\textcolor{keywordcolour}{#1}}}
\newcommand\yhfunctionsp[1]{\;\;\texttt{\textcolor{functioncolour}{#1}}}
\newcommand\yhfunction[1]{\texttt{\textcolor{functioncolour}{#1}}}
\newcommand\yhconstant[1]{\texttt{\textcolor{constantcolour}{#1}}}

\ifANONYMOUS
\newcommand{\vericert}{HLS\-Cert}
\else
\newcommand{\vericert}{Veri\-cert}
\fi
\newcommand{\compcert}{Comp\-Cert}
\newcommand{\legup}{Leg\-Up}

\newcommand{\slowdownOrig}{56}
\newcommand{\slowdownDiv}{10}
\newcommand{\areaIncr}{21}


\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem*{remark}{Remark}

\begin{document}

%% Title information
\title[HLS Tools Should be Proven Correct]{High-Level Synthesis Tools Should be Proven Correct}
                                        %% when present, will be used in
                                        %% header instead of Full Title.
%\titlenote{with title note}             %% \titlenote is optional;
                                        %% can be repeated if necessary;
                                        %% contents suppressed with 'anonymous'
%\subtitle{Subtitle}                     %% \subtitle is optional
%\subtitlenote{with subtitle note}       %% \subtitlenote is optional;
                                        %% can be repeated if necessary;
                                        %% contents suppressed with 'anonymous'


%% Author information
%% Contents and number of authors suppressed with 'anonymous'.
%% Each author should be introduced by \author, followed by
%% \authornote (optional), \orcid (optional), \affiliation, and
%% \email.
%% An author may have multiple affiliations and/or emails; repeat the
%% appropriate command.
%% Many elements are not rendered, but should be provided for metadata
%% extraction tools.

%% Author with single affiliation.
\author{Yann Herklotz}
\orcid{0000-0002-2329-1029}
\affiliation{
  \institution{Imperial College London, UK}
  %\country{UK}
}
\email{yann.herklotz15@imperial.ac.uk}

\author{John Wickerson}
\orcid{0000-0000-0000-0000}
\affiliation{
  \institution{Imperial College London, UK}
  %\country{UK}
}
\email{j.wickerson@imperial.ac.uk}

% JW notes:
% - High-level synthesis is increasingly being relied upon.
% - But it's really flaky. (Cite bugs from FCCM submission etc.)
% - There exist some workarounds. (Testing the output, formally verifying the output, etc.)
% - The time has come to prove the tool itself correct. (Mention success of Compcert and other fully verified tools?)
% - We've made some encouraging progress on this front in a prototype tool called Vericert. (Summarise current state of Vericert, and how it compares performance-wise to LegUp.)
% - But there's still a long way to go. (List the main things left to do, and how you expect Vericert to compare to LegUp after those things are done.)

\begin{abstract}
  With hardware designs becoming ever more complex and demand for custom accelerators ever growing, high-level synthesis (HLS) is increasingly being relied upon.  However, HLS is known to be quite flaky, with each tool supporting subtly different fragments of the input language and sometimes even generating incorrect designs.

  We argue that a formally verified HLS tool could solve this issue by dramatically reducing the amount of trusted code, and providing a formal description of the input language that is supported. To this end, we have developed \vericert{}, a formally verified HLS tool, based on \compcert{}, a formally verified C compiler.
\end{abstract}

%% 2012 ACM Computing Classification System (CSS) concepts
%% Generate at 'http://dl.acm.org/ccs/ccs.cfm'.
\begin{CCSXML}
<ccs2012>
<concept>
<concept_id>10011007.10011006.10011008</concept_id>
<concept_desc>Software and its engineering~General programming languages</concept_desc>
<concept_significance>500</concept_significance>
</concept>
<concept>
<concept_id>10003456.10003457.10003521.10003525</concept_id>
<concept_desc>Social and professional topics~History of programming languages</concept_desc>
<concept_significance>300</concept_significance>
</concept>
</ccs2012>
\end{CCSXML}

\ccsdesc[500]{Software and its engineering~General programming languages}
\ccsdesc[300]{Social and professional topics~History of programming languages}
%% End of generated code


%% Keywords
%% comma separated list
\keywords{\compcert{}, Coq, high-level synthesis, C, Verilog}

\maketitle

\section{Introduction}

%\JW{I removed the `All' from the title, as I think that's a bit strong; it also means that the title fit on one line now :).}

\renewcommand{\epigraphsize}{\normalsize}
\renewcommand{\epigraphflush}{center}
\renewcommand{\epigraphrule}{0pt}
\epigraph{\textit{High-level synthesis research and development is inherently prone to introducing bugs or regressions in the final circuit functionality.}}{--- Andrew Canis~\cite{canis15_legup}\\Co-founder of LegUp Computing}

%\JW{Nice quote; I'd be tempted to tinker with whether it can be formatted a bit more elegantly, like at https://style.mla.org/styling-epigraphs/} 
Research in high-level synthesis (HLS) often concentrates on performance\JWreplace{,}{:} trying to achieve the lowest area with the shortest run-time.  What is often overlooked is ensuring that the \JWreplace{high-level synthesis}{HLS} tool is indeed correct, which means that it outputs \JWreplace{a correct hardware design}{hardware designs that are equivalent to the software input}. \JW{As discussed, the rest of this paragraph could be used to justify the claim that existing techniques suffice.} Instead, the design is often meticulously tested, often using the higher level design as a model. \JWreplace{As}{Because/Since} these tests are performed on the hardware design directly, they have to be run on a simulator, which takes much longer than if the original C was tested. \JW{And} Any formal properties obtained from the C code would also have to be checked again in the resulting design, to ensure that these hold there as well, as the synthesis tool may have translated the input incorrectly.

It is often assumed that because \JW{This sentence is missing a word or two?} current HLS tools should transform the input specification into a semantically equivalent design, such as mentioned in \citet{lahti19_are_we_there_yet}.  However, this is not the case, and as with all complex pieces of software there are bugs in HLS tools as well.  For example, Vivado HLS was found to incorrectly apply pipelining optimisations\footnote{\url{https://bit.ly/vivado-hls-pipeline-bug}} or generate wrong designs with valid C code as input\JWcouldcut{, but which the HLS tool interprets differently compared to a C compiler, leading to undefined behaviour}.  These types of bugs are difficult to identify, and exist because firstly it is not quite clear what input these tools support, and secondly \JWreplace{if}{whether} the output design actually behaves the same as the input.

\paragraph{Our position} is that a formally verified high-level synthesis tool could be the solution to these problems.  It not only guarantees that the output is correct, but also brings a formal specification of the input and output language semantics.  These are the only parts of the compiler that need to be trusted, and if these are well-specified, then the behaviour of the resulting design can be fully trusted.  In addition to that, if the semantics of the input semantics are taken from a tool that is widely trusted already, then there should not be any strange behaviour; the resultant design will either behave exactly like the input specification, or the translation will fail early at compile time.

\JW{Outline what the rest of the paper is about. E.g. `In what follows, we will argue our position by presenting several possible \emph{objections} to our position, and then responding to each in turn.'}

\section{Arguments against formalised HLS}

\paragraph{Objection 1: People should not be designing hardware in C to begin with.}
\JWcouldcut{In fact,} formally verifying HLS of C is the wrong approach\JWreplace{, as}{. C} it should not be used to design hardware, let alone hardware \JWreplace{that is important to be reliable}{where reliability is crucial}. Instead, there have been many efforts to formally verify the translation of high-level hardware description languages like Bluespec with K\^{o}i\-ka~\cite{bourgeat20_essen_blues}, formalising the synthesis of Verilog into \JWreplace{technology mapped}{technology-mapped} \JWreplace{net lists}{net-lists} with Lutsig~\cite{loow21_lutsig}, or work on formalising circuit design in Coq itself to ease design verification~\cite{choi17_kami,singh_silver_oak}.\JW{Yes, very nicely put.}

\paragraph{Our response:} However, verifying HLS \JW{is} also important.  Not only \JWreplace{in}{is} HLS becoming more popular, as it requires much less design effort to produce new hardware~\cite{lahti19_are_we_there_yet}, but much of that convenience comes from the easy behavioural testing that HLS allows to ensure correct functionality of the design.  This assumes that HLS tools are correct. \JW{Richard Bornat used to have a nice saying about `proving the programs that people actually write'. You could say you're engaging with the kind of toolflow (i.e. C to Verilog) that hardware designers actually use (and for that, you could cite that magazine article we cited in the PLDI submission about `most hardware designs' start life as a C program').}

\paragraph{Existing approaches for testing or formally verifying hardware designs are sufficient for ensuring reliability.}

\citet{du21_fuzzin_high_level_synth_tools} showed that on average 2.5\% of randomly generated C \JW{programs}, tailored to the specific HLS tool, end up with incorrect designs.  These bugs were reported and confirmed to be new bugs in the tools, demonstrating that existing internal tests did not catch \JWreplace{these bugs}{them}. \JWreplace{In addition to that,}{And} existing verification techniques for checking the output of HLS tools may not be enough to catch these bugs reliably.  Checking the final design against the original model using a test bench may miss many edge cases that produce bugs.

There has been research on performing equivalence checks \JWreplace{on}{between} the \JW{output} design and the behavioural input, \JW{There are a few different phrases in the paper: `behavioural input', `software input', `higher level design', `input code' etc. I think it would be good to consistentify this a bit. Also it's sometimes not completely clear whether `design' means `software design' or `hardware design'.} focusing on creating translation validators~\cite{pnueli98_trans} to prove the equivalence between the design and input code, while supporting various optimisations such as scheduling~\cite{kim04_autom_fsmd,karfa06_formal_verif_method_sched_high_synth,chouksey20_verif_sched_condit_behav_high_level_synth} or code motion~\cite{banerjee14_verif_code_motion_techn_using_value_propag,chouksey19_trans_valid_code_motion_trans_invol_loops}.  However, these aren't perfect solutions either, as there is no guarantee that these proofs really compose with each other.  This means that \JWcouldcut{an} equivalence checkers are \JWreplace{normally}{often} designed to check the translation from start to finish, which is computationally expensive, as well as possibly being \JWreplace{unreliable}{highly incomplete?} due to a combination of optimisations producing an output that cannot be matched to the input, even though it is correct.

The radical solution to this problem is to formally verify the \JWreplace{complete}{whole} tool.  This has been \JWreplace{proven}{shown?} to be successful, in \compcert{}~\cite{leroy09_formal_verif_realis_compil}, for example, which is a formally verified C compiler written in Coq~\cite{coquand86}.  The reliability of \JWreplace{formal verification in Coq}{a formally verified compiler} was demonstrated by CSmith~\cite{yang11_findin_under_bugs_c_compil}, a random, valid C generator, which found more than 300 bugs in GCC and Clang, whereas only 5 bugs were found in the unverified parts of \compcert{}, which prompted the verification of those parts as well. \JW{If this needs condensing, you could say: `whereas no bugs were found in the verified parts of \compcert{}.'}

\paragraph{HLS applications don't require the levels of reliability that a formally verified compiler affords.}

One might argue that developing a formally verified tool in a theorem prover and proving correctness theorems about it might be too tedious and take too long, and that HLS tools specifically do not need that kind of reliability.  With our experience developing a verified HLS tool called \vericert{}~\cite{herklotz21_formal_verif_high_level_synth} based on \compcert{}, we found that it normally takes $5\times$ or $10\times$ longer to prove a translation correct compared to writing the algorithm.

However, this could be seen as being beneficial, as proving the correctness of the HLS tool proves the absence of any bugs according to the language semantics, meaning much less time has to be spent on fixing bugs.  In addition to that, verification also forces the algorithm to deal with many different edge cases that may be hard to identify normally, and may even allow for more optimisations as one can be certain about assumptions one would usually have to make.

\paragraph{Any HLS tool that is simple enough for formal verification to be feasible won't produce sufficiently optimised designs to be useful.}

Another concern might be that a verified HLS tool might not be performant enough to be usable in practice.  If that is the case, then the verification effort could be seen as useless, as it could not be used.  Taking \vericert{} as an example, which does not currently include many optimisations, we found that performing comparisons between the fully verified bits of \vericert{} and \legup{}~\cite{canis11_legup}, we found that the speed and area was comparable ($1\times$ - $1.5\times$) that of \legup{} without LLVM optimisations and without operation chaining.  With those optimisations fully turned on, \vericert{} is around $4.5\times$ slower than \legup{}, with half of the speed-up being due to LLVM.\@

There are many optimisations that need to be added to \vericert{} to turn it into a viable and competitive HLS tool.  First of all, the most important addition is a good scheduling implementation, which supports operation chaining and properly pipelining operators.  Our main focus is implementing scheduling based on system of difference constraint~\cite{cong06_sdc}, which is the same algorithm \legup{} uses.  With this optimisation turned on, \vericert{} is only $~2\times$ to $~3\times$ slower than fully optimised \legup{}, with a slightly larger area.  The scheduling step is implemented using verified translation validation, meaning the scheduling algorithm can be tweaked and optimised without ever having to touch the correctness proof.

\paragraph{Even a formally verified HLS tool can't give absolute guarantees about the hardware designs it produces.}

It is true that a verified tool is still allowed to fail at compilation time, meaning none of the correctness proofs need to hold if no output is produced.  However, this is mostly a matter of putting more engineering work into the tool to make it reliable.  If a test bench is available, it is also quite simple to check this property, as it just has to be randomly tested without even having to execute the output.

In addition to that, specifically for an HLS tool taking C as input, undefined behaviour will allow the HLS tool to behave any way it wishes.  This becomes even more important when passing the C to a verified HLS tool, as if it is not free of undefined behaviour, then none of the proofs will hold.  Extra steps therefore need to be performed to ensure that the input is free of any undefined behaviour.

Finally, the input and output language semantics also need to be trusted, as the proofs only hold as long as the semantics hold are a true representation of the language.  In \vericert{} this comes down to trusting the C semantics developed by CompCert~\cite{blazy09_mechan_seman_cligh_subset_c_languag}. Trusting the Verilog semantics is not that easy, since it is not quite clear which semantics to take.  Verilog can either be simulated or synthesised into hardware, and has different semantics based on how it is used.  We therefore use existing Verilog semantics by \citet{loow19_proof_trans_veril_devel_hol}, which are designed to only model a small part of the semantics, while ensuring that the behaviour for synthesis and simulation stays the same.

\section{Conclusion}

In conclusion, we have demonstrated that a realistic HLS tool should be proven correct, and even though the performance does not quite match state of the art HLS tools, as the optimisations are formally verified, and added, similar performance may be achieved.  In addition to that, a formally verified HLS tool, written in a theorem prover, would provide a good base to mechanically verify other HLS specific optimisations and help develop more reliable HLS tools in the future.

%% Acknowledgments
%\begin{acks}%% acks environment is optional
%% contents suppressed with 'anonymous'
  %% Commands \grantsponsor{<sponsorID>}{<name>}{<url>} and
  %% \grantnum[<url>]{<sponsorID>}{<number>} should be used to
  %% acknowledge financial support and will be used by metadata
  %% extraction tools.
%\end{acks}


%% Bibliography
\bibliography{references.bib}


%% Appendix
%\input{appendix}

\end{document}

%%% Local Variables:
%%% mode: latex
%%% TeX-master: t
%%% TeX-command-extra-options: "-shell-escape"
%%% End: