summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJohn Wickerson <j.wickerson@imperial.ac.uk>2021-02-25 16:16:40 +0000
committeroverleaf <overleaf@localhost>2021-02-25 17:04:33 +0000
commitcd3dcfdfe35f04449e5e06871640f7b5c6755c74 (patch)
treea62728ef27d4b4318afece83731cff0a9de16834
parent8e3dfda8b700385317b9ed26a93d4f49eb2137e1 (diff)
downloadlatte21_hlstpc-cd3dcfdfe35f04449e5e06871640f7b5c6755c74.tar.gz
latte21_hlstpc-cd3dcfdfe35f04449e5e06871640f7b5c6755c74.zip
Update on Overleaf.
-rw-r--r--main.tex26
1 files changed, 14 insertions, 12 deletions
diff --git a/main.tex b/main.tex
index 7e8468f..9e9864b 100644
--- a/main.tex
+++ b/main.tex
@@ -145,9 +145,10 @@
% - But there's still a long way to go. (List the main things left to do, and how you expect Vericert to compare to LegUp after those things are done.)
\begin{abstract}
- High-level synthesis (HLS) is increasingly being relied upon as the size of designs and needs for hardware accelerators increases. However, HLS is known to be quite flaky with each tool supporting different input languages and sometimes even generating incorrect designs.
+ High-level synthesis (HLS) is increasingly being relied upon as the size of designs and needs for hardware accelerators increases.
+ \JW{I'd be tempted to switch that around to make a more interesting opening: `With hardware designs becoming ever more complex and demand for custom accelerators ever growing, high-level synthesis (HLS) is increasingly being relied upon.'} However, HLS is known to be quite flaky \JW{,} with each tool supporting \JW{subtly} different \JWreplace{input languages}{fragments of the input language} and sometimes even generating incorrect designs.
- We argue that formally verifying HLS could solve this issue by dramatically reducing the amount of trusted code, and providing a formal description of the input language that is supported. In this respect, we have developed \vericert{}, a formally verified HLS tool, based on \compcert{}, a formally verified C compiler.
+ We argue that \JWreplace{formally verifying HLS}{a formally verified HLS tool} could solve this issue by dramatically reducing the amount of trusted code, and providing a formal description of the input language that is supported. \JWcouldcut{In this respect,} \JW{To this end,} we have developed \vericert{}, a formally verified HLS tool, based on \compcert{}, a formally verified C compiler.
\end{abstract}
%% 2012 ACM Computing Classification System (CSS) concepts
@@ -188,27 +189,28 @@
\epigraph{\textit{High-level synthesis research and development is inherently prone to introducing bugs or regressions in the final circuit functionality.}}{--- Andrew Canis~\cite{canis15_legup}\\Co-founder of LegUp Computing}
%\JW{Nice quote; I'd be tempted to tinker with whether it can be formatted a bit more elegantly, like at https://style.mla.org/styling-epigraphs/}
-Research in high-level synthesis (HLS) often concentrates on performance, trying to achieve the lowest area with the shortest run-time. What is often overlooked is ensuring that the high-level synthesis tool is indeed correct, which means that it outputs a correct hardware design. Instead, the design is often meticulously tested, often using the higher level design as a model. As these tests are performed on the hardware design directly, they have to be run on a simulator, which takes much longer than if the original C was tested. Any formal properties obtained from the C code would also have to be checked again in the resulting design, to ensure that these hold there as well, as the synthesis tool may have translated the input incorrectly.
+Research in high-level synthesis (HLS) often concentrates on performance\JWreplace{,}{:} trying to achieve the lowest area with the shortest run-time. What is often overlooked is ensuring that the \JWreplace{high-level synthesis}{HLS} tool is indeed correct, which means that it outputs \JWreplace{a correct hardware design}{hardware designs that are equivalent to the software input}. \JW{As discussed, the rest of this paragraph could be used to justify the claim that existing techniques suffice.} Instead, the design is often meticulously tested, often using the higher level design as a model. \JWreplace{As}{Because/Since} these tests are performed on the hardware design directly, they have to be run on a simulator, which takes much longer than if the original C was tested. \JW{And} Any formal properties obtained from the C code would also have to be checked again in the resulting design, to ensure that these hold there as well, as the synthesis tool may have translated the input incorrectly.
-It is often assumed that because current HLS tools should transform the input specification into a semantically equivalent design, such as mentioned in \citet{lahti19_are_we_there_yet}. However, this is not the case, and as with all complex pieces of software there are bugs in HLS tools as well. For example, Vivado HLS was found to incorrectly apply pipelining optimisations\footnote{\url{https://bit.ly/vivado-hls-pipeline-bug}} or generate wrong designs with valid C code as input, but which the HLS tool interprets differently compared to a C compiler, leading to undefined behaviour. These types of bugs are difficult to identify, and exist because firstly it is not quite clear what input these tools support, and secondly if the output design actually behaves the same as the input.
+It is often assumed that because \JW{This sentence is missing a word or two?} current HLS tools should transform the input specification into a semantically equivalent design, such as mentioned in \citet{lahti19_are_we_there_yet}. However, this is not the case, and as with all complex pieces of software there are bugs in HLS tools as well. For example, Vivado HLS was found to incorrectly apply pipelining optimisations\footnote{\url{https://bit.ly/vivado-hls-pipeline-bug}} or generate wrong designs with valid C code as input\JWcouldcut{, but which the HLS tool interprets differently compared to a C compiler, leading to undefined behaviour}. These types of bugs are difficult to identify, and exist because firstly it is not quite clear what input these tools support, and secondly \JWreplace{if}{whether} the output design actually behaves the same as the input.
-Our position is that a formally verified high-level synthesis tool could be the solution to these problems. It not only guarantees that the output is correct, but also brings a formal specification of the input and output language semantics. These are the only parts of the compiler that need to be trusted, and if these are well-specified, then the behaviour of the resulting design can be fully trusted. In addition to that, if the semantics of the input semantics are taken from a tool that is widely trusted already, then there should not be any strange behaviour; the resultant design will either behave exactly like the input specification, or the translation will fail early at compile time.
+\paragraph{Our position} is that a formally verified high-level synthesis tool could be the solution to these problems. It not only guarantees that the output is correct, but also brings a formal specification of the input and output language semantics. These are the only parts of the compiler that need to be trusted, and if these are well-specified, then the behaviour of the resulting design can be fully trusted. In addition to that, if the semantics of the input semantics are taken from a tool that is widely trusted already, then there should not be any strange behaviour; the resultant design will either behave exactly like the input specification, or the translation will fail early at compile time.
-\section{Arguments against formalised HLS}
+\JW{Outline what the rest of the paper is about. E.g. `In what follows, we will argue our position by presenting several possible \emph{objections} to our position, and then responding to each in turn.'}
-\paragraph{People should not be designing hardware in C to begin with.}
+\section{Arguments against formalised HLS}
-In fact, formally verifying HLS of C is the wrong approach, as it should not be used to design hardware, let alone hardware that is important to be reliable. Instead, there have been many efforts to formally verify the translation of high-level hardware description languages like Bluespec with K\^{o}i\-ka~\cite{bourgeat20_essen_blues}, formalising the synthesis of Verilog into technology mapped net lists with Lutsig~\cite{loow21_lutsig}, or work on formalising circuit design in Coq itself to ease design verification~\cite{choi17_kami,singh_silver_oak}.
+\paragraph{Objection 1: People should not be designing hardware in C to begin with.}
+\JWcouldcut{In fact,} formally verifying HLS of C is the wrong approach\JWreplace{, as}{. C} it should not be used to design hardware, let alone hardware \JWreplace{that is important to be reliable}{where reliability is crucial}. Instead, there have been many efforts to formally verify the translation of high-level hardware description languages like Bluespec with K\^{o}i\-ka~\cite{bourgeat20_essen_blues}, formalising the synthesis of Verilog into \JWreplace{technology mapped}{technology-mapped} \JWreplace{net lists}{net-lists} with Lutsig~\cite{loow21_lutsig}, or work on formalising circuit design in Coq itself to ease design verification~\cite{choi17_kami,singh_silver_oak}.\JW{Yes, very nicely put.}
-However, verifying HLS also important. Not only in HLS becoming more popular, as it requires much less design effort to produce new hardware~\cite{lahti19_are_we_there_yet}, but much of that convenience comes from the easy behavioural testing that HLS allows to ensure correct functionality of the design. This assumes that HLS tools are correct.
+\paragraph{Our response:} However, verifying HLS \JW{is} also important. Not only \JWreplace{in}{is} HLS becoming more popular, as it requires much less design effort to produce new hardware~\cite{lahti19_are_we_there_yet}, but much of that convenience comes from the easy behavioural testing that HLS allows to ensure correct functionality of the design. This assumes that HLS tools are correct. \JW{Richard Bornat used to have a nice saying about `proving the programs that people actually write'. You could say you're engaging with the kind of toolflow (i.e. C to Verilog) that hardware designers actually use (and for that, you could cite that magazine article we cited in the PLDI submission about `most hardware designs' start life as a C program').}
\paragraph{Existing approaches for testing or formally verifying hardware designs are sufficient for ensuring reliability.}
-\citet{du21_fuzzin_high_level_synth_tools} showed that on average 2.5\% of randomly generated C, tailored to the specific HLS tool, end up with incorrect designs. These bugs were reported and confirmed to be new bugs in the tools, demonstrating that existing internal tests did not catch these bugs. In addition to that, existing verification techniques for checking the output of HLS tools may not be enough to catch these bugs reliably. Checking the final design against the original model using a test bench may miss many edge cases that produce bugs.
+\citet{du21_fuzzin_high_level_synth_tools} showed that on average 2.5\% of randomly generated C \JW{programs}, tailored to the specific HLS tool, end up with incorrect designs. These bugs were reported and confirmed to be new bugs in the tools, demonstrating that existing internal tests did not catch \JWreplace{these bugs}{them}. \JWreplace{In addition to that,}{And} existing verification techniques for checking the output of HLS tools may not be enough to catch these bugs reliably. Checking the final design against the original model using a test bench may miss many edge cases that produce bugs.
-There has been research on performing equivalence checks on the design and the behavioural input, focusing on creating translation validators~\cite{pnueli98_trans} to prove the equivalence between the design and input code, while supporting various optimisations such as scheduling~\cite{kim04_autom_fsmd,karfa06_formal_verif_method_sched_high_synth,chouksey20_verif_sched_condit_behav_high_level_synth} or code motion~\cite{banerjee14_verif_code_motion_techn_using_value_propag,chouksey19_trans_valid_code_motion_trans_invol_loops}. However, these aren't perfect solutions either, as there is no guarantee that these proofs really compose with each other. This means that an equivalence checkers are normally designed to check the translation from start to finish, which is computationally expensive, as well as possibly being unreliable due to a combination of optimisations producing an output that cannot be matched to the input, even though it is correct.
+There has been research on performing equivalence checks \JWreplace{on}{between} the \JW{output} design and the behavioural input, \JW{There are a few different phrases in the paper: `behavioural input', `software input', `higher level design', `input code' etc. I think it would be good to consistentify this a bit. Also it's sometimes not completely clear whether `design' means `software design' or `hardware design'.} focusing on creating translation validators~\cite{pnueli98_trans} to prove the equivalence between the design and input code, while supporting various optimisations such as scheduling~\cite{kim04_autom_fsmd,karfa06_formal_verif_method_sched_high_synth,chouksey20_verif_sched_condit_behav_high_level_synth} or code motion~\cite{banerjee14_verif_code_motion_techn_using_value_propag,chouksey19_trans_valid_code_motion_trans_invol_loops}. However, these aren't perfect solutions either, as there is no guarantee that these proofs really compose with each other. This means that \JWcouldcut{an} equivalence checkers are \JWreplace{normally}{often} designed to check the translation from start to finish, which is computationally expensive, as well as possibly being \JWreplace{unreliable}{highly incomplete?} due to a combination of optimisations producing an output that cannot be matched to the input, even though it is correct.
-The radical solution to this problem is to formally verify the complete tool. This has been proven to be successful, in \compcert{}~\cite{leroy09_formal_verif_realis_compil}, for example, which is a formally verified C compiler written in Coq~\cite{coquand86}. The reliability of formal verification in Coq was demonstrated by CSmith~\cite{yang11_findin_under_bugs_c_compil}, a random, valid C generator, which found more than 300 bugs in GCC and Clang, whereas only 5 bugs were found in the unverified parts of \compcert{}, which prompted the verification of those parts as well.
+The radical solution to this problem is to formally verify the \JWreplace{complete}{whole} tool. This has been \JWreplace{proven}{shown?} to be successful, in \compcert{}~\cite{leroy09_formal_verif_realis_compil}, for example, which is a formally verified C compiler written in Coq~\cite{coquand86}. The reliability of \JWreplace{formal verification in Coq}{a formally verified compiler} was demonstrated by CSmith~\cite{yang11_findin_under_bugs_c_compil}, a random, valid C generator, which found more than 300 bugs in GCC and Clang, whereas only 5 bugs were found in the unverified parts of \compcert{}, which prompted the verification of those parts as well. \JW{If this needs condensing, you could say: `whereas no bugs were found in the verified parts of \compcert{}.'}
\paragraph{HLS applications don't require the levels of reliability that a formally verified compiler affords.}