From 433f6cf9cb6b6acfaf7094bd8727a87718a20830 Mon Sep 17 00:00:00 2001 From: Yann Herklotz Date: Wed, 6 Apr 2022 20:48:13 +0100 Subject: Fix up the fonts --- hls.tex | 107 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 107 insertions(+) (limited to 'hls.tex') diff --git a/hls.tex b/hls.tex index 4ea51d8..4feaa14 100644 --- a/hls.tex +++ b/hls.tex @@ -1 +1,108 @@ \chapter{Formal Verification of High-Level Synthesis} + +\newcommand{\slowdownOrig}{27} +\newcommand{\slowdownDiv}{2} +\newcommand{\areaIncr}{1.1} + +\paragraph{Can you trust your high-level synthesis tool?} + +As latency, throughput, and energy efficiency become increasingly important, custom hardware +accelerators are being designed for numerous applications. Alas, designing these accelerators can +be a tedious and error-prone process using a hardware description language (HDL) such as Verilog. +An attractive alternative is \emph{high-level synthesis} (HLS), in which hardware designs are +automatically compiled from software written in a high-level language like C. Modern HLS tools such +as LegUp~\cite{canis11_legup}, Vivado HLS~\cite{xilinx20_vivad_high_synth}, Intel +i++~\cite{intel_hls}, and Bambu HLS~\cite{bambu_hls} promise designs with comparable performance and +energy-efficiency to those hand-written in an HDL~\cite{homsirikamol+14, silexicahlshdl, 7818341}, +while offering the convenient abstractions and rich ecosystems of software development. But +existing HLS tools cannot always guarantee that the hardware designs they produce are equivalent to +the software they were given, and this undermines any reasoning conducted at the software level. + +Indeed, there are reasons to doubt that HLS tools actually \emph{do} always preserve equivalence. +For instance, Vivado HLS has been shown to apply pipelining optimisations +incorrectly\footnote{\url{https://bit.ly/vivado-hls-pipeline-bug}} or to silently generate wrong +code should the programmer stray outside the fragment of C that it +supports.\footnote{\url{https://bit.ly/vivado-hls-pointer-bug}} Meanwhile, +\citet{lidbury15_many_core_compil_fuzzin} had to abandon their attempt to fuzz-test Altera's (now +Intel's) OpenCL compiler since it ``either crashed or emitted an internal compiler error'' on so +many of their test inputs. More recently, +\citet{herklotz21_empir_study_reliab_high_level_synth_tools} fuzz-tested three commercial HLS tools +using Csmith~\cite{yang11_findin_under_bugs_c_compil}, and despite restricting the generated +programs to the C fragment explicitly supported by all the tools, they still found that on average +2.5\% of test-cases were compiled to designs that behaved incorrectly. + +\paragraph{Existing workarounds} + +Aware of the reliability shortcomings of HLS tools, hardware designers routinely check the generated +hardware for functional correctness. This is commonly done by simulating the generated design +against a large test-bench. But unless the test-bench covers all inputs exhaustively -- which is +often infeasible -- there is a risk that bugs remain. + +One alternative is to use \emph{translation validation}~\cite{pnueli98_trans} to prove equivalence +between the input program and the output design. Translation validation has been successfully +applied to several HLS +optimisations~\cite{kim04_autom_fsmd,karfa06_formal_verif_method_sched_high_synth,chouksey20_verif_sched_condit_behav_high_level_synth,banerjee14_verif_code_motion_techn_using_value_propag,chouksey19_trans_valid_code_motion_trans_invol_loops}. +Nevertheless, it is an expensive task, especially for large designs, and it must be repeated every +time the compiler is invoked. For example, the translation validation for Catapult +C~\cite{mentor20_catap_high_level_synth} may require several rounds of expert +`adjustments'~\cite[p.~3]{slec_whitepaper} to the input C program before validation succeeds. And +even when it succeeds, translation validation does not provide watertight guarantees unless the +validator itself has been mechanically proven +correct~\cite[e.g.][]{tristan08_formal_verif_trans_valid}, which has not been the case in HLS tools +to date. + +Our position is that none of the above workarounds are necessary if the HLS tool can simply be +trusted to work correctly. + +\paragraph{Our solution} + +We have designed a new HLS tool in the Coq theorem prover and proved that any output design it +produces always has the same behaviour as its input program. Our tool, called Vericert, is +automatically extracted to an OCaml program from Coq, which ensures that the object of the proof is +the same as the implementation of the tool. Vericert is built by extending the CompCert verified C +compiler~\cite{leroy09_formal_verif_realis_compil} with a new hardware-specific intermediate +language and a Verilog back end. It supports most C constructs, including integer operations, +function calls (which are all inlined), local arrays, structs, unions, and general control-flow +statements, but currently excludes support for case statements, function pointers, recursive +function calls, non-32-bit integers, floats, and global variables. + +\paragraph{Contributions and Outline} +The contributions of this paper are as follows: + +\begin{itemize} + \item We present Vericert, the first mechanically verified HLS tool that compiles C to Verilog. In + Section~\ref{sec:design}, we describe the design of Vericert, including certain + optimisations related to memory accesses and division. + \item We state the correctness theorem of Vericert with respect to an existing semantics for + Verilog due to \citet{loow19_proof_trans_veril_devel_hol}. In Section~\ref{sec:verilog}, we + describe how we extended this semantics to make it suitable as an HLS target. We also + describe how the Verilog semantics is integrated into CompCert's language execution model + and its framework for performing simulation proofs. A mapping of CompCert's infinite memory + model onto a finite Verilog array is also described. + \item In Section~\ref{sec:proof}, we describe how we proved the correctness theorem. The proof + follows standard CompCert techniques -- forward simulations, intermediate specifications, + and determinism results -- but we encountered several challenges peculiar to our + hardware-oriented + setting. + These include handling discrepancies between the byte-addressed memory assumed by the input + software and the word-addressed memory that we implement in the output hardware, different + handling of unsigned comparisons between C and Verilog, and carefully implementing memory + reads and writes so that these behave properly as a RAM in hardware. + \item In Section~\ref{sec:evaluation}, we evaluate Vericert on the PolyBench/C benchmark + suite~\cite{polybench}, and compare the performance of our generated hardware against an + existing, unverified HLS tool called LegUp~\cite{canis11_legup}. We show that Vericert + generates hardware that is \slowdownOrig$\times$ slower (\slowdownDiv$\times$ slower in the + absence of division) and \areaIncr$\times$ larger than that generated by LegUp. This + performance gap can be largely attributed to Vericert's current lack of support for + instruction-level parallelism and the absence of an efficient, pipelined division + operator. We intend to close this gap in the future by introducing (and verifying) HLS + optimisations of our own, such as scheduling and memory analysis. This section also reports + on our campaign to fuzz-test Vericert using over a hundred thousand random C programs + generated by Csmith~\cite{yang11_findin_under_bugs_c_compil} in order to confirm that its + correctness theorem is watertight. +\end{itemize} + +\paragraph{Companion material} +Vericert is fully open source and available on GitHub at +\url{https://github.com/ymherklotz/vericert}. A snapshot of the Vericert development is also +available in a Zenodo repository~\cite{yann_herklotz_2021_5093839}. -- cgit