From 433f6cf9cb6b6acfaf7094bd8727a87718a20830 Mon Sep 17 00:00:00 2001
From: Yann Herklotz <git@yannherklotz.com>
Date: Wed, 6 Apr 2022 20:48:13 +0100
Subject: Fix up the fonts

---
 hls.tex | 107 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 107 insertions(+)

(limited to 'hls.tex')

diff --git a/hls.tex b/hls.tex
index 4ea51d8..4feaa14 100644
--- a/hls.tex
+++ b/hls.tex
@@ -1 +1,108 @@
 \chapter{Formal Verification of High-Level Synthesis}
+
+\newcommand{\slowdownOrig}{27}
+\newcommand{\slowdownDiv}{2}
+\newcommand{\areaIncr}{1.1}
+
+\paragraph{Can you trust your high-level synthesis tool?}
+
+As latency, throughput, and energy efficiency become increasingly important, custom hardware
+accelerators are being designed for numerous applications.  Alas, designing these accelerators can
+be a tedious and error-prone process using a hardware description language (HDL) such as Verilog.
+An attractive alternative is \emph{high-level synthesis} (HLS), in which hardware designs are
+automatically compiled from software written in a high-level language like C.  Modern HLS tools such
+as LegUp~\cite{canis11_legup}, Vivado HLS~\cite{xilinx20_vivad_high_synth}, Intel
+i++~\cite{intel_hls}, and Bambu HLS~\cite{bambu_hls} promise designs with comparable performance and
+energy-efficiency to those hand-written in an HDL~\cite{homsirikamol+14, silexicahlshdl, 7818341},
+while offering the convenient abstractions and rich ecosystems of software development.  But
+existing HLS tools cannot always guarantee that the hardware designs they produce are equivalent to
+the software they were given, and this undermines any reasoning conducted at the software level.
+
+Indeed, there are reasons to doubt that HLS tools actually \emph{do} always preserve equivalence.
+For instance, Vivado HLS has been shown to apply pipelining optimisations
+incorrectly\footnote{\url{https://bit.ly/vivado-hls-pipeline-bug}} or to silently generate wrong
+code should the programmer stray outside the fragment of C that it
+supports.\footnote{\url{https://bit.ly/vivado-hls-pointer-bug}} Meanwhile,
+\citet{lidbury15_many_core_compil_fuzzin} had to abandon their attempt to fuzz-test Altera's (now
+Intel's) OpenCL compiler since it ``either crashed or emitted an internal compiler error'' on so
+many of their test inputs.  More recently,
+\citet{herklotz21_empir_study_reliab_high_level_synth_tools} fuzz-tested three commercial HLS tools
+using Csmith~\cite{yang11_findin_under_bugs_c_compil}, and despite restricting the generated
+programs to the C fragment explicitly supported by all the tools, they still found that on average
+2.5\% of test-cases were compiled to designs that behaved incorrectly.
+
+\paragraph{Existing workarounds}
+
+Aware of the reliability shortcomings of HLS tools, hardware designers routinely check the generated
+hardware for functional correctness. This is commonly done by simulating the generated design
+against a large test-bench. But unless the test-bench covers all inputs exhaustively -- which is
+often infeasible -- there is a risk that bugs remain.
+
+One alternative is to use \emph{translation validation}~\cite{pnueli98_trans} to prove equivalence
+between the input program and the output design. Translation validation has been successfully
+applied to several HLS
+optimisations~\cite{kim04_autom_fsmd,karfa06_formal_verif_method_sched_high_synth,chouksey20_verif_sched_condit_behav_high_level_synth,banerjee14_verif_code_motion_techn_using_value_propag,chouksey19_trans_valid_code_motion_trans_invol_loops}.
+Nevertheless, it is an expensive task, especially for large designs, and it must be repeated every
+time the compiler is invoked.  For example, the translation validation for Catapult
+C~\cite{mentor20_catap_high_level_synth} may require several rounds of expert
+`adjustments'~\cite[p.~3]{slec_whitepaper} to the input C program before validation succeeds. And
+even when it succeeds, translation validation does not provide watertight guarantees unless the
+validator itself has been mechanically proven
+correct~\cite[e.g.][]{tristan08_formal_verif_trans_valid}, which has not been the case in HLS tools
+to date.
+
+Our position is that none of the above workarounds are necessary if the HLS tool can simply be
+trusted to work correctly.
+
+\paragraph{Our solution}
+
+We have designed a new HLS tool in the Coq theorem prover and proved that any output design it
+produces always has the same behaviour as its input program. Our tool, called Vericert, is
+automatically extracted to an OCaml program from Coq, which ensures that the object of the proof is
+the same as the implementation of the tool. Vericert is built by extending the CompCert verified C
+compiler~\cite{leroy09_formal_verif_realis_compil} with a new hardware-specific intermediate
+language and a Verilog back end. It supports most C constructs, including integer operations,
+function calls (which are all inlined), local arrays, structs, unions, and general control-flow
+statements, but currently excludes support for case statements, function pointers, recursive
+function calls, non-32-bit integers, floats, and global variables.
+
+\paragraph{Contributions and Outline}
+The contributions of this paper are as follows:
+
+\begin{itemize}
+  \item We present Vericert, the first mechanically verified HLS tool that compiles C to Verilog. In
+        Section~\ref{sec:design}, we describe the design of Vericert, including certain
+        optimisations related to memory accesses and division.
+  \item We state the correctness theorem of Vericert with respect to an existing semantics for
+        Verilog due to \citet{loow19_proof_trans_veril_devel_hol}. In Section~\ref{sec:verilog}, we
+        describe how we extended this semantics to make it suitable as an HLS target.  We also
+        describe how the Verilog semantics is integrated into CompCert's language execution model
+        and its framework for performing simulation proofs. A mapping of CompCert's infinite memory
+        model onto a finite Verilog array is also described.
+  \item In Section~\ref{sec:proof}, we describe how we proved the correctness theorem. The proof
+        follows standard CompCert techniques -- forward simulations, intermediate specifications,
+        and determinism results -- but we encountered several challenges peculiar to our
+        hardware-oriented
+        setting.
+        These include handling discrepancies between the byte-addressed memory assumed by the input
+        software and the word-addressed memory that we implement in the output hardware, different
+        handling of unsigned comparisons between C and Verilog, and carefully implementing memory
+        reads and writes so that these behave properly as a RAM in hardware.
+  \item In Section~\ref{sec:evaluation}, we evaluate Vericert on the PolyBench/C benchmark
+        suite~\cite{polybench}, and compare the performance of our generated hardware against an
+        existing, unverified HLS tool called LegUp~\cite{canis11_legup}. We show that Vericert
+        generates hardware that is \slowdownOrig$\times$ slower (\slowdownDiv$\times$ slower in the
+        absence of division) and \areaIncr$\times$ larger than that generated by LegUp. This
+        performance gap can be largely attributed to Vericert's current lack of support for
+        instruction-level parallelism and the absence of an efficient, pipelined division
+        operator. We intend to close this gap in the future by introducing (and verifying) HLS
+        optimisations of our own, such as scheduling and memory analysis.  This section also reports
+        on our campaign to fuzz-test Vericert using over a hundred thousand random C programs
+        generated by Csmith~\cite{yang11_findin_under_bugs_c_compil} in order to confirm that its
+        correctness theorem is watertight.
+\end{itemize}
+
+\paragraph{Companion material}
+Vericert is fully open source and available on GitHub at
+\url{https://github.com/ymherklotz/vericert}. A snapshot of the Vericert development is also
+available in a Zenodo repository~\cite{yann_herklotz_2021_5093839}.
-- 
cgit