summaryrefslogtreecommitdiffstats
path: root/chapters
diff options
context:
space:
mode:
Diffstat (limited to 'chapters')
-rw-r--r--chapters/background.tex2
-rw-r--r--chapters/hls.tex46
-rw-r--r--chapters/introduction.tex35
3 files changed, 47 insertions, 36 deletions
diff --git a/chapters/background.tex b/chapters/background.tex
index 81ca653..6ca5563 100644
--- a/chapters/background.tex
+++ b/chapters/background.tex
@@ -13,7 +13,7 @@
% Catapult C~\cite{mentor20_catap_high_level_synth} & \cmark{} & \cmark{} & \cmark{} & \xmark{} & \cmark{}\\
% Chapman \textit{et al.}~\cite{chapman92_verif_bedroc} & \cmark{} & \xmark{} & \cmark{} & \cmark{} & \cmark{}\\
% Clarke \textit{et al.}~\cite{clarke03_behav_c_veril} & \cmark{} & \cmark{} & \cmark{} & \xmark{} & \cmark{}\\
-% Ellis~\cite{ellis08} & \xmark{} & \xmark{} & \cmark{} & \cmark{} & \cmark{}\\
+% Ellis~\cite{ellis08_correc} & \xmark{} & \xmark{} & \cmark{} & \cmark{} & \cmark{}\\
% Karfa \textit{et al.}~\cite{karfa06_formal_verif_method_sched_high_synth} & \cmark{} & \xmark{} & \cmark{} & \xmark{} & \cmark{}\\
% Kundu \textit{et al.}~\cite{kundu08_valid_high_level_synth} & \cmark{} & \xmark{} & \cmark{} & \xmark{} & \cmark\\
% LegUp~\cite{canis13_legup} & \cmark{} & \cmark{} & \xmark{} & \xmark{} & \cmark{}\\
diff --git a/chapters/hls.tex b/chapters/hls.tex
index 39da3b3..18227e6 100644
--- a/chapters/hls.tex
+++ b/chapters/hls.tex
@@ -9,6 +9,10 @@
\chapter[sec:hls]{High-Level Synthesis}
+This chapter outlines the base implementation of a formally verified \HLS\ tool called
+Vericert~\cite{herklotz21_veric}. This chapter is based on a paper describing the initial
+implementation in detail~\citef{herklotz21_formal_verif_high_level_synth}.
+
We have designed a new HLS tool in the Coq theorem prover and proved that any output design it
produces always has the same behaviour as its input program. Our tool, called Vericert, is
automatically extracted to an OCaml program from Coq, which ensures that the object of the proof is
@@ -39,12 +43,12 @@ function calls, non-32-bit integers, floats, and global variables.
handling of unsigned comparisons between C and Verilog, and carefully implementing memory reads
and writes so that these behave properly as a RAM in hardware.
\item In \in{Section}[sec:hls:evaluation], we evaluate Vericert on the PolyBench/C benchmark
- suite~\cite[polybench], and compare the performance of our generated hardware against an existing,
- unverified HLS tool called LegUp~\cite[canis11_legup]. We show that Vericert generates hardware
- that is \slowdownOrig$\times$ slower (\slowdownDiv$\times$ slower in the absence of division) and
- \areaIncr$\times$ larger than that generated by LegUp. This performance gap can be largely
- attributed to Vericert's current lack of support for instruction-level parallelism and the absence
- of an efficient, pipelined division operator. We intend to close this gap in the future by
+ suite~\cite[pouchet13_polyh], and compare the performance of our generated hardware against an
+ existing, unverified HLS tool called LegUp~\cite[canis11_legup]. We show that Vericert generates
+ hardware that is \slowdownOrig$\times$ slower (\slowdownDiv$\times$ slower in the absence of
+ division) and \areaIncr$\times$ larger than that generated by LegUp. This performance gap can be
+ largely attributed to Vericert's current lack of support for instruction-level parallelism and the
+ absence of an efficient, pipelined division operator. We intend to close this gap in the future by
introducing (and verifying) HLS optimisations of our own, such as scheduling and memory analysis.
This section also reports on our campaign to fuzz-test Vericert using over a hundred thousand
random C programs generated by Csmith~\cite{yang11_findin_under_bugs_c_compil} in order to confirm
@@ -60,7 +64,7 @@ function calls, non-32-bit integers, floats, and global variables.
\blank[big]
A snapshot of the Vericert development is also available in a Zenodo
-repository~\cite{yann_herklotz_2021_5093839}.
+repository~\citef{herklotz21_veric}.
\section[sec:hls:design]{Designing a Verified HLS Tool}
@@ -72,15 +76,15 @@ into hardware, expressed in the Verilog language.
\paragraph{Choice of source language.} C was chosen as the source language as it remains the most
common source language amongst production-quality HLS tools~\cite[canis11_legup,
- xilinx20_vivad_high_synth, intel_hls, bambu_hls]. This, in turn, may be because it is
-\quotation{[t]he starting point for the vast majority of algorithms to be implemented in
- hardware}~\cite{5522874}, lending a degree of practicality. The availability of
+ xilinx20_vivad_high_synth, intel20_high_synth_compil, pilato13_bambu]. This, in turn, may be
+because it is \quotation{[t]he starting point for the vast majority of algorithms to be implemented
+ in hardware}~\cite{gajski10_what_hls}, lending a degree of practicality. The availability of
CompCert~\cite{leroy09_formal_verif_realis_compil} also provides a solid basis for formally verified
C compilation. We considered Bluespec~\cite{nikhil04_blues_system_veril}, but decided that although
-it \quotation{can be classed as a high-level language}~\cite{greaves_note}, it is too hardware-oriented to
-be suitable for traditional HLS. We also considered using a language with built-in parallel
-constructs that map well to parallel hardware, such as occam~\cite{page91_compil_occam},
-Spatial~\cite{spatial} or Scala~\cite{chisel}.
+it \quotation{can be classed as a high-level language}~\cite{greaves19_resear_note}, it is too
+hardware-oriented to be suitable for traditional HLS. We also considered using a language with
+built-in parallel constructs that map well to parallel hardware, such as
+occam~\cite{page91_compil_occam}, Spatial~\cite{spatial} or Scala~\cite{bachrach12_chisel}.
\paragraph{Choice of target language.} Verilog~\cite{06_ieee_stand_veril_hardw_descr_languag} is an
HDL that can be synthesised into logic cells which can either be placed onto a field-programmable
@@ -102,9 +106,9 @@ validated C parser~\cite{jourdan12_valid_lr_parser}. The Vellvm
framework~\cite{zhao12_formal_llvm_inter_repres_verif_progr_trans} was also considered because
several existing HLS tools are already LLVM-based, but additional work would be required to support
a high-level language like C as input. The .NET framework has been used as a basis for other HLS
-tools, such as Kiwi~\cite{kiwi}, and LLHD~\cite{schuiki20_llhd} has been recently proposed as an
-intermediate language for hardware design, but neither are suitable for us because they lack formal
-semantics.
+tools, such as Kiwi~\cite{greaves08_kiwi}, and LLHD~\cite{schuiki20_llhd} has been recently proposed
+as an intermediate language for hardware design, but neither are suitable for us because they lack
+formal semantics.
\startplacefigure[title={Vericert as a Verilog back end to CompCert.}]
\hbox{\starttikzpicture [language/.style={fill=white,rounded corners=3pt,minimum height=7mm},
@@ -164,10 +168,10 @@ We select CompCert's three-address code (3AC)\footnote{This is known as register
tool.} as the starting point. Branching off \emph{before} this point (at CminorSel or earlier)
denies CompCert the opportunity to perform optimisations such as constant propagation and dead-code
elimination, which, despite being designed for software compilers, have been found useful in HLS
-tools as well~\cite{cong+11}. And if we branch off \emph{after} this point (at LTL or later) then
-CompCert has already performed register allocation to reduce the number of registers and spill some
-variables to the stack; this transformation is not required in HLS because there are many more
-registers available, and these should be used instead of RAM whenever possible.
+tools as well~\cite{cong11_high_level_synth_fpgas}. And if we branch off \emph{after} this point (at
+LTL or later) then CompCert has already performed register allocation to reduce the number of
+registers and spill some variables to the stack; this transformation is not required in HLS because
+there are many more registers available, and these should be used instead of RAM whenever possible.
3AC is also attractive because it is the closest intermediate language to LLVM IR, which is used by
several existing HLS compilers. It has an unlimited number of pseudo-registers, and is represented
diff --git a/chapters/introduction.tex b/chapters/introduction.tex
index d96ff55..8588d3f 100644
--- a/chapters/introduction.tex
+++ b/chapters/introduction.tex
@@ -3,21 +3,28 @@
\chapter{Introduction}
\startthesis
-An optimising high-level synthesis tool can be proven correct using an interactive theorem prover
-while also remaining practical.
+ An optimising \infull{HLS} tool can be proven correct using an interactive theorem prover while
+ also remaining practical.
\stopthesis
\noindent As latency, throughput, and energy efficiency become increasingly important, custom
hardware accelerators are being designed for numerous applications. Alas, designing these
accelerators can be a tedious and error-prone process using a hardware description language (HDL)
-such as Verilog. An attractive alternative is \emph{high-level synthesis} (HLS), in which hardware
-designs are automatically compiled from software written in a high-level language like C. Modern
-HLS tools such as LegUp~\cite{canis11_legup}, Vivado HLS~\cite{xilinx20_vivad_high_synth}, Intel
-i++~\cite{intel_hls}, and Bambu HLS~\cite{bambu_hls} promise designs with comparable performance and
-energy-efficiency to those hand-written in an HDL~\cite{homsirikamol+14, silexicahlshdl, 7818341},
-while offering the convenient abstractions and rich ecosystems of software development. But
-existing HLS tools cannot always guarantee that the hardware designs they produce are equivalent to
-the software they were given, and this undermines any reasoning conducted at the software level.
+such as Verilog. An attractive alternative is \emph{\infull{HLS}} (\HLS), in which hardware designs
+are automatically compiled from software written in a high-level language like C. Modern \HLS\
+tools such as LegUp~\cite{canis11_legup}, Vivado HLS~\cite{xilinx20_vivad_high_synth}, Intel
+i++~\cite{intel20_high_synth_compil}, and Bambu HLS~\cite{pilato13_bambu} promise designs with
+comparable performance and energy-efficiency to those hand-written in an
+HDL~\cite{homsirikamol14_can, gauthier20_high_level_synth, pelcat16_desig_hdl}, while offering the
+convenient abstractions and rich ecosystems of software development.
+
+There are two main use-cases of \HLS. Firstly, it is a very convenient tool to use for prototyping
+designs, as this means that the software model can be reused for the initial design of the hardware.
+Secondly, \HLS\ is also often used design hardware that has quite regular control-flow and is
+compute intensive, one example being DSP hardware design.
+
+But existing HLS tools cannot always guarantee that the hardware designs they produce are equivalent
+to the software they were given, and this undermines any reasoning conducted at the software level.
Indeed, there are reasons to doubt that HLS tools actually \emph{do} always preserve equivalence.
For instance, Vivado HLS has been shown to apply pipelining optimisations
@@ -27,7 +34,7 @@ supports.\footnote{\goto{bit.ly/vivado-hls-pointer-bug}[url(https://bit.ly/vivad
Meanwhile, \cite{lidbury15_many_core_compil_fuzzin} had to abandon their attempt to fuzz-test
Altera's (now Intel's) OpenCL compiler since it \quotation{either crashed or emitted an internal
compiler error} on so many of their test inputs. More recently,
-\cite{herklotz21_empir_study_reliab_high_level_synth_tools} fuzz-tested three commercial HLS tools
+\citef{herklotz21_empir_study_reliab_high_level_synth_tools} fuzz-tested three commercial HLS tools
using Csmith~\cite{yang11_findin_under_bugs_c_compil}, and despite restricting the generated
programs to the C fragment explicitly supported by all the tools, they still found that on average
2.5\% of test-cases were compiled to designs that behaved incorrectly.
@@ -44,9 +51,9 @@ optimisations~\cite{kim04_autom_fsmd,karfa06_formal_verif_method_sched_high_synt
Nevertheless, it is an expensive task, especially for large designs, and it must be repeated every
time the compiler is invoked. For example, the translation validation for Catapult
C~\cite{mentor20_catap_high_level_synth} may require several rounds of expert
-`adjustments'~\cite[alternative=authoryear,right={, p.~3)}][slec_whitepaper] to the input C program
-before validation succeeds. And even when it succeeds, translation validation does not provide
-watertight guarantees unless the validator itself has been mechanically proven
+`adjustments'~\cite[alternative=authoryear,right={, p.~3)}][chauhan20_formal_ensur_equiv_c_rtl] to
+the input C program before validation succeeds. And even when it succeeds, translation validation
+does not provide watertight guarantees unless the validator itself has been mechanically proven
correct~\cite[tristan08_formal_verif_trans_valid], which has not been the case in HLS tools to date.
Our position is that none of the above workarounds are necessary if the HLS tool can simply be