summaryrefslogtreecommitdiffstats
path: root/evaluation.tex
diff options
context:
space:
mode:
authorJohn Wickerson <j.wickerson@imperial.ac.uk>2020-11-20 22:32:06 +0000
committeroverleaf <overleaf@localhost>2020-11-20 22:32:09 +0000
commit221579087f9442b9f66ff66c381433464f3fab35 (patch)
tree3760b0e9467639ee621506d41b1a78552b3575e1 /evaluation.tex
parent94c69b06fc80d399a7084745b18b01a067c69a1f (diff)
downloadoopsla21_fvhls-221579087f9442b9f66ff66c381433464f3fab35.tar.gz
oopsla21_fvhls-221579087f9442b9f66ff66c381433464f3fab35.zip
Update on Overleaf.
Diffstat (limited to 'evaluation.tex')
-rw-r--r--evaluation.tex4
1 files changed, 3 insertions, 1 deletions
diff --git a/evaluation.tex b/evaluation.tex
index 4d4eb32..056aa18 100644
--- a/evaluation.tex
+++ b/evaluation.tex
@@ -101,7 +101,9 @@ It is notable that even without \vericert{} performing many optimisations, a few
%We are very encouraged by these data points.
As we improve \vericert{} by incorporating further optimisations, this gap should reduce whilst preserving our correctness guarantees.
-Cycle count is one factor in calculating execution latency. The other factor is the clock frequency, which tells us how fast a cycle executes. Figure~\ref{fig:comparison_time} compares the execution times of \vericert{} and \legup{}. On average, \vericert{} designs are about $10\times$ slower than \legup{} designs. As mentioned earlier, we modify the Polybench programs to utilise C-based divides and modulos. We had to avoid using the built-in Verilog divides and modulos, since Quartus interprets them as single-cycle operations. This interpretation affects clock frequency drastically. On average, when using the built-in Verilog approach, \vericert{}'s clock frequency was 21MHz, compared to \legup{}'s clock frequency of 247MHz. By moving to the C-based approach, our average clock frequency is now 112MHz. Hence, we reduce the frequency gap from approximately $12\times$ to $2\times$. This gap still exists because \legup{} uses various optimisation tactics and Intel-specific IPs, which requires further engineering effort and testing from our side.
+Cycle count is one factor in calculating execution times; the other is the clock frequency, which determines the duration of each of these cycles. Figure~\ref{fig:comparison_time} compares the execution times of \vericert{} and \legup{}. Across the original Polybench/C benchmarks, we see that \vericert{} designs are about $60\times$ slower than \legup{} designs. This more dramatic
+
+As mentioned earlier, we modify the Polybench programs to utilise C-based divides and modulos. We had to avoid using the built-in Verilog divides and modulos, since Quartus interprets them as single-cycle operations. This interpretation affects clock frequency drastically. On average, when using the built-in Verilog approach, \vericert{}'s clock frequency was 21MHz, compared to \legup{}'s clock frequency of 247MHz. By moving to the C-based approach, our average clock frequency is now 112MHz. Hence, we reduce the frequency gap from approximately $12\times$ to $2\times$. This gap still exists because \legup{} uses various optimisation tactics and Intel-specific IPs, which requires further engineering effort and testing from our side.
% The difference in cycle counts shows the degree of parallelism that \legup{}'s scheduling and memory system can offer. However, their Verilog generation is not guaranteed to be correct. Although the runtime LegUp outputs are tested to be correct for these programs, this does not provide any confidence on the correctness of Verilog generation of any other programs. Our Coq proof mechanisation guarantees that generated Verilog is correct for any input program that uses the subset of \compcert{} instructions that we have proven to be correct.