Move signedness discussion into limitations

author: Yann Herklotz <git@yannherklotz.com> 2021-07-27 11:24:13 +0200
committer: Yann Herklotz <git@yannherklotz.com> 2021-07-27 11:24:13 +0200
commit: 019d76c997486c0a45ab2f0dd6b86b59fbd84285 (patch)
tree: caba9885fc07076cbaba8b82282d9bb359ab003d /algorithm.tex
parent: befb0f0edd9f0af617464610e08bc16f8f0ebbf2 (diff)
download: oopsla21_fvhls-019d76c997486c0a45ab2f0dd6b86b59fbd84285.tar.gz
oopsla21_fvhls-019d76c997486c0a45ab2f0dd6b86b59fbd84285.zip
1 files changed, 11 insertions, 7 deletions
diff --git a/algorithm.tex b/algorithm.tex
index 676df46..5fe6016 100644
--- a/algorithm.tex
+++ b/algorithm.tex
@@ -302,12 +302,6 @@ A high-level overview of the architecture can be seen in Figure~\ref{fig:accumul
 Each 3AC instruction either corresponds to a hardware construct, or does not have to be handled by the translation, such as function calls (because of inlining).
 For example, line 2 in Figure~\ref{fig:accumulator_rtl} shows a 32-bit register \texttt{x5} being initialised to 3, after which the control flow moves execution to line 3. This initialisation is also encoded in HTL at state 8 in both the control logic and data-path. always-blocks, shown in Figure~\ref{fig:accumulator_v}.  Simple operator instructions are translated in a similar way.
 
-C and Verilog handle signedness quite differently. By default, all operators and registers in Verilog (and HTL) are unsigned, so to force an operation to handle the bits as signed, both operators have to be forced to be signed. Moreover, Verilog implicitly resizes expressions to the largest needed size by default, which can affect the result of the computation.  This feature is not supported by the Verilog semantics we adopted, so to match the semantics to the behaviour of the simulator and synthesis tool, braces are placed around all expressions to inhibit implicit resizing.  Instead, explicit resizing is used in the semantics, and operations can only be performed on two registers that have the same size.
-
-In addition to that, equality checks between \emph{unsigned} variables is actually not supported, because this requires supporting the comparison of pointers, which should only be performed between pointers with the same provenance.  In \vericert{} there is currently no way to determine the provenance of a pointer, and it therefore cannot model the semantics of unsigned comparison in \compcert{}. This is not a severe restriction in practice, however, because, as dynamic allocation is not supported either, equality comparison of pointers is rarely needed, and equality comparison of integers can still be performed by casting them both to signed integers.
-
-Finally, the \texttt{mulhs} and \texttt{mulhu} instructions, which fetch the upper bits of a 32-bit multiplication, are not translated by \vericert{} either, because 64-bit numbers are not supported. These instructions are only generated to optimise divisions by a constant that is not a power of two, so turning off constant propagation will allow these programs to pass without error.
-
 \subsubsection{Translating HTL to Verilog}
 
 Finally, we have to translate the HTL code into proper Verilog. % and prove that it behaves the same as the 3AC according to the Verilog semantics.
@@ -342,7 +336,7 @@ Therefore, an extra compiler pass is added from HTL to HTL to extract all the di
 
 There are two interesting parts to the inserted RAM interface.  Firstly, the memory updates are triggered on the negative edge of the clock, out of phase with the rest of the design which is triggered on the positive edge of the clock.  The main advantage is that instead of loads and stores taking three clock cycles and two clock cycles respectively, they only take two clock cycles and one clock cycle instead, greatly improving their performance.  In addition to that, using the negative edge for the clock is supported by many synthesis tools, and does not affect the maximum frequency of the final design.
 
-Secondly, the logic in the enable signal of the RAM (\texttt{en != u\_en}) is also atypical.  To make the proof simpler, the goal is to create a RAM which disables itself after every use, so that firstly, the proof can assume that the RAM is disabled at the start and end of every clock cycle, and secondly so that only the state which contains the load and store need to be modified to ensure this property.  With a simple enable signal, it would not be possible to disable it in the RAM itself, because this would result in a register being driven twice from two different locations, once from the RAM interface and once from the data-path.  The only other solutions are to either insert extra states that disable the RAM accordingly, thereby eliminating the speed advantage of the negative edge triggered RAM, or to check the next state after a load and store and insert disables into that state.  The latter solution can quickly become complicated though, especially as this new state could contain another memory operation, in which case the disable signal should not be added to that state.  A solution to this problem is to create another enable signal that is controlled by the self-disabling RAM, which is always set to be equal to the enable signal set by the data-path.  The RAM is then considered enabled if the data-path enable and the RAM enable are different.
+Secondly, the logic in the enable signal of the RAM (\texttt{en != u\_en}) is also atypical.  In existing hardware designs, enables would be manually controlled and turned on and off explicitly in the data-path by the hardware designer whenever it is necessary, by using a check like the following in the RAM:\@ \texttt{en == 1}.  This means that the RAM only turns on when the enable signal is set.  However, to make the proof simpler as many side effects need to be eliminated from the hardware as possible, which can be achieved by creating a RAM which disables itself after every use, by using a toggle instead.  This ensures that the RAM will always be disabled directly after it was used without having to add extra signals in later states, as it is not a high or low value that triggers the RAM block, but the fact that the user enable signal \texttt{u\_en} changed in value. This simplifies the proof, because firstly it can assume that the RAM is disabled at the start and end of every clock cycle, and secondly only the state which contains the load and store statements need to be modified to ensure this property.  The only other solutions would be to either insert extra states that disable the RAM accordingly, thereby eliminating the speed advantage of the negative edge triggered RAM, or to check the next state after a load and store and insert disables into that state.  The latter solution can quickly become complicated though, especially as this new state could contain another memory operation, in which case the disable signal should not be added to that state.  A solution to this problem is to create another enable signal that is controlled by the self-disabling RAM, which is always set to be equal to the enable signal set by the data-path.  The RAM is then considered enabled if the data-path enable and the RAM enable are different.
 
 %We can instead generate a second enable signal that is set by the user, and the original enable signal is then updated by the RAM to be equal to the value that the user set.  This means that the RAM should be enabled whenever the two signals are different, and disabled otherwise.
 
@@ -431,6 +425,16 @@ In a similar vein, the introduction of pipelined operators, especially for divis
 
 \vericert{} is currently limited in terms of I/O as well, because the main function cannot accept any arguments for the Clight program to be well-formed.  In addition to that, external function calls that produce traces have also not been implemented yet, however, these could enable the C program to read and write values to a bus that is shared by various other components in the hardware design.
 
+\paragraph{Language restrictions}
+
+There are minor language restrictions: comparisons of unsigned integers as well as minor instructions are not supported.
+
+C and Verilog handle signedness quite differently. By default, all operators and registers in Verilog (and HTL) are unsigned, so to force an operation to handle the bits as signed, both operators have to be forced to be signed. Moreover, Verilog implicitly resizes expressions to the largest needed size by default, which can affect the result of the computation.  This feature is not supported by the Verilog semantics we adopted, so to match the semantics to the behaviour of the simulator and synthesis tool, braces are placed around all expressions to inhibit implicit resizing.  Instead, explicit resizing is used in the semantics, and operations can only be performed on two registers that have the same size.
+
+In addition to that, equality checks between \emph{unsigned} variables are actually not supported, because this requires supporting the comparison of pointers, which should only be performed between pointers with the same provenance.  In \vericert{} there is currently no way to determine the provenance of a pointer, and it therefore cannot model the semantics of unsigned comparison in \compcert{}. This is not a severe restriction in practice, however, because, as dynamic allocation is not supported either, equality comparison of pointers is rarely needed, and equality comparison of integers can still be performed by casting them both to signed integers.
+
+Finally, the \texttt{mulhs} and \texttt{mulhu} instructions, which fetch the upper bits of a 32-bit multiplication, are not translated by \vericert{} either, because 64-bit numbers are not supported. These instructions are only generated to optimise divisions by a constant that is not a power of two, so turning off constant propagation will allow these programs to pass without error.
+
 %%% Local Variables:
 %%% mode: latex
 %%% TeX-master: "main"
author	Yann Herklotz <git@yannherklotz.com>	2021-07-27 11:24:13 +0200
committer	Yann Herklotz <git@yannherklotz.com>	2021-07-27 11:24:13 +0200
commit	019d76c997486c0a45ab2f0dd6b86b59fbd84285 (patch)
tree	caba9885fc07076cbaba8b82282d9bb359ab003d /algorithm.tex
parent	befb0f0edd9f0af617464610e08bc16f8f0ebbf2 (diff)
download	oopsla21_fvhls-019d76c997486c0a45ab2f0dd6b86b59fbd84285.tar.gz oopsla21_fvhls-019d76c997486c0a45ab2f0dd6b86b59fbd84285.zip