summaryrefslogtreecommitdiffstats
path: root/chapters/scheduling.tex
diff options
context:
space:
mode:
Diffstat (limited to 'chapters/scheduling.tex')
-rw-r--r--chapters/scheduling.tex34
1 files changed, 20 insertions, 14 deletions
diff --git a/chapters/scheduling.tex b/chapters/scheduling.tex
index 9412ea4..4b897ef 100644
--- a/chapters/scheduling.tex
+++ b/chapters/scheduling.tex
@@ -10,7 +10,7 @@
\oindex{static scheduling}The use of multicore processors has been increasing drastically, thereby
making parallelising compilers ever more important. In addition to that, the need for custom
hardware accelerators is also increasing, as traditional processors are not always the best choice
-for all applications. Compilers that support optimisations which can automatically parallelise it's
+for all applications. Compilers that support optimisations which can automatically parallelise its
input programs are therefore also becoming more important, as these back ends all benefit from
it. For processors, one wants to optimise the pipeline usage as much as possible, whereas in
hardware one can create custom pipelines, as well as making more use out of the spacial properties
@@ -31,7 +31,7 @@ can never move instructions past a branching instruction, even when this might d
the performance. Trace scheduling~\cite[fisher81_trace_sched] is an alternative that addresses this
issue, by creating paths that might cross branching instructions, and scheduling the instructions in
that path. However, the main problem with trace scheduling is that it is often infeasible on large
-programs, and because of the large design space, it's sometimes hard to find an optimal
+programs, and because of the large design space, it is sometimes hard to find an optimal
schedule. \index{superblock}Superblock~\cite[hwu93_super] and
\index{hyperblock}hyperblock~\cite[mahlke92_effec_compil_suppor_predic_execut_using_hyper]
scheduling are two subsets of trace scheduling, which compromise on the flexibility of trace
@@ -254,23 +254,29 @@ hyperblocks. It has a few limitations on the kind of conditional statements that
First, only conditional statements without loops can be translated, therefore, one must identify
cycles in the control-flow before performing the if-conversion. Secondly, if-conversion will not
always result in more efficient code, so it should not be applied to any conditional
-statements. Instead, it is best applied to conditional statements where each branch will take a
-similar amount of time to execute.
+statements.
From an implementation point of view, if-conversion can be implemented naïvely, however, it could be
improved by adding better heuristics for which conditional statements to transform into predicated
instructions. For hyperblocks, conditionals where each branch approximately execute in the same
number of cycles are an ideal candidate for the transformation, as it means that no time is lost due
-to having to execute extra predicated instructions that are false. However, in addition to that if
-one has an estimate on the probability of picking a branch, then one can introduce more interesting
-heuristics such as also transforming conditionals who have disproportionate branches, but where one
-is more likely to execute the longest branch every time.
+to having to execute extra predicated instructions that are false. However, as hyperblocks are an
+extension of superblocks, single paths through the control-flow graph can also be represented. In
+our implementation, we use simple static heuristics to pick these paths~\cite[??].
-For back ends that do not support predicates, it is also important to eliminate the predicates again
-using a process called reverse if-conversion, which creates branches out of the predicates and
-groups these together again. This allows for hyperblock scheduling to also be used with back ends
-that do not support predicates, which are most the default back ends included in CompCert. However,
-this is currently out of scope for the implementation.
+\subsection[proof-if-conversion]{Correctness of If-Conversion}
+
+The correctness of the if-conversion pass does not rely on which paths are chosen to be combined.
+The proof therefore does not have to cover the static heuristics that are generated, because these
+only affect the performance. Even though the translation is quite simple, the reasoning about it's
+correctness does have its difficulties. We have a similar problem as before, where we sometimes
+need to execute multiple steps in the input program to reach one step in the output program. The
+solution is to perform a similar style proof as for the basic block generation.
+
+The main difference is the matching condition, as the hyperblock is not reduced instruction by
+instruction anymore, but section by section, depending on if a section was if-converted or not.
+This means that at each point, one is either at the head of a section or in the middle of an
+if-converted section.
\section[sec:scheduling]{Scheduling Implementation}
@@ -681,7 +687,7 @@ grained comparison are the following:
predicates are combined using the or operation.
\item
Iterate through the expressions in both maps and compare the predicates using the SAT solver. If
- an expression is not present in one of the maps, then it's predicate should be equivalent to
+ an expression is not present in one of the maps, then its predicate should be equivalent to
$\perp$.
\stopitemize