aboutsummaryrefslogtreecommitdiffstats
path: root/scheduling/BTLroadmap.md
blob: 27e74de553d9034de1c52b8990123061a8ca4747 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
# BTL Development Roadmap

BTL aims to be an IR dedicated to defensive certification of middle-end optimizations (before register allocation).
It provides a CFG of "loop-free" blocks, where each block is run in one step emitting at most a single observational event.
The "local" optimizations (i.e. preserving "locally" the semantics of such blocks) would be checked by symbolic execution with rewriting rules.
The only info required from oracles: a "dupmap" mapping block entries.
In contrast, the "global" optimizations would require some invariants annotations at block entry (provided by oracles).

Examples of optimizations that we aim to support:

 - instruction scheduling
 - instruction rewritings (instruction selection)
 - branch duplication, "if-lifting" (e.g. side-exit moved up in "traces").
 - strength-reduction
 - SSA optimizations

We sketch below the various kinds of supported optimizations in development order...

## Block boundaries, branch duplication or factorization

Two possibility of branch duplications (e.g tail-duplication, loop unrolling, etc):

- during RTL -> BTL (while "building" BTL blocks)
- during BTL -> BTL. Typically, the "if-lifting" à la Justus could be performed/verified in a single pass here !

Branch factorization should also be possible in BTL -> RTL pass. Example: revert "loop-unrolling".

**IMPLEM NOTE:** a single verifier for RTL -> BTL and BTL -> RTL simulations, with a dupmap: BTL block-entries -> RTL nodes.

**CURRENT STATUS** 

- verifier: specified but not yet implemented.
- BTL -> RTL: proof started.

## "Basic" symbolic execution / symbolic simulation

IMPLEM PLAN in two steps:

1. a very basic version without rewriting rules, and without support for simulation modulo-liveness.

2. Support a "SVundef" symbolic value, and implement symbolic simulation wrt "less_undef" relation. 
This will allow more general rewriting rules and will generalize modulo-liveness (see next section).

**IMPLEM NOTE:** use a symbolic execution in Continuation-Passing Style, in order to have of simple handling of "side-exits" within "branch unfolding". 

## Simulation modulo liveness

Extends BTL with the possibility of destroying a set of registers at each exit (a destroyed register is simply set to Vundef).

We can the provide a "MarkUnused" pass of BTL -> BTL that insert these destructions at each exit using liveness info !
(NB: the liveness info could be provided by the RTL -> BTL oracle like on RTLpath).

In the symbolic execution, destroying a register corresponds to associate it SVundef value.

NB: this approach through "destroyed registers" and "SVundef" seems more modular (thus simpler) and more general than the current approach on RTLpath.

The proof of "simulation modulo liveness" of a BTL B1 by a BTL B2 will simply corresponds to compose two forward simulations:  B1 --> MarkUnused(B1) --> B2.

**IMPLEM NOTE:** "MarkUnused" checks the correctness of liveness informations while performing its transformation.
It should computes alive register from block entry, by forward proceeding of "internal instructions" (like in RTLPathLivegen).
When arriving on a final instruction, we marked as "destroyed" all alive registers that are not in the input registers of the exit
Semantically, the destruction of registers happens "in parallel" of the final step of the block.

## Port of RTLpath optimizations to BTL

- Generalize superblock scheduling for a notion of "extended-blocks" such that each instruction of a block, except the block entry, has a single predecessor in the block. 
- Port rewriting rules of RTLpath.
- Justus's "poor man SSA" + if-lifting.

## Invariants at block entry

Extends the symbolic simulation test to support invariants at block entry, including global renaming of registers.

Case-study: support of strength-reduction.

## Support of SSA-optimizations

Extends BTL with "register renamings" at exits (in a sense, "register renaming" extends the feature of "register destroying").

This should enable to represent SSA-forms in BTL IR, more or less like in MLIR.

## Alias analysis in the symbolic simulation

...