| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|\ |
|
| | |
|
| | |
|
|\| |
|
| |\ |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Another remnant of trying to devise a complicated algorithm for a
problem that was, in fact, very simple: I just had to check whether the
branch was within the loop body.
I tested it functionally on the benchmarks: only heapsort is changed, in
slightly worst (4-5%), because the old get_loop_info had done a buggy
guess that proved to be lucky for that particular case.
The other benchmarks are unchanged: the predictions stay the exact same.
The get_loop_info could potentially be improved by having a natural loop
detection that extends to outer loops (not just inner loops), though I
expect the performance improvements would be very small.
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
While I was developing the new trace linearize, I started off with
implementing a big algorithm reasoning on dependencies etc.., but I
realized later that it was giving a too different performance (sometimes
better, sometimes worst) than the original CompCert. So I
stripped it off gradually until its performance (on regular code with
just branch prediction) was on par with the base Linearize of CompCert.
I was aiming here for something that is either equal, or better, in
terms of performance.
My (then and current) theory is that I have stripped it out so much that
now it's just like the algorithm of CompCert, but with a modification
for Lcond instructions (see the new linearize_aux_cb). However, I never
tested that theory: the code worked, so I left it as is, without any
simplification. But now that I need to get a clear version for my
manuscript, I'm digging into it.
It turns out my theory is not really exact.
A difference is that instead of taking the minpc across the chain, I take
the pc of the very first block of the chain I create. This was (I think)
out of laziness in the middle of two iterations, except that I forgot
about it.
I tested my new theory by deleting all the stuff about dependencies
calculation (commited), and also computing a minpc just like original
compcert (not commited): I get the same exact Mach code than
linearize_aux_cb.
So right now, the only difference between linearize_aux_cb and
linearize_aux_trace is that slightly different minpc computation.
I think transitionning to linearize_aux_cb will be 1) much clearer than
this Frankenstein monster of linearize_aux_trace that I made, and 2)
might be better performing too.
I don't have access to Kalray machines today so i'm leaving this on hold
for now, but tomorrow I will test performance wise to see if there is a
regression. If there isn't, I will commit this (and it will be the
version narrated by my manuscript).
If there is a regression, it would mean selecting the pc of the first
node (in opposition to the minpc) is more performant, so i'd backtrack
the change to linearize_aux_cb anyway and there should then be 0
difference in the generated code.
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
It only works correctly if both profiling and static prediction are
used: it then compares both and gives stats in COMPCERT_PREDICT_STATS
file.
The stats are of the form:
total correct mispredicts missed
total = number of total CBs encountered
correct = number of correct predictions
mispredicts = times when static prediction did a wrong guess (predicted
the opposite from profiling, or predicted Some _ when profiling said
None)
missed = times when static prediction was not able to give a verdict,
though the profiling gave one
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | | |
cfrontend/C2C.ml
|
| | | |
|
|\ \ \
| |/ /
|/| |
| | |
| | |
| | |
| | |
| | |
| | | |
PARTIAL MERGE (PARTLY BROKEN).
See unsolved conflicts in: aarch64/TO_MERGE and riscV/TO_MERGE
WARNING:
interface of va_args and assembly sections have changed
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
On PowerPC/Diab, common declarations must not be used for small data sections.
Add a `~common` option to `PrintAsmaux.variable_section` to control
the use of common declarations. The default is whatever is specified
on the command line using the `-fcommon` and `-fno-common` options.
Use `~common:false` for `Section_small_data` on PowerPC / Diab.
Note that on PowerPC/Linux, GCC uses common declarations for uninitialized
variables in small data section, so we keep doing this in CompCert as well.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Distinguish between:
- uninitialized variables, which can go in COMM if supported
- variables initialized with fixed, numeric quantities,
which can go in a readonly section if "const"
- variables initialized with symbol addresses which may need relocation,
which cannot go in a readonly section even if "const",
but can go in a special "const_data" section.
Also: on macOS, use ".const" instead of ".literal8" for literals,
as not all literals have size 8.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is a generalization of the previous PrintAsmaux.common_section
function that
- handles initialized variables in addition to uninitialized variables;
- can be used for Section_const, not just for Section_data.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This avoids a new warning of Coq 8.13.
Eventually these `Global Hint` should become `#[export] Hint`,
with a cleaner but different meaning than `Global Hint`.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The previous branch tunneling was missing optimization opportunities
introduced by the optimization of conditional branches. For example:
L1: instr; branch L2
L2: if cond then branch L3 else branch L4
L3: branch L4
L4: ...
was transformed into
L1: instr; branch L2
L2: branch L4
L3: branch L4
L4: ...
missing a tunneling opportunity (branch L2 -> branch L4).
This commit improves branch tunneling so that the expected code is produced:
L1: instr; branch L4
L2: branch L4
L3: branch L4
L4: ...
To this end, additional equalities are introduced in the union-find
data structure corresponding to optimizable conditional branches.
In rare cases these additional equalities trigger new opportunities for
optimizing conditional branches. Hence we iterate the analysis
until no optimizable conditional branch remains.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We used to define an instrumented version record_goto' that also
builds the measure f, prove it correct, then show equivalence with
record_goto.
The new proofs make do without the instrumented version. They prove
strong existence of the measure, as in
`{ f | branch_map_correct (record_goto fn) f}`.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Since Coq 8.12, `omega` is flagged as deprecated and scheduled for removal.
Also replace CompCert's homemade tactics `omegaContradiction`, `xomega`,
and `xomegaContradiction` with `lia` and `extlia`.
Turn back on the deprecation warning for uses of `omega`.
Make the proof of `Ctypes.sizeof_pos` more robust to variations in `lia`.
|
| | |
| | |
| | |
| | |
| | |
| | | |
Instead of being a simple boolean we now use an option type to record
the number of fixed (non-vararg) arguments. Hence, `None` means
not vararg, and `Some n` means `n` fixed arguments followed with varargs.
|
| |/
|/| |
|
|\ \
| | |
| | |
| | | |
Merge remote-tracking branch 'origin/kvx-better2-cse3' into kvx-work
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| |\ \ |
|
| | | | |
|
| | | | |
|