| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|\ |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Another remnant of trying to devise a complicated algorithm for a
problem that was, in fact, very simple: I just had to check whether the
branch was within the loop body.
I tested it functionally on the benchmarks: only heapsort is changed, in
slightly worst (4-5%), because the old get_loop_info had done a buggy
guess that proved to be lucky for that particular case.
The other benchmarks are unchanged: the predictions stay the exact same.
The get_loop_info could potentially be improved by having a natural loop
detection that extends to outer loops (not just inner loops), though I
expect the performance improvements would be very small.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
While I was developing the new trace linearize, I started off with
implementing a big algorithm reasoning on dependencies etc.., but I
realized later that it was giving a too different performance (sometimes
better, sometimes worst) than the original CompCert. So I
stripped it off gradually until its performance (on regular code with
just branch prediction) was on par with the base Linearize of CompCert.
I was aiming here for something that is either equal, or better, in
terms of performance.
My (then and current) theory is that I have stripped it out so much that
now it's just like the algorithm of CompCert, but with a modification
for Lcond instructions (see the new linearize_aux_cb). However, I never
tested that theory: the code worked, so I left it as is, without any
simplification. But now that I need to get a clear version for my
manuscript, I'm digging into it.
It turns out my theory is not really exact.
A difference is that instead of taking the minpc across the chain, I take
the pc of the very first block of the chain I create. This was (I think)
out of laziness in the middle of two iterations, except that I forgot
about it.
I tested my new theory by deleting all the stuff about dependencies
calculation (commited), and also computing a minpc just like original
compcert (not commited): I get the same exact Mach code than
linearize_aux_cb.
So right now, the only difference between linearize_aux_cb and
linearize_aux_trace is that slightly different minpc computation.
I think transitionning to linearize_aux_cb will be 1) much clearer than
this Frankenstein monster of linearize_aux_trace that I made, and 2)
might be better performing too.
I don't have access to Kalray machines today so i'm leaving this on hold
for now, but tomorrow I will test performance wise to see if there is a
regression. If there isn't, I will commit this (and it will be the
version narrated by my manuscript).
If there is a regression, it would mean selecting the pc of the first
node (in opposition to the minpc) is more performant, so i'd backtrack
the change to linearize_aux_cb anyway and there should then be 0
difference in the generated code.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It only works correctly if both profiling and static prediction are
used: it then compares both and gives stats in COMPCERT_PREDICT_STATS
file.
The stats are of the form:
total correct mispredicts missed
total = number of total CBs encountered
correct = number of correct predictions
mispredicts = times when static prediction did a wrong guess (predicted
the opposite from profiling, or predicted Some _ when profiling said
None)
missed = times when static prediction was not able to give a verdict,
though the profiling gave one
|
| | |
|
| | |
|
|/ |
|
|\
| |
| |
| | |
Merge remote-tracking branch 'origin/kvx-better2-cse3' into kvx-work
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |\ |
|
| | | |
|
| | | |
|
| |\ \ |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
|\ \ \ \
| | | | |
| | | | |
| | | | | |
gricad-gitlab.univ-grenoble-alpes.fr:sixcy/CompCert into kvx-work
|
| | | | | |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Note : the issue is still present later in Duplicateproof. This is
because I am examining an "identity ptree" which is way too big. I am
having a look to see if I could make this ptree less big - to not
include nodes that are identity
|
| | | | | |
|
|/ / / / |
|
| | | | |
|
| | | | |
|