aboutsummaryrefslogtreecommitdiffstats
path: root/backend
Commit message (Collapse)AuthorAgeFilesLines
...
* | | | Monday's work on RTLTunnelingproofPierre Goutagny2021-06-071-31/+153
| | | |
* | | | Add RTLTunnelingproof.vPierre Goutagny2021-06-041-0/+170
| | | |
* | | | Fix check_instr Icond target conditionsPierre Goutagny2021-06-041-2/+2
| | | |
* | | | Write RTLTunneling.vPierre Goutagny2021-06-031-0/+125
| | | |
* | | | Add RTLTunneling.vPierre Goutagny2021-06-031-0/+0
|/ / /
* | | Merge remote-tracking branch 'verimag/manuscript' into kvx-workCyril SIX2021-06-011-13/+11
|\ \ \
| * | | Do not rotate if the CB was already at the end.Cyril SIX2021-04-281-1/+5
| | | |
| * | | Heuristic counter updateCyril SIX2021-04-281-12/+6
| |/ /
* | | Merge remote-tracking branch 'origin/kvx-work' into merge_master_8.13.1Cyril SIX2021-06-014-545/+286
|\| |
| * | Merge remote-tracking branch 'origin/manuscript' into kvx-worksubmission_OOPSLA2021_AARCH64_KVXCyril SIX2021-04-133-511/+129
| |\ \
| | * | CleaningCyril SIX2021-04-021-2/+1
| | | |
| | * | More efficientCyril SIX2021-04-021-8/+12
| | | |
| | * | Outermost loop detection worksCyril SIX2021-04-021-9/+10
| | | |
| | * | Getting all loop bodiesCyril SIX2021-04-021-2/+36
| | | |
| | * | get_loop_headers simplification (using the new get_loop_backedges)Cyril SIX2021-04-021-39/+7
| | | |
| | * | Simple backedge detection (modified code from get_loop_headers)Cyril SIX2021-04-022-0/+43
| | | |
| | * | Big simplification of get_loop_infoCyril SIX2021-03-311-111/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Another remnant of trying to devise a complicated algorithm for a problem that was, in fact, very simple: I just had to check whether the branch was within the loop body. I tested it functionally on the benchmarks: only heapsort is changed, in slightly worst (4-5%), because the old get_loop_info had done a buggy guess that proved to be lucky for that particular case. The other benchmarks are unchanged: the predictions stay the exact same. The get_loop_info could potentially be improved by having a natural loop detection that extends to outer loops (not just inner loops), though I expect the performance improvements would be very small.
| | * | Simplification of the Linearize heuristic (same result functionally)Cyril SIX2021-03-301-216/+6
| | | |
| | * | Simplifications on Linearize - details belowCyril SIX2021-03-291-205/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While I was developing the new trace linearize, I started off with implementing a big algorithm reasoning on dependencies etc.., but I realized later that it was giving a too different performance (sometimes better, sometimes worst) than the original CompCert. So I stripped it off gradually until its performance (on regular code with just branch prediction) was on par with the base Linearize of CompCert. I was aiming here for something that is either equal, or better, in terms of performance. My (then and current) theory is that I have stripped it out so much that now it's just like the algorithm of CompCert, but with a modification for Lcond instructions (see the new linearize_aux_cb). However, I never tested that theory: the code worked, so I left it as is, without any simplification. But now that I need to get a clear version for my manuscript, I'm digging into it. It turns out my theory is not really exact. A difference is that instead of taking the minpc across the chain, I take the pc of the very first block of the chain I create. This was (I think) out of laziness in the middle of two iterations, except that I forgot about it. I tested my new theory by deleting all the stuff about dependencies calculation (commited), and also computing a minpc just like original compcert (not commited): I get the same exact Mach code than linearize_aux_cb. So right now, the only difference between linearize_aux_cb and linearize_aux_trace is that slightly different minpc computation. I think transitionning to linearize_aux_cb will be 1) much clearer than this Frankenstein monster of linearize_aux_trace that I made, and 2) might be better performing too. I don't have access to Kalray machines today so i'm leaving this on hold for now, but tomorrow I will test performance wise to see if there is a regression. If there isn't, I will commit this (and it will be the version narrated by my manuscript). If there is a regression, it would mean selecting the pc of the first node (in opposition to the minpc) is more performant, so i'd backtrack the change to linearize_aux_cb anyway and there should then be 0 difference in the generated code.
| * | | Adding more precise heuristic measuresCyril SIX2021-04-131-2/+62
| | | |
| * | | Adding overpredictsCyril SIX2021-04-131-3/+6
| | | |
| * | | Recording of prediction stats with COMPCERT_PROFILING_STATS environment flagCyril SIX2021-04-131-33/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It only works correctly if both profiling and static prediction are used: it then compares both and gives stats in COMPCERT_PREDICT_STATS file. The stats are of the form: total correct mispredicts missed total = number of total CBs encountered correct = number of correct predictions mispredicts = times when static prediction did a wrong guess (predicted the opposite from profiling, or predicted Some _ when profiling said None) missed = times when static prediction was not able to give a verdict, though the profiling gave one
| * | | removing unusued proof lineLéo Gourdin2021-04-091-1/+0
| | | |
| * | | Important commit on expansions' mini CSE, and a draft for addptrofsLéo Gourdin2021-04-061-1/+2
| | | |
* | | | [BROKEN] Merge with v3.9 : something broken for __builtin_expect in ↵Cyril SIX2021-06-014-10/+18
| | | | | | | | | | | | | | | | cfrontend/C2C.ml
* | | | replacing omega with lia in some fileLéo Gourdin2021-03-293-7/+8
| | | |
* | | | Merge branch 'master' into merge_master_8.13.1Sylvain Boulmé2021-03-2335-708/+737
|\ \ \ \ | |/ / / |/| | / | | |/ | |/| | | | | | | | | | | | | PARTIAL MERGE (PARTLY BROKEN). See unsolved conflicts in: aarch64/TO_MERGE and riscV/TO_MERGE WARNING: interface of va_args and assembly sections have changed
| * | Fix regression on PowerPC / DiabXavier Leroy2021-02-231-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On PowerPC/Diab, common declarations must not be used for small data sections. Add a `~common` option to `PrintAsmaux.variable_section` to control the use of common declarations. The default is whatever is specified on the command line using the `-fcommon` and `-fno-common` options. Use `~common:false` for `Section_small_data` on PowerPC / Diab. Note that on PowerPC/Linux, GCC uses common declarations for uninitialized variables in small data section, so we keep doing this in CompCert as well.
| * | Section handling: finer control of variable initializationXavier Leroy2021-02-233-10/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Distinguish between: - uninitialized variables, which can go in COMM if supported - variables initialized with fixed, numeric quantities, which can go in a readonly section if "const" - variables initialized with symbol addresses which may need relocation, which cannot go in a readonly section even if "const", but can go in a special "const_data" section. Also: on macOS, use ".const" instead of ".literal8" for literals, as not all literals have size 8.
| * | Introduce and use PrintAsmaux.variable_sectionXavier Leroy2021-02-231-7/+12
| | | | | | | | | | | | | | | | | | | | | This is a generalization of the previous PrintAsmaux.common_section function that - handles initialized variables in addition to uninitialized variables; - can be used for Section_const, not just for Section_data.
| * | Qualify `Hint` as `Global Hint` where appropriateXavier Leroy2021-01-2110-60/+60
| | | | | | | | | | | | | | | | | | | | | This avoids a new warning of Coq 8.13. Eventually these `Global Hint` should become `#[export] Hint`, with a cleaner but different meaning than `Global Hint`.
| * | Improve branch tunnelingXavier Leroy2021-01-132-60/+328
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous branch tunneling was missing optimization opportunities introduced by the optimization of conditional branches. For example: L1: instr; branch L2 L2: if cond then branch L3 else branch L4 L3: branch L4 L4: ... was transformed into L1: instr; branch L2 L2: branch L4 L3: branch L4 L4: ... missing a tunneling opportunity (branch L2 -> branch L4). This commit improves branch tunneling so that the expected code is produced: L1: instr; branch L4 L2: branch L4 L3: branch L4 L4: ... To this end, additional equalities are introduced in the union-find data structure corresponding to optimizable conditional branches. In rare cases these additional equalities trigger new opportunities for optimizing conditional branches. Hence we iterate the analysis until no optimizable conditional branch remains.
| * | Revised correctness proof for record_gotoXavier Leroy2021-01-131-68/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to define an instrumented version record_goto' that also builds the measure f, prove it correct, then show equivalence with record_goto. The new proofs make do without the instrumented version. They prove strong existence of the measure, as in `{ f | branch_map_correct (record_goto fn) f}`.
| * | Replace `omega` tactic with `lia`Xavier Leroy2020-12-2927-636/+636
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since Coq 8.12, `omega` is flagged as deprecated and scheduled for removal. Also replace CompCert's homemade tactics `omegaContradiction`, `xomega`, and `xomegaContradiction` with `lia` and `extlia`. Turn back on the deprecation warning for uses of `omega`. Make the proof of `Ctypes.sizeof_pos` more robust to variations in `lia`.
| * | Changed cc_varargs to an option typeBernhard Schommer2020-12-251-1/+1
| | | | | | | | | | | | | | | | | | Instead of being a simple boolean we now use an option type to record the number of fixed (non-vararg) arguments. Hence, `None` means not vararg, and `Some n` means `n` fixed arguments followed with varargs.
* | | add has_type infoDavid Monniaux2021-01-311-1/+3
| |/ |/|
* | Conditions now propagated by CSE3David Monniaux2021-01-205-264/+934
|\ \ | | | | | | | | | Merge remote-tracking branch 'origin/kvx-better2-cse3' into kvx-work
| * | totally switch off conditions in cse3David Monniaux2020-12-092-16/+21
| | |
| * | begin implementing -fcse3-conditionsDavid Monniaux2020-12-092-8/+15
| | |
| * | CSE3 + conditions proofDavid Monniaux2020-12-092-34/+58
| | |
| * | apply_cond_soundDavid Monniaux2020-12-091-0/+14
| | |
| * | apply_cond0_soundDavid Monniaux2020-12-091-1/+25
| | |
| * | apply_cond1_soundDavid Monniaux2020-12-091-0/+30
| | |
| * | proof for jumptableDavid Monniaux2020-12-091-1/+27
| | |
| * | one 'admit' lessDavid Monniaux2020-12-091-2/+36
| | |
| * | avancement dans les preuvesDavid Monniaux2020-12-091-1/+34
| | |
| * | CSE3 compiles again, but some admitted lemmasDavid Monniaux2020-12-091-10/+8
| | |
| * | progress moving to list of itemsDavid Monniaux2020-12-093-92/+233
| | |
| * | analysis with Abst_sameDavid Monniaux2020-12-083-22/+39
| | |
| * | CSE3 now runs on its own fixpoint iterator not based on Kildall.vDavid Monniaux2020-12-082-114/+3
| | |