Add a canonical encoding of identifiers as numbers and use it in clightgen (#353)

Within CompCert, identifiers (names of C functions, variables, types, etc) are represented by unique positive numbers, sometimes called "atoms". In the original implementation, atoms 1, 2, ..., N are assigned to identifiers as they are encountered. The resulting number are small and are efficient when used as keys in data structures such as PTrees. However, the mapping from C source-level identifiers to atoms differs between compilation units. This is not a problem for CompCert but complicates CompCert-based verification tools that need to combine several compilation units. This commit introduces an alternate implementation of atoms, suggested by Andrew Appel. The choice between implementations is governed by the Boolean reference `Camlcoq.use_canonical_atoms`. In the alternate implementation, identifiers are converted to bit sequences via a Huffman encoding, then the bits are represented as positive numbers. The same identifier is always represented by the same number. However, the numbers are usually bigger than in the original implementation, making PTree operations slower: lookups and updates take time linear in the length of the identifier, instead of logarithmic time in the number of identifiers encountered. The CompCert compiler (the `ccomp` executable) still uses the original implementation, but the `clightgen` tool used in conjunction with the VST program logic can use either implementations: - The alternate "canonical atoms" implementation is used by default, and also if the `-canonical-idents` option is given. - The original implementation is used if the `-short-idents` option is given. Closes: #222 Closes: #311
author: Xavier Leroy <xavierleroy@users.noreply.github.com> 2020-05-19 10:25:45 +0200
committer: GitHub <noreply@github.com> 2020-05-19 10:25:45 +0200
commit: 0eba6f63b6bc458d856e477f6f8ec6b78ef78c58 (patch)
tree: 6ab59e34aea0369013ed7e7461f3bf2f732d97bd /exportclight/Clightgen.ml
parent: 4a676623badb718da4055b7f26ee05f5097f4e7b (diff)
download: compcert-kvx-0eba6f63b6bc458d856e477f6f8ec6b78ef78c58.tar.gz
compcert-kvx-0eba6f63b6bc458d856e477f6f8ec6b78ef78c58.zip
1 files changed, 7 insertions, 2 deletions
diff --git a/exportclight/Clightgen.ml b/exportclight/Clightgen.ml
index f7279a5e..637454f0 100644
--- a/exportclight/Clightgen.ml
+++ b/exportclight/Clightgen.ml
@@ -98,6 +98,8 @@ Recognized source files:
   .i or .p       C source file that should not be preprocessed
 Processing options:
   -normalize     Normalize the generated Clight code w.r.t. loads in expressions
+  -canonical-idents  Use canonical numbers to represent identifiers  (default)
+  -short-idents  Use canonical numbers to represent identifiers
   -E             Preprocess only, send result to standard output
   -o <file>      Generate output in <file>
 |} ^
@@ -142,6 +144,8 @@ let cmdline_actions =
 (* Processing options *)
  [ Exact "-E", Set option_E;
   Exact "-normalize", Set option_normalize;
+  Exact "-canonical-idents", Set Camlcoq.use_canonical_atoms;
+  Exact "-short-idents", Unset Camlcoq.use_canonical_atoms;
   Exact "-o", String(fun s -> option_o := Some s);
   Prefix "-o", Self (fun s -> let s = String.sub s 2 ((String.length s) - 2) in
                               option_o := Some s);]
@@ -175,12 +179,13 @@ let cmdline_actions =
   ]
 
 let _ =
-  try
+try
   Gc.set { (Gc.get()) with
               Gc.minor_heap_size = 524288; (* 512k *)
               Gc.major_heap_increment = 4194304 (* 4M *)
          };
   Printexc.record_backtrace true;
+  Camlcoq.use_canonical_atoms := true;
   Frontend.init ();
   parse_cmdline cmdline_actions;
   if !option_o <> None && !num_input_files >= 2 then
@@ -188,7 +193,7 @@ let _ =
   if !num_input_files = 0 then
     fatal_error no_loc "no input file";
   perform_actions ()
-      with
+with
   | Sys_error msg
   | CmdError msg -> error no_loc "%s" msg; exit 2
   | Abort -> exit 2
author	Xavier Leroy <xavierleroy@users.noreply.github.com>	2020-05-19 10:25:45 +0200
committer	GitHub <noreply@github.com>	2020-05-19 10:25:45 +0200
commit	0eba6f63b6bc458d856e477f6f8ec6b78ef78c58 (patch)
tree	6ab59e34aea0369013ed7e7461f3bf2f732d97bd /exportclight/Clightgen.ml
parent	4a676623badb718da4055b7f26ee05f5097f4e7b (diff)
download	compcert-kvx-0eba6f63b6bc458d856e477f6f8ec6b78ef78c58.tar.gz compcert-kvx-0eba6f63b6bc458d856e477f6f8ec6b78ef78c58.zip