diff options
author | xleroy <xleroy@fca1b0fc-160b-0410-b1d3-a4f43f01ea2e> | 2010-03-03 10:25:25 +0000 |
---|---|---|
committer | xleroy <xleroy@fca1b0fc-160b-0410-b1d3-a4f43f01ea2e> | 2010-03-03 10:25:25 +0000 |
commit | 93d89c2b5e8497365be152fb53cb6cd4c5764d34 (patch) | |
tree | 0de8d05bbd0eeaeb5e4b85395f8dd576984b6a9e /cil/doc/cil004.html | |
parent | 891377ce1962cdb31357d6580d6546ec22df2b4f (diff) | |
download | compcert-93d89c2b5e8497365be152fb53cb6cd4c5764d34.tar.gz compcert-93d89c2b5e8497365be152fb53cb6cd4c5764d34.zip |
Getting rid of CIL
git-svn-id: https://yquem.inria.fr/compcert/svn/compcert/trunk@1270 fca1b0fc-160b-0410-b1d3-a4f43f01ea2e
Diffstat (limited to 'cil/doc/cil004.html')
-rw-r--r-- | cil/doc/cil004.html | 350 |
1 files changed, 0 insertions, 350 deletions
diff --git a/cil/doc/cil004.html b/cil/doc/cil004.html deleted file mode 100644 index 16fde392..00000000 --- a/cil/doc/cil004.html +++ /dev/null @@ -1,350 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" - "http://www.w3.org/TR/REC-html40/loose.dtd"> -<HTML> -<HEAD> - - - -<META http-equiv="Content-Type" content="text/html; charset=ANSI_X3.4-1968"> -<META name="GENERATOR" content="hevea 1.08"> - -<base target="main"> -<script language="JavaScript"> -<!-- Begin -function loadTop(url) { - parent.location.href= url; -} -// --> -</script> -<LINK rel="stylesheet" type="text/css" href="cil.css"> -<TITLE> -Compiling C to CIL -</TITLE> -</HEAD> -<BODY > -<A HREF="cil003.html"><IMG SRC ="previous_motif.gif" ALT="Previous"></A> -<A HREF="ciltoc.html"><IMG SRC ="contents_motif.gif" ALT="Up"></A> -<A HREF="cilly.html"><IMG SRC ="next_motif.gif" ALT="Next"></A> -<HR> - -<H2 CLASS="section"><A NAME="htoc4">4</A> Compiling C to CIL</H2><A NAME="sec-cabs2cil"></A> -In this section we try to describe a few of the many transformations that are -applied to a C program to convert it to CIL. The module that implements this -conversion is about 5000 lines of OCaml code. In contrast a simple program -transformation that instruments all functions to keep a shadow stack of the -true return address (thus preventing stack smashing) is only 70 lines of code. -This example shows that the analysis is so much simpler because it has to -handle only a few simple C constructs and also because it can leverage on CIL -infrastructure such as visitors and pretty-printers.<BR> -<BR> -In no particular order these are a few of the most significant ways in which -C programs are compiled into CIL: -<OL CLASS="enumerate" type=1><LI CLASS="li-enumerate"> -CIL will eliminate all declarations for unused entities. This means that -just because your hello world program includes <TT>stdio.h</TT> it does not mean -that your analysis has to handle all the ugly stuff from <TT>stdio.h</TT>.<BR> -<BR> -<LI CLASS="li-enumerate">Type specifiers are interpreted and normalized: -<PRE CLASS="verbatim"><FONT COLOR=blue> -int long signed x; -signed long extern x; -long static int long y; - -// Some code that uses these declaration, so that CIL does not remove them -int main() { return x + y; } -</FONT></PRE> -See the <A HREF="examples/ex1.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Anonymous structure and union declarations are given a name. -<PRE CLASS="verbatim"><FONT COLOR=blue> - struct { int x; } s; -</FONT></PRE> -See the <A HREF="examples/ex2.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Nested structure tag definitions are pulled apart. This means that all -structure tag definitions can be found by a simple scan of the globals. -<PRE CLASS="verbatim"><FONT COLOR=blue> -struct foo { - struct bar { - union baz { - int x1; - double x2; - } u1; - int y; - } s1; - int z; -} f; -</FONT></PRE> -See the <A HREF="examples/ex3.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">All structure, union, enumeration definitions and the type definitions -from inners scopes are moved to global scope (with appropriate renaming). This -facilitates moving around of the references to these entities. -<PRE CLASS="verbatim"><FONT COLOR=blue> -int main() { - struct foo { - int x; } foo; - { - struct foo { - double d; - }; - return foo.x; - } -} -</FONT></PRE> -See the <A HREF="examples/ex4.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Prototypes are added for those functions that are called before being -defined. Furthermore, if a prototype exists but does not specify the type of -parameters that is fixed. But CIL will not be able to add prototypes for those -functions that are neither declared nor defined (but are used!). -<PRE CLASS="verbatim"><FONT COLOR=blue> - int f(); // Prototype without arguments - int f(double x) { - return g(x); - } - int g(double x) { - return x; - } -</FONT></PRE> -See the <A HREF="examples/ex5.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Array lengths are computed based on the initializers or by constant -folding. -<PRE CLASS="verbatim"><FONT COLOR=blue> - int a1[] = {1,2,3}; - int a2[sizeof(int) >= 4 ? 8 : 16]; -</FONT></PRE> -See the <A HREF="examples/ex6.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Enumeration tags are computed using constant folding: -<PRE CLASS="verbatim"><FONT COLOR=blue> -int main() { - enum { - FIVE = 5, - SIX, SEVEN, - FOUR = FIVE - 1, - EIGHT = sizeof(double) - } x = FIVE; - return x; -} - -</FONT></PRE> -See the <A HREF="examples/ex7.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Initializers are normalized to include specific initialization for the -missing elements: -<PRE CLASS="verbatim"><FONT COLOR=blue> - int a1[5] = {1,2,3}; - struct foo { int x, y; } s1 = { 4 }; -</FONT></PRE> -See the <A HREF="examples/ex8.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Initializer designators are interpreted and eliminated. Subobjects are -properly marked with braces. CIL implements -the whole ISO C99 specification for initializer (neither GCC nor MSVC do) and -a few GCC extensions. -<PRE CLASS="verbatim"><FONT COLOR=blue> - struct foo { - int x, y; - int a[5]; - struct inner { - int z; - } inner; - } s = { 0, .inner.z = 3, .a[1 ... 2] = 5, 4, y : 8 }; -</FONT></PRE> -See the <A HREF="examples/ex9.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">String initializers for arrays of characters are processed -<PRE CLASS="verbatim"><FONT COLOR=blue> -char foo[] = "foo plus bar"; -</FONT></PRE> -See the <A HREF="examples/ex10.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">String constants are concatenated -<PRE CLASS="verbatim"><FONT COLOR=blue> -char *foo = "foo " " plus " " bar "; -</FONT></PRE> -See the <A HREF="examples/ex11.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Initializers for local variables are turned into assignments. This is in -order to separate completely the declarative part of a function body from the -statements. This has the unfortunate effect that we have to drop the <TT>const</TT> -qualifier from local variables ! -<PRE CLASS="verbatim"><FONT COLOR=blue> - int x = 5; - struct foo { int f1, f2; } a [] = {1, 2, 3, 4, 5 }; -</FONT></PRE> -See the <A HREF="examples/ex12.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Local variables in inner scopes are pulled to function scope (with -appropriate renaming). Local scopes thus disappear. This makes it easy to find -and operate on all local variables in a function. -<PRE CLASS="verbatim"><FONT COLOR=blue> - int x = 5; - int main() { - int x = 6; - { - int x = 7; - return x; - } - return x; - } -</FONT></PRE> -See the <A HREF="examples/ex13.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Global declarations in local scopes are moved to global scope: -<PRE CLASS="verbatim"><FONT COLOR=blue> - int x = 5; - int main() { - int x = 6; - { - static int x = 7; - return x; - } - return x; - } -</FONT></PRE> -See the <A HREF="examples/ex14.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Return statements are added for functions that are missing them. If the -return type is not a base type then a <TT>return</TT> without a value is added. -The guaranteed presence of return statements makes it easy to implement a -transformation that inserts some code to be executed immediately before -returning from a function. -<PRE CLASS="verbatim"><FONT COLOR=blue> - int foo() { - int x = 5; - } -</FONT></PRE> -See the <A HREF="examples/ex15.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">One of the most significant transformations is that expressions that -contain side-effects are separated into statements. -<PRE CLASS="verbatim"><FONT COLOR=blue> - int x, f(int); - return (x ++ + f(x)); -</FONT></PRE> -See the <A HREF="examples/ex16.txt">CIL output</A> for this -code fragment<BR> -<BR> -Internally, the <TT>x ++</TT> statement is turned into an assignment which the -pretty-printer prints like the original. CIL has only three forms of basic -statements: assignments, function calls and inline assembly.<BR> -<BR> -<LI CLASS="li-enumerate">Shortcut evaluation of boolean expressions and the <TT>?:</TT> operator are -compiled into explicit conditionals: -<PRE CLASS="verbatim"><FONT COLOR=blue> - int x; - int y = x ? 2 : 4; - int z = x || y; - // Here we duplicate the return statement - if(x && y) { return 0; } else { return 1; } - // To avoid excessive duplication, CIL uses goto's for - // statement that have more than 5 instructions - if(x && y || z) { x ++; y ++; z ++; x ++; y ++; return z; } -</FONT></PRE> -See the <A HREF="examples/ex17.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">GCC's conditional expression with missing operands are also compiled -into conditionals: -<PRE CLASS="verbatim"><FONT COLOR=blue> - int f();; - return f() ? : 4; -</FONT></PRE> -See the <A HREF="examples/ex18.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">All forms of loops (<TT>while</TT>, <TT>for</TT> and <TT>do</TT>) are compiled -internally as a single <TT>while(1)</TT> looping construct with explicit <TT>break</TT> -statement for termination. For simple <TT>while</TT> loops the pretty printer is -able to print back the original: -<PRE CLASS="verbatim"><FONT COLOR=blue> - int x, y; - for(int i = 0; i<5; i++) { - if(i == 5) continue; - if(i == 4) break; - i += 2; - } - while(x < 5) { - if(x == 3) continue; - x ++; - } -</FONT></PRE> -See the <A HREF="examples/ex19.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">GCC's block expressions are compiled away. (That's right there is an -infinite loop in this code.) -<PRE CLASS="verbatim"><FONT COLOR=blue> - int x = 5, y = x; - int z = ({ x++; L: y -= x; y;}); - return ({ goto L; 0; }); -</FONT></PRE> -See the <A HREF="examples/ex20.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">CIL contains support for both MSVC and GCC inline assembly (both in one -internal construct)<BR> -<BR> -<LI CLASS="li-enumerate">CIL compiles away the GCC extension that allows many kinds of constructs -to be used as lvalues: -<PRE CLASS="verbatim"><FONT COLOR=blue> - int x, y, z; - return &(x ? y : z) - & (x ++, x); -</FONT></PRE> -See the <A HREF="examples/ex21.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">All types are computed and explicit casts are inserted for all -promotions and conversions that a compiler must insert:<BR> -<BR> -<LI CLASS="li-enumerate">CIL will turn old-style function definition (without prototype) into -new-style definitions. This will make the compiler less forgiving when -checking function calls, and will catch for example cases when a function is -called with too few arguments. This happens in old-style code for the purpose -of implementing variable argument functions.<BR> -<BR> -<LI CLASS="li-enumerate">Since CIL sees the source after preprocessing the code after CIL does -not contain the comments and the preprocessing directives.<BR> -<BR> -<LI CLASS="li-enumerate">CIL will remove from the source file those type declarations, local -variables and inline functions that are not used in the file. This means that -your analysis does not have to see all the ugly stuff that comes from the -header files: -<PRE CLASS="verbatim"><FONT COLOR=blue> -#include <stdio.h> - -typedef int unused_type; - -static char unused_static (void) { return 0; } - -int main() { - int unused_local; - printf("Hello world\n"); // Only printf will be kept from stdio.h -} -</FONT></PRE> -See the <A HREF="examples/ex22.txt">CIL output</A> for this -code fragment</OL> -<HR> -<A HREF="cil003.html"><IMG SRC ="previous_motif.gif" ALT="Previous"></A> -<A HREF="ciltoc.html"><IMG SRC ="contents_motif.gif" ALT="Up"></A> -<A HREF="cilly.html"><IMG SRC ="next_motif.gif" ALT="Next"></A> -</BODY> -</HTML> |