aboutsummaryrefslogtreecommitdiffstats
path: root/cparser
Commit message (Collapse)AuthorAgeFilesLines
* Ignore unnecessary foldersYann Herklotz2023-04-271-1/+0
|
* wchar_t is unsigned int for arm_littleendian, arm_bigendian and aarch64Bernhard Schommer2023-02-011-2/+7
| | | | | | However, it is signed int for aarch64_apple. Fixes: #475
* Export `name_of_ikind` and `name_of_fkind`Xavier Leroy2023-02-011-0/+3
|
* Introduce `wchar_ikind` in machine descriptionsXavier Leroy2023-01-244-20/+12
| | | | | | This replaces `sizeof_wchar` and `wchar_signed`, and makes it possible to request that `wchar_t` is `long` instead of `int`, something that seems needed for PowerPC.
* Elaboration of compound initializers: reverse list of generated global variablesMichael Schmidt2023-01-231-1/+1
| | | | | | So as to respect dependency order when compound initializers are nested. Fixes: #471
* Ignore debug statements before the first case of a `switch`Xavier Leroy2022-11-031-1/+7
| | | | | | | | | | | | | | | | This can occur in debug mode if there are declarations before the first case, as in ``` switch (x) { int x; case 0: ... } ``` Without this commit, the code above is a structured switch if -g is not given, and a non-structured switch if -g is given. Co-authored-by: Michael Schmidt <github@mschmidt.me>
* Handle unstructured 'switch' statements such as Duff's deviceXavier Leroy2022-10-295-7/+217
| | | | | | | | | | | | | | | - New elaboration pass: SwitchNorm - recognizes structured 'switch' statements and puts them in a normalized form; - if selected, transforms unstructured 'switch' statements into a structured switch with goto actions + the original switch body with appropriate labels and gotos. - C2C treatment of 'switch' statements is simplified accordingly. - New language support option `-funstructured-switch`. - Some tests were added (test/regression/switch3.c).
* Unblock: never put debug info before a labelXavier Leroy2022-10-291-11/+12
| | | | | | | This ensures that normalized switch statements remain normalized. However, if the label and the labeled statement are on different lines, add an extra line directive corresponding to the label before the labeled statement.
* Revised passing of options to `Parse.preprocessed_file`Xavier Leroy2022-09-272-34/+28
| | | | | | Instead of passing a string that encodes the optional source-to-source transformations to perform, just pass one optional, named argument for each transformation. It is clearer and easier to extend this way.
* Export the functions that control warningsXavier Leroy2022-09-231-0/+13
| | | | | Activation and deactivation of a given warning, plus set-as-error and remove-as-error.
* Support C11 Unicode string literals and character constants (#452)Xavier Leroy2022-09-1912-110/+257
| | | | | | | | | | | | | | | | | * Support C11 Unicode string literals and character constants * Add tests for C11 string literals and character constants * Better error message for ill-formed universal character names E.g. \u followed by fewer than 4 hex digits, or \U followed by fewer than 8 hex digits. * Add new warning `invalid-utf8` for byte sequences that are not valid UTF8. The warning is activated but not fatal by default. * Warn on uses of C11 Unicode character constants and string literals This uses the `c11-extensions` warning, which is off by default. * Support preprocessing option -finput-charset= for GNU toolchains
* `_Generic` is a C11 feature, should trigger the corresponding warning if activeXavier Leroy2022-09-041-0/+1
|
* Rework of struct member offsets for debug info.Bernhard Schommer2022-09-031-2/+0
| | | | | | | | | | | | | The struct member byte and bit offsets are now set based upon the new function `struct_layout` from Ctypes.v, thus using the same code to compute as used to generate the actual struct access. The struct offset member information is addded using the types computed after the translation in C2C. Therefore we need to store the new internal names of the members as well as the composites and use them when adding the offset information. Fixes: #445
* Improve control-flow analysis of "noreturn" function callsXavier Leroy2022-05-191-4/+18
| | | | | | | | - Honor "noreturn" attributes and _Noreturn declarations on calls to function pointers. - Don't crash if a function type is an unknown typedef. (This can happen with local typedefs.) Instead, conservatively assume this function can return.
* Revised warning for non-prototyped function declarationsXavier Leroy2022-05-131-17/+22
| | | | | | - Emit the warning even for functions without parameters `int f() {...}` - Show what the type is after prototyping - Refactor this part of `elab_fundef` slightly
* Support _Generic from ISO C 2011Xavier Leroy2022-05-137-711/+1031
| | | | Entirely handled during elaboration. No impact on the verified part of CompCert.
* Fix the parsing of unprototyped function types in casts (#431)Xavier Leroy2022-05-091-2/+2
| | | | | Before, in casts, `int (*)()` was parsed like `int (*)(void)`. This caused expressions like `((int (*)()) &f)(42)` to be rejected. Declarations were correctly parsed.
* Completely avoid line breaks in types when printing error messagesXavier Leroy2022-05-071-6/+10
| | | | | The `pp_indication` optional argument that governs formatting boxes in types was not propagated recursively, causing boxes to appear.
* Enum is only compatible with IInt.Xavier Leroy2022-05-061-4/+4
| | | | | | Enum types should only be compatible with the underlying integer type, not all integer types. Bug 33212
* Revised checks for multi-character constants 'xyz'Xavier Leroy2021-11-161-24/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous code for elaborating character constants has a small bug: the value of a wide character constant consisting of several characters was normalized to type `int`, while, statically, it has type `wchar_t`. If `wchar_t` is `unsigned short`, for example, the constant `L'ab'` would elaborate to 6357090, which is not of type `unsigned short`. This commit fixes the bug by normalizing wide character constants to type `wchar_t`, regardless of how many characters they contain. The previous code was odd in another respect: leading `\0` characters in multi-character constants were ignored. Hence, `'\0bcde'` was accepted while `'abcde'` caused a warning. This commit implements a more predictable behavior: the number of characters in a character literal is limited a priori to sizeof(type of result) / sizeof(type of each character) So, for non-wide character constants we can typically have up to 4 characters (sizeof(int) / sizeof(char)), while for wide character constants we can only have one character. In effect, multiple-character wide character literals are not supported. This is allowed by the ISO C99 standard and seems consistent with GCC and Clang. Finally, a multi-character constant with too many characters was reported either as an error (if the computation overflowed the 64-bit accumulator) or as a warning (otherwise). Here, we make this an error in all cases. GCC and Clang only produce warnings, and truncate the value of the character constant, but an error feels safer.
* Resurrect a warning for bit fields of enum typesXavier Leroy2021-11-121-15/+33
| | | | | | | | | | | | | | | | Earlier CompCert versions would warn if a bit field of width N and type enum E was too small for the values of the enumeration: whether the field is interpreted as a N-bit signed integer or a N-bit unsigned integer, some values of the enumeration are not representable. This warning was performed in the Bitfields emulation pass, which went away during the reimplementation of bit fields within the verified part of CompCert. In this commit, we resurrect the warning and perform it during the Elab pass. In passing, some of the code that elaborates bit fields was moved to a separate function "check_bitfield".
* Ignore unnamed bit fields for initialization of unionsBernhard Schommer2021-09-282-7/+16
| | | | | | | | | | | | When a union is initialized with an initializer without designator the first named member should be initialized. This commit skips members without names during the elaboration of union initializers. Note that anonymous members (unnamed members of struct or union type) should not be skipped, and are not skipped since elaboration give names to these members. Bug 31982
* Ignore unnamed plain members of structs and unionsXavier Leroy2021-09-281-10/+15
| | | | | | | | | | | | | | | | | | | E.g. `struct { int; int x; };`. The `int;` declaration provides no name, is not a bit field, and is not a C11 anonymous struct/union member. Such declarations are not allowed by the C99 grammar, even though GCC, Clang and CompCert tolerate them. The C11 grammar allows these declarations but the standard text gives them no meaning. CompCert used to warn about such declarations, yet include them in the struct or union as unnamed members, similar to an unnamed bit field. This is incorrect and inconsistent with what GCC and Clang do. With this commit, CompCert still warns, then ignores the declaration and does not create an unnamed member. This is consistent with GCC and Clang. Fixes: #411
* Merge branch 'bitfields' (#400)Xavier Leroy2021-08-225-654/+46
|\
| * Native support for bit fields (#400)Xavier Leroy2021-08-225-654/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This big PR adds support for bit fields in structs and unions to the verified part of CompCert, namely the CompCert C and Clight languages. The compilation of bit field accesses to normal integer accesses + shifts and masks is done and proved correct as part of the Cshmgen pass. The layout of bit fields in memory is done by the functions in module Ctypes. It follows the ELF ABI layout algorithm. As a bonus, basic soundness properties of the layout are shown, such as "two different bit fields do not overlap" or "a bit field and a regular field do not overlap". All this replaces the previous emulation of bit fields by source-to-source rewriting in the unverified front-end of CompCert (module cparse/Bitfield.ml). This emulation was prone to errors (see nonstandard layout instead. The core idea for the PR is that expressions in l-value position denote not just a block, a byte offset and a type, but also a bitfield designator saying whether all the bits of the type are accessed (designator Full) or only some of its bits (designator Bits). Designators of the Bits kind appear when the l-value is a bit field access; the bit width and bit offset in Bits are computed by the functions in Ctypes that implement the layout algorithm. Consequently, both in the semantics of CompCert C and Clight and in the SimplExpr, SimplLocals and Cshmgen compilation passes, pairs of a type and a bitfield designator are used in a number of places where a single type was used before. The introduction of bit fields has a big impact on static initialization (module cfrontend/Initializers.v), which had to be rewritten in large part, along with its soundness proof (cfrontend/Initializersproof.v). Both static initialization and run-time manipulation of bit fields are tested in test/abi using differential testing against GCC and randomly-generated structs. This work exposed subtle interactions between bit fields and the volatile modifier. Currently, the volatile modifier is ignored when accessing a bit field (and a warning is printed at compile-time), just like it is ignored when accessing a struct or union as a r-value. Currently, the natural alignment of bit fields and their storage units cannot be modified with the aligned attribute. _Alignas on bit fields is rejected as per C11, and the packed modifier cannot be applied to a struct containing bit fields.
* | Support `# 0 ...` preprocessed line directiveXavier Leroy2021-06-011-1/+1
|/ | | | | | | Before, the line number had to start with a nonzero digit. However, the GCC 11 preprocessor was observed to produce `# 0 ...` directives. Fixes: #398
* Use the LGPL instead of the GPL for dual-licensed filesXavier Leroy2021-05-0848-192/+240
| | | | | | The GPL makes sense for whole applications, but the dual-licensed Coq and OCaml files are more like libraries to be combined with other code, so the LGPL is more appropriate.
* Fix evaluation order in emulation of bitfield assignmentXavier Leroy2021-05-021-2/+2
| | | | | | | | | | | | | | | | A bitfield assignment `x.b = f()` is expanded into a read-modify-write on `x.carrier`. Wrong results can occur if `x.carrier` is read before the call to `f()`, and `f` itself modifies a bitfield with the same carrier `x.carrier`. In this temporary fix, we play on the evaluation order implemented by the SimplExpr pass of CompCert (left-to-right for side-effecting subexpression) to make sure the read part of the read-modify-write sequence occurs after the evaluation of the right-hand side. More substantial fixes will be considered later. Fixes: #395
* Support __builtin_unreachableXavier Leroy2021-05-021-1/+5
| | | | Not yet used for optimizations.
* Fix spurious error on initialization of struct with flexible array memberXavier Leroy2021-05-021-0/+3
| | | | | | | | | | | | The following is correct but was causing a "wrong type for array initializer" fatal error. ``` struct s { int n; int d[]; }; void f(void) { struct s x = {0}; } ``` Co-authored-by: Michael Schmidt <github@mschmidt.me>
* Elab bitfields: check size of type <=32bit rather than checking rank (#387)Amos Robinson2021-04-191-1/+1
| | | | | | | | | | | | When desugaring a bitfield, allow any integral type that is 32 bits or smaller. Previously this was checking the rank of the type rather than the size. This rank check caused issues with standard headers that declare `uint32_t` to be an `unsigned long` rather than an `unsigned int`. Here, any bitfields declared as `uint32_t` were failing to compile even though they are still actually 32 bits. Co-authored-by: Amos Robinson <amos@gh.st>
* Refactor cparser/Parse.mlXavier Leroy2021-04-191-31/+29
| | | | | | | | | - Use pipeline notation `|>` for legibility and better GC behavior (in bytecode at least). - Introduce auxiliary functions. - Remove useless function parameters. - Fix the timing of the "Emulations" pass (because of an extra parameter, what was timed took zero time).
* Ensure compatibility with future versions of MenhirLibXavier Leroy2021-04-191-6/+7
| | | | | | | | After Menhir version 20210310, the `Fail_pr` constructor of the `parse_result` type becomes `Fail_pr_full` with two extra arguments. This PR enables CompCert to handle both versions of the `parse_result` type in MenhirLib.
* "macosx" is now called "macos"Xavier Leroy2021-01-182-3/+3
| | | | | The configure script still accepts "macosx" for backward compatibility, but every other part of CompCert now uses "macos".
* Change warning for pragmas inside functionsXavier Leroy2021-01-161-1/+1
| | | | | | | Follow-up to 35e2b11db. Put the warning "pragmas are ignored inside functions" inside the Unnamed category, so that it is displayed by default and cannot be disabled.
* Ignore and warn about pragmas inside functionsXavier Leroy2021-01-071-1/+4
| | | | | | | | | | | | | | | Pragmas can occur either outside external declarations, at the top level of a compilation unit, or within a compound statement, inside a function definition. The parse tree in cparse/C.mli cannot represent pragmas occuring within a compound statement. In this case, the elaborator used to silently move the pragma to top level, just before the function definition where the pragma occurs. It looks safer to just ignore pragmas occurring inside a function definition, and emit a specific warning.
* AArch64: macOS portXavier Leroy2020-12-262-0/+4
| | | | | This commit adds support for macOS (and probably iOS) running on AArch64 / ARM 64-bit / "Apple silicon" processors.
* C parser: handle other built-in types than __builtin_va_listXavier Leroy2020-12-261-1/+2
| | | | | All the built-in types declared in $ARCH/CBuiltins.ml are now recognized as type names initially.
* Check ptr arithmetic for ++ and --Bernhard Schommer2020-09-201-10/+16
| | | | | Also: improve check for ptr - integer. (Added by Xavier Leroy <xavier.leroy@college-de-france.fr>)
* Add new static-assert token for deLexer utility; bug 29273Michael Schmidt2020-08-041-0/+1
|
* Add comments we missed to sync to GitHubChristoph Cullmann2020-07-301-0/+2
|
* Allow string_literals_list in _Static_assert.Bernhard Schommer2020-07-272-24/+25
| | | | | | Not all pre-processors concatenate string literal lists, however they are allowed in _Static_assert. This is similar to the rules for inline assembly etc.
* More checks for __builtin_va_start (#250)Bernhard Schommer2020-07-211-6/+10
| | | | We check that this builtin function is only called from within a variadic function and has the correct number of arguments.
* cparser/handcrafted.messages: missing blank lineXavier Leroy2020-07-211-0/+1
|
* Updated handcrafted.messages.Bernhard Schommer2020-07-211-0/+108
| | | | | Added error descriptions for the new syntax errors introduced by '_Static_assert'.
* Support _Static_assert from C11Xavier Leroy2020-07-217-1060/+1116
|
* Support __builtin_constant_p as in GCC and Clang (#367)Xavier Leroy2020-07-211-0/+10
| | | | | Returns 1 if the argument is a constant expression, 0 otherwise. Closes: #366
* Use the correct location for Slabaled in transform.Bernhard Schommer2020-07-211-2/+2
|
* Added error for redefined builtin.Bernhard Schommer2020-07-203-0/+6
| | | | | | We check in the initial environment if a function is already defined to avoid redefinition of functions that are part of the builtin environment.
* Introduce additional "branch" build information.Bernhard Schommer2020-07-081-5/+5
|