aboutsummaryrefslogtreecommitdiffstats
path: root/1-lexer.md
diff options
context:
space:
mode:
authorm8pple <dt10@imperial.ac.uk>2017-01-27 18:37:47 +0000
committerGitHub <noreply@github.com>2017-01-27 18:37:47 +0000
commit80f0f762c9ff26b38d2fb75f72d1e02107bc55e0 (patch)
tree8f27d4a0e42af7c2ecf505298670c7e8a0e603c0 /1-lexer.md
parente78934877daa8f0fdb8c67e063724af26c20670e (diff)
downloadCompiler-80f0f762c9ff26b38d2fb75f72d1e02107bc55e0.tar.gz
Compiler-80f0f762c9ff26b38d2fb75f72d1e02107bc55e0.zip
Update string literal spec. Thanks to @ps-george. Closes #2.
Diffstat (limited to '1-lexer.md')
-rw-r--r--1-lexer.md26
1 files changed, 25 insertions, 1 deletions
diff --git a/1-lexer.md b/1-lexer.md
index 9395704..809f197 100644
--- a/1-lexer.md
+++ b/1-lexer.md
@@ -46,7 +46,9 @@ token, and the top-level output will be an array of those dictionaries.
Each dictionary should have the following properties:
- "Text" : The original text of the token, not including any
- surrounding white-space.
+ surrounding white-space. A special case is made for string literals, where
+ the surrounding '"'s can be omitted, though if they are included (appropriately
+ escaped) it [is fine too](#2).
- "Class" : Describes the class of token, with one of the following classes:
@@ -157,6 +159,28 @@ the ordering of `key:value` pairs does not matter. However,
the ordering of tokens within the toplevel array `[]` _does_
matter.
+There is some [ambiguity](#2) over the string literals, which might
+or might not include the quotes. _Both_ forms will be accepted as
+value, so this:
+Another session for a different lexer could be (preferred):
+````
+$ cat wibble.c
+z="wibble"
+$ cat wibble.c | bin/c_lexer
+[ { "Class" : "Identifier", "Text": "z" },
+ { "Class" : "Operator", "Text": "=" },
+ { "Class" : "StringLiteral", "Text": "z" }
+]
+````
+or
+````
+
+$ cat wibble.c | bin/c_lexer
+[ { "Class" : "Identifier", "Text": "z" },
+ { "Class" : "Operator", "Text": "=" },
+ { "Class" : "StringLiteral", "Text": "\"z\"" }
+]
+````
Pre-Processor
-------------