summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorYann Herklotz <git@yannherklotz.com>2022-04-21 18:45:13 +0100
committerYann Herklotz <git@yannherklotz.com>2022-04-21 18:45:13 +0100
commit06d2ac31a9a07bd8d73f9f4792abd8dc97c4649e (patch)
treeb2dc191360e393bcd4c2b1d69d34d9ca16b80a62
parentf9c4176a48c61530aa9bc1da77d9b1a6b3212871 (diff)
downloadlsr22_fvhls-06d2ac31a9a07bd8d73f9f4792abd8dc97c4649e.tar.gz
lsr22_fvhls-06d2ac31a9a07bd8d73f9f4792abd8dc97c4649e.zip
Add an index
-rw-r--r--.gitignore2
-rw-r--r--chapters/background.tex32
-rw-r--r--chapters/hls.tex98
-rw-r--r--chapters/schedule.tex8
-rw-r--r--chapters/scheduling.tex76
-rw-r--r--lsr_env.tex4
-rw-r--r--main.tex6
7 files changed, 119 insertions, 107 deletions
diff --git a/.gitignore b/.gitignore
index 3a8cb0d..2999112 100644
--- a/.gitignore
+++ b/.gitignore
@@ -281,7 +281,7 @@ TSWLatexianTemp*
# auto folder when using emacs and auctex
./auto/*
-*.el
+.auctex-auto/
# expex forward references with \gathertags
*-tags.tex
diff --git a/chapters/background.tex b/chapters/background.tex
index 7208050..19a8c15 100644
--- a/chapters/background.tex
+++ b/chapters/background.tex
@@ -213,15 +213,15 @@ of the compiler, meaning this method of proving algorithms correct ensures a cor
\subsection{CompCert}
-CompCert~\cite{leroy09_formal_verif_compil_back_end} is a formally verified C compiler written in
-Coq~\cite{bertot04_inter_theor_provin_progr_devel}. Coq is a theorem prover, meaning algorithms
-written in Coq can be reasoned about in Coq itself by proving various properties about the
-algorithms. To then run these algorithms that have been proven correct, they can be extracted
-directly to OCaml code and then executed, as there is a straightforward correspondence between Coq
-and OCaml code. During this translation, all the proofs are erased, as they are not needed during
-the execution of the compiler, as they are only needed when the correctness of the compiler needs to
-be checked. With this process, one can have a Compiler that satisfies various correctness
-properties and can therefore be proven to preserve the behaviour of the code.
+\index{CompCert}CompCert~\cite{leroy09_formal_verif_compil_back_end} is a formally verified C
+compiler written in Coq~\cite{bertot04_inter_theor_provin_progr_devel}. Coq is a theorem prover,
+meaning algorithms written in Coq can be reasoned about in Coq itself by proving various properties
+about the algorithms. To then run these algorithms that have been proven correct, they can be
+extracted directly to OCaml code and then executed, as there is a straightforward correspondence
+between Coq and OCaml code. During this translation, all the proofs are erased, as they are not
+needed during the execution of the compiler, as they are only needed when the correctness of the
+compiler needs to be checked. With this process, one can have a Compiler that satisfies various
+correctness properties and can therefore be proven to preserve the behaviour of the code.
CompCert contains eleven intermediate languages, which are used to gradually translate C code into
assembly that has the same behaviour. Proving the translation directly without going through the
@@ -237,13 +237,13 @@ allocation and making sure that the stack variables are correctly aligned.
\subsubsection{CompCertSSA}
-CompCertSSA is an extension of CompCert with an additional \SSA\ intermediate language. This
-language enforces \SSA\ properties and therefore allows for many convenient proofs about
-optimisations performed on this intermediate language, as many assumptions about variables can be
-made when these are encountered. The main interesting porperty that arises from using \SSA\ is the
-\emph{equational lemma}, stating that given a variable, if it was assigned by an operation that does
-not depend on memory, then loading the destination value of that variable is the same as recomputing
-the value of that variable with the current context.
+\index{CompCertSSA}CompCertSSA is an extension of CompCert with an additional \SSA\ intermediate
+language. This language enforces \SSA\ properties and therefore allows for many convenient proofs
+about optimisations performed on this intermediate language, as many assumptions about variables can
+be made when these are encountered. The main interesting porperty that arises from using \SSA\ is
+the \emph{equational lemma}, stating that given a variable, if it was assigned by an operation that
+does not depend on memory, then loading the destination value of that variable is the same as
+recomputing the value of that variable with the current context.
Given a well formed SSA program $p$, a reachable state $\Sigma\ s\ f\ \sigma\ R\ M$, a memory
independent operation which was defined at a node $d$ as $\mono{Iop}\ \mathit{op}\ \vec{a}\ x\ n$
diff --git a/chapters/hls.tex b/chapters/hls.tex
index ef75e78..5e7237a 100644
--- a/chapters/hls.tex
+++ b/chapters/hls.tex
@@ -18,12 +18,12 @@
We have designed a new HLS tool in the Coq theorem prover and proved that any output design it
produces always has the same behaviour as its input program. Our tool, called Vericert, is
automatically extracted to an OCaml program from Coq, which ensures that the object of the proof is
-the same as the implementation of the tool. Vericert is built by extending the CompCert verified C
-compiler~\cite[leroy09_formal_verif_realis_compil] with a new hardware-specific intermediate
-language and a Verilog back end. It supports most C constructs, including integer operations,
-function calls (which are all inlined), local arrays, structs, unions, and general control-flow
-statements, but currently excludes support for case statements, function pointers, recursive
-function calls, non-32-bit integers, floats, and global variables.
+the same as the implementation of the tool. Vericert is built by extending the
+\index{CompCert}CompCert verified C compiler~\cite[leroy09_formal_verif_realis_compil] with a new
+hardware-specific intermediate language and a Verilog back end. It supports most C constructs,
+including integer operations, function calls (which are all inlined), local arrays, structs, unions,
+and general control-flow statements, but currently excludes support for case statements, function
+pointers, recursive function calls, non-32-bit integers, floats, and global variables.
\paragraph{Contributions and Outline.} The contributions of this paper are as follows:
@@ -142,8 +142,8 @@ formal semantics.
\paragraph{Architecture of Vericert.} The main work flow of Vericert is given in
Fig.~\ref{fig:rtlbranch}, which shows those parts of the translation that are performed in CompCert,
and those that have been added. This includes translations to two new intermediate languages added
-in Vericert, HTL and Verilog, as well as an additional optimisation pass labelled as \quotation{RAM
-insertion}.
+in Vericert, \lindex{HTL}HTL and \lindex{Verilog}Verilog, as well as an additional optimisation pass
+labelled as \quotation{RAM insertion}.
\def\numcompcertlanguages{ten}
@@ -152,9 +152,9 @@ assembly output via a sequence of intermediate languages; we must decide which o
\numcompcertlanguages{} languages is the most suitable starting point for the HLS-specific
translation stages.
-We select CompCert's three-address code (3AC)\footnote{This is known as register transfer language
- (RTL) in the CompCert literature. `3AC' is used in this paper instead to avoid confusion with
- register-transfer level (RTL), which is another name for the final hardware target of the HLS
+We select CompCert's \lindex{RTL}three-address code (3AC)\footnote{This is known as register transfer
+ language (RTL) in the CompCert literature. `3AC' is used in this paper instead to avoid confusion
+ with register-transfer level (RTL), which is another name for the final hardware target of the HLS
tool.} as the starting point. Branching off \emph{before} this point (at CminorSel or earlier)
denies CompCert the opportunity to perform optimisations such as constant propagation and dead-code
elimination, which, despite being designed for software compilers, have been found useful in HLS
@@ -357,12 +357,12 @@ but this feature is not supported in most HLS tools anyway~\cite{davidthomas_asa
\subsubsection{Translating 3AC to HTL}
-The next translation is from 3AC to a new hardware translation language (HTL). This involves going
-from a CFG representation of the computation to a finite state machine with data-path (FSMD)
-representation~\cite{hwang99_fsmd}. The core idea of the FSMD representation is that it separates
-the control flow from the operations on the memory and registers. Hence, an HTL program consists of
-two maps from states to Verilog statements: the \emph{control logic} map, which expresses state
-transitions, and the \emph{data-path} map, which expresses computations.
+The next translation is from 3AC to a new \lindex{HTL}hardware translation language (HTL). This
+involves going from a CFG representation of the computation to a \index{FSMD}finite state machine
+with data-path (FSMD) representation~\cite{hwang99_fsmd}. The core idea of the FSMD representation
+is that it separates the control flow from the operations on the memory and registers. Hence, an
+HTL program consists of two maps from states to Verilog statements: the \emph{control logic} map,
+which expresses state transitions, and the \emph{data-path} map, which expresses computations.
Fig.~\ref{fig:accumulator_diagram} shows the resulting FSMD architecture. The right-hand block is
the control logic that computes the next state, while the left-hand block updates all the registers
and RAM based on the current program state.
@@ -513,7 +513,7 @@ Another is the \mono{Oshldimm} instruction, which is a left rotate instruction t
equivalent and therefore has to be implemented in terms of other operations and proven to be
equivalent. The only 32-bit instructions that we do not translate are case-statements
(\mono{Ijumptable}) and those instructions related to function calls (\mono{Icall}, \mono{Ibuiltin},
-and \mono{Itailcall}), because we enable inlining by default.
+and \mono{Itailcall}), because we enable \oindex{inlining}inlining by default.
\subsubsection{Translating HTL to Verilog}
@@ -547,14 +547,15 @@ similar syntax to C arrays, they must be treated quite differently. To make load
efficient as possible, the RAM needs to be word-addressable, which means that an entire integer can
be loaded or stored in one clock cycle. However, the memory model that CompCert uses for its
intermediate languages is byte-addre\-ssa\-ble~\cite{blazy05_formal_verif_memor_model_c}. If a
-byte-addressable memory was used in the target hardware, which is closer to CompCert's memory model,
-then a load and store would instead take four clock cycles, because a RAM can only perform one read
-and write per clock cycle. It therefore has to be proven that the byte-addressable memory behaves
-in the same way as the word-addressable memory in hardware. Any modifications of the bytes in the
-CompCert memory model also have to be shown to modify the word-addressable memory in the same way.
-Since only integer loads and stores are currently supported in Vericert, it follows that the
-addresses given to the loads and stores will be multiples of four. Translating from byte-addressed
-memory to word-addressed memory can then be done by dividing the address by four.
+byte-addressable \index{memory model}memory was used in the target hardware, which is closer to
+CompCert's memory model, then a load and store would instead take four clock cycles, because a RAM
+can only perform one read and write per clock cycle. It therefore has to be proven that the
+byte-addressable memory behaves in the same way as the word-addressable memory in hardware. Any
+modifications of the bytes in the CompCert memory model also have to be shown to modify the
+word-addressable memory in the same way. Since only integer loads and stores are currently
+supported in Vericert, it follows that the addresses given to the loads and stores will be multiples
+of four. Translating from byte-addressed memory to word-addressed memory can then be done by
+dividing the address by four.
\subsubsection[sec:hls:algorithm:optimisation:ram]{Implementation of RAM Interface}
@@ -1124,24 +1125,25 @@ translation proves that the specification holds.
\subsubsection[from-specification-to-simulation-1]{From Specification to Simulation}
Another simulation proof is performed to prove that the insertion of the RAM is semantics
-preserving. As in \in{Lemma}[lemma:simulation_diagram], we require some invariants that always hold
-at the start and end of the simulation. The invariants needed for the simulation of the RAM
-insertion are quite different to the previous ones, so we can define these invariants
-$\mathcal{I}_{r}$ to be the following:
+preserving. As in \index{simulation diagram}\in{Lemma}[lemma:simulation_diagram], we require some
+invariants that always hold at the start and end of the simulation. The invariants needed for the
+simulation of the RAM insertion are quite different to the previous ones, so we can define these
+invariants $\mathcal{I}_{r}$ to be the following:
\startitemize
\item
- The association map for arrays $\Gamma_{a}$ always needs to have the same arrays present, and
- these arrays should never change in size.
+ The \index{association map}association map for arrays $\Gamma_{a}$ always needs to have the same
+ arrays present, and these arrays should never change in size.
\item
The RAM should always be disabled at the start and the end of each simulation step. (This is why
self-disabling RAM is needed.)
\stopitemize
-The other invariants and assumptions for defining two matching states in HTL are quite similar to
-the simulation performed in \in{Lemma}[lemma:simulation_diagram], such as ensuring that the states
-have the same value, and that the values in the registers are less defined. In particular, the less
-defined relation matches up all the registers, except for the new registers introduced by the RAM.
+The other \pindex{invariants}invariants and assumptions for defining two matching states in HTL are
+quite similar to the simulation performed in \in{Lemma}[lemma:simulation_diagram], such as ensuring
+that the states have the same value, and that the values in the registers are less defined. In
+particular, the less defined relation matches up all the registers, except for the new registers
+introduced by the RAM.
\startlemma[lemma:htl_ram]
Given an HTL program, the forward-simulation relation should hold after inserting the RAM and
@@ -1161,8 +1163,9 @@ semantics of a map structure is quite different to that of the case-statement to
converted.
\startlemma[lemma:verilog]
- In the following, we write $\text{\tt tr\_verilog}$ for the translation from HTL to Verilog.
- (Note that this translation cannot fail, so we do not need the constructor here.)
+ In the following, we write $\text{\tt tr\_verilog}$ for the \pindex{translation+HTL to
+ Verilog}translation from HTL to Verilog. (Note that this translation cannot fail, so we do not
+ need the constructor here.)
\startformula
\forall h, V, B \in \text{\tt Safe},\quad \text{\tt tr\_verilog} (h) = V \land h \Downarrow B
@@ -1180,8 +1183,9 @@ converted.
\subsection[sec:proof:deterministic]{Deterministic Verilog Semantics}
-The final lemma we need is that the Verilog semantics is deterministic. This result allows us to
-replace the forwards simulation we have proved with the backwards simulation we desire.
+The final lemma we need is that the \pindex{determinism}Verilog semantics is deterministic. This
+result allows us to replace the forwards simulation we have proved with the backwards simulation we
+desire.
\startlemma[lemma:deterministic]
If a Verilog program $V$ admits behaviours $B_1$ and $B_2$, then
@@ -1236,12 +1240,12 @@ proofs about the extensional equality of array operations, such as merging array
assignments. As the negative edge implies that two merges take place every clock cycle, the proofs
about the equality of the contents in memory and in the merged arrays become more tedious too.
-Looking at the trusted computing base of , the Verilog semantics is 431 lines of code. This and the
-Clight semantics from are the only parts of the compiler that need to be trusted. The Verilog
-semantics specification is therefore much smaller compared to the 1721 lines of the implementation
-that are written in Coq, which are the verified parts of the HLS tool, even when the Clight
-semantics are added. In addition to that, understanding the semantics specification is simpler than
-trying to understand the translation algorithm. We conclude that the trusted base has been
-successfully reduced.
+Looking at the trusted computing base of Vericert, the Verilog semantics is 431 lines of code. This
+and the Clight semantics from are the only parts of the compiler that need to be trusted. The
+Verilog semantics specification is therefore much smaller compared to the 1721 lines of the
+implementation that are written in Coq, which are the verified parts of the HLS tool, even when the
+Clight semantics are added. In addition to that, understanding the semantics specification is
+simpler than trying to understand the translation algorithm. We conclude that the trusted base has
+been successfully reduced.
\stopcomponent
diff --git a/chapters/schedule.tex b/chapters/schedule.tex
index bf62c59..d367e4b 100644
--- a/chapters/schedule.tex
+++ b/chapters/schedule.tex
@@ -42,10 +42,10 @@ proof.
Vericert to pass through \SSA, as well as GSA, and finally generate dynamic HTL (or some other
similar hardware language). During my visit at Inria Rennes, I worked with Sandrine Blazy and
Delphine Demange to extend CompCertSSA~\cite[barthe14_formal_verif_ssa_based_middl_end_compc] with
-\GSA~\cite[ottenstein90_progr_depen_web]. Not only should this allow for global
-optimisations of \SSA\ code, but it should also allow for a more straightforward translation to
-hardware as the GSA code can just be viewed as a \DFG. Instead of working on modulo scheduling, it
-could therefore be interesting to work on a formally verified dynamic scheduling translation.
+\GSA~\cite[ottenstein90_progr_depen_web]. Not only should this allow for global optimisations of
+\SSA\ code, but it should also allow for a more straightforward translation to hardware as the GSA
+code can just be viewed as a \DFG. Instead of working on modulo scheduling, it could therefore be
+interesting to work on a formally verified dynamic scheduling translation.
\startplacemarginfigure[
title={Gantt chart describing the future plan for 2022 and 2023. The main funding runs out
diff --git a/chapters/scheduling.tex b/chapters/scheduling.tex
index 15850ce..8457a1e 100644
--- a/chapters/scheduling.tex
+++ b/chapters/scheduling.tex
@@ -7,18 +7,18 @@
\section[introduction]{Introduction}
-The use of multicore processors has been increasing drastically, thereby making parallelising
-compilers ever more important. In addition to that, the need for custom hardware accelerators is
-also increasing, as traditional processors are not always the best choice for all
-applications. Compilers that support optimisations which can automatically parallelise it's input
-programs are therefore also becoming more important, as these back ends all benefit from it. For
-processors, one wants to optimise the pipeline usage as much as possible, whereas in hardware one
-can create custom pipelines, as well as making more use out of the spacial properties that hardware
-offers, making instruction parallelism important as well as speculative execution. However, with
-more powerful optimisations there is a greater chance to make mistakes in the compiler, making it
-possible to generate programs that behave differently to the input program. An important step
-towards making compilers more reliable is to formally verify these optimisations so that it can be
-guaranteed that bugs are not introduced.
+\oindex{static scheduling}The use of multicore processors has been increasing drastically, thereby
+making parallelising compilers ever more important. In addition to that, the need for custom
+hardware accelerators is also increasing, as traditional processors are not always the best choice
+for all applications. Compilers that support optimisations which can automatically parallelise it's
+input programs are therefore also becoming more important, as these back ends all benefit from
+it. For processors, one wants to optimise the pipeline usage as much as possible, whereas in
+hardware one can create custom pipelines, as well as making more use out of the spacial properties
+that hardware offers, making instruction parallelism important as well as speculative
+execution. However, with more powerful optimisations there is a greater chance to make mistakes in
+the compiler, making it possible to generate programs that behave differently to the input
+program. An important step towards making compilers more reliable is to formally verify these
+optimisations so that it can be guaranteed that bugs are not introduced.
A popular optimisation for processors increasing the instruction parallelism of the resulting
program is to use scheduling. This is an optimisation that is especially important for processors
@@ -31,11 +31,12 @@ can never move instructions past a branching instruction, even when this might d
the performance. Trace scheduling~ is an alternative that addresses this issue, by creating paths
that might cross branching instructions, and scheduling the instructions in that path. However, the
main problem with trace scheduling is that it is often infeasible on large programs, and because of
-the large design space, it's sometimes hard to find an optimal schedule. Superblock~ and hyperblock~
-scheduling are two subsets of trace scheduling, which compromise on the flexibility of trace
-scheduling to instead build more tractable algorithms. Superblock scheduling produces more
-efficient code for processors with a low issue rate, whereas hyperblock scheduling produces more
-performant code for processors with a high issue rate~.
+the large design space, it's sometimes hard to find an optimal
+schedule. \index{superblock}Superblock~ and \index{hyperblock}hyperblock~ scheduling are two subsets
+of trace scheduling, which compromise on the flexibility of trace scheduling to instead build more
+tractable algorithms. Superblock scheduling produces more efficient code for processors with a low
+issue rate, whereas hyperblock scheduling produces more performant code for processors with a high
+issue rate~.
CompCert~ is a compiler that is formally verified in Coq. In addition to that, have also added a
formally verified hardware back end to CompCert, allowing it to also generate provably correct
@@ -59,7 +60,7 @@ the key points of this paper are the following:
\section[sec:scheduling]{Implementation of Hyperblocks in CompCert}
-This section describes the structure of hyperblocks in
+\index{hyperblock}This section describes the structure of hyperblocks in
Section~\goto{2.1}[scheduling:hyperblocks]. Then, the structure of two extra intermediate languages
that were added to CompCert to implement hyperblocks are also shown in
Section~\goto{2.2}[sec:rtlblockdef].
@@ -81,13 +82,13 @@ predicates can be removed, as these instructions will never be activate at the s
\subsection[sec:rtlblockdef]{RTLBlock and RTLPar Intermediate Language Definition}
-Figure~\goto{{[}fig:compcert_interm{]}}[fig:compcert_interm] shows the intermediate languages
-introduced to implement hyperblocks in CompCert. This consists of new RTLBlock and RTLPar
-intermediate languages, which implement the sequential and parallel semantics of basic blocks
-respectively. The semantics of RTLBlock and RTLPar are based on the CompCert RTL intermediate
-language. However, instead of mapping from states to instructions, RTLBlock maps from states to
-hyperblocks, and RTLPar maps from states to parallel hyperblocks, which will be described in the
-next sections.
+\lindex{RTLBlock}\lindex{RTLPar}Figure~\goto{{[}fig:compcert_interm{]}}[fig:compcert_interm] shows
+the intermediate languages introduced to implement hyperblocks in CompCert. This consists of new
+RTLBlock and RTLPar intermediate languages, which implement the sequential and parallel semantics of
+basic blocks respectively. The semantics of RTLBlock and RTLPar are based on the CompCert RTL
+intermediate language. However, instead of mapping from states to instructions, RTLBlock maps from
+states to hyperblocks, and RTLPar maps from states to parallel hyperblocks, which will be described
+in the next sections.
\placeformula \startformula \startmathalignment
\NC i\ \ \colon\colon= \NC \ \ \mono{RBnop} \NR
@@ -180,16 +181,17 @@ schedule that is shown in Figure~\goto{{[}fig:op_chain_d{]}}[fig:op_chain_d].
\section[abstr_interp]{Abstract Interpretation of Hyperblocks}
-The correctness of the scheduling algorithm is checked by comparing the symbolic expressions of each
-register before and after scheduling. This section describes how these symbolic expressions are
-constructed for each hyperblock.
+\index{abstract interpretation}The correctness of the scheduling algorithm is checked by comparing
+the symbolic expressions of each register before and after scheduling. This section describes how
+these symbolic expressions are constructed for each hyperblock.
\subsection[abstr_language]{Construction of Abstract Language}
-The main difficulty when constructing symbolic expressions for hyperblocks is the presence of
-predicates, which can be used by the scheduling algorithm to reorder instructions further, even if
-these contain data dependencies. The scheduling algorithm will manipulate predicates in the
-following ways, which the comparison function needs to take into account.
+The main difficulty when constructing \index{symbolic expression}symbolic expressions for
+hyperblocks is the presence of predicates, which can be used by the scheduling algorithm to reorder
+instructions further, even if these contain data dependencies. The scheduling algorithm will
+manipulate predicates in the following ways, which the comparison function needs to take into
+account.
\startitemize
\item
@@ -271,11 +273,11 @@ meaning if the predicate evaluates to true, all other predicates must evaluate t
\subsection[linear-flat-predicated-expressions]{Linear (Flat) Predicated Expressions}
-The update function for predicated expressions is quite a bit more complex compared to standard
-update functions for symbolic execution, such as the one used by . The main reason for this is the
-non-recursive nature of the predicated expressions. As the predicates are only available at the
-top-level of the expressions, combining multiple expressions means that the conditions and multiple
-possibilities also need to trickle up to the top.
+\index{symbolic expression}The update function for predicated expressions is quite a bit more
+complex compared to standard update functions for symbolic execution, such as the one used by . The
+main reason for this is the non-recursive nature of the predicated expressions. As the predicates
+are only available at the top-level of the expressions, combining multiple expressions means that
+the conditions and multiple possibilities also need to trickle up to the top.
Instead, if the predicated instructions were recursive, one would not have to change any predicates
when evaluating the argument lists of instructions, for example, as these can just stay
diff --git a/lsr_env.tex b/lsr_env.tex
index e8dde80..188ebe4 100644
--- a/lsr_env.tex
+++ b/lsr_env.tex
@@ -471,4 +471,8 @@
\setupfootnotes[align={stretch,verytolerant,hz,hanging,hyphenated}]
+\define[1]\lindex{\index{intermediate language+#1}}
+\define[1]\oindex{\index{optimisation+#1}}
+\define[1]\pindex{\index{proof+#1}}
+
\stopenvironment
diff --git a/main.tex b/main.tex
index 3d3a974..cc6492c 100644
--- a/main.tex
+++ b/main.tex
@@ -13,8 +13,8 @@
% \showmakeup
% \showframe
% \showbodyfontenvironment
-%\showbodyfont
-%\showjustification
+% \showbodyfont
+% \showjustification
\component title
@@ -35,6 +35,8 @@
\stopbodymatter
\startbackmatter
+ \chapter{Index}
+ \placeindex
\chapter{Bibliography}
\placelistofpublications
\stopbackmatter