XRM 88: Add an internal documentation.

https://svn.lrde.epita.fr/svn/xrm/trunk Index: ChangeLog from SIGOURE Benoit <sigoure.benoit@lrde.epita.fr> Add an internal documentation. * src/str/xrm-front.str: Fix comments. * doc/user-guide.txt: Add an internal documentation. doc/user-guide.txt | 230 ++++++++++++++++++++++++++++++++++++++++++++++++-- src/str/xrm-front.str | 4 2 files changed, 223 insertions(+), 11 deletions(-) Index: src/str/xrm-front.str --- src/str/xrm-front.str (revision 87) +++ src/str/xrm-front.str (working copy) @@ -36,13 +36,13 @@ if must-parse-pctl then // output PCTL source log-timed(xtc-transform(!"pp-pctl", pass-verbose) | "pretty printed PCTL code", 2) - else // output PRISM source + else // output PRISM source (default) log-timed(xtc-transform(!"pp-prism", pass-verbose) | "pretty printed PRISM code", 2) end end end - /* default: if we don't get inside the previous if, we will output + /* if we don't get inside the previous if, we will output * binary ATerms. */ /** pipeline of transformations performed by xrm-front */ Index: doc/user-guide.txt --- doc/user-guide.txt (revision 87) +++ doc/user-guide.txt (working copy) @@ -27,7 +27,7 @@ The goal of eXtended Reactive Modules is to provide a comprehensive solution to these problems by adding syntactic extensions to the PRISM -language. XRM comes with a set of tools with enables the programmer to work +language. XRM comes with a set of tools which enable the programmer to work with the extended version of the PRISM language. The main tool, xrm-front, will compile an XRM source file (written in the extended language) in a PRISM source file (the base language, as used by the tools PRISM and APMC). @@ -48,16 +48,18 @@ o OS with executable stack. o ATerm 2.4.2 or newer. o SDF2-Bundle 2.3.4 or newer. + o GNU make. - Using Nix: Nix is a package management system that ensures safe and complete installation of packages. You can get Nix from http://www.cs.uu.nl/wiki/Trace/Nix (pick up the latest _unstable_ release). + /!\ Download nix-X.YYpreZZZZ not nixpkgs-X.YYpreZZZZ. Once Nix is installed, use the following commands (you might need to be root depending on how you installed nix): $ nix-channel --add http://nix.cs.uu.nl/dist/stratego/channels-v3/strategoxt-unstable - $ nix-env -u '*' + $ nix-channel --update $ nix-env --install aterm sdf2-bundle strategoxt There you are! Add the following line to your .bashrc/.zshrc: @@ -137,7 +139,7 @@ copy is available in the prism/ directory). The XRM language is 100% PRISM compliant and only offers extension to the - language. + base language. Many people used to overcome the limitations of the PRISM language by generating PRISM code using scripts (Shell scripts, M4 scripts, etc...). @@ -451,7 +453,7 @@ // XPCTL code here end If you use property sections, you will need to pass an additional argument - to xrm-front to specify the PCTL property file where the options must be + to xrm-front to specify the PCTL property file where the properties must be saved. The option switch is --p-output f or -po f (where `f' is a path). However, this option is not mandatory. If you specify property sections but omit this switch, xrm-front will discard the properties and issue a warning @@ -536,14 +538,224 @@ # General design -------------- - - // FIXME + This sections explains the general design used when developing XRM. + The sources are located under the src/ folder. They are divided in 5 + sections: + * src/lib: Contains strategies/code exported in libraries linked with + the tools. In fact (at this time) XRM doesn't use any + library but the code under this path should be exported + through shared and static libraries. + * src/lib/native: Contains C code used by Stratego code (with `prim'). + Actually this code features several extensions used to + manipulate float values in Stratego. This was implemented in + C because it wasn't available in stratego-lib and because + doing the same work would be a pain in C. + This code also provide a random number generator. + * src/lib/<lang>/pp: Where <lang> is {pctl,prism,xpctl,xrm}. Contains + the pretty-printing strategies used by the pretty printers. + These strategies are meant to be available in libraries but + at this time they are simply imported by the + pretty-printers. Exporting them in libraries would reduce + compilation time. They should rather be exported in static + libraries (if possible) because dynamic libraries have + inconvenient: the .so must be installed and available in + the LD_LIBRARY_PATH (or must be installed in standard + paths). Installing one dynamic library per pretty-printer + does not seem quite attractive. + * src/sig: Contains nothing. This is where signatures for the 4 + languages (pctl, prism, xpctl, xrm) will be generated (under + the build dir) + * src/str: Contains the Stratego code used by xrm-front. + * src/syn/<lang>: Contains the SDF grammar of the language <lang>. Each + grammar has Stratego embeddings (for concrete syntax) + defined in Stratego<LANG>.sdf and optionally + <LANG>-MetaCongruences.sdf and <LANG>-MetaVars.sdf. + Renamed version of each grammar are generated during the + build so that the grammars can easily be assimilated within + other host languages later on. + * src/tools: Contains Stratego sources for the pretty-printers and + parsers (which are registered as XTC components). o The build system - o How the tools (parsers, pretty printers...) are made, XTC - o FIXME + ---------------- + XRM use the autotools to set up the build system. It also uses the + Makefile provided by autoxt and Transformers' Makefile. The configure + script tries to guess the correct value for PKG_CONFIG_PATH if it is + not provided by looking for common locations where Stratego/XT is + installed by Nix. Many GNU make extensions are used by the Makefiles. + Parallel builds work and are encouraged. + + o Creating tools with XTC + ----------------------- + XRM uses XTC-registered components. For the documentation of XTC see + the chapters 28 and 29 of the Stratego/XT manual + (http://nix.cs.uu.nl/dist/stratego/strategoxt-manual-unstable-latest/manual/c..., + http://nix.cs.uu.nl/dist/stratego/strategoxt-manual-unstable-latest/manual/c...). + Basically, XTC components are simply applications or files which are + registered by a program called `xtc'. This program creates a file where + registered components are listed (the repository). When a component is + registered in the repository, its name, version and path are saved. + This is useful so that you can develop tiny applications and call them + one after the other in a pipeline. Calling them from Stratego is made + easy because the Stratego programs know where the XTC repository is and + it can query that repository to find the auxiliary tools it needs. For + instance one could register ls, grep and wc in an XTC repository. Then + one could invoke them from a Stratego program one after the other in + order to simulate the command `ls -l | grep rwx | wc -l` for instance. + Note that XTC repositories are in fact ATerms and you can display them + with pp-aterm. + For instance this is used by parsers which need to invoke sglr with the + correct parse table in argument. The parse table is generated somewhere + and is registered in the XTC repository. The parser looks up the + location of the parse table through the repository and then invoke sglr + with the right path to the parse table. + However, this has a disadvantage: Invoking XTC components is quite + inefficient. This is because when you do this, the current term is + saved in a temporary file under /tmp. When the current term is quite + large this can produce files from several hundred mega-bytes up to + several giga-bytes. Then the XTC component you invoked is called with + that temporary file as input (-i) and another new temporary file as + output (-o). Once it has finished, the current processes reloads its + current term from the temporary file where the XTC component sent its + output. This leads to many i/o operations on disk which tend to be + quite slow. That's why XRM tries to avoid to use XTC for external + processes as much as possible. But since we still want to be modular, + instead of using XTC components, we use external libraries. At this + time XRM doesn't use libraries because it was easier to import the + libraries required than to export them through libraries which would + then be linked with the binary. However this makes the compilation time + longer. We should really consider exporting common things in external + libraries. + + The --help and --about options are handled using tool-doc + (https://svn.cs.uu.nl:12443/repos/StrategoXT/strategoxt/trunk/stratego-regula...). + The pretty-printers directly include the strategies they need (the + pretty-printing rules are written in Stratego under src/lib/<lang>/pp). + This enables them to be stand-alone, they can pretty-print without + using intermediate files. They also use libstratego-gpp which is quite + recent. This enables them to use abox-to-text without to call an + external XTC component and go through intermediate temporary files. # xrm-front's pipeline -------------------- - // FIXME + In this section we will review the pipeline of xrm-front. We will explain + the different passes of the pipeline, how the dynamic rules are used and + how the transformations are performed. + + Everything begins in src/str/xrm-front.str. The first thing performed by + xrm-front is to check whether the options it was invoked with are + consistent. For instance, if the user invokes xrm-front with the flags -b + (request output in binary ATerm format) and -P (request output in + pretty-printed ATerm format) the switch -P is ignored and a warning is + issued. In former version of xrm-front, there were more conflicting options. + + Then xrm-front parses its input files with the correct parser. When the + switch -p (or --pctl) is provided, XRM must parse XPCTL source code whereas + it parses XRM source code by default. The parsing is performed by an + external XTC component. It would be useful to use library-based parser + instead to improve performances. Then the strategy xrm-front-pipeline is + invoked. + + This where the transformations begin. The real pipeline can be found in + src/str/xrm-to-prism.str. + + The first pass removes the XRM sugar. This is a simple innermost traversal + where basic transformations rules are applied to remove what is merely + simple syntactic sugar. We also use this pass to catch calls to the XRM + builtin rand in property files (which is not allowed). In XRM files, these + calls are simply replaced by a variable which will be the random variable. + At this time each random variable is controlled by a separate module + generated by xrm-front. This module is stored in a DR named RandGenModules. + + The second pass collects static const declarations, formula (and + parameterized) declarations. For property files, the formulas are removed + when they are collected (because they are not allowed in standard property + files). Parameterized formulas are always removed once they are collected + because they do not exist in the base languages. These declarations are + collected in dynamic rules as follows: + * ExpandStaticConsts: id -> value + * ExpandFormulas: id -> value + * ExpandPFormulas: PFormulaCall(name, a*) -> e + this DR rewrites a call to the parameterized formula `name' (invoked + with the arguments `a*') to `e' which is the inlined version of the + formula `name' with its formal parameters replaced by the parameters + provided in `a*'. + + The third pass of xrm-front is to check that meta-vars are used correctly. + The whole code is traversed using a hand-crafted traversal that collects + the declarations of meta-vars (using the appropriate scopes). For instance, + when the traversal enters a for loop, it registers the meta-var used to + iterate in the loop in a scoped DR. Several things must be evaluable down + to simple integers at compile time (such as array subscripts or the + `from' or `to' of a for loop for instance. Indeed if you can't evaluate + the latter at compile time, how can you unroll the loop if you don't know + how many iterations it has to go through?). Once we know what are the + meta-vars (as well as static consts, formulas etc.), we can easily check + whether an expression is evaluable as a literal value at compile time. If + the expression contains only literals and identifiers known to be evaluable + at compile time (such as static consts and meta-vars for instance) then the + expression itself is evaluable at compile time. + + Then an important pass starts: the meta-code is evaluated (leading to code + generation). Once again, this pass uses a hand-made traversal and features: + evaluation of static ifs, lazy evaluation of operator `&' and operator `|' + (so that the user can rely on them to prevent invalid code from being + evaluated, eg: x > 0 && a[x] to prevent an invalid array access). For loops + are unrolled. Calls to parameterized are inlined. When a call to a + parameterized formula is inlined, we must re-start the pass on the code + inlined. + + The fifth pass consists of removal of array declarations, evaluation of + calls to the XRM builtin static_rand and collection of non-array local + variable declarations. This is 3 different things but they + are combined together to reduce the number of traversals. Array + declarations are rewritten as declarations of lists of variables. xrm-front + ensures that array declarations are not overlapping with each other. + Each array declaration is recorded in the DR DeclaredArrays which maps an + Identifier(idf) -> aa-list where `aa-list' is the list of array subscript + declared for that array. When an array is declared in multiple parts, each + part is added in the same entry of DeclaredArrays (we first fetch the parts + previously declared, check that they don't overlap with the parts being + declared, and if they don't, we concatenate them and save them back in the + DR). This part of the code might be a bit hard to read because an abstract + factory is used to build declaration lists resulting of array declarations + and this is a bit hard to follow. Hopefully the numerous comments in the + file will help the reader to understand how the rewriting is performed. + Non-array declarations are also collected in a DR named DeclaredIdentifers + which maps an identifier to some information about it (such as its type, + its definition range, its initial value). These information are gathered + for the type-checking pass. Note that type information of variables + declared in arrays are not yet gathered. + + So now, array declarations have been rewritten as declaration lists. This + introduces several unwanted nested lists in the AST which are then removed + using `flatten-list'. + + Then comes the type-checking pass. At this time, this pass is quite basic + and only checks that array subscripts do not lead to out of bound array + accesses. We can easily do this thanks to the DR DeclaredArrays which maps + an identifier to the dimensions declared for that array. The pass also + ensures that all array accesses are made on declared arrays. + + Then, we paste the module generated by the calls to the rand builtin + (which were saved in the DR RandGenModules) at the end of the file. + + Now we can remove all array accesses. This is done by flattening them, eg + x[1][2][3] is rewritten as x_1_2_3. For property files, we also use this + pass to expand all non-parameterized formulas since their expansions wasn't + forced by any previous pass. We must force the expansion of + non-parameterized formulas in property files because they are not allowed + in the base language. + + In the end, we can now re-order the content of the modules. Indeed, in XRM + we allow the declarations and commands to be freely intertwined whereas in + PRISM declarations must always come before the commands. This is done + using a scoped DR. We traverse all the Modules and collect declarations and + commands in DRs (respectively CommandList and DeclarationList). Once we + have them all we can easily re-construct the module with the declarations + first, followed by the commands. + + The resulting AST is further desugared if the -D | --desugar switch was + provided, then it is pretty-printed in the format required by the command + line (default: PRISM code).
participants (1)
-
SIGOURE Benoit