XRM 88: Add an internal documentation.

18 Jul 2006

https://svn.lrde.epita.fr/svn/xrm/trunk


Index: ChangeLog
from  SIGOURE Benoit  <sigoure.benoit@lrde.epita.fr>

	Add an internal documentation.

	* src/str/xrm-front.str: Fix comments.
	* doc/user-guide.txt: Add an internal documentation.

 doc/user-guide.txt    |  230 ++++++++++++++++++++++++++++++++++++++++++++++++--
 src/str/xrm-front.str |    4 
 2 files changed, 223 insertions(+), 11 deletions(-)

Index: src/str/xrm-front.str

--- src/str/xrm-front.str	(revision 87)
+++ src/str/xrm-front.str	(working copy)
@@ -36,13 +36,13 @@
           if must-parse-pctl then		// output PCTL source
             log-timed(xtc-transform(!"pp-pctl", pass-verbose)
                      | "pretty printed PCTL code", 2)
-          else					// output PRISM source
+          else				// output PRISM source (default)
             log-timed(xtc-transform(!"pp-prism", pass-verbose)
                      | "pretty printed PRISM code", 2)
           end
         end
       end
-      /* default: if we don't get inside the previous if, we will output
+      /* if we don't get inside the previous if, we will output
        * binary ATerms. */
 
   /** pipeline of transformations performed by xrm-front */
Index: doc/user-guide.txt
--- doc/user-guide.txt	(revision 87)
+++ doc/user-guide.txt	(working copy)
@@ -27,7 +27,7 @@
 
   The goal of eXtended Reactive Modules is to provide a comprehensive
 solution to these problems by adding syntactic extensions to the PRISM
-language. XRM comes with a set of tools with enables the programmer to work
+language. XRM comes with a set of tools which enable the programmer to work
 with the extended version of the PRISM language. The main tool, xrm-front,
 will compile an XRM source file (written in the extended language) in a PRISM
 source file (the base language, as used by the tools PRISM and APMC).
@@ -48,16 +48,18 @@
       o OS with executable stack.
       o ATerm 2.4.2 or newer.
       o SDF2-Bundle 2.3.4 or newer.
+      o GNU make.
 
   - Using Nix:
       Nix is a package management system that ensures safe and complete
       installation of packages.
       You can get Nix from http://www.cs.uu.nl/wiki/Trace/Nix (pick up the
       latest _unstable_ release).
+      /!\ Download nix-X.YYpreZZZZ not nixpkgs-X.YYpreZZZZ.
       Once Nix is installed, use the following commands (you might need to be
       root depending on how you installed nix):
       $ nix-channel --add http://nix.cs.uu.nl/dist/stratego/channels-v3/strategoxt-unstable
-      $ nix-env -u '*'
+      $ nix-channel --update
       $ nix-env --install aterm sdf2-bundle strategoxt
       There you are!
       Add the following line to your .bashrc/.zshrc:
@@ -137,7 +139,7 @@
   copy is available in the prism/ directory).
 
   The XRM language is 100% PRISM compliant and only offers extension to the
-  language.
+  base language.
 
   Many people used to overcome the limitations of the PRISM language by
   generating PRISM code using scripts (Shell scripts, M4 scripts, etc...).
@@ -451,7 +453,7 @@
       // XPCTL code here
     end
   If you use property sections, you will need to pass an additional argument
-  to xrm-front to specify the PCTL property file where the options must be
+  to xrm-front to specify the PCTL property file where the properties must be
   saved. The option switch is --p-output f or -po f (where `f' is a path).
   However, this option is not mandatory. If you specify property sections but
   omit this switch, xrm-front will discard the properties and issue a warning
@@ -536,14 +538,224 @@
 
   # General design
     --------------
-
-  // FIXME
+    This sections explains the general design used when developing XRM.
+    The sources are located under the src/ folder. They are divided in 5
+    sections:
+      * src/lib: Contains strategies/code exported in libraries linked with
+                 the tools. In fact (at this time) XRM doesn't use any
+                 library but the code under this path should be exported
+                 through shared and static libraries.
+      * src/lib/native: Contains C code used by Stratego code (with `prim').
+                 Actually this code features several extensions used to
+                 manipulate float values in Stratego. This was implemented in
+                 C because it wasn't available in stratego-lib and because
+                 doing the same work would be a pain in C.
+                 This code also provide a random number generator.
+      * src/lib/<lang>/pp: Where <lang> is {pctl,prism,xpctl,xrm}. Contains
+                 the pretty-printing strategies used by the pretty printers.
+                 These strategies are meant to be available in libraries but
+                 at this time they are simply imported by the
+                 pretty-printers. Exporting them in libraries would reduce
+                 compilation time. They should rather be exported in static
+                 libraries (if possible) because dynamic libraries have
+                 inconvenient: the .so must be installed and available in
+                 the LD_LIBRARY_PATH (or must be installed in standard
+                 paths). Installing one dynamic library per pretty-printer
+                 does not seem quite attractive.
+      * src/sig: Contains nothing. This is where signatures for the 4
+                 languages (pctl, prism, xpctl, xrm) will be generated (under
+                 the build dir)
+      * src/str: Contains the Stratego code used by xrm-front.
+      * src/syn/<lang>: Contains the SDF grammar of the language <lang>. Each
+                 grammar has Stratego embeddings (for concrete syntax)
+                 defined in Stratego<LANG>.sdf and optionally
+                 <LANG>-MetaCongruences.sdf and <LANG>-MetaVars.sdf.
+                 Renamed version of each grammar are generated during the
+                 build so that the grammars can easily be assimilated within
+                 other host languages later on.
+      * src/tools: Contains Stratego sources for the pretty-printers and
+                 parsers (which are registered as XTC components).
     
     o The build system
-    o How the tools (parsers, pretty printers...) are made, XTC
-    o FIXME
+      ----------------
+      XRM use the autotools to set up the build system. It also uses the
+      Makefile provided by autoxt and Transformers' Makefile. The configure
+      script tries to guess the correct value for PKG_CONFIG_PATH if it is
+      not provided by looking for common locations where Stratego/XT is
+      installed by Nix. Many GNU make extensions are used by the Makefiles.
+      Parallel builds work and are encouraged.
+
+    o Creating tools with XTC
+      -----------------------
+      XRM uses XTC-registered components. For the documentation of XTC see
+      the chapters 28 and 29 of the Stratego/XT manual
+      (http://nix.cs.uu.nl/dist/stratego/strategoxt-manual-unstable-latest/manual/c...,
+      http://nix.cs.uu.nl/dist/stratego/strategoxt-manual-unstable-latest/manual/c...).
+      Basically, XTC components are simply applications or files which are
+      registered by a program called `xtc'. This program creates a file where
+      registered components are listed (the repository). When a component is
+      registered in the repository, its name, version and path are saved.
+      This is useful so that you can develop tiny applications and call them
+      one after the other in a pipeline. Calling them from Stratego is made
+      easy because the Stratego programs know where the XTC repository is and
+      it can query that repository to find the auxiliary tools it needs. For
+      instance one could register ls, grep and wc in an XTC repository. Then
+      one could invoke them from a Stratego program one after the other in
+      order to simulate the command `ls -l | grep rwx | wc -l` for instance.
+      Note that XTC repositories are in fact ATerms and you can display them
+      with pp-aterm.
+      For instance this is used by parsers which need to invoke sglr with the
+      correct parse table in argument. The parse table is generated somewhere
+      and is registered in the XTC repository. The parser looks up the
+      location of the parse table through the repository and then invoke sglr
+      with the right path to the parse table.
+      However, this has a disadvantage: Invoking XTC components is quite
+      inefficient. This is because when you do this, the current term is
+      saved in a temporary file under /tmp. When the current term is quite
+      large this can produce files from several hundred mega-bytes up to
+      several giga-bytes. Then the XTC component you invoked is called with
+      that temporary file as input (-i) and another new temporary file as
+      output (-o). Once it has finished, the current processes reloads its
+      current term from the temporary file where the XTC component sent its
+      output. This leads to many i/o operations on disk which tend to be
+      quite slow. That's why XRM tries to avoid to use XTC for external
+      processes as much as possible. But since we still want to be modular,
+      instead of using XTC components, we use external libraries. At this
+      time XRM doesn't use libraries because it was easier to import the
+      libraries required than to export them through libraries which would
+      then be linked with the binary. However this makes the compilation time
+      longer. We should really consider exporting common things in external
+      libraries.
+
+      The --help and --about options are handled using tool-doc
+      (https://svn.cs.uu.nl:12443/repos/StrategoXT/strategoxt/trunk/stratego-regula...).
+      The pretty-printers directly include the strategies they need (the
+      pretty-printing rules are written in Stratego under src/lib/<lang>/pp).
+      This enables them to be stand-alone, they can pretty-print without
+      using intermediate files. They also use libstratego-gpp which is quite
+      recent. This enables them to use abox-to-text without to call an
+      external XTC component and go through intermediate temporary files.
 
   # xrm-front's pipeline
     --------------------
 
-  // FIXME
+  In this section we will review the pipeline of xrm-front. We will explain
+  the different passes of the pipeline, how the dynamic rules are used and
+  how the transformations are performed.
+
+  Everything begins in src/str/xrm-front.str. The first thing performed by
+  xrm-front is to check whether the options it was invoked with are
+  consistent. For instance, if the user invokes xrm-front with the flags -b
+  (request output in binary ATerm format) and -P (request output in
+  pretty-printed ATerm format) the switch -P is ignored and a warning is
+  issued. In former version of xrm-front, there were more conflicting options.
+
+  Then xrm-front parses its input files with the correct parser. When the
+  switch -p (or --pctl) is provided, XRM must parse XPCTL source code whereas
+  it parses XRM source code by default. The parsing is performed by an
+  external XTC component. It would be useful to use library-based parser
+  instead to improve performances. Then the strategy xrm-front-pipeline is
+  invoked.
+
+  This where the transformations begin. The real pipeline can be found in
+  src/str/xrm-to-prism.str.
+
+  The first pass removes the XRM sugar. This is a simple innermost traversal
+  where basic transformations rules are applied to remove what is merely
+  simple syntactic sugar. We also use this pass to catch calls to the XRM
+  builtin rand in property files (which is not allowed). In XRM files, these
+  calls are simply replaced by a variable which will be the random variable.
+  At this time each random variable is controlled by a separate module
+  generated by xrm-front. This module is stored in a DR named RandGenModules.
+  
+  The second pass collects static const declarations, formula (and
+  parameterized) declarations. For property files, the formulas are removed
+  when they are collected (because they are not allowed in standard property
+  files). Parameterized formulas are always removed once they are collected
+  because they do not exist in the base languages. These declarations are
+  collected in dynamic rules as follows:
+    * ExpandStaticConsts: id -> value
+    * ExpandFormulas: id -> value
+    * ExpandPFormulas: PFormulaCall(name, a*) -> e
+      this DR rewrites a call to the parameterized formula `name' (invoked
+      with the arguments `a*') to `e' which is the inlined version of the
+      formula `name' with its formal parameters replaced by the parameters
+      provided in `a*'.
+
+  The third pass of xrm-front is to check that meta-vars are used correctly.
+  The whole code is traversed using a hand-crafted traversal that collects
+  the declarations of meta-vars (using the appropriate scopes). For instance,
+  when the traversal enters a for loop, it registers the meta-var used to
+  iterate in the loop in a scoped DR. Several things must be evaluable down
+  to simple integers at compile time (such as array subscripts or the
+  `from' or `to' of a for loop for instance. Indeed if you can't evaluate
+  the latter at compile time, how can you unroll the loop if you don't know
+  how many iterations it has to go through?). Once we know what are the
+  meta-vars (as well as static consts, formulas etc.), we can easily check
+  whether an expression is evaluable as a literal value at compile time. If
+  the expression contains only literals and identifiers known to be evaluable
+  at compile time (such as static consts and meta-vars for instance) then the
+  expression itself is evaluable at compile time.
+
+  Then an important pass starts: the meta-code is evaluated (leading to code
+  generation). Once again, this pass uses a hand-made traversal and features:
+  evaluation of static ifs, lazy evaluation of operator `&' and operator `|'
+  (so that the user can rely on them to prevent invalid code from being
+  evaluated, eg: x > 0 && a[x] to prevent an invalid array access). For loops
+  are unrolled. Calls to parameterized are inlined. When a call to a
+  parameterized formula is inlined, we must re-start the pass on the code
+  inlined.
+
+  The fifth pass consists of removal of array declarations, evaluation of
+  calls to the XRM builtin static_rand and collection of non-array local
+  variable declarations. This is 3 different things but they
+  are combined together to reduce the number of traversals. Array
+  declarations are rewritten as declarations of lists of variables. xrm-front
+  ensures that array declarations are not overlapping with each other.
+  Each array declaration is recorded in the DR DeclaredArrays which maps an
+  Identifier(idf) -> aa-list where `aa-list' is the list of array subscript
+  declared for that array. When an array is declared in multiple parts, each
+  part is added in the same entry of DeclaredArrays (we first fetch the parts
+  previously declared, check that they don't overlap with the parts being
+  declared, and if they don't, we concatenate them and save them back in the
+  DR). This part of the code might be a bit hard to read because an abstract
+  factory is used to build declaration lists resulting of array declarations
+  and this is a bit hard to follow. Hopefully the numerous comments in the
+  file will help the reader to understand how the rewriting is performed.
+  Non-array declarations are also collected in a DR named DeclaredIdentifers
+  which maps an identifier to some information about it (such as its type,
+  its definition range, its initial value). These information are gathered
+  for the type-checking pass. Note that type information of variables
+  declared in arrays are not yet gathered.
+
+  So now, array declarations have been rewritten as declaration lists. This
+  introduces several unwanted nested lists in the AST which are then removed
+  using `flatten-list'.
+
+  Then comes the type-checking pass. At this time, this pass is quite basic
+  and only checks that array subscripts do not lead to out of bound array
+  accesses. We can easily do this thanks to the DR DeclaredArrays which maps
+  an identifier to the dimensions declared for that array. The pass also
+  ensures that all array accesses are made on declared arrays.
+
+  Then, we paste the module generated by the calls to the rand builtin
+  (which were saved in the DR RandGenModules) at the end of the file.
+
+  Now we can remove all array accesses. This is done by flattening them, eg
+  x[1][2][3] is rewritten as x_1_2_3. For property files, we also use this
+  pass to expand all non-parameterized formulas since their expansions wasn't
+  forced by any previous pass. We must force the expansion of
+  non-parameterized formulas in property files because they are not allowed
+  in the base language.
+
+  In the end, we can now re-order the content of the modules. Indeed, in XRM
+  we allow the declarations and commands to be freely intertwined whereas in
+  PRISM declarations must always come before the commands. This is done
+  using a scoped DR. We traverse all the Modules and collect declarations and
+  commands in DRs (respectively CommandList and DeclarationList). Once we
+  have them all we can easily re-construct the module with the declarations
+  first, followed by the commands.
+
+  The resulting AST is further desugared if the -D | --desugar switch was
+  provided, then it is pretty-printed in the format required by the command
+  line (default: PRISM code).

    

SIGOURE Benoit

tags

participants (1)