The Interchange Development Group
http://www.icdevgroup.org

ICDEVGROUP Documentation Set
http://git.icdevgroup.org/?p=xmldocs.git;a=tree

This is a technical look at the XMLDOCS system, and it tells you how to get
the documentation generated. For XMLDOCS authoring, see file WRITING.


INTRODUCTION

The (new) Interchange XML documentation set is completely self-contained.
To build the complete documentation set, just run:

  make git
  make

Sometimes you want to build multiple versions of documentation at once
(for example, one online, one offline, one profiled for Debian GNU etc.).
It is possible to change OUTPUT/ directory to something else (named
OUTPUT<yourstring>) to achieve this effect. However, places where I
couldn't have inserted variables have OUTPUT/ hardcoded, so OUTPUT is still
made a symbolic link to the output dir of your choice. 

Here's an example:

  OUTPUT=-std make skel

After that point, you can omit OUTPUT= from subsequent calls to "make"
and things will still clap together nicely.

Also, it is possible to generate documentation for a specific Interchange
release, such as 5.0.0 or 5.2.0. To do so, use TARGET env variable:

  TARGET="5.2.0" make refxmls

Just make sure to have TARGET defined on each 'make' invocation.

**  -- -- --   -- -- -- --   -- -- -- --   -- -- -- --   -- -- -- --  **

To build specific targets, see Makefile for target names. Few useful
targets would include:

 -- Those that are not part of 'make' routine:
  make git          (to create complete sources/ directory, or update existing)
  make clean        (removes OUTPUT/)
  make distclean    (remove OUTPUT/ and tmp/)
  make look-clean   (clean + 'mv tmp tmp.temporary'. Useful to only make
                    the tree as if it is clean (to perform Git operations
                    without errors), but then automatically rename
                    the directory back to 'tmp/' and avoid the overhead of
                    regenerating OlinkDB files).

 -- And those that are part of 'make':
  make skel
  make cache
  make refxmls
  make olinkdbs-nc olinkdbs-c
  make OUTPUT/xmldocs.css
  make tmp/iccattut-nc.db tmp/iccattut-c.db
  make OUTPUT/iccattut.html OUTPUT/iccattut
  ...

See Makefile for a complete list, of course.


PREREQUISITES

To perform a successful build, the following programs and modules
must be available:

  - Perl
  - Shell commands: mkdir, cp, ln, find, tar, gzip, bzip2, make
  - git-core
  - docbook-xml
  - docbook-xsl
  - xsltproc
  - tetex-extra (and actually, packages it depends on)

Optional Perl modules:

  - XML::Twig

(Install the Twig module if if you want to syntax-check your additions by
calling i.e. "bin/refs-autogen --validate git-head").

RedHat setups

If running on Red Hat and not Debian GNU, apply this patch:
http://icdevgroup.org/~docelic/xmldocs-rh.diff . It allows you to get
usable results, although it's flaky and Debian GNU is really the preferred
platform. (I suppose with a better patch, by someone know knows Red Hat
and its DocBook XML setup, Red Hat would produce completely good results
too - patches welcome).

On Gentoo GNU/Linux systems, the prerequisite ebuild packages are as follows:

  - dev-lang/perl
  - dev-util/git
  - app-arch/gzip
  - app-arch/bzip2
  - app-text/docbook-xml-dtd
  - app-text/docbook-xsl-stylesheets
  - dev-libs/libxslt
  - dev-util/ctags
  - app-text/tetex
  - dev-perl/XML-Twig

If running on Gentoo GNU/Linux then you will need to apply the following
patch to correct the hard-coded, Debian-specific file locations:

    http://icdevgroup.org/~kwalsh/xmldocs-gentoo.diff

FINAL OUTPUT

During the invocation of 'make', few files will be created:

  docbook/auto*.ent - Files containing XML entities that we can use in our
                   sources. For example, configuration directive Include is
                   referenced simply as &conf-Include;, tag [include] is 
                   referenced simply as &tag-include;.
  
  tmp/*.db       - OLink DB files generated from source .xml files,
                   and other, on-the-fly .xmls. 

  cache/<ver>/*  - Various Interchange source tree statistics, available
                   over a filesystem interface. (For XInclusion in .xml
                   sources and similar purposes). The files are generated
                   by bin/mkreport which depends on cache/<ver>/.cache.bin.
                   cache/<ver>/.cache.bin is generated by bin/stattree.

                   The cache is Perl's portable Storable dump. It was
                   originally kept in Git (so others could re-use it),
                   but that didn't play out well in practice so now everyone
                   building the docs needs to have the sources/ directory
                   ready and run 'make caches' himself to get the .bin
                   files generated.
  
  OUTPUT/        - Autogenerated:
                   directory containing the actual completely self-contained
                   and interlinked documentation set. Once it's created, you
                   can move it out of the build tree and package as you see
                   fit.
                   
                   This can also be OUTPUT<yourprefix>, if you pass e.g.
                   OUTPUT=-std in a call to make (as already shown above).

  tmp/missing[2] - After you build the documentation, there will be a file
                   named tmp/missing autogenerated, and it will contain a list
                   of symbols with all the parts of the documentation they
                   still miss for the item to be considered complete.
                   tmp/missing2 will also be created that will list HOWTO
                   items and glossary entries that need work.


DEVELOPMENT NOTES

 The directory structure:
   Makefile      - Main Makefile
   bin           - Helper tools
   cache         - Interchange source trees metadata
   docbook       - DocBook XML support files
   files         - Support files, such as downloadable examples etc.
   guides        - Collection of guides
   refs          - Collection of reference pages
   glossary      - Collection of glossary items
   images        - All images
   tmp           - Scratch and temporary file space
   pending       - A "pending" directory.
                   If you have a chunk which you'd like to integrate in
                   the docset, but don't have the time to prepare it
                   yourself, just drop it in there and someone will pick
                   it up.
   sources       - (Not in Git.) run 'make git' to auto-create this 
                   directory with all needed contents. 
   whatsnew      - A directory containing whatsnew.xml, continuously updated
                   what's new list. The items are added automatically; just
                   place copies of the Git commit messages in the whatsnew/
                   directory; every time you run bin/whatsnew-update, it will
                   process the files and update whatsnew.xml, which you can
                   then push in Git.


 Updating cache/:
   The dotfiles found in cache/ can only be generated when the sources/
   directory is present as described above, and contains Interchange releases
   in directories named after release numbers (with the exception of
   "git-head"). Run 'make git' to create that sources/ directory with
   all the contents. Run 'make up-all' to invoke git update on all
   versions (or 'make up-<ver>' for specific). ('make git' can also be used
   to update repositories).


   ** Leftover information, not important for standard procedure: ************
   * Once sources/ is in place, make sure all available versions are mentioned
   * in Makefile's IC_VERSIONS variable, and then run 'make cache'.
   * This will regenerate files for the versions you have.
   *
   * It is important to have as many releases as possible in sources/, because
   * when you generate the documentation (reference pages most notably), the
   * symbols there are considered "added" the first time they're encountered
   * in the sources (so they'll appear more recent than they actually are).
   ** End of leftover information ********************************************


   When bin/mkreport runs later, it parses the .cache.bin file and produces
   number of output files (interesting "leaf nodes" in a hash). Those 
   files are filesystem interface to tree-level statistics, and can be
   used in numerous ways, XInclude for example.
   Like: "Interchange consists of <xi:include file='.../total.files'> files".
   (Currently this is not available; bin/mkreport is outdated and broken, and
   will be fixed when I come to needing it).


 The XML "preprocessor" tool:
   There's bin/pp tool which you can use to write larger blocks of
   XML more conveniently. See the script itself for usage notes.

 The "CVS CO/UP" script:
   There's bin/coup tool that helps you manage sources/ directory.
   See the script or the Makefile for invocation examples.
   ('make git' and 'make up-<ver|all>' invoke bin/coup).


 Autogeneration of the reference pages: ** IMPORTANT **
 Creation of new documentation parts:   ** IMPORTANT **

   When bin/stattree runs, it collects information about all the "symbols"
   in the source it can find (symbols are anything: pragmas, global variables,
   functions, tags, config directives ...).
   It collects the symbol names together with all files
   and line numbers (and few lines of context around them) where they
   appear. This is the first step of reference pages autogeneration.

   Some of the symbol information is derived from the source automatically;
   other parts must be added manually by us. 

   Please note that the symbol name must match the symbol used in the source,
   although tags allowed to be used with two forms (dash and underscore).
   
   Let's take an example of "post_page" pragma (but the principle is the same
   for any symbol). User-supplied information is found in either:
   
   o  refs/<symbol>/ directory, or
   o  refs/<symbol> file, where multiple sections are defined using the 
      __NAME__ <section> and __END__ tokens (similar to IC profiles ;-).
      Everything outside __NAME__ <name> ... __END__ blocks is discarded
      and can effectively be used as a comment area.

   The refs/<symbol> single-file-based approach is now preferred. It's more
   convenient, and keeps the number of files in Git to a minimum. The bin/editem
   script advises to use it.
   Use refs/<symbol>/* only in special cases (which is never, or close to it).
   
   Regardless of the way you document an item, the following information
   is needed to consider the symbol documentation complete:

     - Pieces that must be user-supplied because defaults are only placeholders:
        purpose, synopsis, description, examples, see also

     - Pieces that could be user-supplied but have meaningful default:
        notes, bugs, authors, copyright

     - Autogenerated (can be overriden if really neccesary):
        id, name, symbol type, availability, source occurences


   *** Obscure information START /// Not needed in general ***
   * All of above fields can both be overriden or appended with user-supplied
   * information:
   *
   *oo  refs/<symbol> method (one-file, the preferred way):
   *
   *   To fill the template of the reference page, you add content to sections
   *   in the following way:
   *
   *   __NAME__ section name
   *   section content
   *   __END__
   *
   *   Over time it appeared we only want to append information and never
   *   override it, so this method does not have a way to override a value
   *   like refs/<symbol>/control (in multi-file method) can do.
   * 
   *oo  refs/<symbol>/* method (one-directory, multiple-files):
   *
   *   To unconditionally override values and/or provide one-liner contents, use
   *   refs/<symbol>/control file. It has pretty much inflexible
   *   "field: content" line format, but # comments can be used.
   *
   *   To append with information, you use refs/<symbol>/<X>, where <X> is
   *   the name of an existing section, maybe followed by an arbitrary string.
   *   With the exception of example files, you generally only create one
   *   file for each section.
   *   To supply more examples, you could keep them in an informal structure
   *   like this:
   *   refs/post_page/example
   *   refs/post_page/example2
   *   refs/post_page/example-relative_pages
   *   refs/post_page/example:used-often
   *   refs/post_page/example.something
   *
   *   (also, nothing prevents you from having more <example>s in the same file
   *   if you like).
   *
   *   You can't use # comments in the non-control files (they'd be left as-is),
   *   but you can use XML comments <!-- commented section -->.
   *   To avoid having to escape all HTML entities and everything, simply
   *   enclose "dirty" blocks in <![CDATA[ ... ]]>.
   *** Obscure information END /// Not needed in general ***


** To create the documentation for a yet non-existent item, your best bet
** is to start off by copying an existing item over.

** When adding documentation entries, please favor QUALITY over QUANTITY.
** That means you should grep the whole Git repo for ALL information about a 
** symbol (and supplement that with your own invaluable historical and
** technical information), and then write the item documentation that
** includes all that information.
** Also make sure to check the actual symbol source; at many places I've
** found undocumented options being present, and variables used or checked.

** After you build the documentation, there will be a file named
** tmp/missing autogenerated, and it will contain a list of symbols with all
** the parts of the documentation they still miss for the item to be
** considered complete. tmp/missing2 will also be created that will list
** HOWTO items and glossary entries that need work.
** Clearing out this list is a priority;
** given that the new system is so convenient and cool, you have no excuse ;-)


Davor Ocelic, docelic@icdevgroup.org