<?xml version="1.0"?>
<!-- ============================================================ -->
<!--                                                              -->
<!-- 	This file makes a part of RenderX XSL Test Suite          -->
<!--                                                              -->
<!--    Author: Alexander Peshkov                                 -->
<!--                                                              -->
<!--    (c) RenderX, 2003                                         -->
<!--                                                              -->
<!-- ============================================================ -->
<document>
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
           xmlns:dc="http://purl.org/dc/elements/1.1/"
           xmlns:dcterms="http://purl.org/dc/terms/"
           xmlns:db="http://www.oasis-open.org/docbook/xml/4.2/">
    <rdf:Description rdf:about="http://xep.xattic.com/testsuite/usecases/index.xml">
        <dc:creator>Alexander Peshkov</dc:creator>
        <dc:title>Creating document indexes using RenderX extensions</dc:title>
        <dc:description>
          <db:para>
            Creates document indexes using RenderX extensions.
            Single-page and range references, different styling for index entries.
          </db:para>
        </dc:description>
        <dc:date>2003-07-07</dc:date>
        <dcterms:requires rdf:resource="http://xep.xattic.com/testsuite/usecases/generic.xsl"/>
        <dcterms:requires rdf:resource="http://xep.xattic.com/testsuite/usecases/index.xsl"/>
    </rdf:Description>
  </rdf:RDF> 
  <section>
  <title>Creating indexes using RenderX extensions</title>
  <para>
    Building page number lists for indexes is not possible within XSL 1.0.
    RenderX XEP provides this functionality via extension elements/properties.
  </para>
  <para>
    indexes creation can be devided in two steps: first one presumes defining index terms and ranges in the text,
    second one &#8211; inserting index page and setting index entries properties, which control visual appearance.
    First step performed using <code>rx:key</code> property (applicable to any element that can carry an id attribute) 
    and <code>rx:begin-index-range</code>, <code>rx:end-index-range</code> elements usefull for creating ranges in the index.
    Second step utilizes <code>rx:page-index</code> element together with <code>rx:index-item</code> elements.
    Both can have standard font and style properties, while latter can also carry four special properties:
    <code>ref-key</code> (required), <code>range-separator</code>, 
    <code>merge-subsequent-page-numbers</code>, <code>link-back</code>.
    You can find detailed description of these elements/properties in XEP documentation included in distribution.
    <citation>
    	XEP 4.0 Reference for Java, section 3.6.3. "Indexes" (reference.pdf)
    </citation>
  </para>
  <para>
	Following excerpt shows how code with two stand alone index terms and one index range will looks like:
	<codeblock> &lt;fo:block ...&gt;Processing a Stylesheet&lt;/fo:block&gt;
    &lt;fo:block ...&gt;
      &lt;rx:begin-index-range rx:key="transformation" id="transformation-id"/&gt;
   An &lt;fo:inline rx:key="XSL"&gt;XSL&lt;/fo:inline&gt;
      &lt;fo:inline rx:key="stylesheet processor-primary"&gt;stylesheet processor&lt;/fo:inline&gt;
      accepts a document or data in XML and an XSL stylesheet and produces...
      ...
      &lt;rx:end-index-range ref-id="transformation-id"/&gt;
    &lt;/fo:block&gt;
	</codeblock>
	And this code shows how index page will be organized:
	<codeblock>
      &lt;fo:block font="bold 16pt Helvetica" ...&gt;INDEX&lt;/fo:block&gt;
      ...
          &lt;fo:block font="12pt Times"&gt;
         stylesheet processor
            &lt;rx:page-index&gt;
              &lt;rx:index-item ref-key="stylesheet processor-primary"
                             color="blue"
                             font-style="italic"
                             font-weight="bold"
                             link-back="true"/&gt;
            &lt;/rx:page-index&gt;
          &lt;/fo:block&gt;
          ...
          &lt;fo:block font="12pt Times"&gt;
         transformation
            &lt;rx:page-index&gt;
              &lt;rx:index-item ref-key="transformation-primary"
                             color="blue"
                             font-style="italic"
                             font-weight="bold"
                             link-back="true"/&gt;
            &lt;/rx:page-index&gt;
          &lt;/fo:block&gt;
          ...
          &lt;fo:block font="12pt Times"&gt;
         XSL
		    &lt;rx:page-index&gt;
              &lt;rx:index-item ref-key="XSL-primary"
                             color="blue"
                             font-style="italic"
                             font-weight="bold"
                             link-back="true"/&gt;
              &lt;rx:index-item ref-key="XSL"
                             color="blue"
                             font-style="italic"
                             font-weight="normal"
                             link-back="true"/&gt;
            &lt;/rx:page-index&gt;
          &lt;/fo:block&gt;			
      ...
	</codeblock>
	It creates two index entries, all linked back to the index terms location in the main text.
	By default, page numbers separated by colon symbol followed by space;
	subsequent page numbers not merged; en dash ("&amp;ndash;", U+2013) used to separate page
	numbers in a range.
  </para>
  <para>
	Starting from the next page you can see some formated text which contains a bunch of keywords,
	which should be placed in the index. Some of index terms have ranged structure and so must be
	represented in the index by the pages range. Text below devided into several page-sequences in
	order to demonstrate how index ranges works over page-sequence borders.
  </para>
  </section>
<section>
<title>Introduction and Overview</title>
<para>
This specification defines the <indexterm primary="true" term="XSL">Extensible Stylesheet Language
(XSL)</indexterm>. XSL is a language for expressing stylesheets. Given a class of
arbitrarily structured XML documents or data files, designers use
an XSL stylesheet to express their intentions about how that
structured content should be presented; that is, how the source
content should be styled, laid out, and paginated onto some
presentation medium, such as a window in a Web browser or a
hand-held device, or a set of
physical pages in a catalog, report, pamphlet, or book.
</para>
<subtitle>Processing a Stylesheet</subtitle>
<para><indexrange class="start" primary="true" term="transformation" id="transformation-id"/>
An <indexterm term="XSL">XSL</indexterm> <indexterm primary="true" term="stylesheet processor">stylesheet processor</indexterm> accepts a document
or data in XML
and an XSL stylesheet and produces the presentation of that XML source
content that was intended by the designer of that
stylesheet. There are two aspects of this
presentation process:
first, constructing a result tree from the XML source tree and second,
interpreting the result tree to produce formatted
results suitable for presentation on a
display, on paper, in speech, or onto other media. The first
aspect is called <indexterm primary="true" term="tree transformation">tree transformation</indexterm> and the
second
is called <indexterm primary="true" term="formatting">formatting</indexterm>. The process of formatting
is performed by the <indexterm term="formatter">formatter</indexterm>. This formatter
may simply be a rendering engine inside a browser.
</para>
<para>
<indexterm term="tree transformation">Tree transformation</indexterm> allows the structure
of the result tree to be significantly
different from the structure of the source tree. For example, one could add
a table-of-contents as a filtered selection of an original source document,
or one could rearrange source data into a sorted tabular presentation.
In constructing the result tree,
the <indexterm term="tree transformation">tree transformation</indexterm> process also adds
the information necessary to format that result tree.
</para>
<para>Formatting is enabled by including formatting semantics
in the result tree. Formatting semantics are
expressed in terms of a catalog of classes of
<indexterm term="formatting objects">formatting objects</indexterm>. The nodes of the result tree are formatting
objects. The classes of <indexterm primary="true" term="formatting objects">formatting objects</indexterm>
denote typographic abstractions such as page, paragraph,
table, and so forth. Finer control over the presentation of these abstractions is
provided by a set of <indexterm term="formatting properties">formatting properties</indexterm>, such as
those controlling indents, word- and
letter spacing, and widow, orphan, and hyphenation control.
In <indexterm term="XSL">XSL</indexterm>, the classes of <indexterm term="formatting objects">formatting objects</indexterm> and
<indexterm primary="true" term="formatting properties">formatting properties</indexterm>
provide the vocabulary for expressing presentation intent.
</para>
<para>
The <indexterm term="XSL">XSL</indexterm> processing model is intended to be
conceptual only.  An implementation is not
mandated to provide these as separate
processes.  Furthermore, implementations are free to process
the source document in any way that produces the same result
as if it were processed using the conceptual XSL processing
model.  A diagram depicting the detailed conceptual model is
shown below.</para>
</section>

<section>
<subtitle>Tree Transformations</subtitle>
<para>
<indexrange term="tree transformation" class="start" id="tree-id"/>
<indexterm term="tree transformation">Tree transformation</indexterm> constructs the result tree.  In XSL,
this tree is called the <indexterm primary="true" term="element and attribute tree">element and attribute
tree</indexterm>, with objects primarily in the
"formatting object" namespace. In this tree, a formatting
object
is represented as an XML element, with the properties represented by a
set of XML attribute-value pairs. The content of the formatting object
is the content of the XML element. Tree
transformation is defined in the <indexterm term="XSLT" primary="true">XSLT</indexterm> Recommendation.
A diagram depicting this conceptual process is
shown below.</para>
<para>
The XSL stylesheet is used in tree transformation. A stylesheet
contains a set of tree construction rules. The tree construction rules
have two parts: a pattern that is matched against elements in the
source tree and a template that constructs a portion of the result
tree. This allows a stylesheet to be applicable to a wide class of
documents that have similar source tree structures.
</para>
<para>In some implementations of <indexterm term="XSLT">XSL/XSLT</indexterm>,
the result of tree construction
can be output as an XML document. This would allow an XML document
which contains <indexterm term="formatting objects">formatting objects</indexterm> and <indexterm term="formatting properties">formatting properties</indexterm> to be
output. This capability is neither necessary for an XSL processor nor
is it encouraged. There are, however, cases where this is important,
such as a server preparing input for a known client; for example, the way
that a <indexterm primary="true" term="WAP">WAP</indexterm> 
(<ulink url="http://www.wapforum.org/faqs/index.htm">http://www.wapforum.org/faqs/index.htm</ulink>)
server prepares specialized input for a <indexterm term="WAP">WAP</indexterm> capable
hand held device. To preserve accessibility, designers of Web systems
should not develop architectures that require (or use) the
transmission of documents containing <indexterm term="formatting objects">formatting objects</indexterm>
and properties unless either the transmitter knows that the client can accept
formatting objects and properties or the transmitted document contains
a reference to the source document(s) used in the construction of the
document with the <indexterm term="formatting objects">formatting objects</indexterm> and properties.
<indexrange class="end" id="tree-id"/>
<indexrange class="end" id="transformation-id"/>
</para>
</section>

<section>
<subtitle>Formatting</subtitle>
<para>
<indexrange class="start" term="formatting" id="formatting-id"/>
Formatting interprets the result tree in its <indexterm term="formatting object tree">formatting object tree</indexterm>
form to produce the presentation intended by the designer
of the stylesheet from which the XML element and attribute
tree in the "fo" namespace was constructed.
</para>
<para>
The vocabulary of <indexterm term="formatting objects">formatting objects</indexterm> supported by XSL - the set of
<code>fo:</code> element types - represents the set of
typographic abstractions available to the
designer. Semantically, each formatting object represents a
specification for a part of the pagination, layout, and styling
information that will be applied to the content of that formatting
object as a result of formatting the whole result tree. Each
formatting object class represents a particular kind of formatting
behavior. For example, the block formatting object class represents
the breaking of the content of a paragraph into lines. Other parts of
the specification may come from other <indexterm term="formatting objects">formatting objects</indexterm>; for
example, the formatting of a paragraph (block formatting
object)
depends on both the specification of properties on the block
formatting object and the specification of the layout structure into
which the block is placed by the <indexterm primary="true" term="formatter">formatter</indexterm>.
</para>
<para>
The properties associated with an instance of a formatting object
control the formatting of that object. Some of the properties, for
example "color", directly specify the formatted result.
Other properties, for example 'space-before', only constrain the set
of possible formatted results without specifying any particular
formatted result. The <indexterm term="formatter">formatter</indexterm> may make choices among other
possible considerations such as esthetics.
</para>
<para>
Formatting consists of the generation of a tree
of geometric areas, called the <indexterm primary="true" term="area tree">area tree</indexterm>. The
geometric areas are
positioned on a sequence of one or more pages (a browser typically
uses a single page). Each geometric area has a position on the page, a
specification of what to display in that area and may have a
background, padding, and borders. For example, formatting a
single character generates an area sufficiently large enough to hold the
glyph that is used to present the character visually and the glyph is
what is displayed in this area. These areas may be nested. For
example, the glyph may be positioned within a line, within a
block, within a page.
</para>
<para>
Rendering takes the <indexterm term="area tree">area tree</indexterm>, the abstract model of the presentation (in terms
of pages and their collections of areas), and causes a
presentation to appear on the relevant medium, such as a browser
window on a computer display screen or sheets of paper. The semantics
of rendering are not described in detail in this specification.
</para>
<para>
The first step in formatting is to "objectify" the element
and attribute
tree obtained via an <indexterm term="XSLT">XSLT</indexterm> transformation.
Objectifying the tree basically consists of turning the
elements in the
tree into formatting object nodes and the
attributes into property specifications. The result of this step is
the <indexterm primary="true" term="formatting object tree">formatting object tree</indexterm>.
</para>
<para>
As part of the step of objectifying, the characters that occur in
the result tree are replaced by fo:character nodes.
Characters in text nodes which consist
solely of white space characters and
which are children of elements whose corresponding <indexterm>formatting objects</indexterm> do
not permit fo:character nodes as children are ignored.  Other characters
within elements whose corresponding <indexterm>formatting objects</indexterm> do not permit
fo:character nodes as children are errors.
</para>
<para>The content of the fo:instream-foreign-object
is not objectified;
instead the object representing the fo:instream-foreign-object
element points to the appropriate node in the element and
attribute tree.
Similarly any non-XSL namespace child element
of fo:declarations is not objectified; instead the object representing
the fo:declarations element points to the appropriate node in the
element and attribute tree.
</para>
<para>
The second phase in formatting is to refine the formatting
object tree
to produce the <indexterm primary="true" term="refined formatting object tree">refined formatting object tree</indexterm>.
The <indexterm term="refinement">refinement</indexterm> process handles the mapping from properties
to traits.  This consists of: (1) shorthand expansion into
individual properties, (2) mapping of corresponding
properties, (3) determining computed values (may include
expression evaluation),
(4) handling white-space-treatment and linefeed-treatment
property effects, and (5) inheritance.
Details on
<indexterm term="refinement">refinement</indexterm> are found in "&#x00A7; 5 - Property <indexterm term="refinement">Refinement</indexterm>/Resolution"
</para>
<para>
The third step in formatting is the construction of the area
tree. The <indexterm term="area tree">area tree</indexterm> is generated as described
in the semantics of each formatting object. The traits applicable to each formatting
object
class control how the areas are generated. Although every
formatting property may be specified on every formatting object, for
each formatting object class, only a subset of the formatting
properties are used to determine the traits for objects of
that class.
<indexrange class="end" id="formatting-id"/>
</para>
</section>
<section>
<title>INDEX</title>
<index link-back="true"/>
</section>
</document>