XPP Index Spec. Deconstructed

The IDPF ePub3 Indexing Specification is abstract, incomplete and relatively mundane. It should be low priority for any reading system developer or publisher producing digital content.

It is not totally irrelevant, but it is incomplete and un-necessary. It adds little whilst increasing production costs and the long term cost of digital content ownership.

Flavour d'Jour

It is probably not the place of any specification to give vision statements on what could be done with it, but that is what happens in the voluminous Appendices to this specification.

The XPP Index is less than (<) the XPP Content Fragment Identifier (CFI) interpretation of the dead X:Link specification. It brings no value now and questionable future value.

Here we have attempted to provide a neutral analysis.

Read the Specification

Presumably there are very few people who have read or tried to digest this specification draft; or applied it to real content for use in a reading system. Axis12 has done both in our evaluation.

There are also few who have seriously tagged Indexes for future value and variation processing. Here are the challenges presented by the specification:

  1. Reading Systems are meant to be able to interpret and use these "indexes", as the XPP specification states.
  2. Publishers and their service providers are suppose to be able to produce content that implements the tagging structures so reading systems can use it.
  3. The IDPF continues to believe that as they have written words in a specification it matters to publishers.
  4. The specification is an epub:type encapsulation of a decade-old unsuccessful XML tagging strategy for indexes. It could be seen as a step back in the digital-content marketplace.

Index Spec Discussion History.

Axis12 has a significant HTML5 digital content management system that relies on the fact that ordered means exactly that and unordered means the sequence of "items" is neglected.

Indexes and the art of Indexing is complex. No one is denying or belittling this. The Cross Platform Publisher eIndexer reflects the complexity of both the mental process an Indexer (human) has to go through. Cross Platform Publisher:eIndexer is instantly ready for every aspect of this specification and can be upgraded quickly and easily when required.

The conundrum is, should we support this specification pre-emptively (as they have supported ePub3 in production tools and reading systems for two years) or wait for a publisher to demand XPP index. Other than the production and processing testing they have done so far they will wait.

Fortunately Axis12 have a module attached to Cross Platform Publisher called Cross Platform Publisher:Formats On Demand. It's purpose is to take highly value tagged content and simplify it for ePub3 reading systems, the Kindle drone format and possibly now the ePub3 Index specification if a publisherrequests it.

Complexity Without Benefit

Numerous new rules and epub:type properties have been released by the Index specification.

For Documents That Are Only Indexes

If the document should happen to be only an Index, make sure to use this declaration for the metadata.

< dc : type > index < / dc : type >

Manifest

Manifest must be intelligent and shouldn't be forgotten.

properties =" index "

Index is now right up there with MathML, SVG and Scripts.

And the epub:type Vocabulary

More interestingly, the specification creates a radically extended epub:type vocabulary with lots of new property values. Here they are for the digital content production afficiandos. this is not an explanation but a simple list, with the vocabulary being mostly self explanatory, unlike their use.

  1. index-editor-note
  2. index-entry
  3. index-entry-list
  4. index-group
  5. index-headnotes
  6. index
  7. index-legend
  8. index-locator
  9. index-locator-list
  10. index-locator-range
  11. index-term
  12. index-term-categories
  13. index-term-category
  14. index-xref-preferred
  15. index-xref-related

Down to Nav

Indexes must be referenced from the nav.XHTML file.Add in something such as DocBook "role statement:

collection role="index"

collection role="index-group"

Strange Manual Linking

The specification goes into detail on tagging index terms in an index with epub:type properties. It introduces the property inheritance concept which means there is little need to have to use the properties defined. Axis12 have been using sub-group inheritance in Cross Platform Publisher:FoundationXTML since 2006.

What is of considerable interest is the index link targets. In the online Index sample they have used a very manually applied href value and #href which doesn't adequately illustrate the complexities of real-world digital-content target index linking.

A target can be an item, a start or an end point, but there is no epub:type property that can be applied to a for example to deliver different highlighting or other visual and accessibility clues to the content body.

Index Power

In Cross Platform Publisher Index links are symmetrical and semantic. Linking symmetry is for crude reading systems. The semantics is for more advanced reading systems that support Javascript. It is possible to get to an Index term of interest from the index reference in the text.

Axis12 use HTML5 data- attributes for link types rather than class statements because they are Cross Platform Publisher:FoundationXHTML. That means a line of Javascript can create a reference back to the Index term.

In a full ePub3 package, Index terms are block packaged to the index references to empower horizontal Index term navigation. The horizontal navigation packages are processed and created at packaging time so the reader CPU/Memory resource consumption is minimised. The reading system should not be responsible for everything presentation and interactive. This dilutes innovation; which is where the ePub3 specification is struggling.

Horizontal Index Linking

Main term example with page numbers. This is suitable for documents that have page breaks included and a source reference to the print work the page breaks represent. Ideally the links resolve to the actual index reference point on that page, not just the page number.

Main term example with direction indicators. Suitable where no page break references are available and the index reference is tagged in the content. This is built into the Cross Platform Publisher eIndexer for the production of front-list print and digital only books. Notice that the arrows of single references point to whether the index reference is before or after the current location. This sense of direction is essential in horizontal navigation tools.

Sub-term example with page numbers. Suitable for documents that have page breaks included and a source reference to the print work the page breaks represent. The abstraction with page numbers in digital content is that the user could have clicked the Page 12 index term reference, the page 47 start index-term reference or the page 52 end index-term reference.

Sub-term example with direction indicators. Suitable where no page break references are available and the index reference is tagged in the content. This is built into the Cross Platform Publisher eIndexer for the production of front-list print and digital only books. The sub-term is listed first with the primary term second, and other links referenced appropriately.

The value of this approach is a user can explore all index references and sub-references for any specific index term and easily return to their starting reading point from any of the other index blobs.

While a reading system theoretically could assemble this package dynamically:

  1. It doesn't make sense to commit tablets to the memory and CPU resource required.
  2. Unable to get multiple reading systems to handle this consistently across all platforms.
  3. Insistence in using general HTML5

No New Ideas. Production Nightmare

IDPF has made index production rigid, complex, incomplete and expensive. Without target properties the system is very limited as to what can actually be created.

It would appear the potential dynamic and interactive potential of indexes have been turned into "XML-like" cement, encapsulated in epub:type properties and other peculiar items.

The good/bad news is: no publisher will be able to produce a consistently tagged index that any reading system will be able to use for years unless they use an advanced XHTML production system like Cross Platform Publisher.

Is there a point to specifications of this nature? Axis12 understands the goodwill and nature of the specification. It reflects the fact that the current IDPF management and specification writers are not involved on a daily basis with real-world digital content. It is not difficult or expensive.

Summary

Ultimately the IDPF has added some epub:type properties to standard index terms while using unordered lists.

In their demonstration Index they also sneaked in an index link to an arcane XLink type CFI item, which is a link to the irrelevant arcane.

They give acknowledgement to the DocBook specification which is potentially alarming with serious digital content practitioners. A deceased and failed failed XML method should not influence a 2013 HTML5 specification. There are real 2013 new and powerful options.

The outcome is a reading system specification that defines a rigid Index tagging strategy that publisher must create for a reading system to understand, which is questionable.

It is this lock-down approach that continues to put stress on ePub3. The inspirational outcomes of the AAP conference and the as yet undocumented abstract outcomes of the recent EDUPUB conference are being ignored by the IDPF.

Cross Platform Publisher is one of the very few systems that can easily process epub:type properties into anything, such as this specification. Currently an ePub3 produced with Cross Platform Publisher includes around 90% of epub:type properties.

Share this post


About Us

Axis12 specialise in building, hosting and supporting high traffic, content heavy web applications for both the public and private sector that help them achieve their Digital First aspirations. We recently implemented Cross Platform Publisher for the National Health Service (NHS) in the United Kingdom, which has transformed the way online reporting and publishing is carried out.
Read more...