LTM Change Proposal 1.3

Abstract

This document contains the initial proposal for new features to be introduced in version 1.3 of the Linear Topic Maps Notation. The purpose of this document is to invite discussion before the changes are made official.

This is $Revision: 1.6 $.

1 Introduction

In version 1.3 the following changes are proposed:

Adding the #INCLUDE directive proposed but rejected for 1.2.
Support for variant names.
A #VERSION directive for declaring the version of LTM in use in an LTM file.
URI prefixes, and perhaps also more convenient references.
Support for reification.
Add a section describing deserialization of LTM documents into TMDM model instances, at the same time clarifying the rules and requirements for merging.

If all of these changes are accepted for LTM 1.3 it will be no less expressive than XTM 1.0. It will still not support the full TMDM model, since full representation of source locators will not be supported. It is not expected that all the changes will be accepted; this document is intended more as a trial balloon than anything.

It must be admitted that some cruft has been allowed to gather in the syntax over versions 1.0, 1.1, and 1.2, and this proposal outlines them and how they might be dealt with by deprecating some features and adding alternatives. A later 2.0 version might then remove the deprecated features.

2 The #INCLUDE directive

Splitting large topic maps up into separate files can make their maintenance substantially easier, and at the same time opens for reuse of individual modules. LTM 1.2 added the #MERGEMAP directive, which allowed external topic map documents to be merged in. These documents had their own namespaces, however, which meant that the only way to merge was by subject identifier or subject address.

The #INCLUDE directive

[1] include ::= '#' 'INCLUDE' WS STRING

The STRING is a URI reference to the LTM topic map to import. This inclusion mechanism will cause the external LTM file to be treated as if its content had been included at the point of reference. The benefit is that this makes merging much easier to specify for the authors.

Example of use: #INCLUDE "geography.ltm".

3 The #VERSION directive

Given the increasing number of LTM versions and the prospect that later versions might not be backwards compatible it might be helpful if LTM documents could declare what version they are written in. This might help implementations select the right parser for parsing them, or even allow a special forwards-compatible mode for versions newer than the latest supported version.

The syntax would be as follows:

The #VERSION directive

[2] version ::= '#' 'VERSION' STRING

where the STRING would be the version number of the LTM version used.

4 URI prefixes

In cases where LTM files declare a number of PSIs for the topics the files usually become quite cluttered with all the long URIs that take up a lot of visual space. A typical example is the first draft XMLvoc ontology, which is nearly unreadable because of all the PSIs.

A related problem is that of writing modular LTM files that get merged together to form other topic maps. It's possible to solve this using #INCLUDE, but only at the expense of some loss in modularity, since the IDs then need to be aligned in the various files. The only way to avoid this at present is to redefine the common topics with PSIs in each file, which leads to LTM files like those in the Pepys topic map.

One solution to this is to allow the user to declare prefixes for the URIs using a directive, in much the same way that prefixes work in tolog. These prefixes could then be used everywhere topic IDs are used today.

The syntax for this might be as follows:

The #URIPREFIX directive

[3] uriprefix ::= '#' 'URIPREFIX' WS NAME WS ('@' | '%') STRING

With this new syntax the beginning of the XMLvoc file referenced above could have been written as follows:

Example: xmlvoc.ltm in LTM 1.3

#URIPREFIX xtm @"http://www.topicmaps.org/xtm/1.0/core.xtm#"
#URIPREFIX srg @"http://psi.xml.org/stdsreg/#"

/* -------------- housekeeping topics -------------- */

[xtm:super-sub = "superclass-subclass relationship"
               = "superclass(es)" /xtm:sub
               = "subclass(es)" /xtm:super]
[xtm:super     = "superclass"]
[xtm:sub       = "subclass"]
[xtm:sort      = "sort"]

This would also greatly simplify the Pepys topic map fragment, which could skip the long list of "imports" at the top of the file, and instead write the first event as:

Example: Pepys map in LTM 1.3

[event-16611010-01 : event:working-event = "Samuel works at the Navy Office (10th October 1661)";"16611010-01"]
event:occurs(event-16611010-01 : event:event, today : event:on)
event:participation(event-16611010-01 : event:event, wiki:Samuel_Pepys : event:worker, navy-office : event:place)

The downside of this proposal would be that prefix:postfix would have been interpreted by earlier LTM parsers as equivalent to prefix : postfix, which has an entirely different meaning. The author is thus not entirely certain whether this proposal is acceptable and would welcome feedback on this issue.

5 Reification support

There are six constructs in topic maps which may be reified: topic maps, base names, variant names, occurrences, associations, and association roles. LTM already allows the topic map to be reified, but none of the other constructs. What's needed is a change to the syntax such that a connection can be made between a topic and one of these constructs.

Using the '@' character followed by the ID has already been suggested, but that character has already been taken for another purpose. The same applies to '%'. The '~' character, however, is free, and might work. (The '&' character is also free, but its visual appearance does not feel right for this purpose.)

The idea is that the sequence ~ id can be used after a topic map construct to specify the ID of the topic that reifies the construct. The LTM processor must then create a URI and assign it to the reified construct as a source locator and to the reifying topic as a subject identifier.

Issue (ltm-reifiying-subject-identifier):

Should LTM 1.3 define the subject identifier used for reification?

Example: Using reification

[ltm : syntax = "LTM" / acronym ~ltm-name]
[ltm-name : name = "LTM"]

invented-by(ltm-name-topic : invention, steve-pepper : inventor) ~invented
[invented : association = "Invention of LTM"]

{ltm, specification, "http://www.ontopia.net/download/ltm.html"} ~ltmspec
[ltmspec : occurrence = "The LTM specification"]
written-by(ltmspec-topic : work, lmg : author)

Note that the reification ID must be given before any scope on the construct being reified.

6 Variant names

LTM already supports two special cases of variant names: sort names and display names. An example of a topic declaration that uses them all might be:

[paris : city = "Paris (France)"; "paris"; "Paris"]

In this case, the two latter names are variants of a predefined kind (sort name and display name, in that order). The question is how to allow additional scopes on these two, and also to allow additional variant names with arbitrary scopes to be specified. The interaction between the scope of the variants with the scoping of the entire topic name is also an issue.

One way to do this might be to simply allow more names to be added after the display name, separated by semicolons in the same way, as shown below:

Adding variants to topic names

[4]	`topname`	::=	`'=' basename variantlist? scope?`
[5]	`variantlist`	::=	`';' (sortname \| sortname? ';' (dispname \| dispname? (';' variant scope)+))`
[6]	`variant`	::=	`STRING scope`
[7]	`sortname`	::=	`STRING scope`
[8]	`dispname`	::=	`STRING scope`

An example of this might be:

[xml = "Extensible Markup Language"; ; 
         ; "XML" /acronym;
         ; "Extended Markup Language" /erroneous]

Note that there is an issue here with the interaction between the scope (and reification) of the last variant name with that of the entire topic map (base name + variants). However, since scope is required for variants in the LTM syntax (because TMDM requires this), the first scope after the last variant must necessarily be that of the variant.

Example: A scoped topic name

[xml = "Extensible Markup Language"; ; 
         ; "eXtensible Markup Language" /erroneous /english]

In the example above the entire topic name has the scope "english", while the variant within it has the scope "english" and "erroneous". This shows that there is no syntactical ambiguity here. Since reification IDs appear before the scopes they cannot cause any ambiguities, either.

7 Deserialization specification

The deserialization specification would explain how, given an LTM document, to build a TMDM model instance. It would thus fill the same role as the XTM Syntax Specification. The idea is to firm up the definition of LTM now that there are multiple implementations, to clarify the merging rules (which were deliberately vague in previous versions), and to test that the TMDM works as intended for relating different topic map syntaxes to one another.