NNS - Navibars Native Sitemap 1.0

Specification, 21 August 2005

This version:
http://navibar.oaklett.org/specs/nns-10/NNS-10-20050821.html
Latest version:
http://navibar.oaklett.org/specs/nns-10/
Previous version:
http://navibar.oaklett.org/specs/negs-10-wd/
Author:
Markus Siebeneicher, <siebeneicher@oaklett.org>

Abstract

Navibar Native Sitemap is a RDF/XML file which holds information about webpages of a website. Additionally it gives information about the content structure in form of a tree-representation. The common information about a single page like title and URL are given by the Dublin Core Metadata Element Set and the tree-representation comes from the Minimal Access Plan.

Although NNS is the prefered format to represent your websites content, there is also the NEGS - Navibars Extension for Google Sitemaps format.

MAP - Minimal Access Plan

The Minimal Access Plan - shortly MAP - is a small pool of RDF-predicates which can be used in an RDF/XML file to graph a tree-like map of a whole website. MAP is grown up for practical reasons and is the base for the Firefox Navibar Extension. To understand MAP its good to look at Navibars goals: Once Navibar is installed, whenever you visit a website Navibar makes a request to the same domain to response a single file, the sitemap.rdf(e.g. http://www.mozilla.org/sitemap.rdf). When the website supports that Sitemap it builds a tree-like map of the contents depending on the rdf-facts in the sitemap.rdf.

Because the nature of RDF is structure-flat and there is no original mechanism to proccess a tree without using known graph routes, the decision was to create some RDF predicates to let Navibar know how to build a tree from the RDF-flat. Well, there is RDFS and OWL, but they arn't available yet in Firefox, so there were primarily born two predicates:

Each predicate indicates a new set of children of the actual page. The Difference between them lies in the point-of-view, but later more...

With the two predicates Navibar can represent a flexible tree structure including own context-menus, and whats about the title and URL of a page? Now DublinCore comes in game, the http://purl.org/dc/elements/1.1/title predicate represents the title of a page and the http://purl.org/dc/elements/1.1/identifer represents the URL of a page. Basicly thats all we need for a tree-like representation of a whole page using RDF as source of the sitemap. Of Course, thats not all...

If you havnt any idea what RDF is about you could invest some time to understand the ground concepts behind RDF. Dave Beckett maintenance http://www.ilrt.bris.ac.uk/discovery/rdf/resources/ which is a good start to find more information about RDF.

Namespace

The namespace of MAP is:

http://www.oaklett.org/map/1.0#

In the RDF/XML Sitemap source it looks like:

<?xml version="1.0"?>
<rdf:RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
            xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
            xmlns:map="http://www.oaklett.org/map/1.0#"
	    xmlns:dc="http://purl.org/dc/elements/1.1/">

	(...)

</rdf:RDF>
    

The shortfix map behind xmlns: is free and can be changed by you, but be aware to change the xml namespace prefix in front of all MAP predicates too.

The Root Resource

Cos RDF hasnt any default start or root resource we have to define a special root resource:

urn:sitemap:root

This URN is the starting-point of Navibar to build-up the tree. If there is no root resource in the Sitemap no tree will be build-up. The urn:sitemap:root root resource is a page like all other ones in your tree, but the difference is, that it will be hidden in the tree - only the children of the root resource will be shown. To define children of that root page we have to use the http://www.oaklett.org/map/1.0#container predicate:

<rdf:Description rdf:about="urn:sitemap:root">
  <map:container>
    <rdf:Seq>
      <rdf:li rdf:resource="mydomain:welcome"/>
      <rdf:li rdf:resource="mydomain:news"/>
      <rdf:li rdf:resource="mydomain:articles"/>
      <rdf:li rdf:resource="mydomain:forum"/>
      <rdf:li rdf:resource="mydomain:about"/>
    </rdf:Seq>
  </map:container>
</rdf:Description>

Later more about the container predicate...

MAP Predicates

container and embedded

The most important MAP predicates are:

These predicates indicate the children of the current page resource. In the example above(The Root Resource) we see the starting point of the Sitemap source, so let us define the page with the resource mydomain:articles. We only have to define 2 facts to describe a legal page:

<rdf:Description rdf:about="mydomain:articles">
  <dc:title>Sport-Articles</dc:title>
  <dc:identifier>http://www.mydomain.org/articles/</dc:identifier>
</rdf:Description>

The dc:title and dc:identifier represent the title and URL of the page. But whats about new children of that page? We can define them with the container or embedded predicate:

<rdf:Description rdf:about="mydomain:articles">
  <dc:title>Sport-Articles</dc:title>
  <dc:identifier>http://www.mydomain.org/articles/</dc:identifier>
  <map:container>
    <rdf:Seq>
      <rdf:li rdf:resource="mydomain:articles:soccer"/>
      <rdf:li rdf:resource="mydomain:articles:polo"/>
      <rdf:li rdf:resource="mydomain:articles:chess"/>
    </rdf:Seq>
  </map:container>
</rdf:Description>

The little difference between the container and embedded element lies in the point-of-view of a tree. Imaging a simple tree:

That simple tree is a set of tree-items in focus of one point-of-view. Now imaging a tree in a tree-item:

Well, in fact its a simple tree too, but with one difference: the blue tree-items can be proccessed in an other manner than the red tree-items. That point-of-view of a tree gives the Firefox Navibar Extension the ability to build a content-tree in the Firefox Sidebar and a context-tree. The following Image will clear the difference:

show context-menu image

The above tree represented in a Sitemap source would look like:

<?xml version="1.0"?>
<rdf:RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
	 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
	 xmlns:map="http://www.oaklett.org/map/1.0#"
	 xmlns:dc="http://purl.org/dc/elements/1.1/">

  <rdf:Description rdf:about="urn:sitemap:root">
    <map:container>
      <rdf:Seq>
	<rdf:li rdf:resource="1"/>
	<rdf:li rdf:resource="2"/>
	<rdf:li rdf:resource="3"/>
      </rdf:Seq>
    </map:container>
  </rdf:Description>

  <rdf:Description rdf:about="1">
    <dc:title>1</dc:title>
    <dc:identifier>1</dc:identifier>
  </rdf:Description>

  <rdf:Description rdf:about="2">
    <dc:title>2</dc:title>
    <dc:identifier>2</dc:identifier>
    <map:container>
      <rdf:Seq>
	<rdf:li rdf:resource="A"/>
	<rdf:li rdf:resource="B"/>
	<rdf:li rdf:resource="C"/>
      </rdf:Seq>
    </map:container>
  </rdf:Description>

  <rdf:Description rdf:about="3">
    <dc:title>3</dc:title>
    <dc:identifier>3</dc:identifier>
  </rdf:Description>

  <rdf:Description rdf:about="A">
    <dc:title>A</dc:title>
    <dc:identifier>A</dc:identifier>
  </rdf:Description>

  <rdf:Description rdf:about="B">
    <dc:title>B</dc:title>
    <dc:identifier>B</dc:identifier>
  </rdf:Description>

  <rdf:Description rdf:about="C">
    <dc:title>C</dc:title>
    <dc:identifier>C</dc:identifier>
    <map:embedded>
      <rdf:Seq>
	<rdf:li rdf:resource="I"/>
	<rdf:li rdf:resource="II"/>
	<rdf:li rdf:resource="III"/>
      </rdf:Seq>
    </map:embedded>
  </rdf:Description>

  <rdf:Description rdf:about="I">
    <dc:title>I</dc:title>
    <dc:identifier>I</dc:identifier>
  </rdf:Description>

  <rdf:Description rdf:about="II">
    <dc:title>II</dc:title>
    <dc:identifier>II</dc:identifier>
  </rdf:Description>

  <rdf:Description rdf:about="III">
    <dc:title>III</dc:title>
    <dc:identifier>III</dc:identifier>
  </rdf:Description>

</rdf:RDF>
    

Simplified the container predicate indicates a new set of children in the current point-of-view. The embedded predicate indicates a new set of children in a new point-of-view.

show deep context-menu

Following the rule that the container predicate indicates a new set of children in the same point-of-view its good practice to use the container predicate when you will define a new set of children of a context-tree-item(But for easiness both - container and embedded - predicates will be passed through to build deeper sets of children than the first one in the context-tree). Furthermore it's possible to use both in one page.

The depth of the tree's - content- or context-tree - is just limited by the patience of your fingers when writing the sitemap-source by hand. So you can have lots of new sets of children in a context-tree too!

To represent a list in RDF you can use three predicates: Alt, Baq, Seq. In the Sitemap source you can do it at all.

Its also possible to use a Resource/Webpage twice or more. I.e. the webpage with the title "Create Something" could be defined as a child of topics like "Authors" and "Contributors" and it would be represented as child of both. The benefit is, you do not have to write it twice or more.

You should mention that using webpages and there child recursive is not possible and anyway it do not make sense to embedded a resource in in it self or deeper in the tree again.

http://www.oaklett.org/map/1.0#icon

The MAP icon predicate can be used to place your own icon instead of a plain document icon to the left of a tree-item. The Value of http://www.oaklett.org/map/1.0#icon should be a valid URL(Literal) with a 15x15 Image as target. PNG is the preferred Image Format.

An Example RDF Fact could look like:

urn:myapp:startpage -> http://www.oaklett.org/map/1.0#icon -> http://myapp.org/images/startpage-icon.png

Dublin Core Metadata Element Set, Version 1.1

To describe some elemental informations about a page the Dublin Core Metadata comes in game. The Dublin Core Metadata is not part of MAP. The Dublin Core Metadata Element Set and MAP complement one another. The following table is an abstract of the Dublin Core Metadata Element Set, Version 1.1: Reference Description.

Required

Element Name: Title
Label: Title
Definition: A name given to the resource.
Comment: Typically, Title will be a name by which the resource is formally known.
Element Name: Identifier
Label: Resource Identifier
Definition: An unambiguous reference to the resource within a given context.
Comment: Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system. Formal identification systems include but are not limited to the Uniform Resource Identifier (URI) (including the Uniform Resource Locator (URL)), the Digital Object Identifier (DOI) and the International Standard Book Number (ISBN).

Optional

Element Name: Creator
Label: Creator
Definition: An entity primarily responsible for making the content of the resource.
Comment: Examples of Creator include a person, an organization, or a service. Typically, the name of a Creator should be used to indicate the entity.
Element Name: Subject
Label: Subject and Keywords
Definition: A topic of the content of the resource.
Comment: Typically, Subject will be expressed as keywords, key phrases or classification codes that describe a topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme.
Element Name: Description
Label: Description
Definition: An account of the content of the resource.
Comment: Examples of Description include, but is not limited to: an abstract, table of contents, reference to a graphical representation of content or a free-text account of the content.
Element Name: Publisher
Label: Publisher
Definition: An entity responsible for making the resource available
Comment: Examples of Publisher include a person, an organization, or a service. Typically, the name of a Publisher should be used to indicate the entity.
Element Name: Contributor
Label: Contributor
Definition: An entity responsible for making contributions to the content of the resource.
Comment: Examples of Contributor include a person, an organization, or a service. Typically, the name of a Contributor should be used to indicate the entity.
Element Name: Date
Label: Date
Definition: A date of an event in the lifecycle of the resource.
Comment: Typically, Date will be associated with the creation or availability of the resource. Recommended best practice for encoding the date value is defined in a profile of ISO 8601 [W3CDTF] and includes (among others) dates of the form YYYY-MM-DD.
Element Name: Type
Label: Resource Type
Definition: The nature or genre of the content of the resource.
Comment: Type includes terms describing general categories, functions, genres, or aggregation levels for content. Recommended best practice is to select a value from a controlled vocabulary (for example, the DCMI Type Vocabulary [DCT1]). To describe the physical or digital manifestation of the resource, use the FORMAT element.
Element Name: Format
Label: Format
Definition: The physical or digital manifestation of the resource.
Comment: Typically, Format may include the media-type or dimensions of the resource. Format may be used to identify the software, hardware, or other equipment needed to display or operate the resource. Examples of dimensions include size and duration. Recommended best practice is to select a value from a controlled vocabulary (for example, the list of Internet Media Types [MIME] defining computer media formats).
Element Name: Source
Label: Source
Definition: A Reference to a resource from which the present resource is derived.
Comment: The present resource may be derived from the Source resource in whole or in part. Recommended best practice is to identify the referenced resource by means of a string or number conforming to a formal identification system.
Element Name: Language
Label: Language
Definition: A language of the intellectual content of the resource.
Comment: Recommended best practice is to use RFC 3066 [RFC3066] which, in conjunction with ISO639 [ISO639]), defines two- and three-letter primary language tags with optional subtags. Examples include "en" or "eng" for English, "akk" for Akkadian", and "en-GB" for English used in the United Kingdom.
Element Name: Relation
Label: Relation
Definition: A reference to a related resource.
Comment: Recommended best practice is to identify the referenced resource by means of a string or number conforming to a formal identification system.
Element Name: Coverage
Label: Coverage
Definition: The extent or scope of the content of the resource.
Comment: Typically, Coverage will include spatial location (a place name or geographic coordinates), temporal period (a period label, date, or date range) or jurisdiction (such as a named administrative entity). Recommended best practice is to select a value from a controlled vocabulary (for example, the Thesaurus of Geographic Names [TGN]) and to use, where appropriate, named places or time periods in preference to numeric identifiers such as sets of coordinates or date ranges.
Element Name: Rights
Label: Rights Management
Definition: Information about rights held in and over the resource.
Comment: Typically, Rights will contain a rights management statement for the resource, or reference a service providing such information. Rights information often encompasses Intellectual Property Rights (IPR), Copyright, and various Property Rights. If the Rights element is absent, no assumptions may be made about any rights held in or over the resource.

Overview

The RDF facts to define a legal page with a sets of children are:

dc = http://purl.org/dc/elements/1.1/
map = http://www.oaklett.org/map/1.0#
rdf = http://www.w3.org/1999/02/22-rdf-syntax-ns#

(resourceOfPage) -> dc:title -> (titleOfPage)
                    dc:identifier -> (urlOfPage)
		    dc:*   (optional)
		    map:container -> rdf:Seq|Baq|Alt  (optional, except the urn:sitemap:root resource)
		    map:embedded -> rdf:Seq|Baq|Alt  (optional)

Examples

You can copy and modify all examples for your own needs. If you want to try them out, load them on your http server under your root public html directory and rename them to sitemap.rdf. Then open any page on your domain and the example should be loaded after some seconds in the sidebar.

Simple Example

This simple example can be used as a good start for your own homepage. Here you can download/open the Simple Example.

Fully Extended Example

when you decide to choose embedded pages which will be shown as context-menu-item you have great flexiblity in creating your own navigation-structure. You can also disable the common context-menus to make a more clear menu for your end-users. This Sitemap example makes intensive use of container and embedded pages and additionaly it use its own icons.

Here you can download/open the Fully Extended Example(221KB).

If you have installed the Firefox Navibar Extension you can try out the example in a live-demo under http://myexampleapp.oaklett.org.

A Word on Filesize

Without doubt you should try to minimize the filesize of the Sitemap source whether in the internet or intranet! The RDF/XML format can be realy greedy on filesize, so it's good practice to remove useless facts and whitespace.

you should also compress the file before sending to client-side, this highly decrease network traffic. to enabled file compression consider the documentation of your favorite webserver. if you use the apache httpd webserver see: http://httpd.apache.org/docs-2.0/mod/mod_deflate.html.

If you are in a well known network environment(e.g. intranet) you can use larger and complexer sitemaps. Internet sitemaps should be simple and compressed if possible.

Statistics

The following statistics on filesize are based on generated(makeLargeSitemap.pl) plain(not compressed) sitemap's which are as simple as possible(large sample sitemap). the sitemaps contain:

Pages(width/depth) => filesize
-------------------------------
20(4/2) => 5KB
39(3/3) => 10KB
84(4/3) => 20KB
155(5/3) => 35KB
363(3/5) => 85KB
584(8/3) => 125KB
1110(10/3) => 234KB
3615(15/3) => 750KB
3905(5/5) => 858KB
111110(10/5) => 23MB
    

Questions?

If you have questions about the usage or notes concact <Markus Siebeneicher>.