[ Team LiB ] Previous Section Next Section

7.1 Module Status

According to the specification, "Modules are classified as Proposed until accepted as Standard by members of the RSS-DEV working group or a sub-membership thereof focused on the area addressed by the module."

Currently, there are only 3 modules classified as Standard— Dublin Core, Syndication, and Content—and at least 16 that are Proposed. The Proposed classifications, however, should not stop you from using the modules — it indicates only the lack of a schedule for voting on the modules, not a lack of merit. These modules may well be accepted as Standard in the future. So, to reflect this, here are the current modules, in alphabetical order.

mod_admin

The Administration module, written by Aaron Swartz and Ken Macleod, provides information on the feed's owner and the toolkit used to produce it. This helps the RSS user work with his provider to get things right, and it helps the RSS community at large to identify problems with certain systems.

Recommended Usage

It is good manners to include this module as a matter of course. The data is not dynamically created, so it can be included within a template and just left to do its job.

Namespace

The namespace prefix for this module is admin:, which should point to http://webns.net/mvcb/. Therefore, the root element and the RSS 1.0 module containing mod_admin should look like this:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:admin="http://webns.net/mvcb/">

Elements

The mod_admin elements occur as subelements of channel only. They consist of:

<admin:errorReportsTo rdf:resource= "URI"/>

The URI is typically a mailto: URL for contacting the feed administrator to report technical errors.

<admin:generatorAgent rdf:resource= "URI"/>

The URI is the home page of the software used to generate the feed. If possible, this should be a page that specifies a version number within the URI.

Example

Example 7-1. mod_admin in the channel element
<?xml version="1.0" encoding="utf-8"?> 
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:admin="http://webns.net/mvcb/"> 
  <channel rdf:about="http://rss.benhammersley.com/index.rdf">
    <title>Content Syndication with RSS</title>
    <link>http://rss.benhammersley.com</link>
    <description>Content Syndication with RSS, the blog</description>
    <admin:errorReportsTo rdf:resource="mailto:ben@benhammersley.com"/>
    <admin:generatorAgent rdf:resource="http://www.movabletype.org/?v=2.1"/>
...
mod_aggregation

The Aggregation module plays a small but useful part in the life cycle of information passing through the Web. It allows news aggregators, such as Meerkat, Snewp, and so on (all covered in Chapter 12) to display the sources of their items. These services gather items from many other sources and group them by subject. mod_aggregation allows us to know where they originated.

This, of course, works over generations: as long as the mod_aggregation elements are respected, a Meerkat feed that uses a Snewp item from a Moreover feed that is itself an aggregation (for example) will still have the original source credited. As long as the mod_aggregation elements are left in place, the information is preserved. There is not, as yet, any feature for describing an aggregation history, however. You only know about the primary source.

Aggregators are the only people generating these elements — if you're building such a system, consider including them. The act of parsing such elements, however, is good for everyone. One can easily envisage an HTML representation of an RSS 1.0 feed with a "link via x" section. This is already done manually by many weblog owners, so why not include the feature in your RSS parsing scripts?

Namespace

mod_aggregation takes ag: as its prefix and http://purl.org/rss/modules/aggregation as its identifying URI. Therefore, an RSS 1.0 root element that uses it should look like this:

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:ag="http://purl.org/rss/1.0/modules/aggregation/" >

Elements

mod_aggregation 's elements are all subelements of item. There are three, and they are all mandatory if you are using the module:

ag:source

The name of the source of the item (no character limit).

ag:sourceURL

The URL of the source of the item (no character limit).

ag:timestamp

The time the item was published by the original source, in the ISO 8601 standard (ccyy-mm-ddThh:mm:ss+hh:mm).

Example

Example 7-2. mod_aggregation in action
<?xml version="1.0"?> 
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:ag="http://purl.org/rss/1.0/modules/aggregation/"
>   
   
  <channel rdf:about="http://meerkat.oreillynet.com/?_fl=rss1.0">
    <title>Meerkat</title>
    <link>http://meerkat.oreillynet.com</link>
    <description>Meerkat: An Open Wire Service</description>
  </channel>
   
  <items>
    <rdf:Seq>
      <rdf:li rdf:resource="http://c.moreover.com/click/here.pl?r123" />
    </rdf:Seq>
  </items>
   
  <item rdf:about="http://c.moreover.com/click/here.pl?r123" >
    <title>XML: A Disruptive Technology</title>
<link>http://c.moreover.com/click/here.pl?r123</link>
    <description>
    XML is placing increasingly heavy loads on the existing technical
    infrastructure of the Internet.
    </description>
    <ag:source>XML.com</ag:source>
    <ag:sourceURL>http://www.xml.com</ag:sourceURL>
    <ag:timestamp>2000-01-01T12:00+00:00</ag:timestamp>
  </item>
</rdf:RDF>
mod_annotation

mod_annotation is the smallest module. It consists of one element, which refers to a URL where a discussion of the item is being held. It might point to a discussion group, a commenting service, Usenet, an Annotea service, etc.

For sites that host such discussions, the addition of this module into the RSS feed should be simple and worthwhile. Weblogs, for example, might only need to point the element to the URL of the main entry page for a particular item.

If you want to parse this module into HTML, you should, as with many of these modules, have no problems simply assigning a separate div or span for the contents of the element, wrapping it within an <a href="URL">, and formatting it as you wish. This would probably only make sense if your parser is also taking notice of either the description element or the data provided by mod_content, simply because it is hard to have a discussion based solely on a headline.

Namespace

mod_annotation is identified by the namespace prefix annotate: and the URI http://purl.org/rss/1.0/modules/annotate/. Hence, the root element looks like this:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns="http://purl.org/rss/1.0/"
         xmlns:annotate="http://purl.org/rss/1.0/modules/annotate/"
>

Element

There's only one element, a subelement of item , and here it is:

<annotate:reference rdf:resource= "URL" />

The URL points to a discussion on the item.

However, this element can also take subelements of its own from the Dublin Core modules, mod_dublincore and mod_DCTerms. We'll cover these modules soon, but Example 7-4 will give you an idea.

Do you see how the namespaces system works? In Example 7-3, we have a feed using only the mod_annotation system. We've added one additional namespace and used the element correctly. In Example 7-4, we want to use another module to describe something in terms that the currently available elements cannot. So we decide upon mod_dublincore, add in the namespace declaration, and go ahead.

Also notice that in Example 7-3 annotate is a one-line element, with a closing />, whereas in Example 7-4 annotate contains the mod_dublincore elements before closing. This means that the mod_dublincore elements refer to annotate, not to the item or channel. As we'll see, mod_dublincore can get addictive, and you might find yourself describing everything in your feed. This is not bad at all, but it may get confusing. By paying attention to which elements are within which, you can see what is happening.

Examples

Example 7-3. mod_annotation with additional mod_dublincore data
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns="http://purl.org/rss/1.0/"
         xmlns:annotate="http://purl.org/rss/1.0/modules/annotate/"
> 
   
<item rdf:about="http://www.example.com/item1">
    <title>RSS 0.9 or RSS 1.0...Discuss</title>
    <link>http://www.example.com/item1</link>
    <annotate:reference rdf:resource="http://www.example.com/discuss/item1"/>
</item>
Example 7-4. mod_annotation inside an item element
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns="http://purl.org/rss/1.0/"
         xmlns:annotate="http://purl.org/rss/1.0/modules/annotate/"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
>
   
. . .
   
   
<item rdf:about="http://www.example.com/item1">
    <title>RSS 0.9 or RSS 1.0...Discuss</title>
    <link>http://www.example.com/item1</link>
<annotate:reference rdf:resource="http://www.example.com/discuss/item1">
      <dc:subject>XML</dc:subject>
  <dc:description>A discussion group on the subject in hand</dc:description>
</annotate>
</item>
mod_audio

mod_audio is the first of the RSS 1.0 modules we have seen that points at something other than a text page. It is specifically designed for the syndication of MP3 files — its elements matching those of the ID3 tag standard — but it can be used for any audio format.

It was designed by Brian Aker, who also wrote the mp3 module for the Apache web server. That Apache module not only streams MP3s from a server, but also creates RSS playlists.

If you're syndicating audio, or pointing at feeds that are syndicating audio, this is a must. Also, consider using mod_streaming , the module for streaming.

Namespace

mod_audio uses the prefix audio: and is indentified by the URI http://media.tangent.org/rss/1.0/. Hence:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
          xmlns="http://purl.org/rss/1.0/"
          xmlns:audio="http://media.tangent.org/rss/1.0/" >

Elements

mod_audio elements are all subelements of item. None of them are mandatory to the module, but you should make an effort to include as many as possible per track.

audio:songname

The title of the song.

audio:artist

The name of the artist.

audio:album

The name of the album.

audio:year

The year of the track.

audio:comment

Any text comment on the track.

audio:genre

The genre of the track (should match genre_id).

audio:recording_time

The length of the track in seconds.

audio:bitrate

The bitrate of the track, in kbps.

audio:track

The number of the track on the album.

audio:genre_id

The genre ID number, as defined by the ID3 standard.

audio:price

The price of the track, if you're selling it.

Example

Example 7-5. An item using mod_audio
<item rdf:about="http://www.example.com/boyband.mp3" >
     <title>BoyBand's Latest Track!</title>
     <description>The latest track from the fab five.</description>
     <link>http://www.example.com/boyband.mp3</link>
     <audio:songname>One Likes to Get Funky</audio:songname>
     <audio:artist>BoyBand</audio:artist>
     <audio:album>Not Just Another</audio:album>
     <audio:year>2005</audio:year>
     <audio:genre>Top 40</audio:genre>
     <audio:genre_id>60</audio:genre_id>
</item>

Applications

It could be said that some of these elements are superfluous, since they can be replaced by other elements (for example, audio:songname could be replaced by title). This is true in many cases, but it is much neater to use a simple MP3 tag-reading script to generate the RSS and map ID3 elements across directly. There are many ID3 tag-reading libraries available, including Chris Nandor's MP3::Info for Perl.

mod_changedpage

mod_changedpage does for RSS 1.0 what the cloud element does for RSS 0.9x — it introduces a form of Publish and Subscribe. We'll discuss Publish and Subscribe in detail in Chapter 12, but basically it enables a system in which you can "subscribe" to a feed and be notified when something new is published.

mod_changedpage uses only one element, which points to a changedPage server. Users wishing to be told when the feed has updated send an HTTP POST request of a certain format to this server. Upon updating, this server sends a similar POST request back to the user. The user's client then knows about the update. Again, Chapter 12 examines this in detail.

Namespace

mod_changedpage takes the namespace prefix cp: and is identified by the URI http://my.theinfo.org/changed/1.0/rss/. Hence, its declaration looks like this:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"  
         xmlns:cp="http://my.theinfo.org/changed/1.0/rss/">

Element

mod_changedpage takes only one element, a subelement of channel :

<cp:server rdf:resource="URL" />

The URL is the address of the changedPage server.

Example

Example 7-6. mod_changedpage in the channel
<?xml version="1.0" encoding="utf-8"?> 
   
<rdf:RDF  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
        xmlns=http://purl.org/rss/1.0/
          xmlns:cp="http://my.theinfo.org/changed/1.0/rss/"
>
   
<channel rdf:about="http://meerkat.oreillynet.com/?_fl=rss1.0">
  <title>Meerkat</title>
  <link>http://meerkat.oreillynet.com</link>
  <description>Meerkat: An Open Wire Service</description>
<cp:server rdf:resource="http://example.org/changedPage" />
</channel>
...
mod_company

mod_company allows RSS feeds to deliver business news metadata. Like mod_audio, this is another example of RSS 1.0 stretching the bounds of RSS functionality; this module could lead to RSS being used as a specialist business news vehicle rather than just a generalized list of links.

Namespace

mod_company takes the namespace prefix company: and is identified by the URI http://purl.org/rss/1.0/modules/company. By now you'll realize that this means the root element of a RSS 1.0 document containing mod_company will resemble this:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
          xmlns="http://purl.org/rss/1.0/"
          xmlns:company="http://purl.org/rss/1.0/modules/company/">

Elements

mod_company provides four elements, all of which are subelements of item. None of them are defined as mandatory, but there's little hassle and much reward in including all of them.

company:name

The name of the company.

company:symbol

The ticker symbol of the company's stock.

company:market

The abbreviation of the market in which the stock is traded.

company:category

The category of the company, expressed using the Taxonomy module. For more details, see mod_taxonomy later in this chapter.

Example

Example 7-7. mod_company being used within an item
<item rdf:about="http://www.example.com/financial_news/00001.html">
     <title>Cisco Stock moves either up or down!</title>
<description>A brief story about a thing happening today</description>
<link>http://www.example.com/financial_news/00001.html<link>
    <company:symbol>CSCO</company:symbol>
     <company:market>NASDAQ</company:market>
     <company:name>Cisco Systems Inc.</company:name>
     <company:category>
     <taxo:topic rdf:resource="http://dmoz.org/Computers/Data_Communications/Vendors/
Manufacturers/">
     </company:category>
</item>
mod_content

mod_content is perhaps the most misunderstood module of all. Its purpose is not only to allow for much richer content — the entire site, images and all, for example — to be included within a RSS 1.0 item, but also to give a complete RDF description of this content. Now, not only can we make RDF graphs from channel to item, but we can also make them from item to an image within an item. An RDF query of "Find all the feeds that point to articles accompanied by a picture of an elephant" can now be executed easily, as mod_content provides not just the content itself, but the relationship metadata as well. It can also be used to split the object to which an item points into smaller sections, from the standpoint of an RDF parser.

The syntax for this can look a little long-winded — RDF is rather verbose when written in XML — and, because of this, mod_content feeds can often look scary. They're not really, and reformatting them in a text editor can give you an idea of what is happening. Despite this apparent complexity, it is one of the only modules to have been officially accepted by the rss-dev working group.

It must be noted that mod_content is not to be confused with the core specification's description subelement of item. Some RSS 1.0 feeds use description to contain the content the item represents. While this may be common practice with RSS 0.9x users, RSS 1.0 users may wish to do it properly. description is for a description of the content; mod_content is for the content itself.

Namespaces

mod_content is identified by the namespace prefix content: and the URI http://purl.org/rss/1.0/modules/content/. Hence, the root element looks like this:

<rdf:RDF  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
          xmlns="http://purl.org/rss/1.0/"
          xmlns:content="http://purl.org/rss/1.0/modules/content/">

Elements

mod_content is slightly more complex than the other modules—it has a specific structure that must be followed. It consists of one element with various subelements that have important attributes of their own, some of which are mandatory, while others are not.

The first element, content:items, is a subelement of item . It consists of an rdf:Bag that contains as many content:items as needed, each enveloped in an rdf:li element, as shown in Example 7-8.

Example 7-8. The basic structure of a mod_content items
<item>
...
<content:items>
<rdf:Bag>
  <rdf:li>
      <content:item rdf:about=""/>
  </rdf:li>
  <rdf:li>
      <content:item />
  </rdf:li>
</rdf:Bag>
</content:items>
</item>

Notice that one of the content:item elements in Example 7-8 has an rdf:about attribute, but the other does not. This difference is to show that if the content is available on the Web at a specific address, the rdf:about attribute contains the URI of the content, including any part of the content that is directly addressable (an image, for example). Hence, a deeper level of RDF relationship is declared.

Now, you will also notice that the content:item element in Example 7-8 is empty. This is not much use, so we'll look into filling it. Content, as you know, can come in many formats: plain text, HTML 4.0, XHTML 1.1, and so on. What you do with such content depends on its format, so mod_content needs to be able to describe the format. It does this with a content:format subelement.

This element takes one attribute, rdf:resource, which points to a URI that represents the format of the content. Basically, this attribute declares the namespace of the content. For example, for XHTML 1.0 Strict, the URI is http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict. The URI for HTML 4.0 is http://www.w3.org/TR/html4/. Further examples can be found in the RDDL natures document at http://www.rddl.org/natures/.

The content:format element is required. If you don't include it, you force anyone parsing your feed to guess your content's format.

Because you have declared the format of the content:item using an RDF declaration, you must now envelop the actual content inside an rdf:value element. Example 7-9 shows a simple version.

Example 7-9. A simple version of a mod_content item
<item>
...
<content:items>
  <rdf:Bag>
    <rdf:li>
      <content:item>
      <content:format rdf:resource="http://www.w3.org/TR/html4/" />
        <rdf:value>
          <![CDATA[<em>This is<strong>very</em> cool</strong>.]]>
        </rdf:value>
      </content:item>
</rdf:li>
</rdf:Bag>
</content:items>

Example 7-9 shows a single item containing a single content:item, containing a line of HTML 4.0 that reads <em>This is<strong>very</em> cool</strong>. Note that the HTML content is encased in a CDATA section. As with all XML (see Appendix A for details), non-XML-compliant content must be wrapped away in this manner inside an RSS feed.

HTML, however, is not the only content type, and newer content types are fully XML-compliant. XHTML, for example, does not need to be wrapped away, as long as the parser is made aware that the contents of the rdf:value element can be treated accordingly. For this, we use rdf:value's optional range of attributes, rdf:parseType and xmlns. Example 7-10 shows the same content as Example 7-9, but reformatted into XHTML. Note the differences in bold.

Example 7-10. A simple version of a mod_content item, with XHTML
<item>
...
<content:items>
  <rdf:Bag>
    <rdf:li>
      <content:item>
      <content:format rdf:resource="http://www.w3.org/1999/xhtml"/>
       <content:encoding rdf:resource="http://www.w3.org/TR/REC-xml#dt-wellformed" />
        <rdf:value rdf:parseType="Literal" xmlns="http://www.w3.org/1999/xhtml">
           <em>This is <strong>very</strong> </em> <strong>cool</strong>.
        </rdf:value>
      </content:item>
</rdf:li>
</rdf:Bag>
</content:items>

In Example 7-10, we've told the rdf:value element that its contents are both parsable of the namespace represented by the URI http://www.w3.org/1999/xhtml. We declare all of this to prevent RDF parsers from getting confused. We humans, of course, are anything but.

The content itself is now well-formed XML. To show this, we can include a new subelement of content:item , the optional content:encoding. This points to the rdf:resource of the URI of well-formed XML, http://www.w3.org/TR/REC-xml#dt-wellformed.

If no content:encoding is present, we assume that the content is plain character data, either enclosed in a CDATA section or surrounded by escaped characters such as &lt;b&gt;.

In summary:

content:items

Contains a subelement of rdf:Bag.

rdf:Bag

Contains one or more subelements of rdf:li.

rdf:li

Contains a mandatory subelement of content:item.

content:item

Takes the mandatory subelements content:format and rdf:value, and the optional subelement content:encoding. content:item must take the attribute rdf:about="URI" if the object can be directly addressed.

content:format

Takes the attribute rdf:about="URI", where the URI represents the format of the the content.

rdf:value

Contains the actual content. It can take two attributes. If its content is well-formed XML, it must take the attributes rdf:Parsetype="literal" and xmlns="http://www.w3.org/1999/xhtml".

content:encoding

Takes the attribute rdf:about="URI", where the URI represents the format in which the content is encoded.

Examples

Example 7-11. A fully mod_contented item
<item rdf:about="http://example.org/item/">
 <title>The Example Item</title>
 <link>http://example.org/item/</link>
 <description>I am an example item</description>
 <content:items>
  <rdf:Bag>
   
   <rdf:li>
    <content:item>
    <content:format rdf:resource="http://www.w3.org/1999/xhtml" />
    <content:encoding rdf:resource="http://www.w3.org/TR/REC-xml#dt-wellformed" />
     <rdf:value rdf:parseType="Literal" xmlns="http://www.w3.org/1999/xhtml">
      <em>This is a <strong>very cool</strong> example of mod_content</em>
     </rdf:value>
    </content:item>
   </rdf:li>
   
   <rdf:li>
    <content:item>
     <content:format rdf:resource="http://www.w3.org/TR/html4/" />
     <rdf:value>
      <![CDATA[You can include content in lots of formats. <a                
       href="http://www.oreillynet.com">links</a> too. ]]>
     </rdf:value>
    </content:item>
   </rdf:li>
   
  </rdf:Bag>
 </content:items>
</item>

It may either amuse or terrify you to realize that as content:item can contain any XML-formatted content, it can itself contain other RSS feeds. This might be of use for a RSS tutorial website, syndicating its lessons. Here, in Example 7-12, is an early version of this very section of this book, represented as an item, stopping right here to prevent a spiral of recursion.

Example 7-12. This page, formatted into an RSS 1.0 item
<item rdf:about="http://example.org/item/">
	<title>Examples</title>
	<description>The text of the first part of the Examples section of the mod_content 
	bit of chapter 7 of Content Syndication with XML and RSS</description>
	   
	<content:items>
	<rdf:Bag>
	   
	<rdf:li>
	<content:item>
	<content:format rdf:resource="http://www.w3.org/TR/html4/" />
	<rdf:value>
	<![CDATA[ <h2>Examples</h2>]]>
	</rdf:value>
	</content:item>
	</rdf:li>
	   
	<rdf:li>
	<content:item>
	<content:format rdf:resource="http://purl.org/rss/1.0/" />
	<content:encoding rdf:resource="http://www.w3.org/TR/REC-xml#dt-wellformed" />
	<rdf:value rdf:parseType="Literal" 
		   xmlns="http://purl.org/rss/1.0/"
		   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
		   xmlns:content="http://purl.org/rss/1.0/modules/content/">
	<item rdf:about="http://example.org/item/"><title>The Example Item</title>
	<link>http://example.org/item/</link><description>I am an example item</description>
	<content:items><rdf:Bag><rdf:li><content:item><content:format rdf:resource="http://www.w3.org/1999/xhtml" /><
	content:encoding rdf:resource="http://www.w3.org/TR/REC-xml#dt-wellformed" /><
	rdf:value rdf:parseType="Literal" xmlns="http://www.w3.org/1999/xhtml"><em>
	This is a <strong>very cool</strong> example of mod_content</em></rdf:value></content:item>
	</rdf:li><rdf:li><content:item><content:format rdf:resource="http://www.w3.org/TR/html4/" /><rdf:value><![CDATA[You can include content in lots of formats. <a href="http://www.oreillynet.com">links</a> too. ]]>
	</rdf:value></content:item></rdf:li></rdf:Bag></content:items></item>
	</rdf:value>
	</content:item>
	</rdf:li>
	   
	<rdf:li>
	<content:item>
	<content:format rdf:resource="http://www.w3.org/TR/html4/" />
	<rdf:value><![CDATA[ <p><i> Example 7.12 A fully mod_contented &lt;item&gt;</i></p><p> 
	It may either amuse or terrify you to realize that 
as &lt;content:item&gt; can 
contain any XML-formatted content, it can itself contain other RSS feeds. 
This might be of use 
for a RSS tutorial website, syndicating its lessons. Here, in example 7.13, is this 
very section of this book, represented as an &lt;item&gt;, stopping right here to prevent a 
spiral of recursion.</p>]]>
</rdf:value>
</content:item>
</rdf:li>
</rdf:Bag>
</content:items>
</item>
mod_dublincore

The second of the Standard modules to be examined in this chapter, mod_dublincore is the most-used of all the RSS 1.0 modules. It allows an RSS 1.0 feed to express the additional metadata formalized by the Dublin Core Metadata Initiative. Chapter 5 discusses this initiative in detail, so let's move on to the details of the module itself.

Namespace

mod_dublincore is identified by the prefix dc: and the URI http://purl.org/dc/elements/1.1/. So, in the grand tradition, the root element appears as:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns="http://purl.org/rss/1.0/"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
>

Elements

mod_dublincore can be used in two ways: the simpler and the more RDF-based.

In either usage, mod_dublincore elements are entirely optional and can be applied to the channel , an item, an image, a textinput element, or all of them, as liberally as you wish, as long as the information you are relating makes sense. It is rather addictive, I must say, and I encourage you to put Dublin Core metadata all over your feeds. Here's what we can include:

dc:title

The title of the item.

dc:creator

The name of the creator of the item (i.e., a person, organization, or system). If the creator is a person, this information is customarily in the format Firstname Lastname (email@domain.com).

dc:subject

The subject of the item.

dc:description

A brief description of the item.

dc:publisher

The name of the publisher, either a person or an organization. If the publisher is a person, this information is customarily in the format Firstname Lastname (email@domain.com).

dc:contributor

The name of a contributor, customarily in the format Firstname Lastname (email@domain.com).

dc:date

The publishing date, in the W3CDTF format (e.g., 2000-01-01T12:00+00:00).

dc:type

The nature of the item, taken from the list of Dublin Core types at http://dublincore.org/documents/dcmi-type-vocabulary/:

Collection

A collection is an aggregation of items, described as a group; its parts can be described and navigated separately (for example, a weblog).

Dataset

A dataset is information encoded in a defined structure (for example, lists, tables, and databases), intended to be useful for direct machine processing.

Event

According to the official definition of the Dublin Core authors, an event is a nonpersistent, time-based occurrence. Examples include any exhibition, webcast, conference, workshop, open-day, performance, battle, trial, wedding, tea-party, conflagration, or orgy. The soon-to-be-described mod_event has a lot to do with this sort of thing.

Image

They are worth a thousand words, you know.

Interactive resource

The official Dublin Core definition of an interactive resource is "a resource which requires interaction from the user to be understood, executed, or experienced. For example — forms on web pages, applets, multimedia learning objects, chat services, virtual reality." In the RSS world, resrouces could be either pointers to programs, or the textinput element itself.

Service

Technically, a service is a system that provides one or more functions of value to the end user. Assuming that just providing information doesn't count, a service could be used to point to web applications or web services, as long as you create an RSS feed that provides the necessary details (using mod_content to syndicate WSDL files, for example).

Software

You know what software is. In this case, it is distinguished from an interactive resource by being downloadable, rather than run on a remote server.

Sound

Officially, a sound is a resource with content primarily intended to be rendered as audio.

Text

Plain text content.

dc:format

This differs from dc:type by a degree of sophistication. Whereas dc:type provides a top-level indication of the feed's nature, dc:format should point to the exact MIME type of the content itself.

dc:identifier

The identifier should be an unambiguous reference to the resource within a given context. So, in RSS 1.0 terms, this is the same as the item 's rdf:about attribute.

dc:source

In RSS 1.0 terms, this element can do the same job as the ag:sourceURL of the mod_aggregation module. It should point to an unambiguous reference of the source of the item. Unlike the ag:sourceURL element, however, dc:source is not restricted to URLs. Any sufficiently unambiguous reference works (ISBN numbers, for example).

dc:language

The language in which the item is written, using the standard language code, as covered in Appendix B.

dc:relation

The URI of a related resource. See mod_DCTerms later in this chapter for more details.

dc:coverage

According to the Dublin Core authors, "Coverage will typically include spatial location (a place name or geographic coordinates), temporal period (a period label, date, or date range), or jurisdiction (such as a named administrative entity). Recommended best practice is to select a value from a controlled vocabulary (for example, the Thesaurus of Geographic Names [TGN]) and that, where appropriate, named places or time periods be used in preference to numeric identifiers such as sets of coordinates or date ranges."

dc:rights

This element should contain any copyright, copyleft, public domain, or similar declaration. The absence of this element does not imply anything whatsoever.

The more complex version of mod_dublincore adds RDF and the mod_taxonomy module to give a richer meaning to dc:subject. For example, dc:subject can be used simply like this:

<dc:subject>World Cup</dc:subject>

or combined with a definition of a topic, in a richer RDF version:

<dc:subject>
  <rdf:Description>
    <taxo:topic rdf:resource="http://dmoz.org/Sports/Soccer/" />
    <rdf:value>World Cup</rdf:value>
  </rdf:Description>
</dc:subject>

This not only defines the subject, but also provides it with a wider contextual meaning. In this example, we're saying the subject is "the World Cup of soccer" (or more correctly, we're saying that "this item is on the subject represented by the term `World Cup' in the context provided by the URI http://dmoz.org/Sports/Soccer".) After all, there is more than one World Cup. This approach is a especially useful for describing homonyms, such as:

<dc:subject>
  <rdf:Description>
    <taxo:topic rdf:resource="http://dmoz.org/Business/Industries/
Food_and_Related_Products/Beverages/Soft_Drinks" />
    <rdf:value>Coke</rdf:value>
  </rdf:Description>
</dc:subject>

as opposed to:

<dc:subject>
  <rdf:Description>
    <taxo:topic rdf:resource="http://dmoz.org/Health/Addictions/Substance_Abuse/
Illegal_Drugs/" />
    <rdf:value>Coke</rdf:value>
  </rdf:Description>
</dc:subject>

Example

Example 7-13. An RSS 1.0 feed with mod_dublincore
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns="http://purl.org/rss/1.0/"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
>   
   
<channel rdf:about="http://meerkat.oreillynet.com/?_fl=rss1.0">
  <title>Meerkat</title>
  <link>http://meerkat.oreillynet.com</link>
  <description>Meerkat: An Open Wire Service</description>
  <dc:publisher>The O'Reilly Network</dc:publisher>
  <dc:creator>Rael Dornfest (mailto:rael@oreilly.com)</dc:creator>
  <dc:rights>Copyright &#169; 2000 O'Reilly &amp; Associates, Inc.</dc:rights>
  <dc:date>2000-01-01T12:00+00:00</dc:date>
  <dc:type>Interactive Resource</dc:type>
  <image rdf:resource="http://meerkat.oreillynet.com/icons/meerkat-powered.jpg" />
  <textinput rdf:resource="http://meerkat.oreillynet.com" />
   
  <items>
    <rdf:Seq>
      <rdf:li resource="http://c.moreover.com/click/here.pl?r123" />
    </rdf:Seq>
  </items>
   
</channel>
   
<image rdf:about="http://meerkat.oreillynet.com/icons/meerkat-powered.jpg">
  <title>Meerkat Powered!</title>
  <url>http://meerkat.oreillynet.com/icons/meerkat-powered.jpg</url>
<link>http://meerkat.oreillynet.com</link>
  <dc:creator> Rael Dornfest (mailto:rael@oreilly.com)</dc:creator>
  <dc:type>image</dc:type>
</image>
   
<textinput rdf:about="http://meerkat.oreillynet.com">
  <title>Search Meerkat</title>
  <description>Search Meerkat's RSS Database...</description>
  <name>s</name>
  <link>http://meerkat.oreillynet.com/</link>
</textinput>
   
<item rdf:about="http://c.moreover.com/click/here.pl?r123">
  <title>XML: A Disruptive Technology</title>
  <link>http://c.moreover.com/click/here.pl?r123</link>
  <dc:description>This the description of the article</dc:description>
  <dc:publisher>The O'Reilly Network</dc:publisher>
  <dc:creator>Simon St.Laurent (mailto:simonstl@simonstl.com)</dc:creator>
  <dc:rights>Copyright &#169; 2000 O'Reilly &amp; Associates, Inc.</dc:rights>
  <dc:subject>XML</dc:subject>
</item>
</rdf:RDF>
mod_DCTerms

Once Dublin Core metadata has sunk its I-must-add-metadata-to-everything addictive nature into your very soul, you soon realize that the core terms are lacking in depth. For example, dc:relation means "is related to," but in what way? We don't know, unless we use mod_DCTerms.

mod_DCTerms introduces 28 new subelements to channel , item, image, and textinput, as appropriate. These subelements are related, within Dublin Core, to the core elements found within mod_dublincore, but mod_DCTerms does not express this relationship. For example, dcterms:created is actually a refinement of dc:date.

Namespace

mod_DCTerms takes the namespace prefix dcterms: and is identified by the URI http://purl.org/dc/terms/. So, the root element looks like this:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:dcterms="http://purl.org/dc/terms/"
>

Elements

You have a lot to choose from with this module. As we've said, the elements can be subelements of channel, item, image, or textinput. Apply liberally and with gusto.

dcterms:alternative

An alternative title for the item. For example:

<title>Programming Perl</title>
<dc:title>Programming Perl</dc:title>
<dcterms:alternative>The Camel Book</dcterms:alternative>
dcterms:created

The date the object was created, in W3CDTF standard (YYYY-MM-DDTHH:MM:SS).

dcterms:issued

The date the object was first made available. This should be used, for backward compatibility, with dc:date, and it should contain the same value. Again, the date must be in W3CDTF format.

dcterms:modified

The date the content of the object last changed, in W3CDTF format. This can sit inside channel, item, or both.

dcterms:extent

The size of the document referred to by the section of the feed in which the element appears, in bytes.

dcterms:medium

The HTTP Content-Type of the object to which the parent element refers. The HTTP Content-Type is made up of the MIME type, followed optionally by the character set, denoted by the string ;charset=. For example:

<dcterms:medium>text/html; charset=UTF-8</dcterms:medium>

Paired elements

Some of the mod_DCterms elements come paired together naturally. When we talk about two separate items, it is important to remember that the following paired elemets must work together:

dcterms:isVersionOf and dcterms:hasVersion

This pair of elements works together to point to different versions of an object. For example, you could use it to list versions in different languages or different formats. Their values should point to each other, should be URIs, and, for complete RDF compatibility, should be encased in an rdf:resource attribute. There is also nothing to stop you from providing further information about the version, via additional RDF markup, like so:

<dcterms:hasVersion rdf:resource="URI  OF RESOURCE">
<dc:title>TITLE OF OTHER VERSION</dc:title>
</dcterms:hasVersion>
dcterms:isReplacedBy and dcterms:replaces

Used to denote an item that points to a more recent version of the object in question. The syntax is the same as the dcterms:isVersionOf pair — it takes an rdf:resource attribute that points to the URI of the object in question.

dcterms:isRequiredBy and dcterms:requires

Used to denote an object relationship in which, according to the Dublin Core specification, "the described resource requires the referenced resource to support its function, delivery, or coherence of content." As you might expect by now, this pair takes the attribute rdf:resource to denote the URI of the object to which you're pointing , and may be augmented by additional RDF.

dcterms:isPartOf and dcterms:hasPart

The mod_DCTerms elements have quite self-explanatory names, and this pair is no exception. It denotes objects that are subsections of other objects. It's the traditional syntax of an rdf:resource attribute, with the option of additional RDF within the element.

dcterms:isReferencedBy and dcterms:references

A pair in which one object refers to or cites the other. Its syntax is the usual drill—an rdf:resource attribute and some additional RDF if you're feeling generous.

dcterms:isFormatOf and dcterms:hasFormat

This final pair of elements denotes two objects that contain the same intellectual content but differ in format. For example, one object could be color PDF and the other could be a Word document. The syntax is the same as the other paired elements, but with the additional recommendation that you include dc:format, dc:language, or another element that helps the end user tell the difference between the two separate versions. Also bear in mind that URIs must be unique, so anyone using content negotiation on their server must give different URIs for each format, whether or not it is actually necessary.

Using DCSV values

There are three mod_DCTerms elements that take a special syntax to denote a timespan. This syntax, Dublin Core Structured Values (DCSV), represents complex values together in one simple string. It takes the following format (all the attributes are optional):

name=ASSOCIATED NAME; start=START TIME; end=END TIME; scheme=W3C-DTF;
dcterms:temporal

This element denotes any timespan of the item's subject matter. For example:

<dcterms:temporal>
name=World War 2; start=1939; end=1945; scheme=W3C-DTF;
</dcterms:temporal>
dcterms:valid

This denotes the timespan during which the item's contents is valid. For example:

<dcterms:valid>start=20030101; end=200300201;  scheme=W3C-DTF;</dcterms:valid>
dcterms:available

This denotes the timespan during which the object to which the item points is available (i.e., network-retrievable).

mod_event

mod_event really breaks RSS 1.0 out of the datacentric model and into the real world. It's purpose is to describe details of real-world events. You can then use this data in your calendar applications, display it on a page, email it, or whatever purpose you like.

According to Søren Roug, the module's author, "This specification is not a reimplementation of RFC 2445 iCalendar in RDF. In particular, it lacks such things as TODO and repeating events, and there is no intention of adding those parts to the specification."

Namespace

The events module takes the shapely ev: as its namespace prefix, and it is identified by the pleasingly regular http://purl.org/rss/1.0/modules/event/. So, the root element looks like this:

<?xml version="1.0" encoding="utf-8"?> 
   
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:ev="http://purl.org/rss/1.0/modules/event/"
>

Elements

The mod_events elements are all subelements of item. None of them are mandatory, but common sense should prevail regarding usage: the more the better.

ev:startdate

The time and date of the start of the event, in W3CDTF format.

ev:enddate

The time and date of the end of the event, in W3CDTF format.

ev:location

The location of the event. This can be a simple string or a URI, or it can be semantically augmented via RDF. For example:

<ev:location>At Ben's house</ev:location>
or
<ev:location>http://www.example.org/benshouse</ev:location>
or
<ev:location rdf:resource="http://www.mapquest.com">
<rdf:value>
http://www.mapquest.com/maps/map.adp?map.x=177&map.y=124&mapdata=xU4YXdELrnB2xoPaJ66QjsffE4Zu%252bP6OZQy2y1Ah8EPehGZcP7zX7a3LAujflI
6g%252boY5z8%252b7lqnLexYmGmo96xAPLE%252bMe4H2TaN0PDMZ5pH9rjsN3owqiP9AOg8%252fOX
tNlI1FGCb4fddEaWl23DGyUhXfazgpROqIrCGP%252fmKvh2vwRs0lc8k9F0ltIpaTc%252foiXwyvfB
CMSvv2EAvYEbNgn6ztUAlmEA%252bK2tqfR5jD9QRdgA0yRNovXEpgRakMia3g2jRzTo06OcbL8TDJru
fAn11sl6d5CQUD8xjR1nJj3ieObeWOVwRB0w8T4MSHFQLg9SoPaSN3LMG2PixeD2X5%252bs4Sg3K1JS
4LqmvDON%252bugHKDenLg%252b%252fxQhtVGFuhugqWLosZ%252fSo2wQ7Y%253d&click=center
</rdf:value>
</ev:location>
ev:organizer

The name of the organizer of the event. Again, we can semantically augment this element to include more information. For example:

<ev:organizer>Ben Hammersley</ev:organiser>
ev:type

According to the specification, this should be "the type of event, such as conference, deadline, launch, project meeting. The purpose is to promote or filter out certain types of events that the user has a particular (lack of) interest for. Avoid the use of subject-specific wording. Use instead the Dublin Core subject element."

Example

Example 7-14. An RSS 1.0 feed with mod_event
<?xml version="1.0" encoding="utf-8"?> 
   
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns:ev="http://purl.org/rss/1.0/modules/event/"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns="http://purl.org/rss/1.0/"
> 
   
  <channel rdf:about="http://events.oreilly.com/?_fl=rss1.0">
    <title>O'Reilly Events</title>
    <link>http://events.oreilly.com/</link>
    <description>O'Reilly Events</description>
   
    <items>
      <rdf:Seq>
        <rdf:li resource="http://conferences.oreilly.com/p2p/" />
        <rdf:li resource="http://www.oreilly.com/catalog/progxmlrpc/" />
      </rdf:Seq>
    </items>
   
  </channel>
   
  <item rdf:about="http://conferences.oreilly.com/p2p/">
    <title>The O'Reilly Peer-to-Peer and Web Services Conference</title> 
    <link>http://conferences.oreilly.com/p2p/</link>
    <ev:type>conference</ev:type>
    <ev:organizer>O'Reilly</ev:organizer>
    <ev:location>Washington, DC</ev:location>
    <ev:startdate>2001-09-18</ev:startdate>
    <ev:enddate>2001-09-21</ev:enddate>
    <dc:subject>P2P</dc:subject>
  </item> 
  <item rdf:about="http://www.oreilly.com/catalog/progxmlrpc/">
    <title>Programming Web Services with XML-RPC</title> 
    <link>http://www.oreilly.com/catalog/progxmlrpc/</link>
    <ev:startdate>2001-06-20</ev:startdate>
    <ev:type>book release</ev:type>
    <dc:subject>XML-RPC</dc:subject>
    <dc:subject>Programming</dc:subject>
  </item> 
</rdf:RDF>
mod_rss091

The mod_rss091 module is designed to give RSS 1.0 "sideways compatibility" with RSS 0.91. Because the three core subelements of the item element are the same in both standards, including mod_rss091 elements in your RSS 1.0 feed allows for dynamic downgrading of the feed for parsers that can't be bothered with all the RDF stuff. Because the data is rather simple and mostly static, including this module within your RSS 1.0 feed is straightforward. With this in mind, it's worth doing.

Namespace

The prefix for this module is the self-explanatory rss091:, and the module is represented by the URI http://purl.org/rss/1.0/modules/rss091#. Hence:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:rss091="http://purl.org/rss/1.0/modules/rss091/"
>

Elements

The mod_rss091 elements represent the same elements within RSS 0.91. Chapter 4 provides details on each of those elements.

Subelements of channel

rss091:language

The language of the feed.

rss091:rating

The PICS rating of the feed.

rss091:managingEditor

The managing editor of the feed.

rss091:webmaster

The webmaster of the feed.

rss091:pubDate

The publication date of the feed.

rss091:lastBuildDate

The date of the feed's last build.

rss091:copyright

The copyright notice of the feed.

rss091:skipHours rdf:parseType="Literal"

The skipHours element, with correct RDF syntax.

rss091:hour

The hours, in GMT, during which the feed should not be retrieved.

rss091:skipDays rdf:parseType="Literal"

The skipDays element, with correct RDF syntax.

rss091:day

The days during which a feed should not be retrieved (Monday is 1 and Sunday is 7).

Subelements of image

rss091:width

The width of the image.

rss091:height

The height of the image.

Subelement of item

rss091:description

The description of the item. While this element is replicated by core RSS 1.0, it is listed here for the sake of completion.

Example

Example 7-15. RSS 1.0 feed elements using mod_rss091
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:rss091="http://purl.org/rss/1.0/modules/rss091/"
>   
  <channel rdf:about="http://www.xml.com/xml/news.rss">
    <title>XML.com</title>
    <link>http://xml.com/pub</link>
    <description>
    XML.com features a rich mix of information and services for the XML community.
    </description>
    <rss091:language>en-us</rss091:language>
    <rss091:rating>(PICS-1.1 "http://www.rsac.org/ratingsv01.html"
     l gen true comment "RSACi North America Server" 
     for "http://www.rsac.org" on "1996.04.16T08:15-0500" 
     r (n 0 s 0 v 0 l 0))</rss091:rating>
    <rss091:managingEditor>Edd Dumbill</rss091:managingEditor>
    <rss091:webmaster>(mailto:webmaster@xml.com)</rss091:webmaster>
    <rss091:pubDate>Sat, 01 Jan 2000 12:00:00 GMT</rss091:pubDate>
    <rss091:lastBuildDate>Sat, 01 Jan 2000 12:00:00 GMT</rss091:lastBuildDate>
    <rss091:skipHours rdf:parseType="Literal">
    <rss091:hour>12</rss091:hour>
    </rss091:skipHours>
    <rss091:skipDays rdf:parseType="Literal">
    <rss091:day>Thursday</rss091:day>
    </rss091:skipDays>
  </channel>
   
  <image rdf:about="http://xml.com/universal/images/xml_tiny.gif">
    <title>XML.com</title>
    <link>http://www.xml.com</link>
    <url>http://xml.com/universal/images/xml_tiny.gif</url>
    <rss091:width>88</rss091:width>
    <rss091:height>31</rss091:height>
    <rss091:description>XML.com...</rss091:description>
  </image>
   
  <item rdf:about="http://xml.com/pub/2000/08/09/xslt/xslt.html" position="1">
     <title>Processing Inclusions with XSLT</title>
     <link>http://xml.com/pub/2000/08/09/xslt/xslt.html</link>
     <rss091:description>
     Processing document inclusions with general XML tools can be problematic. 
     This article proposes a way of preserving inclusion information through 
     SAX-based processing.
     </rss091:description>
  </item>
   
</rdf:RDF>
mod_servicestatus

mod_servicestatus is one of the latest RSS 1.0 modules. Its purpose is to allow RSS 1.0 to display details of the status and current availability of services and servers.

You should bear in mind the difference between services and servers. One service may rely on more than one server in the back end. For the user, however, such information is irrelevant — something either works or it doesn't. With mod_servicestatus, you cannot differentiate between a virtual service and an actual physical server, but you can combine servers into services at the parsing stage. This means that one feed can be used for multiple things: a detailed display for sysadmins, and a simplified version for end users.

Namespace

The mod_servicestatus prefix is ss:, and the module is identified by the URI http://purl.org/rss/1.0/modules/servicestatus/. So, a mod_servicestatus feed will start like this:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
       xmlns="http://purl.org/rss/1.0/"
         xmlns:ss="http://purl.org/rss/1.0/modules/servicestatus/"
>

Elements

The first element of mod_servicestatus is a subelement of channel:

ss:aboutStats

A URI that points to a page explaining the results and methodology being used.

All the other elements within mod_servicestatus are subelements of item. As with many modules, all of these elements are optional, but the more you include the more fun you'll have.

ss:responding

This can be either true or false, and it refers to whether the server is responding.

ss:lastChecked

The date and time the server was last checked, in W3CDTF format.

ss:lastSeen

The date and time the server last responded, in W3CDTF format. In conjunction with ss:lastChecked, this enables you to work out down times.

ss:availability

A figure that describes server availability. Usually a integer percentage, this should be explained in the document referenced by ss:aboutStats.

ss:averageResponseTime

The average response time of the server, usually in seconds. This should also be explained in the ss:aboutStats document.

ss:statusMessage

A message aimed at the end user. For example: "We know this is broken, and we're working on it," or "Please log out, pushing of the Big Red Button is imminent," or "Run for the door! Run! Run!"

Example

Example 7-16. An RSS 1.0 feed with mod_servicestatus
<?xml version="1.0" encoding="utf-8"?>
   
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns="http://purl.org/rss/1.0/"
         xmlns:ss="http://purl.org/rss/1.0/modules/servicestatus/"
>
   
<channel rdf:about="http://my.organisation.com">
  <title>An Example</title>
  <description>Just an example of system statuses</description>
  <link>http://my.organisation.com</link> 
<ss:aboutStats>http://my.organisation.com/status.html</ss:aboutStats>
  <items>
    <rdf:Seq>
      <rdf:li resource="http://my.organisation.com/website" />
      <rdf:li resource="http://my.organisation.com/database" />
    </rdf:Seq>
  </items>
</channel>
   
<item rdf:about="http://my.organisation.com/website">
  <title>Website</title>
  <link>http://my.organisation.com/website</link>
  <ss:responding>true</ss:responding>
  <ss:lastChecked>2002-05-10T19:20:30.45+01:00</ss:lastChecked>
  <ss:lastSeen>2002-05-10T19:20:30.45+01:00</ss:lastSeen>
  <ss:availability>85</ss:availability>
  <ss:averageResponseTime>5.2</ss:averageResponseTime>
</item>
   
<item rdf:about="http://my.organisation.com/database">
  <title>Database server</title>
  <link>http://my.organisation.com/database</link>
  <ss:responding>false</ss:responding>
  <ss:lastChecked>2002-05-10T19:20:30.45+01:00</ss:lastChecked>
  <ss:lastSeen>2002-05-09T13:43:56.24+01:00</ss:lastSeen>
  <ss:availability>77</ss:availability>
  <ss:averageResponseTime>12.2</ss:averageResponseTime>
  <ss:statusMessage>Engineers are investigating.</ss:statusMessage>
</item>
   
</rdf:RDF>
mod_slash

Slash is the software originally written to run the popular technology news site, Slashdot. It has spread quite far lately, and now hundreds of sites use it for their content management system. Slash's unique features do not fit into the core RSS 1.0 specification, so Rael Dornfest and Chris Nandor wrote this module. The features are most easily understood after a look at a Slash-based site, so go over to http://www.slashdot.org to see what's happening.

Namespace

The namespace prefix is slash:, and the identifying URI is http://purl.org/rss/1.0/modules/slash/. Hence:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
>

Elements

All the mod_slash elements are subelements of item. They are all mandatory.

slash:section

The title of the section in which the article appears.

slash:department

The title of the department in which the article appears (in most Slash sites, this title is a joke).

slash:comments

The number of comments attached to an article.

slash:hit_parade

A comma-separated list of the number of comments displayable at each karma threshold (this will make sense to you if you look at a Slash-based site). There should be seven figures, matching karma thresholds of -1, 0, 1, 2, 3, 4, and 5.

Example

Example 7-17. An item element containing mod_slash
<item rdf:about="http://slashdot.org/article.pl?sid=02/07/01/164242">
<title>LotR Two Towers Trailer Online</title>
<link>http://slashdot.org/article.pl?sid=02/07/01/164242</link>
<dc:creator>CmdrTaco</dc:creator>
<dc:subject>movies</dc:subject>
<dc:date>2002-07-01T17:08:24+00:00</dc:date>
<slash:department>provided-you-have-sorenson-and-bandwidth</slash:department>
<slash:section>articles</slash:section>
<slash:comments>20</slash:comments>
<slash:hitparade>20,19,11,8,3,0,0</slash:hitparade>
</item>
mod_streaming

mod_streaming was designed by me (happily enough) to take care of the additional needs of anyone who wants to create a feed that points to streaming-media presentations. You will notice elements for the live events—start times, end times, and so forth. These can also be used to split a single stream into chunks and provide associated metadata with each section.

Namespace

mod_streaming takes str: as its prefix, and http://hacks.benhammersley.com/rss/streaming/ is its identifying URI. So, its root element looks like this:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns="http://purl.org/rss/1.0/"
         xmlns:str="http://hacks.benhammersley.com/rss/streaming/"
>

Elements

All the elements within mod_streaming are subelements of item except str:type, which can be a subelement of channel as well. All elements are optional.

str:type

This can take audio , video, or both as its value, and it can be a subelement of either item or channel . video implies a video, regardless of the presence of a soundtrack. both implies a mixture of video and audio items, and hence is for use only within a channel description.

str:associatedApplication

The name of any special application required to play back the stream.

str:associatedApplication.version

The version number of the associated application, if applicable.

str:associatedApplication.downloadUri

The URI for downloading the associated application.

str:codec

The name of the codec in which the stream is encoded.

str:codec.version

The version number of the codec, if applicable

str:codec.downloadUri

The URI for downloading the codec.

str:codec.sampleRate

The value of any audio's sample rate, in kHz.

str:codec.stereo

Either stereo or mono, depending on the audio being used.

str:codec.ResolutionX

The number of pixels in the X axis (width).

str:codec.ResolutionY

The number of pixels in the Y axis (height).

str:duration

The length of the item, in the W3C format of HH:MM:SS.

str: live

Either live or recorded, as applicable.

str:live.scheduledStartTime

A W3CDTF-encoded date and time for the start of live broadcasts, or just HH:MM:SS.ss for the start time in the timecode of a recording.

str:live.scheduledEndTime

The end time of a live broadcast or recording, in the same format as str:live.scheduledStartTime.

str:live.location

This can be a literal string, as per the Dublin Core location guidelines, or it can use RDF with additional location-specific namespaces.

str:live.contactUri

A URI to contact the live show (e.g., mailto:, http:, aim:, or irc:). Think "radio phone-in show."

Example

Example 7-18. An RSS 1.0 feed using mod_streaming
<?xml version="1.0" encoding="utf-8"?> 
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:str="http://hacks.benhammersley.com/rss/streaming/"
> 
<channel rdf:about="http://www.streamsRus.com/">
  <title>Streams R Us</title>
  <link>http://www.streamsRus.com</link>
  <description>Streams R Us: An Entirely Fictional Site</description>
  <str:type>both</str:type>
  <image rdf:resource="http://www.streamsRus.com/icons/stream.jpg" />
<items>
  <rdf:Seq>
    <rdf:li rdf:resource="http://www.streamsRus.com/example.ram" />
    <rdf:li rdf:resource="http://www.streamsRus.com/example2.mp3" />
    <rdf:li rdf:resource="http://www.streamsRus.com/example3.mov" />
  </rdf:Seq>
</items>
</channel>
   
<item rdf:about="http://www.streamsRus.com/example.ram">
  <title>RSS Rocks Out</title> 
  <link>http://www.streamsRus.com/example.ram</link>
  <str:associatedApplication>realplayer</str:associatedApplication>
  <str:associatedApplication.downloadUri>http://www.real.com/
  </str:associatedApplication.downloadUri>
  <str:duration>00:04:30</str:duration>
  <str:live>recorded</str:live>
</item> 
   
<item rdf:about="http://www.streamsRus.com/example2.ram">
  <title>RSS Rocks Out Live</title> 
  <link>http://www.streamsRus.com/example2.mp3</link>
  <str:associatedApplication>winamp</str:associatedApplication>
  <str:associatedApplication.downloadUri>http://www.winamp.com/
  </str:associatedApplication.downloadUri>
  <str:duration>00:04:30</str:duration>
  <str:live>live</str:live>
  <str:live.scheduledStartTime>2002:04:03T00:00:00Z</str:scheduledStartTime>
  <str:live.scheduledEndTime>2002:04:03T00:04:30Z</str:scheduledEndTime>
</item> 
  
<item rdf:about="http://www.streamsRus.com/example3.mov">
  <title>RSS Rocks Out Live on Video</title> 
  <link>http://www.streamsRus.com/example2.mov</link>
  <str:type>video</str:type>
  <str:codec>sorenson</str:codec>
  <str:associatedApplication>Quicktime</str:associatedApplication>
  <str:associatedApplication.downloadUri>http://www.apple.com/quicktime
  </str:associatedApplication.downloadUri>
  <str:duration>00:02:32</str:duration>
  <str:live>live</str:live>
  <str:live.scheduledStartTime>2002:04:03T00:00:00Z</str:scheduledStartTime>
  <str:live.scheduledEndTime>2002:04:03T00:02:32Z</str:scheduledEndTime>
  <str:codec.ResolutionX>600</str:codec.ResolutionX>
  <str:codec.ResolutionY>400</str:codec.ResolutionY>
  <str:live.ContactUri>mailto:ben@benhammersley.com</str:live.contact.Uri>
</item>   
   
</rdf:RDF>
mod_syndication

mod_syndication gives aggregators and feed users an idea of how often the feed changes. By giving this information, you prevent everyone from wasting time and bandwidth by asking for your feed too often or, indeed, too seldom. It is the third module to achieve Standard status.

mod_syndication supersedes the skipHours and skipDays elements of mod_rss091. Clients usually prefer the mod_syndication values over mod_rss091.

Namespace

mod_syndication takes sy: as its prefix and http://purl.org/rss/1.0/modules/syndication as its identifying URI. Thus:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
>

Elements

The mod_syndication elements are all subelements of channel:

sy:updatePeriod

Takes a value of hourly, daily, weekly, monthly, or yearly.

sy:updateFrequency

A number representing the number of times the feed should be refreshed during the updatePeriod. For example, an updatePeriod of hourly and an updateFrequency of 2 will make the aggregator refresh the feed twice an hour. If this element is missing, the default is 1.

sy:updateBase

The date and time, in W3CDTF format, from which all calculations should originate.

Example

Example 7-19. A part of a channel containing the mod_syndication elements
<channel rdf:about="http://meerkat.oreillynet.com/?_fl=rss1.0">
<title>Meerkat</title>
<link>http://meerkat.oreillynet.com</link>
<description>Meerkat: An Open Wire Service</description>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>2</sy:updateFrequency>
<sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase>
mod_taxonomy

mod_taxonomy allows the classification of objects under a defined taxonomic scheme — basically, you describe the topics of your objects.

The object can be anything: a channel, an item, or a reference from another module. Because of this universality, mod_taxonomy can be used heavily throughout an RSS 1.0 feed, which may cause some confusion. As with many modules, a good bit of reformatting may help clarify things.

The taxonomic definitions are always given as URIs. As shown in Chapter 5, URIs are used, like namespaces, to differentiate between homonyms. Python (the language) and Python (the snake) need to be distinguished, because you may want to run away from one of them.

One good source of taxonomic URIs is the Open Directory Project, at http://www.dmoz.org. All the examples in this section originate from this source.

Namespace

mod_taxonomy takes the stylish moniker of taxo: and the identifying URI of http://purl.org/rss/1.0/modules/taxonomy/. Hence, the lovely root element:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns="http://purl.org/rss/1.0/"
  xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/"
>

Elements

mod_taxonomy can be used in two ways: the simple and the more defined. The simple method uses one element, and it can be used as a subelement of item or channel:

<taxo:topics>
<rdf:Bag>
<rdf:li resource="URI TO TAXONOMIC REFERENCE" /
<rdf:li resource="URI TO TAXONOMIC REFERENCE" />
</rdf:Bag>
</taxo:topics>

This nesting of elements gives a list of topics that are associated with the channel or the item that contains it. This structure remains the same, with additional <rdf:li resource=""/> elements for every new topic.

This provides a straightforward method for giving a list of defining URIs for an RSS object. Sometimes, however, we'd like to define more details of each of the topic URIs themselves. For this, we use the taxo:topic element. This element is a subelement of rdf:RDF — i.e., on the same level as channel, item, and so on.

Within the grammar of RDF, taxo:topic allows us to assign metadata to the URI that we use elsewhere in the feed in taxo:topics. It takes one subelement of its own module, taxo:link , and then any other module's element that can be a subelement of channel. The most popular elements come from mod_dublincore:

<taxo:topic rdf:about="URI OF TAXONOMIC RESOURCE">
<taxo:link>URL TO TAXONOMIC RESOURCE HERE<taxo:link>
<dc:subject>EXAMPLE</dc:subject>
OTHER ELEMENTS HERE
</taxo:topic>

The taxo:topic element itself can contain taxo:topics, as shown in Example 7-20.

Example

Example 7-20. A partial RSS 1.0 feed demonstrating mod_taxonomy
<item rdf:about="http://c.moreover.com/click/here.pl?r123" position="1">
  <title>XML: A Disruptive Technology</title> 
  <link>http://c.moreover.com/click/here.pl?r123</link>
  <description>
  XML is placing increasingly heavy loads on the existing technical
  infrastructure of the Internet.
  </description>
  <taxo:topics>
    <rdf:Bag>
     <rdf:li resource="http://meerkat.oreillynet.com/?c=cat23">
     <rdf:li resource="http://meerkat.oreillynet.com/?c=47">
     <rdf:li resource="http://dmoz.org/Computers/Data_Formats/Markup_Languages/XML/">
    </rdf:Bag>
  </taxo:topics>
</item> 
   
<taxo:topic rdf:about="http://meerkat.oreillynet.com/?c=cat23">
  <taxo:link>http://meerkat.oreillynet.com/?c=cat23</taxo:link>
  <dc:title>Data: XML</taxo:title>
  <dc:description>A Meerkat channel</dc:description>
</taxo:topic>
   
<taxo:topic rdf:about="http://dmoz.org/Computers/Data_Formats/Markup_Languages/XML/">
  <taxo:link>http://dmoz.org/Computers/Data_Formats/Markup_Languages/XML/</taxo:link>
  <dc:title>XML</taxo:title>
  <dc:subject>XML</dc:subject>
  <dc:description>DMOZ category</dc:description>
  <taxo:topics>
    <rdf:Bag>
     <rdf:li resource="http://meerkat.oreillynet.com/?c=cat23">
     <rdf:li resource="http://dmoz.org/Computers/Data_Formats/Markup_Languages/SGML/">
     <rdf:li resource="http://dmoz.org/Computers/Programming/Internet/">
    </rdf:Bag>
  </taxo:topics>
</taxo:topic>

Example 7-19 shows an item using taxo:topics to describe itself, and a taxo:topic defining two of the taxonomic definitions used. The last taxo:topic uses taxo:topics itself to define its own subject with more finesse.

Note that the taxo:topic elements—which define the URIs we use within the <item><taxo:topics></taxo:topics></item> section — are on the same level as the item within the document. RSS 1.0's structure, unlike RSS 0.9x, gives them both equal weight.

mod_threading

mod_threading provides a system to describe the children of an item (for example, replies to a weblog entry). This module is still in a state of flux — a great deal of work is being done to finalize a system for the description of message threads within RSS and RDF. This is one of the goals of the ThreadML developmental effort (http://www.quicktopic.com/7/H/rhSrjkWgjnvRq).

With this in mind, mod_threading can get complicated quickly. Unfortunately, as complex as you might logically make it, the lack of standardization means that anything but the simplest usage will likely be misunderstood by most parsers. Therefore, in this chapter we restrict ourselves to defining children only within the limited scope of a single document. If true message threading is your goal, check with the mailing lists and weblogs for more details.

Namespace

mod_threading takes the prefix thr: and the identifying URI http://purl.org/rss/1.0/modules/threading/. Hence, the root element:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns="http://purl.org/rss/1.0/"
         xmlns:thr="http://purl.org/rss/1.0/modules/threading/"
>

Element

There's only one element within mod_threading; it's a subelement of item and it contains an rdf:Seq of rdf:li of URIs representing items that are children of the item:

<thr:children>
  <rdf:Seq>
    <rdf:li rdf:resource="URI OF CHILD ITEM" />
  </rdf:Seq>
</thr:children>

For simplicity's sake, the child item, and hence the URI, must be also contained within the same RSS 1.0 document.

Example

Example 7-21. mod_threading within an item element
<item rdf:about="http://c.moreover.com/click/here.pl?r123">
    <title>XML: A Disruptive Technology</title> 
    <link>http://c.moreover.com/click/here.pl?r123</link>
    <thr:children>
     <rdf:Seq>
       <rdf:li rdf:resource="http://www.example.com/child1"/>
       <rdf:li rdf:resource="http://www.example.com/child2"/>
       <rdf:li rdf:resource="http://www.example.com/child2"/>
     </rdf:Seq>
   </thr:children>
</item>
mod_wiki

Wikis—web pages that grant editing rights to everyone—are increasingly popular, but they give RSS feed creators plenty of special problems. Because wikis contain extensive information about how the page has been edited, and by whom, they require their own module to supply all the necessary elements.

Namespace

mod_wiki 's prefix is wiki:, and the identifying URI is http://purl.org/rss/1.0/modules/wiki/. mod_wiki also uses mod_dublincore for some of its elements. Hence, the lovely root element:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns="http://purl.org/rss/1.0/"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns:wiki="http://purl.org/rss/1.0/modules/wiki/"
 >

Elements

wiki:interwiki

An optional subelement of channel , wiki:interwiki refers to the moniker of the wiki in question if it is part of an interwiki setup. It can take two forms, between which you may choose—the simpler:

<wiki:interwifi>INTERWIKI MONIKER<wiki:interwiki>

or the more complex, which may be unparsable for simple parsers:

<wiki:interwiki>
  <rdf:Description link="URL TO WIKI">
    <rdf:value>WIKI NAME</rdf:value>
  </rdf:Description>
</wiki:interwiki>
wiki:version

An optional subelement of item, containing the version number of the page.

wiki:status

An optional subelement of item, denoting it as new, updated, or deleted.

wiki:importance

An optional subelement of item, describing the importance of the change to the page (either major or minor).

wiki:diff

An optional subelement of item that provides a URL to the previous version of the page.

wiki:history

An optional subelement of item that provides a URL to a list of changes to the page.

wiki:host

A special optional subelement of the dc:contributor element from mod_dublincore. It contains the IP address of the person who made the change to the wiki page. It should be in the following format:

<dc:contributor>
  <rdf:Description wiki:host="192.168.1.10">
    <rdf:value>A.N.Person</rdf:value>
  </rdf:Description>
</dc:contributor>

Example

Example 7-22. mod_wiki within an item element
<item rdf:about="http://www.usemod.com/cgi-bin/mb2.pl?action=browse&amp;id=JohnKellden&amp;revision=30">
  <title>JohnKellden</title>
  <link>http://www.usemod.com/cgi-bin/mb2.pl?JohnKellden</link>
  <description></description>
  <dc:date>2002-07-03T06:47:19+00:00</dc:date>
  <dc:contributor>
  <rdf:Description wiki:host="pc88-86.norrkoping.se" >
  <rdf:value>pc88-86.norrkoping.se</rdf:value>
  </rdf:Description>
  </dc:contributor>
  <wiki:status>updated</wiki:status>
  <wiki:importance>major</wiki:importance>
<wiki:diff>http://www.usemod.com/cgi-bin/mb2.pl?action=browse&amp;diff=4&amp;id=JohnKellden</wiki:diff>
<wiki:version>30</wiki:version>
  <wiki:history>http://www.usemod.com/cgi-bin/mb2.pl?action=history&amp;id=JohnKellden</
wiki:history>
</item>
    [ Team LiB ] Previous Section Next Section