I’ve recently read yet another post about RSS… Blogging gurus say that every blog should have post explaining RSS and we do.
Strangely these pages are extremely bad at explaining RSS. They do contain plenty of convincing words to make you subscribe, but generally there is zero effort to actually explain.
So do you want to know what it really has inside? Read on.
<?xml version="1.0"?> <rss version="2.0"> <channel> <title>Software updates</title> <description> Updates for software featured at www.Rarst.net</description> <link>https://www.Rarst.net</link> <lastBuildDate>Thu, 25 Sep 2008 21:54:37 +0200</lastBuildDate> <generator>RSSUpdGen by Rarst</generator> <item> <title><![CDATA[CCleaner 2.12.651]]></title> <description><![CDATA[CCleaner 2.12.651<br /> Read more at <a href="https://www.rarst.net/software/ccleaner/"> https://www.rarst.net/software/ccleaner/</a><br /> Download at <a href="http://www.ccleaner.com/download/builds"> http://www.ccleaner.com/download/builds</a>]]></description> <pubDate>Thu, 25 Sep 2008 21:54:37 +0200</pubDate> <link><![CDATA[http://www.ccleaner.com/download/builds]]></link> <guid isPermaLink="false"><![CDATA[CCleaner 2.12.651]]></guid> </item> </channel> </rss>
This is example of real RSS feed I create for generating software updates list in my sidebar. Looks scary? But it really isn’t – just pay attention to tags (those words between <>). They are made to be human-readable which is one of XML (and in turn RSS) strong points.
- version tags in the beginning tell us (and software that reads feed) versions of XML and RSS used. XML is always 1.0 and I generally go with 2.0 version of RSS because it’s really easy to create and work with;
- channel contains information about feed and feed entries (items). Notice how channel ends up in the end only followed by rss tag;
- title is name of feed;
- description is human-readable description of feed content and purpose;
- link is a site that feed comes from;
- lastBuildDate is time when feed was created or last edited. It must be in RFC 822 time format which is human-readable (but huge pain to code);
- generator is name of software that created feed. In this case I am using small program I coded for myself in AutoIt;
- item is one feed entry. Feed usually contains few of them, I left one for simplicity:
- title again, but now we are inside item so it is name for this item. Notice how it is enclosed in additional CDATA brackets – this is done so there is less risk of special characters breaking XML rules and feed itself;
- description again, but inside item it’s actual content of an entry. It may be text but usually it has (X)HTML markup so can have rich formatting, include images, etc;
- pubDate – time when item was created;
- link – outgoing link for item. In this case to download page. This is actually rare tag in feeds, see next one for explanation;
- guid is one of the most important tags. It is unique identifier of an item. So even if item itself changes but guid remains the same it won’t highlight as updated. In feeds guid usually contains link. Since it doesn’t make sense for few items to point to the same link it acts as good identifier. Feed readers assume guid is a link by default. In this case it fails. If I update item download link still remains same so it won’t show as updated in feed readers. So I am using combination of program name and version as identifier and adding isPermaLink=false parameter to notify feed readers that they should look for link elsewhere (in this case in link tag above).
RSS feed is XML file of specific RSS format that contains structured information about feed itself as well as entries in it.
Because it is text RSS feeds are easy to work with and to code software for creating and processing them. There are more tags to be used and all of them are described in RSS 2.0 specification.
Lost in technical details? Ask questions about feeds in comments, I like to talk about them. :)