Previous Section  < Day Day Up >  Next Section

Workshop: Reading RSS Syndication Feeds

There are hundreds of XML dialects out there representing data in a platform- independent, software-independent manner. One of the most popular is RSS, a format for sharing headlines and links from online news sites, weblogs, and other sources of information.

RSS makes web content available in XML form, perfect for reading in software, in web-accessible files called feeds. There are RSS readers called news aggregators that have been adopted by several million information junkies to track all of their favorite websites. There also are web applications that collect and share RSS items.

The hard-working Builder class in the nu.xom package can load XML over the Internet from any URL:


String rssUrl = "http://search.csmonitor.com/rss/top.rss";

Builder builder = new Builder();

Document doc = builder.build(rssUrl);


This hour's workshop employs this technique to read an RSS 2.0 file, presenting the 15 most recent items.

Open your editor and enter the text of Listing 21.4. Save the result as Aggregator.java.

Listing 21.4. The Full Text of Aggregator.java

 1: import java.io.*;

 2: import nu.xom.*;

 3:

 4: public class Aggregator {

 5:     public String[] title = new String[15];

 6:     public String[] link = new String[15];

 7:     public int count = 0;

 8:

 9:     public Aggregator(String rssUrl) {

10:         try {

11:             // retrieve the XML document

12:             Builder builder = new Builder();

13:             Document doc = builder.build(rssUrl);

14:             // retrieve the document's root element

15:             Element root = doc.getRootElement();

16:             // retrieve the root's channel element

17:             Element channel = root.getFirstChildElement("channel");

18:             // retrieve the item elements in the channel

19:             if (channel != null) {

20:                 Elements items = channel.getChildElements("item");

21:                 for (int current = 0; current < items.size(); current++) {

22:                     if (count > 15) {

23:                         break;

24:                     }

25:                     // retrieve the current item

26:                     Element item = items.get(current);

27:                     Element titleElement = item.getFirstChildElement("title");

28:                     Element linkElement = item.getFirstChildElement("link");

29:                     title[current] = titleElement.getValue();

30:                     link[current] = linkElement.getValue();

31:                     count++;

32:                 }

33:             }

34:         } catch (ParsingException exception) {

35:             System.out.println("XML error: " + exception.getMessage());

36:             exception.printStackTrace();

37:         } catch (IOException ioException) {

38:             System.out.println("IO error: " + ioException.getMessage());

39:             ioException.printStackTrace();

40:         }

41:     }

42:

43:     public void listItems() {

44:         for (int i = 0; i < 15; i++) {

45:             if (title[i] != null) {

46:                 System.out.println("\n" + title[i]);

47:                 System.out.println(link[i]);

48:                 i++;

49:             }

50:         }

51:     }

52:

53:     public static void main(String[] arguments) {

54:         if (arguments.length > 0) {

55:             Aggregator aggie = new Aggregator(arguments[0]);

56:             aggie.listItems();

57:         } else {

58:             System.out.println("Usage: java Aggregator rssUrl");

59:         }

60:     }

61: }


After you compile the application successfully, it can be run with any RSS 2.0 feed. Here's a command to try it with the Top Stories feed from the Christian Science Monitor newspaper:


java Aggregator http://search.csmonitor.com/rss/top.rss


Sample output from the feed follows:


As Britain copes, a massive hunt for London bombers

http://www.csmonitor.com/2005/0711/p07s01-woeu.html



The new Al Qaeda: local franchises

http://www.csmonitor.com/2005/0711/p01s01-woeu.html



Tough job: Can anyone govern California?

http://www.csmonitor.com/2005/0711/p02s01-uspo.html


By the way

You can find out more about the RSS 2.0 XML dialect from the RSS Advisory Board website at http://blogs.law.harvard.edu/tech. The author of this book is a member of the board, which offers guidance on the format and a directory of software that can be used to read RSS feeds.

There also are two other formats with similar functionality and appeal: RSS 1.0 and Atom.


    Previous Section  < Day Day Up >  Next Section