PHP: converting RSS to JSON

Posted on August 17, 2011

Converting RSS to JSON requires only a sufficient knowledge of the json_encode() function and one of the XML libraries and extensions that come shipped together with PHP. In this post I'll show you a basic routine to convert an RSS feed to a JSON file using the PHP's DOM extension.

jQuery: RSS feed plugin

Posted on July 22, 2011

The following video shows a practical use of the zRSSFeed plugin for jQuery. This plugin allows you to display any RSS feed on your web pages with a minimum effort. Surely one of the easiest ways of parse and render a local or remote RSS feed with jQuery.

jQuery: RSS reader

Posted on June 11, 2011

Fetching a remote RSS feed with jQuery is one of the most frequently asked questions in a million. jQuery can't handle a remote feed by itself due to the same-domain policies of the AJAX standard. For that reason, we need a server-side script which accepts two parameters, namely the absolute URL of the feed and the number of items you want to display. In our example we'll use PHP, obviously after making sure that the passed URL is a valid URL and that we're actually dealing with an RSS feed. Here's the script:

Formatting a Wordpress post date in the RSS format

Posted on May 28, 2011

Wordpress automatically formats the post_date field in the wp_posts table using various date formats, including the RSS and Atom ones. The default data type for this field is datetime, which means that the date stored in this field has the format YYYY-MM-DD HH:MM:SS. However, sometimes we may need to format this data type directly, for example if something goes wrong with our RSS or Atom feeds and we actually have to manually create a physical feed. Here's how we can do:

CSS: styling an RSS feed reader

Posted on April 19, 2011

Styling an RSS feed reader is easy with CSS. In this post I'm going to use the main RSS feed of the jQuery's blog to show you some CSS techniques that you can reuse in your own project. The feed will be fetched with jQuery using a local copy of it, but you can always use a server-side language to retrieve its contents. We'll see how to accomplish this as well. First, our basic markup structure:

jQuery: RSS feed rotator

Posted on April 4, 2011

Let's say that we want to create an RSS feed rotator with jQuery. To accomplish this, we need a server-side script to fetch the feed, jQuery's AJAX methods and a JavaScript timer to create the intervals between feeds. Since fetching a feed requires some time, we hide the elements while the process is running and then we reveal them one by one with a certain delay. First, let's take a look at our PHP script:

XML structure of a FeedBurner feed

Posted on June 16, 2010

A FeedBurner feed is usually located at the address you've chosen as your FeedBurner account name. For example, my FeedBurner feed is located at http://feeds.feedburner.com/blogspot/onwebdev. If you use a download command line utility (such as wget), you can download it very easily. Basically, this kind of feed is automatically generated by a server-side script, so you can stumble on some problems if you try to fetch this resource from your website. In my case, I've tried to use PHP's DOM extension and SimpleXML to parse my feed, but it didn't work, because this kind of approach requires the presence of a static XML file. Nevertheless, if you use a more common stream-based approach (like file_get_contents()), it works fine, though.

In any case, you need to know in advance the structure of a FeedBurner feed if you want to succeed. In this post, we'll look into the details of this kind of feed.

Root element and namespaces

<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

The root element is rss with five namespaces attached to it. The namespaces are:

atom: http://www.w3.org/2005/Atom
openSearch: http://a9.com/-/spec/opensearch/1.1/
georss: http://www.georss.org/georss
thr: http://purl.org/syndication/thread/1.0
feedburner: http://rssnamespace.org/feedburner/ext/1.0

Many of the elements that we'll find within our feed belong to one or more of these namespaces, so it must be clear from start that we have to take these namespaces into account while parsing our feed. Either we choose a server-side approach or a client-side one, if we don't know very well the namespace structure of our feed, it's more likely that we'll encounter some problems.

The channel element

The channel element contains both item elements and some additional information that we may use to add some description to our feed. Basically, the elements directly contained within the channel element that are relevant for our purpose are:

title: the title of the whole feed
description: a brief description of our feed
link: the URI of our blog or website
lastBuildDate: a date of the latest updates to our feed
managingEditor: the author of the feed.

But the most important element of the channel element is surely item. It's discussed below.

The item element

Each item element contains the relevant information about a post of our blog or website. The most important children of this element are:

pubDate: the date when our post has been published
category: the category under which our post has been published (one or more elements)
title: the title of our post
description: the content of our post; XHTML tags are inserted by encoding the < and > entities, such as < and >
feedburner:origLink: the original link of your post
author: the author of the post

We've only one problem here: the feedburner:origLink belongs to our feedburner global namespace, so we have to take this aspect into account during parsing. Basically, you have to select an element that "lives" inside the feedburner global namespace. The most obvious solution to this problem is using a DOM method such as getElementsByTagNameNS(). or, if you use SimpleXML or any other string-based extension, an approach like this:

$link_ns = $item->children('http://rssnamespace.org/feedburner/ext/1.0');
$link = $link_ns->origLink;

For more info on this solution, read this article.

Parsing RSS feeds with the DOM and JavaScript

Posted on May 22, 2010

Parsing RSS feeds with the traditional DOM approach is not the simplest way to perform this task. However, if you want to get a finer control over the whole process, this may be a feasible way. Here's the basic code to achieve this goal:

function XMLDoc() {
 var me = this;
 var req = null;
 if (window.XMLHttpRequest) {
  req = new XMLHttpRequest();
 }
 
 else if (window.ActiveXObject) {
  try {
   req = new ActiveXObject("MSXML2.XMLHttp.6.0");
  }
  catch(e) {
   try {
    req = new ActiveXObject("MSXML2.XMLHttp.3.0");
   }
  catch(e) {
   req = null;
  }
  
  }
 }
 
 
 this.request = req;
 this.loadXMLDoc = function (url, handler) {
  if (this.request) {
   this.request.open ("GET", url, true);
   this.request.onreadystatechange = function () {
    handler(me);
   };
   this.request.setRequestHeader("Content-Type", "text/xml");
   this.request.send(null);
  }
 };
}

function initXML () {
 var newrequest = new XMLDoc();
 newrequest.loadXMLDoc("rss.xml", getRSS);
}

function getRSS (req) {
 req = req.request;
 var content = document.getElementById("content");
 var div = document.createElement("div");
 div.className = "entries";
 var h3 = document.createElement("h3");
 h3.innerHTML = "Recent entries";
 div.appendChild(h3);
 var ol = document.createElement("ol");
 div.appendChild(ol);
 
 
 if (req.readyState == 4 && req.status == 200) {
  var root = req.responseXML.documentElement;
  var items = root.getElementsByTagName("item");
  
  
  for (var i=0, len=items.length; i<len; i++) {
  
  
   var title = items[i].getElementsByTagName("title")[0].firstChild.nodeValue;
   var link = items[i].getElementsByTagName("link")[0].firstChild.nodeValue;
   var li = document.createElement("li");
   li.innerHTML = "<a href='" + link + "'>" + title + "</a>";
   
   
   
   ol.appendChild(li);
   
   
   
  
  
  
  }
  
  content.appendChild(div);
  
  
  
  
 }
}


window.onload = initXML;

The first object, XMLDoc, creates and returns an XHMHttpRequest object. It also opens the given resource and sets the content type for the Ajax request (in this case, it uses the HTTP verb GET) through the loadXMLDoc() function, which also accepts a function reference to handle the request via the onreadystatechange event. The function initXML() uses an instance of XMLDoc to set the event handler to the function getRSS().

The getRSS() function simply retrieves a DOMDocument instance from the XMLDoc object, loops through all item elements by starting at the root of the XML document (using responseXML.documentElement as reference) and then extracts the values of the link and title elements to create XHTML links.

Although this approach actually allows you to see all the details of an Ajax connection, it's better to use the Ajax functionality of some JavaScript library (such as jQuery or Prototype) in order to simplify the code and avoid redundance. Futher, if you want to use this DOM approach, you should cache your data and use local variable reference instead of global reference. For example, instead of writing:

var content = document.getElementById("content");
var div = document.createElement("div");

you should write:

var doc = document;
var content = doc.getElementById("content");
var div = doc.createElement("div");

By doing so, you avoid global lookups and improve the performance of your script.

RSS Feeds: basics

Posted on April 3, 2010

This is a good video, very plain and simple, useful for beginners. I actually need to learn something from its simplicity.

XSLT and RSS: test results

Posted on March 8, 2010

I started my tests by stylizing a static RSS file with CSS after transforming it with XSLT and linking to the page through the link element. Browsers apply in this case their own custom template to the RSS file and display it accordingly, so there's no way to circumvent browser default formatting for an RSS document (static in this case). I need more time in order to use a server-side XSLT processing for displaying an RSS document generated on the fly. Time will tell.

XSLT and RSS: planning tests

Posted on March 5, 2010

As many of you know, browsers that support RSS show your feeds using their own formatting template. That's cool, but what happens if we want to transform an RSS feed with XSLT? We could write something like this:

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet href="style.xsl" type="text/xsl"?>

What happens then? That's what I'm going to test. Stay tuned!

How to make an RSS feed

Posted on February 28, 2010

I find this video really interesting (maybe I should propose to insert something like this in my guide on RSS.. but I look awful in videos!).

What is RSS?

Posted on February 20, 2010

Actually, I'm writing a guide on RSS but still.

The image element in RSS

Posted on February 13, 2010

Speaking theoretically, the image element in RSS should be used to insert graphics into an RSS feed. Just theoretically! In fact, according to some tests made during the writing of an RSS guide for Html.it, this element works only in Firefox. What's more, Firefox accepts it only when it's put in a a direct descendant of the channel element. That is, it doesn't work in an item element. Developers are then forced to use the string <img /> to insert an image into their RSS feed. Oh well...

onwebdev

Web development by Gabriele Romanato

Showing posts with label rss. Show all posts