18 May 2008

Google Reader Sync Support: a progress report

Hello folks, this is Arnold here. Lars had posted about my Summer of Code project: Google Reader Integration with Liferea.

The project got mentioned recently in a Free Software Magazine column (The 2008 Google Summer of Code: 21 Projects I'm Excited About), that also got slashdotted.

Anyway, a lot of work has been done, and it is (seemingly!) pretty much usable. So here is a progress report.

Installation and migration issues



Installation. Checkout the latest subversion repositories, and build it. Now in the left panel, right click, and choose New->Source, and choose Google Source. Give your email ID (which can also be non-gmail google IDs) and password. And you're set. You should see your feedlist on the leftpanel. You will observe the the "Read/Unread" status of all your items should be preserved from Google Reader.

A few notes about migration. If you have a Google Source from a previous installation, it will automatically be converted to a synchronized Google Source. So beware if you don't want Liferea to automatically modify your Google Reader data. You might also notice that some of the "Read/Unread" statuses before migrating have changed, this is because the synchronized Google Source gives preference to the "Read/Unread" statuses from Google Reader.

What works



Lets say you have been using the Google Reader online (For clarity, by Google Reader I will be referring to the online Google Reader API and/or interfaces. I will use the term Google Source for the Google functionality within Liferea). Now, on adding the Google Source, the first thing you will notice is that the "Read/Unread" statuses are retrieved from Google.

Liferea will synchronize subscription lists, and "Read/Unread" statuses both to, and from, Google Reader. This is the main functionality. If you are offline when a local change is made, it will propagate the changes to Google Reader later, when you get connected.

Another cool feature that you will notice is the "broadcast-friends" node. You can now read all the posts that are shared by your Google Reader friends from within Liferea. Although, as of right now, this doesn't show the name of the person who shared it.

What does not work, or does not work correctly



Comments do not work. This is an inherent issue with using Google Reader as a source: if you have been using Google Reader, you would have noticed that Google Reader does not show you a list of comments to an item, while Liferea can (for most feeds which support it). Liferea relies on some information in the feed, which the Google Reader API discards. Until Google makes changes to their API, we really can't do much about it.

Labels. Or Folders. All the posts within Google Source in Liferea, will fall in the same folder. Google Reader categorizes feeds by labels, which can be used to produce a hierarchical folder structure. I will definitely be implementing this over the summer.

Stars, and the "Important" flag. Liferea can flag an item as "Important", and Google Reader can mark an item as "starred". Eventually, I would like to synchronize these two.

Sharing. While you can see posts shared by others, you cannot share an item from within Liferea.

Efficient Updating


In my SoC abstract, I proposed that we can use Google Reader to save the user's bandwidth while updating. Here is the idea I had in mind:

Normally a user can have hundreds of feeds in his feedlist. Whenever Liferea does an update, it has to update each one of these, even if usually only one or two of them have changed. This wastes both time and bandwidth. Using Google Reader API, however, it is possible for us to download only the "reading-list" -- the reading list over all subscriptions -- in one HTTP request. In some sense I'm downloading all the changes to all my feeds in one go. I can use this to recover the "Read/Unread" flags: for any item in the reading-list, the item is marked as Unread. Any unread item under the Google Source, which does not appear in the reading-list, is marked as read.

However, as simple and clean as this might sound, this is flawed: say, while you are at office, you receive some new posts in the Google Reader webapp. You then read it, and so those items are marked as read. Now once you are back at home, you do an update on liferea: this new post will not appear in the "reading-list", so Liferea will never come to know that this post ever existed. You might say its not a cause for concern, since you have already read that post -- but maybe you want that post for offline reading.

So ideally, instead of requesting the "reading-list", I would like to request a list of "Changes since so-and-so date." Unfortunately, the Google Reader API has no such feature as far as I, and the pyrfeed Google Reader API page knows. I am keen on getting feedback and suggestions regarding this, since I would love to have this feature myself!

Feedback, Suggestions and About Me



For a good part of the IST-day, you can catch me on #liferea by the nick arnstein. You can also mail me at arnstein87 AT gmail DOT com.

I have just completed my undergraduate studies from Chennai Mathematical Institute, and will be joining University of Pennsylvania for a PhD in Computer Science this fall. I blog here.

15 comments:

Anonymous said...

> Comments do not work.

The solution seems to be painfully obvious: just fetch feeds URLs from Google Reader and then look at them normally...

Lars said...

@anonymous: Your solution is painfully obviously wrong. The idea of having Google Reader support in the first place is to have full read state synchronisation. How is this possible when not fetching the items via Google Reader?

Anonymous said...

I think the other anonymous reader means: fetching the post's url (via Google) and then check that site for a comment-info in the rss/atom feed given.

Lars said...

I know this is how I understood him too. But that is very questionable to fetch the whole feed from two sources just because one of them doesn't support comment feed links. Also this implies that the two different feed sources (their item ids) can be easily matched.

I think this is a bad idea. One reason being bandwidth sanity and the other one implementing strange workarounds.

Amblin said...

While this completely ROCKS and will probably make Lifearea even more my default reader, it would be even sweeter if the backend was something other that gReader. Something I can run myself(gregarious for example).

arnie said...

@amblin: That's a great idea! I had thought that a personal server support would involve writing the server API. Now that you have pointed me to Gregarious, I would definitely consider it after I'm done with Google Reader.

arnie said...

WARNING! WARNING! WARNING!

Those of you who did decide to pull the sources from sourceforge, checkout the latest and recompile. Otherwise, when you delete your google source root node, you could potentially lose all or part of your subscriptions on google reader.

Anonymous said...

How about working with tt-rss for a synced Liferea client solution using tt-rss as a server component & alternate client.

georgey said...

The idea of having Google Reader support in the first place is to have full read state synchronisation.

nicely put!

Marilu said...

Great work.

Cagnulein said...

Good Job!

Hezy said...

Very nice.

I have already made the transition to an open source online reader (fastladder). is there a chance you'll work on that too?

שמייקל said...

I've noticed the Google Reader login and password are stored as plain text in the configuration file. Is there any option to secure the info, or otherwise not store it (retyping it every time)?

Lars said...

Right now there is no way to store logins/passwords encrypted. Everything is kept in plain text in ~/.liferea.

Anonymous said...

And the label/folder is still not there. It's 2011 :-)

Looks like I will have to wait a little more to use Liferea.