21 December 2008

Wanted: Release Manager

Anyone interested in the job of a Liferea release manager? I think it would help to have someone enforcing a tighter release schedule, who also could priorize and enforce bug reports. Now and then (especially in the last time) releases do not come in the 2-4 weeks period and I believe a dedicated release manager could improve this.

Here is what this unpaid job is about:

  • Deciding on release dates
  • Creating releases (SVN checkout, creation of distribution tarball, creation SF new release)
  • Propagating release (SF news feed, mailing list, webpage)
  • Closing solved bug reports

I know this is quite a lot of work, as I'm doing this for some years now. It would be nice to have someone to take care of it, so that I could spend more time on the code and the bugs.

If you are interested (no matter what expierences) post a comment or write a mail to the mailing list. All contributions are welcome!

12 December 2008

Webkit Support Progress

In SVN trunk the Webkit rendering has now correct context menu support. While I'm not sure that the Webkit API part handling the popup is a stable interface, for now it works. Epiphany doesn't seem to use the popup handling right now...

The good news is that this was the last missing functionality for the Webkit rendering. Now you can zoom in/out, copy links, open links in internal and external browser, bookmark pages with Webkit too.

Note: There are stability issues with Webkit. Users report random crashes when surfing and also crashes when using Flash. Hopefully this will improve in the future for Webkit rendering to become useful.

10 December 2008

64bit Flash for Linux

The new 64bit Flash 10 player for Linux seems to do real wonders. I read more and more reports of people who solved their crashing issues for Firefox and Liferea with the new plugin.

25 October 2008

Compiling with Automake 1.10

Right now Liferea doesn't build with Automake 1.10 (it works up to 1.9.x). When running configure you get errors like these:

configure.ac:178: warning: macro `AM_GCONF_SOURCE_2' not found in library
configure.ac:16: warning: LT_AC_PROG_SED is m4_require'd but not m4_defun'd
acinclude.m4:6: LIFEREA_CONFIG_NICE is expanded from...
configure.ac:16: the top level
configure.ac:16: warning: LT_AC_PROG_SED is m4_require'd but not m4_defun'd
acinclude.m4:6: LIFEREA_CONFIG_NICE is expanded from...
configure.ac:16: the top level
configure.ac:16: warning: LT_AC_PROG_SED is m4_require'd but not m4_defun'd
acinclude.m4:6: LIFEREA_CONFIG_NICE is expanded from...
configure.ac:16: the top level
configure.ac:7: error: possibly undefined macro: AC_ENABLE_SHARED
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
configure.ac:8: error: possibly undefined macro: AC_ENABLE_STATIC
configure.ac:20: error: possibly undefined macro: AC_LIBTOOL_DLOPEN
configure.ac:21: error: possibly undefined macro: AC_PROG_LIBTOOL
configure.ac:178: error: possibly undefined macro: AM_GCONF_SOURCE_2
configure:2694: error: possibly undefined macro: LT_AC_PROG_SED
autoreconf: /usr/bin/autoconf failed with exit status: 1

I've searched the net for answers and tried a lot of thing, but couldn't find much on the issue. Does anyone reading the blog know about what to do to migrate from automake 1.9 to 1.10?

Update: After a hint from Adrian Bunk the solution to the problem above is to ensure a correct libtool installation. Reinstalling the libtool package did solve the problem for me.

21 September 2008

Filter for laconi.ca Micro Blog Feeds

Jens Vierbuchen posted a hint on the mailing list about a filter for laconi.ca micro blog feeds. With this filter you get items using the same HTML layout as the laconi.ca websites themselve use (shaded background, author images...).

("identiger" screenshot taken from project page)

06 September 2008

Finished GSoC Project

Some days ago the 2008 Google Summer of Code final examinations took place and you might ask how did the Liferea application work out. Well Arnold Noronha reached all goals we defined back in May/June and we now have full Google Reader synchronisation for the 1.5.x branch!

So thanks to Google for sponsering the student and thanks to Arnold for completing everything planned!

17 August 2008

Google Reader Synchronization

As you might know I've been working on Google Reader Synchronization in Liferea for a Google SoC project.

Let me go through what has been done, and what has not.

What's new

A noticeable change has been the ability to items "Mark as Important", or in the Google Reader lingo, "Mark as Starred". Liferea will synchronize this flag to, and from, Google Reader.

Another cool feature that has been added is an efficient feed updater (Let's call this fast-update). So here's how it works: Every 10 minutes, Liferea will make a request to Google Reader asking for a list of modified feeds. The response is very small, and hence this does not affect your bandwidth. But from this small request, we can determine a list of feeds that have been updated, and so we can download exactly these feeds. This means that when a new post is available, you will get the update within 10 minutes! Every 24hrs, a full-update is done to complete the synchronization, because there are some situations where fast-update can miss out subscriptions.

What does not work

Comments do not work (as I have pointed out in the past, this is something I cannot fix). I decided not to implement Labels as Folders, simply because some users (who might have been more generous at tagging their subscriptions!) would not like it.

Synchronization with other Web-based aggregators

True --- I wanted to work on this. However, I couldn't get myself motivated enough, because I didn't think I will be using it. :-) I would have had to learn the intricacies of the new API. (Btw, the Google Reader API is *nasty*!) And plus, since I won't be using it, I won't be able to maintain it properly. (oh, excuses!) If you need help implementing synchronization for another web-based aggregator, I will be ready to help. As long as they provide a clean API, it should not be too hard.

That's all folks!

Although SoC is almost over, I will continue developing and maintaining the Google Reader code in Liferea. So you are most welcome to give me suggestions at any time.

12 August 2008

Video Bug Reports

Today I found this video a Liferea user created and posted in his YouTube video channel.:

It describes two problems with the current handling of the 'updated' item state:
  1. Changes are not persistent. There is a bug preventing correct saving of the state change into the DB.
  2. Many users confuse this state with the item read state and wonder why "Mark All Read" cannot be used to reset the 'updated' state.
While the first problem is a functional one and could be fixed I still decided to solve everything by removing the current 'updated' state UI (the icon in the item list). The reason is that I believe it to be of low value to most of the users, to be not inituitive when distinguishing it from the 'read' state and to be visually disturbing if you do see it in the itemlist. Just too many disadvantages and it removing it will save code, documentation and support efforts.

If you got ever confused by this feature starting with 1.4.19 this will be fixed.

01 August 2008

How to run VACUUM

As explained in the last post I see no way to automatically run the "VACUUM" command of sqlite which more or less defragments the DB structure. Nonetheless for everyone who wants to run it manually here is how to do it:
  1. Shutdown Liferea
  2. Start the sqlite client by running: "sqlite3 ~/.liferea_1.4/liferea.db"
  3. At the prompt enter: "VACUUM;"
  4. Wait until the prompt reappears.
  5. Restart Liferea
Situations when you might want to VACUUM
  • When the DB file (~/.liferea_1.4/liferea.db) is very large (e.g. >50MB)
  • When you have only a few feeds with a low cache setting (e.g. 30 feeds and 100 items) and believe Liferea to be unreasonably slow.
  • When you have run Liferea for ages.
If you don't know what this is all about: please do not worry about it. In many cases you might not need to do anything.

Why auto-VACUUM is no good...

During the various performance discussions during the last time here and there people suggested to run "VACUUM" on the Liferea database once it gets slow. This is in line with the sqlite documentation which says:

When an object (table, index, or trigger) is dropped from the database, it leaves behind empty space. This makes the database file larger than it needs to be, but can speed up inserts. In time inserts and deletes can leave the database file structure fragmented, which slows down disk access to the database contents.

The VACUUM command cleans the main database by copying its contents to a temporary database file and reloading the original database file from the copy. This eliminates free pages, aligns table data to be contiguous, and otherwise cleans up the database file structure.

The problem with it is that it also takes very long. With a 50MB DB file I experienced a runtime of over 1 minute. This is why this can be only a tool for experienced users that know how to do it manually knowing what to expect. For executing such a long term operation automatically on runtime would surely be unacceptable to the unsuspecting user. Also there is no good way how to decide when to do a VACUUM to save disk space and improve performance.

28 July 2008

Fix for 100% CPU Usage Problem

After reports from several users that tested with the new release 1.4.18 I believe the 100% CPU issue along with other related symptoms is fixed. Thanks to everyone who retested!

Here is what happened: the release 1.4.16 did bring a DB schema migration that fixed a design problem that caused "loosing" of comment items in the DB. Comments must be removed when there parent items are removed and this didn't work well before 1.4.16. Now I have to admit I tested the comment removal with 1.4.16 very well and it worked as expected, but I failed to notice that the changed DB schema caused the parent item removal to silently do nothing.
The effect is a slow one: your DB file grows and on each feed merge you get more and more old items that should have been removed due to the cache size setting. Now merging (sometimes including full text comparison) against a growing list of items becomes slower each time. For users with lots of feeds updated regularily Liferea finally became unusable because it was merging items constantly.

Now when you start 1.4.18 you still might see some CPU usage during the first update run, because it has to delete a lot of stale items, but afterwards it should run as fast as earlier versions.

Everyone please upgrade to 1.4.18

26 July 2008

Performance Poll Results

First thanks to everyone who answered the three questions!

Your feedback definitively helped to get a better image of the type of setups out there. Here are my conclusion based on the feedback:
  1. A significant amount of the users is subscribed to a number of feeds that is often twice the expected number of feeds. The Liferea target use case that I had in mind up to now was only up to 100 feeds. I think it is necessary to correct the number of acceptable subscriptions to somewhere around 250 and ensure performance with such a number of feeds.
  2. Not all but many users (feels like 80%) do suffer from bad performance. I consider all GUI delays for simple actions (e.g. switching feeds, marking a single feed read) > 2s as bad.
  3. Definitively all users suffer from the linear cost of the complex operations (loading huge item lists, full text search).
Now just redefining the target use case won't do anything good. The question is wether the implementation can be improved to significantly improve the performance. And the answer is simple: No. It can't. The current overly simple design is chosen and limited based on the efforts spent on the project. "Simple" means both simple use cases and no elaborate internal architecture.

So the problem is to solve the issues above. To enable better scaling the internal architecture has to be changed. Right now feed merging (downloading the XML, parsing it, merging items against DB) is done synchronously in the GUI thread. This makes it easy to implement, but hurts each time you navigate Liferea while a background update is in progress and results are processed. The second point from above could be addressed by correctly decoupling GUI and merging. The third point could be solved by shifting the focus from processing subscription caches as a whole to batch processing their items in background...

But not to forget these are only ideas. And Liferea is in need of developers! I'm a professional SW tester with some administration skills and do the development only as a hobby. While I'll try to improve the program other really skilled developer might be able to do the same much much better with less effort.

So again: Please consider contributing code!

Concerning immediate solutions: please try the upcoming release 1.4.18 which should solve the 100% CPU issue.

20 July 2008

Performance Poll

Today I'd like everyone following this blog (or accidentily reading this) to take part in a small poll. I'm interested in subjective performance feed back.

Please answer the following questions using a comment post:
  1. Is Liferea loading feeds / search folders quickly enough?
  2. How many feed subscriptions do you have?
  3. What is the longest unresponsiveness in seconds when you use Liferea. Please name the feature causing it.
Just post something like

no, 100, 5s clicking on "Unread" search folder...

You can post comments anonymously!

01 July 2008

Serious Problems with XulRunner 1.9

With more and more distributions upgrading to the new XulRunner version 1.9 more and more users send bug reports of Liferea becoming unusable because of constant 100% CPU usage.

Right now I'm sorry to say that I have no clue what causes this, debugging is still going one. Hopefully we will find the problem.

Affected versions: There seems to be no limit to the affected versions. I got reports from 1.4.12 up to 1.4.16b all affected. The common symptom was all setups are using a recent XulRunner 1.9. Until now there were no reports about setups with XulRunner 1.8 having the problem.

If someone reading this with insight in XulRunner/Gecko/Mozilla has any idea please post a comment or send a mail!

22 May 2008

Flash 10 with Liferea using XulRunner

Stebalien explains how to get Flash working when your Liferea installation uses XulRunner for rendering, but your distro didn't install the Flash plugin for XulRunner.

18 May 2008

Google Reader Sync Support: a progress report

Hello folks, this is Arnold here. Lars had posted about my Summer of Code project: Google Reader Integration with Liferea.

The project got mentioned recently in a Free Software Magazine column (The 2008 Google Summer of Code: 21 Projects I'm Excited About), that also got slashdotted.

Anyway, a lot of work has been done, and it is (seemingly!) pretty much usable. So here is a progress report.

Installation and migration issues

Installation. Checkout the latest subversion repositories, and build it. Now in the left panel, right click, and choose New->Source, and choose Google Source. Give your email ID (which can also be non-gmail google IDs) and password. And you're set. You should see your feedlist on the leftpanel. You will observe the the "Read/Unread" status of all your items should be preserved from Google Reader.

A few notes about migration. If you have a Google Source from a previous installation, it will automatically be converted to a synchronized Google Source. So beware if you don't want Liferea to automatically modify your Google Reader data. You might also notice that some of the "Read/Unread" statuses before migrating have changed, this is because the synchronized Google Source gives preference to the "Read/Unread" statuses from Google Reader.

What works

Lets say you have been using the Google Reader online (For clarity, by Google Reader I will be referring to the online Google Reader API and/or interfaces. I will use the term Google Source for the Google functionality within Liferea). Now, on adding the Google Source, the first thing you will notice is that the "Read/Unread" statuses are retrieved from Google.

Liferea will synchronize subscription lists, and "Read/Unread" statuses both to, and from, Google Reader. This is the main functionality. If you are offline when a local change is made, it will propagate the changes to Google Reader later, when you get connected.

Another cool feature that you will notice is the "broadcast-friends" node. You can now read all the posts that are shared by your Google Reader friends from within Liferea. Although, as of right now, this doesn't show the name of the person who shared it.

What does not work, or does not work correctly

Comments do not work. This is an inherent issue with using Google Reader as a source: if you have been using Google Reader, you would have noticed that Google Reader does not show you a list of comments to an item, while Liferea can (for most feeds which support it). Liferea relies on some information in the feed, which the Google Reader API discards. Until Google makes changes to their API, we really can't do much about it.

Labels. Or Folders. All the posts within Google Source in Liferea, will fall in the same folder. Google Reader categorizes feeds by labels, which can be used to produce a hierarchical folder structure. I will definitely be implementing this over the summer.

Stars, and the "Important" flag. Liferea can flag an item as "Important", and Google Reader can mark an item as "starred". Eventually, I would like to synchronize these two.

Sharing. While you can see posts shared by others, you cannot share an item from within Liferea.

Efficient Updating

In my SoC abstract, I proposed that we can use Google Reader to save the user's bandwidth while updating. Here is the idea I had in mind:

Normally a user can have hundreds of feeds in his feedlist. Whenever Liferea does an update, it has to update each one of these, even if usually only one or two of them have changed. This wastes both time and bandwidth. Using Google Reader API, however, it is possible for us to download only the "reading-list" -- the reading list over all subscriptions -- in one HTTP request. In some sense I'm downloading all the changes to all my feeds in one go. I can use this to recover the "Read/Unread" flags: for any item in the reading-list, the item is marked as Unread. Any unread item under the Google Source, which does not appear in the reading-list, is marked as read.

However, as simple and clean as this might sound, this is flawed: say, while you are at office, you receive some new posts in the Google Reader webapp. You then read it, and so those items are marked as read. Now once you are back at home, you do an update on liferea: this new post will not appear in the "reading-list", so Liferea will never come to know that this post ever existed. You might say its not a cause for concern, since you have already read that post -- but maybe you want that post for offline reading.

So ideally, instead of requesting the "reading-list", I would like to request a list of "Changes since so-and-so date." Unfortunately, the Google Reader API has no such feature as far as I, and the pyrfeed Google Reader API page knows. I am keen on getting feedback and suggestions regarding this, since I would love to have this feature myself!

Feedback, Suggestions and About Me

For a good part of the IST-day, you can catch me on #liferea by the nick arnstein. You can also mail me at arnstein87 AT gmail DOT com.

I have just completed my undergraduate studies from Chennai Mathematical Institute, and will be joining University of Pennsylvania for a PhD in Computer Science this fall. I blog here.

06 May 2008

New Subscription Options in 1.5.3

Release 1.5.3 will introduce new subscription options that might help with some minor use cases where you want to modify the Liferea default behaviour for certain subscriptions:

03 May 2008

Full Google Reader Sync Support

With the currently ongoing Google Summer of Code 2008, Arnold Noronha is implementing full Google Reader synchronization support in Liferea! He's already started coding and SVN trunk already provides item download via Google Reader, while previously there was only Google Reader to Liferea feed list synchronization. So everyone who ever asked for that feature have a look at the Liferea again in two months after the Google Summer of Code 2008 application is over!

16 March 2008

Advanced Searching

Over time many users asked to be able to make more complex searches using the search box (e.g. matching multiple terms or doing exclusive matches...). The new unstable release 1.5.1 introduces an "Advanced..." button in the standard search dialog. When you click this button the dialog will change and a dialog very similar to the search folder properties dialog will appear. Here you can define one or matching search rules to realize much more complex queries. This advanced search functionality thereby should now cover a lot more use cases.

Note: Liferea still does not allow you to search only the current feed or all feeds of a given folder. The current DB schema doesn't allow building views with such filters. But given time this might be improved...

12 March 2008

Liferea + Firefox + Ubuntu

Recently quite a few Ubuntu users had troubles getting feed subscription with Firefox to work. In all cases it turned out that the Ubuntu package firefox-gnome-support was missing. So if you are using Ubuntu and Firefox please check if you this package installed!

01 March 2008

Favicons and Hosted Blogging

When looking at your subscription list you might notice that many feeds have the same icon. For example a white-on-orange "B" for Blogger, a blue pencil for LiveJournal, a flame icon for FeedBurner and propably others...

If you visit the website of the respective feeds your browser will usually present a different icon in the URL bar. Now one might ask why cannot Liferea use the same one.

The problem is that there are two ways of retrieving these icons.
  1. Relatively to the URL of the website (e.g. as "<webserver>/favicon.ico")
  2. A specific icon file linked in each HTML documented served.
Of course just placing a "favicon.ico" file in the root directory of the webpage is the easiest way to provide a favicon. But this doesn't work anymore with hosted blogging (as provided by Blogger, LiveJournal and many others) or feed caching (as used by FeedBurner and many others). The hosted blogging solutions just do not allow you to put an "favicon.ico" file anywhere (thereby breaking discovery variant #1) and the feed cachers usually work with URL redirection to serve the cached feed content (and thereby breaking discovery variant #2).

So what should I do to help the feed reader to find my favicon?

Solution for hosted blogging: You cannot rely on a "favicon.ico" file so to replace your providers icon you have to upload an favicon image (with arbitrary name) and add a link to it directly into you HTML template. The link needs to be placed under the <head> tag and could look like this:

<link rel="shortcut icon" type="image/png" href="http://myhoster.com/content?blogId=4396446&fileId=4387343">

Note: that Liferea relies on the MIME type and will refuse all images without specified MIME type.

Solution for feed caching: You can either use the "shortcut icon" link mechanism described above or you use Atom feeds you can also specify the original "favicon.ico" file there. For RSS feeds you must fallback to specifying the icon link in the website HTML.

If you think there are better solutions please let me hear about it in the comments!

28 February 2008

Attention Profile

Liferea is a news aggregator and each day allows its users to read maybe hundreds of new blog posts, news articles or podcasts. Many of those are tagged by their authors by descriptive categories. So if it know what the user likes to read most why cannot it preselect those favourite "type" of articles?

The new 1.5 code now keeps track of the absolute number of read categories. Under the "Tools" menu you can now find a new option "Attention Profile" to view the per-category count.

While this might not yet be very useful, this statistic keeping opens up the possibility for more sophisticated features. For example search folders for your most favourite categories, feed and item rating, APML exporting...

Be warned this is experimental, it might work out, it might not. It might hurt performance, or not. Also it arises ethical questions about creating user profiles. All things that still need to be thought about.

Update: Due to performance problems, the Attention Profile has been disabled for Liferea 1.6

27 February 2008

"All Rules Match" Search Folders

Until now search folder rules were "additive" or "removing". This mean when only one of the "additive" and none of the "removing" match rules did match an item it was displayed by the search folder. User feedback over time did show that this is not always intuitive and does not match each use case.

To improve this the search folder properties for 1.5 have changed:

Instead of the long logic explanation there are now two radio buttons allowing to define the intended logic. With "Any Rule Matches" you can create search folders that for example match several rare terms. And using "All Rules Must Match" you can filter all feeds for items on a specific topic identified by one or more keywords.

To give proper credit I must mention that this change was motivated by the searching dialog of RSSOwl (a great platform independant Java based aggregator) which has even more nice feature like instant preview and live updating.

26 February 2008

Release Schedule Calendar

For everyone who needs to know when the next Liferea version will be released (approximately) I created an online calendar, which is embedded at the bottom of the blog main page and can be subscribed in ICAL and Atom format.

12 February 2008

Better Handling Plain Text Content

Current Liferea releases do not handle plain text RSS item content gracefully. If item content is not HTML markup-escaped by the feed generator all text of such an new items ends up in one line without it's line breaks being rendering. This doesn't look very well and makes lists or formatted plain texts unreadable.

For 1.5.x the plan is to solve the problem by auto-detecting the text type of the item description. If it contains no markup than it is to be treated as plain text and all ASCII line breaks need to be converted to HTML line breaks for correct rendering. The critical point here is the plain text/HTML detection. The test implementation in SVN trunk currently only checks for physical HTML tags like <i>, <b> or <a href=""> indicating HTML markup. The risk of this approach is to add additional line breaks to valid HTML content that is not correctly recognized.

If you try 1.5.x/SVN trunk and experience formatting problems with twice as much line breaks or missing line breaks for pure plain text please give some feedback!

Handling Redundancy in Content

Nowadays many feed sources do provide content using Atom or RSS and augment it with application specific namespace providing own tags that often duplicate the content in the container format. For example an iTunes podcast can have an item <description> in the Atom/RSS <item> tag along with an <itunes:summary> description of different quality.

Up until 1.4.x Liferea had a simple implementation primarily using the Atom/RSS description. With the exception of the <content:encoded> tag from the Content-Namespace which depending on tag order will always overrule the default description. Only if there was no default item description additional namespace infos (atom:summary, dc:description...) where used as a content source.

This was an unsatisfactory solution for several reasons:

  • More detailed infos in application specific namespaces are invisible.
  • Ordering problems with <description> and <content:encoded> did sometimes hide better content.
  • Dublin Core description (while rare to encounter) did never win.
  • The scenario of a better summary than description always caused the short description to win.
As a simple solution Liferea 1.5.x now selects the "best content" by simple length comparsion. The assumption is that the format of the content (plain text, HTML, XHTML...) doesn't matter, or more exactly the additional length of (X)HTML encoding indicates better content.

As a result you might see additional content in namespace-rich feeds (e.g. iTunes podcast feeds).

31 January 2008

Improved LiveJournal BlogRoll Support

Todays stable release 1.4.12 adds support for the non-standard "xmlURL" OPML attributes LiveJournal uses in it's blogrolls mentioned in the post from yesterday. So if you do use LiveJournal and want to subscribe to your friends blogroll then you should upgrade to 1.4.12!

30 January 2008

LiveJournal BlogRoll Export Problems

Nikolasco describes in detail how to workaround LiveJournal blogroll export problems using a Ruby script and cron.