01 August 2008

Why auto-VACUUM is no good...

During the various performance discussions during the last time here and there people suggested to run "VACUUM" on the Liferea database once it gets slow. This is in line with the sqlite documentation which says:

When an object (table, index, or trigger) is dropped from the database, it leaves behind empty space. This makes the database file larger than it needs to be, but can speed up inserts. In time inserts and deletes can leave the database file structure fragmented, which slows down disk access to the database contents.

The VACUUM command cleans the main database by copying its contents to a temporary database file and reloading the original database file from the copy. This eliminates free pages, aligns table data to be contiguous, and otherwise cleans up the database file structure.


The problem with it is that it also takes very long. With a 50MB DB file I experienced a runtime of over 1 minute. This is why this can be only a tool for experienced users that know how to do it manually knowing what to expect. For executing such a long term operation automatically on runtime would surely be unacceptable to the unsuspecting user. Also there is no good way how to decide when to do a VACUUM to save disk space and improve performance.

2 comments:

Anonymous said...

What about a button in the Settings window to run vacuum? Either directly (modally, tell the user "this is going to take a few minutes" and it's ok), or on next Quit.

Lars said...

To do this Liferea would have to detach from the DB and to reattach afterwards. Also there should be a way for the user to cancel this long-term operation. IMO this is quite hard to implement. And last but not least that's bad usability.