PmWiki / FlatFileAdvantages

PmWiki stores pages in flat files instead of using a relational database such as MySQL. This page explains why this design decision has been made.

Pm's Explanation

Pm: I chose flat files to store PmWiki pages because I haven't seen any real advantages of using a database, and there are definitely some disadvantages. For the standard operations (view, edit, page revisions), holding the information in flat files is clearly faster than accessing them in a database, and with page caching abilities (coming soon) it'll be even faster. The only operations that really benefit are searches, but I've always believed that for fast, flexible search capabilities it's much better to use existing search programs such as ht://Dig or Google over reinventing another search engine. PmWiki's Site.Search is functional/fast enough for most purposes, and if more performance is needed it's just better to switch to a real search engine.

Indeed, as of January 2004 the Wikipedia uses a MySQL database to store its 190K+ entries, but even with the database Wikipedia has disabled its online search because of performance issues and just forwards search queries directly to Google.
see the talk page?

And there are big disadvantages to using a database -- with a database we'd have to write a bunch of "administrative" tools/scripts to handle things such as mass page deletions in the database, backups/restores of the pages, recovering pages that have been wrongly deleted, etc. Much of that administrative programming overhead is eliminated by using a flat file system, as admins can use existing tools (FTP clients, web-based file/directory managers, shell commands). They are already comfortable with the administrative tools. It's also much easier to build sophisticated and customized page management tools and scripts for specialized applications.

Finally, PmWiki is already structured such that the flat file structure can be easily replaced by a database if it ever proves necessary. However, even PmWiki sites with more than 40 000 pages function well in a flat file system without any noticeable performance problems.

PmWiki supports the ability to subdivide the wiki.d/ directory into separate subdirectories for each group, avoiding the "too large" directory problem. Check out the Cookbook:PerGroupSubDirectories for more information.

Comments:

the sectors can be on any disk or server. Result is you can use one server/disk for DB, another server for PHP and a third for web server. You can share out load and get better overall performance even in very heavy usage. Of course that may not be the goal of PmWiki, ;-). -- Peter
Well you can always use NFS if you want your files on another server. But in both cases NFS or a DB, running them on another server is actually likely to increase your latency and not necessarily increase your thoughput. The advantage of a separate DB is more apparent when you need more than one client accessing it at the same time, which, of course, you can do with NFS also, the DB might provide better locking mechanisms but they are not likely to be important to pmwiki (not writer heavy enough). How do you suggest running PHP on another server than your web server? And, whatever your solution for this, wouldn't this also be available without a DB also? Martin Fick?

Category: PmWiki Design

This page may have a more recent version on pmwiki.org: PmWiki:FlatFileAdvantages, and a talk page: PmWiki:FlatFileAdvantages-Talk.